Skip to content

Commit

Permalink
[jaeger-v2] Fix e2e storage integration v0.99.0 OTEL col failing (#5419)
Browse files Browse the repository at this point in the history
## Which problem is this PR solving?
- Fix failing CI at all of the v2 storage integration #5397 

> 1. What was the observable problem after upgrade? I.e. how did it
manifest?

My assumption was because of OTEL col changed the `ToClientConn` gRPC
implementation from DialContext to NewClient here
open-telemetry/opentelemetry-collector#9965.
Both integration spanWriter and jaegerexporter use OTLP exporter which
creates the gRPC connection through the `ToClientConn` above.

> 2. What do you believe the root cause was?

The gRPC connection from OTLP exporter has not yet been successfully
settled but the requests have been sent.

From the #5397 PR failing CI at Badger and gRPC storages, Badger failed
only at GetServices and GetOperations while gRPC failed at GetServices,
GetOperations, and FindTraces.

Since both of them failed at GetServices and GetOperations which is the
beginning of the tests, the "spanWriter OTLP exporter" to "OTEL col
binary OTLP receiver" connection probably has not yet been successfully
settled.

Then, the Badger storage doesn't have any issue with the FindTraces
because it cleanup the storage with purge() functionality but the gRPC
restarts the remote storage server. With the same assumption, the gRPC
connection from OTEL col binary to the remote storage server probably
has not been settled yet.

> 3. How does the change fix the root cause?

Retrying the requests solves the issue. Here are my arguments:
- The spanWriter's OTLP exporter retries only at GetServices after the
retry config has been enabled, previously requests failed at GetServices
and GetOperations.
- The jaegerexporter's OTLP exporter has the same error message "plugin
error: rpc error: code = Unavailable desc = connection error: desc =
\"transport: Error while dialing: dial tcp [::1]:17271: connect:
connection refused" but success at the 2nd attempt.

## Description of the changes
- Change the spanWriter log level to DEBUG
- Enable retry to integration spanWriter

## How was this change tested?
- Run `make jaeger-v2-storage-integration-test` until it fails since it
has a low probability of reproducing the same error

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [ ] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: James Ryans <james.ryans2012@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yuri Shkuro <github@ysh.us>
Co-authored-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>
  • Loading branch information
4 people committed May 5, 2024
1 parent 2bb3d01 commit 4dc7708
Show file tree
Hide file tree
Showing 5 changed files with 179 additions and 177 deletions.
2 changes: 1 addition & 1 deletion cmd/jaeger/internal/exporters/storageexporter/factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ func createTracesExporter(ctx context.Context, set exporter.CreateSettings, conf
exporterhelper.WithCapabilities(consumer.Capabilities{MutatesData: false}),
// Disable Timeout/RetryOnFailure and SendingQueue
exporterhelper.WithTimeout(exporterhelper.TimeoutSettings{Timeout: 0}),
exporterhelper.WithRetry(configretry.BackOffConfig{Enabled: false}),
exporterhelper.WithRetry(configretry.NewDefaultBackOffConfig()),
exporterhelper.WithQueue(exporterhelper.QueueSettings{Enabled: false}),
exporterhelper.WithStart(ex.start),
exporterhelper.WithShutdown(ex.close),
Expand Down
5 changes: 3 additions & 2 deletions cmd/jaeger/internal/integration/e2e_integration.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,11 @@ import (
"time"

"github.com/stretchr/testify/require"
"go.uber.org/zap"
"go.uber.org/zap/zaptest"
"gopkg.in/yaml.v3"

"github.com/jaegertracing/jaeger/cmd/jaeger/internal/integration/storagecleaner"
"github.com/jaegertracing/jaeger/pkg/testutils"
"github.com/jaegertracing/jaeger/plugin/storage/integration"
"github.com/jaegertracing/jaeger/ports"
)
Expand All @@ -44,7 +45,7 @@ type E2EStorageIntegration struct {
// it also initialize the SpanWriter and SpanReader below.
// This function should be called before any of the tests start.
func (s *E2EStorageIntegration) e2eInitialize(t *testing.T, storage string) {
logger, _ := testutils.NewLogger()
logger := zaptest.NewLogger(t, zaptest.Level(zap.DebugLevel))
configFile := createStorageCleanerConfig(t, s.ConfigFile, storage)
t.Logf("Starting Jaeger-v2 in the background with config file %s", configFile)
cmd := exec.Cmd{
Expand Down
1 change: 0 additions & 1 deletion cmd/jaeger/internal/integration/span_writer.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ func createSpanWriter(logger *zap.Logger, port int) (*spanWriter, error) {
factory := otlpexporter.NewFactory()
cfg := factory.CreateDefaultConfig().(*otlpexporter.Config)
cfg.Endpoint = fmt.Sprintf("localhost:%d", port)
cfg.RetryConfig.Enabled = false
cfg.QueueConfig.Enabled = false
cfg.TLSSetting = configtls.ClientConfig{
Insecure: true,
Expand Down
116 changes: 59 additions & 57 deletions go.mod
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
module github.com/jaegertracing/jaeger

go 1.21
go 1.21.0

toolchain go1.22.2

require (
github.com/HdrHistogram/hdrhistogram-go v1.1.2
Expand All @@ -24,12 +26,12 @@ require (
github.com/hashicorp/go-plugin v1.6.0
github.com/kr/pretty v0.3.1
github.com/olivere/elastic v6.2.37+incompatible
github.com/open-telemetry/opentelemetry-collector-contrib/connector/spanmetricsconnector v0.98.0
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/kafkaexporter v0.98.0
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/jaeger v0.98.0
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/jaegerreceiver v0.98.0
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/kafkareceiver v0.98.0
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/zipkinreceiver v0.98.0
github.com/open-telemetry/opentelemetry-collector-contrib/connector/spanmetricsconnector v0.99.0
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/kafkaexporter v0.99.0
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/jaeger v0.99.0
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/jaegerreceiver v0.99.0
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/kafkareceiver v0.99.0
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/zipkinreceiver v0.99.0
github.com/prometheus/client_golang v1.19.0
github.com/prometheus/client_model v0.6.1
github.com/prometheus/common v0.53.0
Expand All @@ -40,28 +42,28 @@ require (
github.com/stretchr/testify v1.9.0
github.com/uber/jaeger-client-go v2.30.0+incompatible
github.com/xdg-go/scram v1.1.2
go.opentelemetry.io/collector/component v0.98.0
go.opentelemetry.io/collector/config/configgrpc v0.98.0
go.opentelemetry.io/collector/config/confighttp v0.98.0
go.opentelemetry.io/collector/config/configretry v0.98.0
go.opentelemetry.io/collector/config/configtls v0.98.0
go.opentelemetry.io/collector/confmap v0.98.0
go.opentelemetry.io/collector/connector v0.98.0
go.opentelemetry.io/collector/connector/forwardconnector v0.98.0
go.opentelemetry.io/collector/consumer v0.98.0
go.opentelemetry.io/collector/exporter v0.98.0
go.opentelemetry.io/collector/exporter/otlpexporter v0.98.0
go.opentelemetry.io/collector/exporter/otlphttpexporter v0.98.0
go.opentelemetry.io/collector/extension v0.98.0
go.opentelemetry.io/collector/extension/ballastextension v0.98.0
go.opentelemetry.io/collector/extension/zpagesextension v0.98.0
go.opentelemetry.io/collector/otelcol v0.98.0
go.opentelemetry.io/collector/pdata v1.5.0
go.opentelemetry.io/collector/processor v0.98.0
go.opentelemetry.io/collector/processor/batchprocessor v0.98.0
go.opentelemetry.io/collector/processor/memorylimiterprocessor v0.98.0
go.opentelemetry.io/collector/receiver v0.98.0
go.opentelemetry.io/collector/receiver/otlpreceiver v0.98.0
go.opentelemetry.io/collector/component v0.99.0
go.opentelemetry.io/collector/config/configgrpc v0.99.0
go.opentelemetry.io/collector/config/confighttp v0.99.0
go.opentelemetry.io/collector/config/configretry v0.99.0
go.opentelemetry.io/collector/config/configtls v0.99.0
go.opentelemetry.io/collector/confmap v0.99.0
go.opentelemetry.io/collector/connector v0.99.0
go.opentelemetry.io/collector/connector/forwardconnector v0.99.0
go.opentelemetry.io/collector/consumer v0.99.0
go.opentelemetry.io/collector/exporter v0.99.0
go.opentelemetry.io/collector/exporter/otlpexporter v0.99.0
go.opentelemetry.io/collector/exporter/otlphttpexporter v0.99.0
go.opentelemetry.io/collector/extension v0.99.0
go.opentelemetry.io/collector/extension/ballastextension v0.99.0
go.opentelemetry.io/collector/extension/zpagesextension v0.99.0
go.opentelemetry.io/collector/otelcol v0.99.0
go.opentelemetry.io/collector/pdata v1.6.0
go.opentelemetry.io/collector/processor v0.99.0
go.opentelemetry.io/collector/processor/batchprocessor v0.99.0
go.opentelemetry.io/collector/processor/memorylimiterprocessor v0.99.0
go.opentelemetry.io/collector/receiver v0.99.0
go.opentelemetry.io/collector/receiver/otlpreceiver v0.99.0
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.51.0
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.51.0
go.opentelemetry.io/otel v1.26.0
Expand All @@ -85,7 +87,7 @@ require (
require (
github.com/IBM/sarama v1.43.1 // indirect
github.com/VividCortex/gohistogram v1.0.0 // indirect
github.com/aws/aws-sdk-go v1.51.17 // indirect
github.com/aws/aws-sdk-go v1.51.22 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cenkalti/backoff/v4 v4.3.0 // indirect
github.com/cespare/xxhash v1.1.0 // indirect
Expand Down Expand Up @@ -148,14 +150,14 @@ require (
github.com/mostynb/go-grpc-compression v1.2.2 // indirect
github.com/oklog/run v1.1.0 // indirect
github.com/onsi/ginkgo v1.16.5 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage v0.98.0
github.com/open-telemetry/opentelemetry-collector-contrib/internal/common v0.98.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal v0.98.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/internal/kafka v0.98.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/batchpersignal v0.98.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatautil v0.98.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/azure v0.98.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/zipkin v0.98.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage v0.99.0
github.com/open-telemetry/opentelemetry-collector-contrib/internal/common v0.99.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal v0.99.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/internal/kafka v0.99.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/batchpersignal v0.99.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatautil v0.99.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/azure v0.99.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/zipkin v0.99.0 // indirect
github.com/opentracing/opentracing-go v1.2.0 // indirect
github.com/openzipkin/zipkin-go v0.4.2 // indirect
github.com/pelletier/go-toml/v2 v2.1.0 // indirect
Expand Down Expand Up @@ -187,25 +189,25 @@ require (
github.com/xdg-go/stringprep v1.0.4 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
go.opencensus.io v0.24.0 // indirect
go.opentelemetry.io/collector v0.98.0 // indirect
go.opentelemetry.io/collector/config/configauth v0.98.0 // indirect
go.opentelemetry.io/collector/config/configcompression v1.5.0 // indirect
go.opentelemetry.io/collector/config/confignet v0.98.0 // indirect
go.opentelemetry.io/collector/config/configopaque v1.5.0 // indirect
go.opentelemetry.io/collector/config/configtelemetry v0.98.0 // indirect
go.opentelemetry.io/collector/config/internal v0.98.0 // indirect
go.opentelemetry.io/collector/confmap/converter/expandconverter v0.98.0 // indirect
go.opentelemetry.io/collector/confmap/provider/envprovider v0.98.0 // indirect
go.opentelemetry.io/collector/confmap/provider/fileprovider v0.98.0 // indirect
go.opentelemetry.io/collector/confmap/provider/httpprovider v0.98.0 // indirect
go.opentelemetry.io/collector/confmap/provider/httpsprovider v0.98.0 // indirect
go.opentelemetry.io/collector/confmap/provider/yamlprovider v0.98.0 // indirect
go.opentelemetry.io/collector/exporter/debugexporter v0.98.0
go.opentelemetry.io/collector/extension/auth v0.98.0 // indirect
go.opentelemetry.io/collector/featuregate v1.5.0 // indirect
go.opentelemetry.io/collector/semconv v0.98.0 // indirect
go.opentelemetry.io/collector/service v0.98.0 // indirect
go.opentelemetry.io/contrib/config v0.4.0 // indirect
go.opentelemetry.io/collector v0.99.0 // indirect
go.opentelemetry.io/collector/config/configauth v0.99.0 // indirect
go.opentelemetry.io/collector/config/configcompression v1.6.0 // indirect
go.opentelemetry.io/collector/config/confignet v0.99.0 // indirect
go.opentelemetry.io/collector/config/configopaque v1.6.0 // indirect
go.opentelemetry.io/collector/config/configtelemetry v0.99.0 // indirect
go.opentelemetry.io/collector/config/internal v0.99.0 // indirect
go.opentelemetry.io/collector/confmap/converter/expandconverter v0.99.0 // indirect
go.opentelemetry.io/collector/confmap/provider/envprovider v0.99.0 // indirect
go.opentelemetry.io/collector/confmap/provider/fileprovider v0.99.0 // indirect
go.opentelemetry.io/collector/confmap/provider/httpprovider v0.99.0 // indirect
go.opentelemetry.io/collector/confmap/provider/httpsprovider v0.99.0 // indirect
go.opentelemetry.io/collector/confmap/provider/yamlprovider v0.99.0 // indirect
go.opentelemetry.io/collector/exporter/debugexporter v0.99.0
go.opentelemetry.io/collector/extension/auth v0.99.0 // indirect
go.opentelemetry.io/collector/featuregate v1.6.0 // indirect
go.opentelemetry.io/collector/semconv v0.99.0 // indirect
go.opentelemetry.io/collector/service v0.99.0 // indirect
go.opentelemetry.io/contrib/config v0.5.0 // indirect
go.opentelemetry.io/contrib/propagators/b3 v1.25.0 // indirect
go.opentelemetry.io/contrib/zpages v0.50.0 // indirect
go.opentelemetry.io/otel/bridge/opencensus v1.25.0 // indirect
Expand Down
Loading

0 comments on commit 4dc7708

Please sign in to comment.