Migrating to 2.8.0 from 1.44.0 (mainly collector-related confusion) #7302
Replies: 1 comment
-
Hi @luckyycode! I'm Dosu and I’m helping the jaeger team. There are known issues with trace sampling and visibility when migrating to Jaeger 2.x, especially with the new Go-based collector architecture and its reliance on OpenTelemetry Collector components. Your config looks structurally correct, but a few things stand out:
A few troubleshooting steps:
If you’re using OpenTelemetry SDKs on the client side, be aware that adaptive/remote sampling may not work as expected due to missing attributes, and you may need to use a constant sampler or adjust your collector’s sampling configuration accordingly (discussion). Let me know if you see any relevant metrics or logs after making these changes, or if traces start flowing again with a simplified processor pipeline. To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
In 1.44 I had default_strategy type 'const' and I had every trace visible as I wanted in my project.
After reading new docs and upgrading to 2.8.0 I got weird behavior: The first minute after starting the collector I get part of my traces and then nothing, it just does not output any of trace anymore for hours.
The client-side is set up correctly. Parent-based always-sample or just always-sample samplers, over grpc.
Here's my config. NOTE: I tried probabilistic value of 1 (always sample). I tried queue size params and timeouts. I tried batch only processor with no params - no desired result, it's the same behavior as I said above.
jaeger-query extension is by itself in another cluster, no problems with jaeger-query here.
The only logs I got in jaeger process with debug level, no errors:
Cassandra is accessible, no strict fw rules. Tried to rollback to 1.44 and it works again.
Average trace rps is under 1k traces per second, no memory issues.
Running in kubernetes, nginx-ingress with grpc. Every port is exposed as per doc.
I am wondering if anybody got the same behavior.
Beta Was this translation helpful? Give feedback.
All reactions