Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor exporter - step 1 #1078

Merged
merged 3 commits into from
Aug 14, 2020
Merged

Refactor exporter - step 1 #1078

merged 3 commits into from
Aug 14, 2020

Conversation

reyang
Copy link
Member

@reyang reyang commented Aug 13, 2020

Trying to fix several issues in the current exporter design. Related to #896 and #896 (comment).

The overall thinking:

  1. We don't want to have two queues - one in the core SDK and another in the exporter. The exporter shouldn't handle queuing at all, that is the responsibility of batch export processor.
  2. Processor/Exporter shouldn't shoot Task randomly to the worker pool and have nondeterministic behavior or unbounded memory usage.
  3. Simple export processor should be simple, it shouldn't have any in-memory queue.
  4. Jaeger/Zipkin/Otlp exporters should use batch export processor by default.
  5. ActivityListener is sync and dispose is also sync, it doesn't make much sense to have async interface in the exporter. The actual exporter implementation can always use async pattern as they would want. This would give memory allocation benefit to sync exporters.
  6. SimpleActivityProcessor should be renamed to SimpleExportActivityProcessor to avoid the confusion (and the same for batch exporter) - OpenTelemetry Python is doing the right job here.
  7. maxExportBatchSize should be enforced - the exporter should never receive a batch with more than maxExportBatchSize items specified in the BatchExportActivityProcessor.

Changes

  • Added a new class ActivityExporterSync, which I plan to replace the current ActivityExporter in the next couple PRs (and rename back to ActivityExporter after the refactor is done).
  • I've also changed the result code to Success and Failure, as the spec is saying there are only two values.
  • Similar like the provider/processor refactor, will send a list of small PRs, step by step, and add test cases after the refactor is done.

For significant contributions please make sure you have completed the following items:

  • Design discussion issue #
  • Changes in public API reviewed

@reyang reyang requested a review from a team August 13, 2020 20:30
@reyang
Copy link
Member Author

reyang commented Aug 13, 2020

@cijothomas @CodeBlanch

/// </summary>
/// <param name="batch">Batch of activities to export.</param>
/// <returns>Result of export.</returns>
public abstract ExportResult Export(IEnumerable<Activity> batch);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this IEnumerable should be replaced with something that doesn't have heap allocation, not trying to cover it in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ this! Just got this merged so we can prevent allocations doing foreachs on the Activity data (tags, links, events, etc.): dotnet/runtime#40544

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1080 for tracking.

@codecov
Copy link

codecov bot commented Aug 13, 2020

Codecov Report

Merging #1078 into master will decrease coverage by 0.06%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1078      +/-   ##
==========================================
- Coverage   77.40%   77.33%   -0.07%     
==========================================
  Files         220      221       +1     
  Lines        6080     6085       +5     
==========================================
  Hits         4706     4706              
- Misses       1374     1379       +5     
Impacted Files Coverage Δ
src/OpenTelemetry/Trace/ActivityExporterSync.cs 0.00% <0.00%> (ø)

/// <summary>
/// Enumeration used to define the result of an export operation.
/// </summary>
public enum ExportResultSync
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we simply use a bool to indicate success/failure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we want to keep close to the spec.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I wonder why spec didn't simply use a boolean? I guess its leftover from old behavior where ExportResult had more than 2 possible values!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it leaves the room so in the future folks can add more value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mm.. Is there any value in Export() returning anything at all? I mean, irrespective of the return value, the Processor simply moves on to next batch. Is it to let Processor do some logging that batch is being dropped? Or to allow future possibilities..

Copy link
Member Author

@reyang reyang Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see BIG value, but I do see value - for example - the processor could log an internal error, or the processor can use this indication to report exporting statistics (e.g. metrics telling folks how many activities are exported successfully vs. dropped).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the processor could log an internal error, or the processor can use this indication to report exporting statistics

These are definitely interesting use cases, but seems to me to be more of a concern for the exporter itself since it has the underlying knowledge of either the error or statistics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. lets stick with spec and keep enum with 2 values for now :)

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will wait till end of day to get additional feedback before merging.

@alanwest
Copy link
Member

The spec says "Export() must not block indefinitely, there must be a reasonable upper limit after which the call must time out with an error result (Failure).".

I'm curious what is considered reasonable?

@reyang
Copy link
Member Author

reyang commented Aug 13, 2020

The spec says "Export() must not block indefinitely, there must be a reasonable upper limit after which the call must time out with an error result (Failure).".

I'm curious what is considered reasonable?

I've seen 15 seconds, 1 minutes and some value in between :)
I think individual exporters should have their own clarification of what's reasonable for them.

@cijothomas cijothomas merged commit e52d76e into master Aug 14, 2020
@cijothomas cijothomas deleted the reyang/exporter branch August 14, 2020 03:55
@CodeBlanch
Copy link
Member

@reyang Plan looks solid to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants