Define span attributes for messaging systems #418

arminru · 2020-01-23T12:51:32Z

We already prepared a proposal regarding semantic conventions for messaging systems but @kbrockhoff was faster with his PR #395. I sent him our proposal and he's fine with it, so I integrated his suggestions into ours and we're moving forward with this PR now.

This proposal includes the component field like the other semantic conventions currently do. There are discussions (#271 and #336 ) for removing that field but there hasn't been any agreement yet.

Regarding an attribute describing the service account for connecting to the messaging system (if desired): Should we add a "user" attribute per type (db.user, messaging.user, rpc.user and the like) or should we define a general attribute for that? (brought up by @bogdandrutu on #395 (comment))

specification/data-messaging.md

arminru · 2020-01-30T10:15:10Z

@open-telemetry/specs-approvers PTAL 🙂

arminru · 2020-02-06T08:05:49Z

@open-telemetry/specs-approvers PTAL 🙂

carlosalberto · 2020-02-12T18:03:03Z

@open-telemetry/specs-approvers Please review this PR. It's around semantic conventions, and it augments what we have (in this case, adding a section for messaging Spans specifically), so it's not breaking anything.

Would love to have this merged (or reviewed/tuned) soon ;)

SergeyKanzhelev

LGTM in general, It would be great to have some specific examples. Also requested some clarifications

specification/data-messaging.md

SergeyKanzhelev · 2020-02-18T17:26:00Z

The last question I have left is the library.name vs. messaging.system

lmolkova · 2020-02-19T17:04:11Z

specification/data-messaging.md

+| `messaging.temp_destination` | A boolean that is `true` if the message destination is temporary. | If temporary (assumed to be `false` if missing). |
+| `messaging.protocol` | The transport protocol such as `AMQP` or `MQTT`. | No |
+| `messaging.url` | Connection substring such as `tibjmsnaming://localhost:7222` or `https://queue.amazonaws.com/80398EXAMPLE/MyQueue`. | No |
+| `messaging.message_id` | A value used by the messaging system as an identifier for the message, represented as a string. | No |


what if messages are batched? I assume message_id is not populated, is it correct?

Are you talking about producing, receiving or processing of the messages?
If they are produced one at a time, received in a batch and then processed one at a time, each producing and each processing span should have the respective message ID. The receiver span won't have any. Please see the batch example at the end of the file.

specification/data-messaging.md

lmolkova · 2020-02-19T17:14:43Z

specification/data-messaging.md

+| `messaging.operation` | A string identifying which part and kind of message consumption this span describes: either `receive` or `process`. (If the operation is `send`, this attribute must not be set: the operation can be inferred from the span kind in that case.) | No |
+
+The _receive_ span is be used to track the time used for receiving the message(s), whereas the _process_ span(s) track the time for processing the message(s).
+Note that one or multiple Spans with `messaging.operation` = `process` may often be the children of a Span with `messaging.operation` = `receive`.


can you provide reasons/preconditions behind making receive span a parent of process? What about receive duration then, it seems to end before children (which is likely ok)?

Also, in the context of sampling, how receive is sampled considering it knows linked context after call completes?

This is to track the time spent receiving the message(s) separately from the time spent for processing them. The receiver span might or might not end before the processor spans are started - both would be correct.

I cannot quite follow the last question. If receiving is tracked separately, the context will usually not have been propagated at the time the receiver span is started. Also, in batch receive scenarios, there won't be one single context to act as a parent for the receive span. Therefore, the receive span will usually be a root span. The child spans tracking message processing will link to the message producer spans as shown in the batch example at the bottom of the file.

it's likely that consumer writes code like

messages = queue.receive(5); foreach (message in messages) { processSpan = trace.StartActiveSpan("my-queue"); // do stuff span.end() }

Logically by the time processing starts, receive is completed and it's span context is not current/may be lost. Do I have to pass it to process span?

I'm on the library instrumentation side (Azure EventHubs and other messaging offerings in Azure) i.e. if I instrument receive, I would have no reasonable means to pass it's context to users so they can not possibly start 'process' span as a child of 'receive'.

I want to understand reasons behind the choice of making process child of receive and perhaps removing mentions of it if this is optional, there is no strong need for it.

Regarding sampling:
OTel samplers check links and if there are links that are sampled in, sampler will also make 'sampled in' decision for span.
But links are not known, as you mentioned, before the 'receive' span starts. This basically means low chance for receive to be consistently sampled with the messages in the batch which makes it useless.
I decided against instrumenting receive at all for the near future, and this is one of the reasons for it.

I've heard some instrumentations solve it through delaying span creation until messages are received to make sampling consistent. Was it considered?

Logically by the time processing starts, receive is completed and it's span context is not current/may be lost. Do I have to pass it to process span?

This depends on the OTel implementation. In Java, for example, one could create a scope around the foreach in which the receive span is active. Ending the receive span does not "deactivate" it, so it would still be picked up as parent by the process spans. Passing the receive span or its spancontext explicitly would also be an option, yes.

I'm on the library instrumentation side (Azure EventHubs and other messaging offerings in Azure) i.e. if I instrument receive, I would have no reasonable means to pass it's context to users so they can not possibly start 'process' span as a child of 'receive'.

If you're on the library instrumentation side and only instrument the receive call which returns a number of messages without having any further hooks to instrument, then you will not be able to produce the process spans, yes. This is a limitation by the instrumentation approach and it should be fine to just have receive spans without process spans in this case since this is all you can trace this way. How else would you trace the time spent processing messages given these limitations?

Regarding sampling
I'd expect the sampler used to sample the receive calls if the information is deemed interesting/valuable based on the span attributes being set. In this case the sampler can deduce from the required span attributes that this is a message-receive span and decide accordingly.

If the receive call is sampled, the process spans would be children of the receive span and bear links to the send/create spans from the spancontext being propagated in the message metadata, if available.
If the receive call is not sampled, the process spans will either be the child of some other span and add the aforementioned links or - if there is no implicit parent - use the spancontext extracted from the message metadata as a parent.

I've heard some instrumentations solve it through delaying span creation until messages are received to make sampling consistent. Was it considered?

The process spans are supposed to be created after the message was received, so this is already solved by having two different spans for receiving and processing.

specification/data-messaging.md

Co-Authored-By: Sergey Kanzhelev <S.Kanzhelev@live.com>

Co-Authored-By: Christian Neumüller <christian.neumueller@dynatrace.com>

arminru · 2020-04-07T07:07:12Z

@open-telemetry/specs-approvers who haven't had the chance to review this PR yet, please consider taking a look. Thank you!

arminru requested review from AloisReitbauer, bogdandrutu, c24t, carlosalberto, iredelmeier, jmacd, reyang, SergeyKanzhelev, songy23, tedsuo, tigrannajaryan and yurishkuro as code owners January 23, 2020 12:51

arminru commented Jan 23, 2020

View reviewed changes

specification/data-messaging.md Outdated Show resolved Hide resolved

Oberon00 reviewed Jan 23, 2020

View reviewed changes

specification/data-messaging.md Outdated Show resolved Hide resolved

jmacd approved these changes Jan 27, 2020

View reviewed changes

arminru force-pushed the semconv-messaging branch from 2b9f71c to c699fdb Compare January 29, 2020 08:21

vmihailenco mentioned this pull request Jan 31, 2020

Add semantic conventions for logs and errors #432

Closed

carlosalberto approved these changes Feb 11, 2020

View reviewed changes

arminru mentioned this pull request Feb 13, 2020

Common event attribute names #397

Closed

arminru force-pushed the semconv-messaging branch from b7e26b2 to ba3ebe7 Compare February 17, 2020 13:00

SergeyKanzhelev suggested changes Feb 17, 2020

View reviewed changes

lmolkova reviewed Feb 19, 2020

View reviewed changes

specification/data-messaging.md Outdated Show resolved Hide resolved

lmolkova reviewed Feb 19, 2020

View reviewed changes

specification/data-messaging.md Outdated Show resolved Hide resolved

lmolkova reviewed Feb 19, 2020

View reviewed changes

specification/data-messaging.md Outdated Show resolved Hide resolved

arminru and others added 25 commits April 7, 2020 09:00

Remove peek operation

b97bf67

Formatting

9491f14

Merge suggestions from open-telemetry#395

80727c4

Fix samples for messaging.system

d976443

Remove component attribute as per open-telemetry#271

9a7254a

Apply suggestions from code review

d30a56d

Co-Authored-By: Sergey Kanzhelev <S.Kanzhelev@live.com>

Define message_id as string

de0afba

Typo

a8b6e1b

Add guidance for span names for temp destinations

e8931de

Add section on RabbitMQ

5da7c84

Add note that messaging.destination might be equal to span name

f2ea69e

Add note on receive/process timing

cc4fb06

Remove colon 👨‍⚕️

daf1874

Add examples

a9b8209

Make markdownlint happy

bb968b7

Clarify processing span kind

4613bd1

Fix table in example

e00d1e8

Make destination_kind optional

d73aa5e

Add batch processor example

0be1d95

Apply suggestions from code review

6355871

Co-Authored-By: Christian Neumüller <christian.neumueller@dynatrace.com>

Rename spans in examples

713b78c

Declare messaging.protocol to be without version

feeb2c5

Add messaging.protocol_version

c639012

Recommend setting net.transport

92748ab

Move file after directories were restructured upstream

1cf72e5

arminru force-pushed the semconv-messaging branch from 18e240a to 1cf72e5 Compare April 7, 2020 07:03

carlosalberto merged commit 6b29667 into open-telemetry:master Apr 7, 2020

arminru deleted the semconv-messaging branch April 7, 2020 15:09

arminru added this to the v0.4 milestone May 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define span attributes for messaging systems #418

Define span attributes for messaging systems #418

arminru commented Jan 23, 2020

arminru commented Jan 30, 2020

arminru commented Feb 6, 2020

carlosalberto commented Feb 12, 2020

SergeyKanzhelev left a comment •

edited

Loading

SergeyKanzhelev commented Feb 18, 2020

lmolkova Feb 19, 2020

arminru Feb 20, 2020

lmolkova Feb 19, 2020

arminru Feb 20, 2020

lmolkova Feb 21, 2020

arminru Feb 24, 2020

arminru Feb 24, 2020

arminru commented Apr 7, 2020

Define span attributes for messaging systems #418

Define span attributes for messaging systems #418

Conversation

arminru commented Jan 23, 2020

arminru commented Jan 30, 2020

arminru commented Feb 6, 2020

carlosalberto commented Feb 12, 2020

SergeyKanzhelev left a comment • edited Loading

Choose a reason for hiding this comment

SergeyKanzhelev commented Feb 18, 2020

lmolkova Feb 19, 2020

Choose a reason for hiding this comment

arminru Feb 20, 2020

Choose a reason for hiding this comment

lmolkova Feb 19, 2020

Choose a reason for hiding this comment

arminru Feb 20, 2020

Choose a reason for hiding this comment

lmolkova Feb 21, 2020

Choose a reason for hiding this comment

arminru Feb 24, 2020

Choose a reason for hiding this comment

arminru Feb 24, 2020

Choose a reason for hiding this comment

arminru commented Apr 7, 2020

SergeyKanzhelev left a comment •

edited

Loading