Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use low cardinality names for database spans #341

Closed
tedsuo opened this issue Apr 23, 2020 · 6 comments
Closed

Use low cardinality names for database spans #341

tedsuo opened this issue Apr 23, 2020 · 6 comments
Milestone

Comments

@tedsuo
Copy link

tedsuo commented Apr 23, 2020

Currently, in some cases database spans use the SQL statement as the span name. This creates a couple of issues:

  • The cardinality is too high to be useful as an index
  • Statements may need to be scrubbed for security/privacy reasons. While it is fine to scrub a secondary attribute like db.statement, mutating a primary index like span name
  • Users find it disturbing to see SQL statements in the span name.

To clarify, SQL templates without variables may work well as a span name. The spec currently recommends using SQL templates when available. But, when a template is not available, we should fall back to a low cardinality option.

Spec details: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/database.md

Span name should be set to low cardinality value representing the statement 
executed on the database. It may be stored procedure name (without argument), 
sql statement without variable arguments, etc. When it's impossible to get any 
meaningful representation of the span name, it can be populated using the same 
value as db.instance.

The current spec is vague about what to use in place of a template. The suggestion I have heard requested the most is to use the SQL Command (SELECT, INSERT, UPDATE, etc). If it is easy enough to switch, I would suggest this option for now. Otherwise, falling back to the db.instance value would be sufficient.

@trask
Copy link
Member

trask commented Apr 23, 2020

Based on discussion in SIG meeting today, let's proceed with this:

  • For PreparedStatements, we continue using the query as the span name (no change).

  • For non-PreparedStatements, let's avoid parsing the SQL query for now, and just use db.instance as the span name. @johnbley is planning to add SQL parsing in the near future, at which time we can leverage that to produce a better span name.

@trask trask added good first issue Good for newcomers contribution welcome Request makes sense, maintainers probably won't have time, contribution would be welcome labels Apr 23, 2020
@trask trask added this to the v0.3.0 milestone Apr 28, 2020
@sleighzy
Copy link
Contributor

sleighzy commented May 5, 2020

I have some changes for this issue, but have a couple of quick questions.

@trask
Copy link
Member

trask commented May 5, 2020

Hi @sleighzy! Sorry for not keeping this issue updated. #366 was just merged yesterday implementing SQL scrubbing, which removes sensitive data and reduces cardinality. I wonder if we still need this?

@sleighzy
Copy link
Contributor

sleighzy commented May 5, 2020

I agree, the SQL scrubbing brings this in alignment with the prepared statement implementation. I'll pull down those latest changes and test them in our environments. Up to you folk but I see no further need for this ticket then. Thanks.

@trask trask removed the good first issue Good for newcomers label May 6, 2020
@trask
Copy link
Member

trask commented May 6, 2020

Great, I'm going to close it then 😄.

@trask trask closed this as completed May 6, 2020
@iNikem
Copy link
Contributor

iNikem commented May 7, 2020

Just for historical record. Otel spec (https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/database.md) says the following: "Span name should be set to low cardinality value representing the statement executed on the database". So we have to use sql statement for spans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants