Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong timestamp for DBM beta feature #9024

Merged
merged 2 commits into from
Mar 25, 2021

Conversation

djova
Copy link
Contributor

@djova djova commented Mar 24, 2021

What does this PR do?

The timestamp derived from the last state_change was wrong because it didn't take into account the timezone, causing some events to be dropped during intake. Now we do an explicit conversion to a unix timestamp from the UTC Unix epoch date.

Motivation

Fix wrong timestamps.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ and integration/ labels attached

@djova djova requested a review from justiniso March 24, 2021 15:07
@djova djova requested a review from a team as a code owner March 24, 2021 15:07
@ghost ghost added the integration/postgres label Mar 24, 2021
@djova djova force-pushed the djova/postgres-statement-samples-fix-timestamp-bug branch 2 times, most recently from edffc9c to 9ea8f7e Compare March 24, 2021 15:13
@@ -22,6 +23,8 @@

VALID_EXPLAIN_STATEMENTS = frozenset({'select', 'table', 'delete', 'insert', 'replace', 'update'})

unix_epoch = datetime.datetime(1970, 1, 1, tzinfo=datetime.timezone.utc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can always assume UTC here. Won't this depend on the configured TZ of the database?

@djova djova force-pushed the djova/postgres-statement-samples-fix-timestamp-bug branch from 9ea8f7e to a8badb7 Compare March 24, 2021 15:26
The timestamp derived from the last `state_change` was wrong because it didn't take into account the timezone, causing some events to be dropped during intake. Update to use internal helpers that derive the unix timestamp correctly.
@djova djova force-pushed the djova/postgres-statement-samples-fix-timestamp-bug branch from a8badb7 to 2c9d9a4 Compare March 24, 2021 15:55
# If the transaction is idle then we have a more specific "end time" than the current time at
# which we're collecting this event. Can only be done if the datetime has a timezone.
if row['state_change'].tzinfo:
event['timestamp'] = get_timestamp(row['state_change']) * 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to make sure I have all of the cases down:

  1. row['state_change'] is timezone-aware. Everything will work as expected.
  2. row['state_change'] is not timezone-aware, the agent is running on the database host.
  3. row['state_change'] is not timezone-aware, the agent is running off-host.

As long as case #1 is guaranteed, we won't have any problems. #2 should also be fine (unless the database is using a different timezone than the host it's running on).

#3 would mean we default to using the agent timestamp. But because all events are aligned on UTC, we will always be close to the actual DB timestamp? Is this correct?

Copy link
Contributor Author

@djova djova Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the docs this is a timestamp with timezone so the timezone info should always be there. It would only be missing if there's a bug in psycopg or if the database host is misconfigured somehow.

#2 should also be fine (unless the database is using a different timezone than the host it's running on).

Yes in that case we use now() as the timestamp, which is still fine because if we're doing one collection per second we should be off by at most one second.

#3 would mean we default to using the agent timestamp. But because all events are aligned on UTC, we will always be close to the actual DB timestamp? Is this correct?

Yeah, same reasoning as #2, falling back to now() should still be within one second of the actual event time, unless the agent was down for a while then woke up and saw some stale connection that had finished its last query a long time ago.

@ofek ofek changed the title postgres integration: fix wrong timestamp Fix wrong timestamp for DBM beta feature Mar 24, 2021
@ofek ofek merged commit 2244dde into master Mar 25, 2021
@ofek ofek deleted the djova/postgres-statement-samples-fix-timestamp-bug branch March 25, 2021 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants