Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit data corruption on NFS volumes #1351

Closed
russjones opened this issue Sep 30, 2017 · 7 comments
Closed

Audit data corruption on NFS volumes #1351

russjones opened this issue Sep 30, 2017 · 7 comments
Assignees

Comments

@russjones
Copy link
Contributor

russjones commented Sep 30, 2017

Problem

When running multiple Teleport Auth Servers in a HA configuration, the recommended approach for the audit log is to mount a shared NFS volume which all Auth Servers write to. This however will not work because multiple clients opening a file with the O_APPEND flag will leads to data corruption as outlined in the NFS documentation in section A8 and A9.

  • TCP guarantees ordered delivery in the context of the single server, however it is possible to have out of order writes in case of the multiple auth servers.

Proposed Solution

To clarify the design algo a little bit:

The only way to solve the problem with NFS that does not guarantee atomic of concurrent appends is to make sure there is only one writer per opened file.

In case if there are several concurrent auth servers writing in the context of the same session, they will write to different files.

The files format will be exactly the same as the existing format.

1. Auth server when receiving session chunks will open the metadata files with starting with the counter of the first received chunk.
Auth server will continue writing to the existing file until the following happens:
2.a. session ends, auth server closes the file
2.b. auth server receives the offset of the chunk that is not successive to the previously received chunk, e.g. the previously written chunk has counter 8, while newly received chunk has counter 10. This means the other auth server wrote the chunk 9.
Auth server resets the state to 1.

For example one auth server 1 will write the following blocks

# will contain blocks from 0 to 92
0_b0ca00f5-a4a9-11e7-9b5d-0a6859bf1618.session.bytes
# will contain blocks from 103 to 500
103_b0ca00f5-a4a9-11e7-9b5d-0a6859bf1618.session.bytes

auth server 2 will write the following blocks:

# will contain blocks 93- 102
93_b0ca00f5-a4a9-11e7-9b5d-0a6859bf1618.session.bytes

Then for playback, we simply gather and join all chunks.

We would need to perform a similar scheme for the metadata as well as the audit log itself because they all also reside on a NFS volume and are subject to the same issues.

Audit events

The Web UI uses the audit log to figure out which sessions are complete and can be played back and and which are active and can be joined based off the values of session.start and session.

Direct integrations with external structured logging facilities for querying and logging are going to solve this problem, e.g. using ELK/Splunk API to query the backends will reducing the amount of work.

@klizhentas
Copy link
Contributor

Correction - as discussed, we don't need to implement this schema for audit log as we don't have to put audit log entries to the NFS volumes and simply use local storage only with log forwarders.

@klizhentas klizhentas changed the title Audit data corruption Audit data corruption on NFS volumes Sep 30, 2017
@mechastorm
Copy link

So just to clarify how do we centralize the session logs for those of us that may not have a preferred log forwarder yet?

The other concern is how are the session logs accessed from the webui there are multiple Auth servers?

@klizhentas
Copy link
Contributor

made several edits

@russjones
Copy link
Contributor Author

russjones commented Sep 30, 2017

@klizhentas The Web UI uses the audit log to figure out which sessions are complete and can be played back and and which are active and can be joined based off the values of session.start and session.end so we need a way for each Auth Server to see all events that have occurred in the system at least of type session.start and session.end.

An idea: we store the audit log in the backend and provide a log forwarder that forwards to a file. This allows us to build more log forwarders in the future and maintain existing functionality with the file based events.

@klizhentas
Copy link
Contributor

@russjones We can explore bringing back audit logs to backends, or we can add direct integrations with external structured logging facilities for querying as well, e.g. it will be no problem direclty log to ELK/Splunk and simply query the backends, reducing the amount of work.

@pmorton
Copy link

pmorton commented Oct 25, 2017

Echoing @mechastorm, if using the recommended shared NFS volume causes corruption, how does one implement high availability? Is it possible?

@klizhentas klizhentas added this to the 2.5.0 milestone Jan 4, 2018
@klizhentas
Copy link
Contributor

fixed in 2.5.0, by #1549

@klizhentas klizhentas mentioned this issue Feb 19, 2018
hatched pushed a commit to hatched/teleport-merge that referenced this issue Nov 30, 2022
hatched pushed a commit that referenced this issue Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants