Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kibana Security to use Server Side Sessions #17870

Closed
elasticmachine opened this issue Feb 10, 2017 · 17 comments · Fixed by #68117
Closed

Kibana Security to use Server Side Sessions #17870

elasticmachine opened this issue Feb 10, 2017 · 17 comments · Fixed by #68117
Assignees
Labels
blocker Feature:Security/Authentication Platform Security - Authentication release_note:enhancement Team:Security Team focused on: Auth, Users, Roles, Spaces, Audit Logging, and more!

Comments

@elasticmachine
Copy link
Contributor

elasticmachine commented Feb 10, 2017

Summary

Today, the Security Plugin stores user authentication information (username/password for basic or access/refresh tokens for token, saml, oidc etc.) in the cookie that gets stored in the users browser. This cookie is encrypted with two-way encryption using a salt based on the xpack.security.encryptionKey in the kibana.yml.

This approach has a number of benefits (e.g. easy to scale) and drawbacks (e.g. we're hitting browser cookie size limit more and more). But more importantly, there are number of use cases we cannot really handle with the current approach, like #18162 or #53478.

This issue proposes using a more secure approach - maintaining server-side sessions in the Kibana backend. Instead of putting the encrypted authentication information in the cookie, the cookie would contain an encrypted session ID, which the server side would generate on first authentication, and store it alongside with all additional information in the Elasticsearch index. After session expiration (see below) or on explicit logout, it would be removed.

Where to store session information?

Since we started to discourage Kibana administrators from giving users access to .kibana index we can safely (to be assessed) use this index to store session information. Alternatively we can use a dedicated index similar to .security-tokens index used by Elasticsearch.

We'd likely want to store the session information in an encrypted form. Either leveraging xpack.security.encryptionKey for all sessions or separate unique keys for every session that'd be stored in the cookie, we'll need to find an acceptable security-performance trade-off. If we use single encryption key we may want to throw some AAD into the mix.

When to clean up session information?

We can implement a number of optimizations to not overburden Kibana server with the session management, for example when user explicitly logs out or Kibana decides to do that on its own we'd not only remove this specific session, but also all other sessions that have expired (based on xpack.security.session.idleTimeout and xpack.security.session.lifespan settings).

In addition to that we can leverage TaskManager to schedule a periodic 24h task to do the cleanup as well.

Also we may want to just mark sessions as deleted and physically delete them later for audit purposes.

What fields session information should include?

These are the first that come into my mind:

  • Version (e.g. to properly handle migration scenarios)
  • Creation/removal/last-access timestamps (e.g. for audit purposes)
  • User and Elasticsearch realm names (e.g. to limit concurrent sessions)
  • Provider type and name (for internal Kibana purposes)
  • The flag that tells whether it's an intermediate (e.g. used during SAML handshake) or final session
  • [Optional] URL that initiated log in (may go to the miscellaneous data field for the realms that actually use that data instead)
  • Miscellaneous data field (e.g. username/password for basic, access/refresh token pair for saml or certificate fingerprint for pki)

Concerns

There are a number of concerns regarding Server-Side session comparing to the current approach:

  • Performance:
    • Additional request to retrieve session information for every request that requires authentication
    • Decryption of the session information in addition to cookie content decryption for every request that requires authentication
    • Increase of .kibana index size (only if we decide to use existing .kibana index)
  • Security (only if we decide to use existing .kibana index):
    • Potential leakage of the sensitive information during .kibana index backup
    • Users with direct access to .kibana index can drop user sessions (not swap or alter though)
  • Complexity:
    • We need to specifically handle the case when multiple requests try to initiate login at the same time (e.g. when pki or kerberos is used we can receive multiple "login" requests at the same time and we'll only figure out that they belong to the same user only after we talk to Elasticsearch that would issue different access tokens for both requests)
    • There is a chance we may need help from the Elasticsearch team (see previous point)
    • Additional index to take care of (only if we decide to use separate index to store session information)
Original comment by @skearns64: Today, the Security component of X-Pack for Kibana encodes both the username and the password in the cookie that gets stored in the users browser. This cookie is encrypted with two-way encryption using a salt based on the `shield.encryptionKey` in the kibana.yml.

This issue proposes using a more secure approach - maintaining server-side sessions in the Kibana backend. Instead of putting the encrypted username/password in the cookie, the cookie would contain an encrypted session ID, which the server side would generate on first authentication, and keep in memory, updating the expiration date with each valid request while the session was active. After session expiration, it would be removed.

I see two potential phases to this.
Phase 1 would be simply storing the session state in the Kibana server memory. The drawback to this approach is In scenarios where multiple Kibana instances are being load-balanced; the load balancer would have to ensure that the same users were always directed to the same Kibana instance (so their requests go to the Kibana server holding their session).

Phase 2 would be to store the session state in Elasticsearch, potentially in the .kibana index. I think doing this depends on LINK REDACTED , which would ensure that the .kibana index (or whatever index we choose to use here) was not accessible to end-users.

@elasticmachine
Copy link
Contributor Author

Original comment by @jaymode:

Just to note for phase 2, I see some overlap with the idea of supporting sessions in security on the elasticsearch side (LINK REDACTED). The writeup there is probably a bit outdated in terms of how I think we'd do it now with the move to a full on HttpClient coming in the future. With that move I'd tend to think that we'd implement the a lightweight version of a HttpSession (similar to what exists in the Java EE world) that we would just use for authentication. IF we did add this, it could be possible that Kibana could delegate to this and just need to deal with error handling for session expiry and such.

@elasticmachine
Copy link
Contributor Author

Original comment by @lukasolson:

@kjbekkelund and @alexbrasetvik Would like to hear your opinions on how this could potentially affect/integrate with Cloud.

@elasticmachine
Copy link
Contributor Author

Original comment by @alexbrasetvik:

Hi. Currently on vacation. Traveling home on Friday. Will comment properly then. We should not make any assumptions on which backend server gets which request, and we cannot turn on sticky sessions on the Cloud load balancers.

@elasticmachine
Copy link
Contributor Author

Original comment by @kjbekkelund:

It is definitely possible to implement secure alternatives that don't rely on state on the backend, e.g. LINK REDACTED. These tokens can either be saved in Web Storage (localstorage, sessionstorage) and sent in an Authorization header, or they can be handled by cookies. LINK REDACTED detailing some of this. Of course, as soon as a cookie is used we need to think about CSRF, but if I'm not mistaken that is already in place for Kibana?

@elasticmachine
Copy link
Contributor Author

Original comment by @skearns64:

I'm +1 on further investigation of how we do this - server side sessions may not be the best way to solve the problem, but they do have some advantages. JWT is a good standard, but if I understand it correctly, our current model doesn't align with it; we're storing encrypted usernames and passwords in the cookie (and would need them in JWT), because we need to pass them both to ES for auth purposes.

We could consider switching the Kibana model further, so that all requests are made by the Kibana Server user, and impersonates the end-user when appropriate. That way, we wouldn't need Kibana to ever store or pass passwords anywhere.

@elasticmachine elasticmachine added Team:Security Team focused on: Auth, Users, Roles, Spaces, Audit Logging, and more! Feature:Security/Authentication Platform Security - Authentication discuss release_note:enhancement labels Apr 24, 2018
@kobelb kobelb removed the discuss label Sep 4, 2019
@arisonl arisonl assigned arisonl and unassigned arisonl Dec 3, 2019
@kobelb kobelb assigned jportner and azasypkin and unassigned jportner Feb 20, 2020
@azasypkin
Copy link
Member

azasypkin commented Apr 7, 2020

UPDATE: updated #17870 (comment) with the current state of things and preliminary plan

@legrego
Copy link
Member

legrego commented Apr 8, 2020

We could consider storing sessions in an alternate index, similar to how Elasticsearch stores its tokens in the .security-tokens index instead of the .security index. This could help address some of the complications you mentioned above:

Increase of .kibana index size

Potential leakage of the sensitive information during .kibana index backup

Users with direct access to .kibana index can drop user sessions (not swap or alter though)


Instead of putting the encrypted authentication information in the cookie, the cookie would contain an encrypted session ID, which the server side would generate on first authentication, and keep in memory, updating the expiration date with each valid request while the session was active.

When you say "and keep in memory", are you thinking about a session cache on the Kibana server?

@azasypkin
Copy link
Member

We could consider storing sessions in an alternate index, similar to how Elasticsearch stores its tokens in the .security-tokens index instead of the .security index. This could help address some of the complications you mentioned above:

Yep, that's an option for sure! The more I think about it, the more reasonable it feels.

When you say "and keep in memory", are you thinking about a session cache on the Kibana server?

Ugh, sorry, it's a left over from the original comment - I don't think we need to store anything in the server memory unless we really need to for the performance reasons - I'll update comment.

@jportner
Copy link
Contributor

jportner commented Apr 16, 2020

Here are my initial thoughts:


the cookie would contain an encrypted session ID

I’m not sure we get any concrete benefit from encrypting the session ID. If anything, this may just pose additional risk, as it would allow an attacker to force Kibana to decrypt data (could allow algorithmic complexity attack to result in DoS). Since don’t need to allow users to set their own session IDs, we should generate them on the server side. The session ID should just be an opaque identifier, nothing more.

We'd likely want to store the session information in an encrypted form.

Just to clarify: I think we probably only want to encrypt sensitive data (such as credentials or PII), right? That way we can query on the other data.

If we use single encryption key we may want to throw some AAD into the mix.

I think this is the right approach. We are talking about storing user credentials with reversible encryption (in the case of basic authN) — if we don’t handle this properly, we will weaken all of the security assertions for these credentials in Elasticsearch as well. I propose that, along with each SID, we generate another random value to be used as AAD. We should set this AAD in the users’s session cookie, and never store this AAD on the server side (or log it), otherwise a malicious administrator could potentially decrypt all users’ session data. This reduces our risk — a malicious administrator could only decrypt a user’s session data if he could intercept requests from that user.

What fields session information should include?

I think we should consider adding the following:

  • username hash (not encrypted), so we can find all sessions for a given user, which is a requirement from Limit the number of concurrent user sessions #18162. Then we can encrypt the username, which could be considered PII in some implementations.
  • roles (not encrypted), so we can find all sessions for a given role, which is a requirement from Limit the number of concurrent user sessions #18162.
    • Note: this is a sticky one — the session info is not the single source of truth for a user’s roles. To accommodate for this, conventional wisdom is to invalidate all of a user’s sessions when his role(s) change. However, a user’s roles can be changed directly in Elasticsearch, and we don’t have a way that I know of to ensure that Kibana is informed of this. We may need to consider adding something to Kibana authZ checks — to invalidate the session if the roles that are reflected in the Kibana sessionInfo do not match with the current roles in Elasticsearch. Not sure, interested to hear what others think.
  • IP address geolocation: so we can provide this in a UI
  • User agent (browser/OS): so we can provide this in a UI

last-access timestamps (e.g. for audit purposes)

This would mean two cluster calls every time a user makes an HTTP request — one to fetch the session, another to update its last-access timestamp. I don’t think we need this in the session store, as this information will be available in the audit log (after it is overhauled)

User and Elasticsearch realm names (e.g. to limit concurrent sessions)

As mentioned above, we’ll also need to capture roles.

Additional request to retrieve session information for every request that requires authentication

This could potentially allow for a DoS attack vector — if an attacker can force Kibana to make a round trip and

(The statement above was incomplete in my original comment; edited to the following)
If we are only encrypting session info, then without a valid session ID, an attacker couldn't force Kibana to consume resources on decryption. However, there's still a performance hit with this approach. We could implement some sort of in-memory caching on the Kibana side, but that would prevent us from invalidating sessions. This could be mitigated with a short window (say, 5 minutes) before session info would need to be retrieved again from Elasticsearch. And/or we could query Elasticsearch periodically in the background (say, every minute) to get a list of invalidated session IDs.

Other thoughts:

  • we should generate SIDs and AAD using a secure PRNG — current OWASP guidance suggests a minimum of 128 bits, it wouldn’t hurt to just use 256 bits to be safe.

@legrego
Copy link
Member

legrego commented Apr 16, 2020

I want to chew on this some more, but I agree with your assessment @jportner w/r/t not encrypting session identifiers, and including additional AAD so that a malicious actor can't perform a credential dump.

username hash (not encrypted), so we can find all sessions for a given user, which is a requirement from #18162. Then we can encrypt the username, which could be considered PII in some implementations.

++

roles (not encrypted), so we can find all sessions for a given role, which is a requirement from #18162.

I want to challenge this requirement. Is this really something that folks need for compliance? If so then it's not enough that the list of roles have changed; It's possible that the privileges granted to one of the user's assigned roles was updated. Having just a list of roles would not be enough to detect that a user's privileges have in fact changed.

IP address geolocation: so we can provide this in a UI

I think storing the IP address initially would be sufficient. We can always specify an GeoIP Ingest Pipeline to populate this for us if we ever want or need this capability.

@legalastic I'm seeing mixed messages online -- is an IP address considered PII? We are considering storing and associating a user's IP address with their Kibana session, for auditing purposes, but also for potential features down the line (e.g., limiting number of sessions per IP, or invalidating sessions if the user's IP address has changed). This data lives within the user's cluster, and never leaves the user's cluster. It will also be periodically cleaned up as sessions are invalidated/expired, but the lifespan of this data is TBD and user-configurable to an extent.

User agent (browser/OS): so we can provide this in a UI

Would this be something we update every time we see it change, or would the initial user agent be sufficient?

last-access timestamps (e.g. for audit purposes)

We could try to challenge this one as well, to reduce the impact of server-side sessions. If it turns out to be problematic, we could issue an update if the last seen access timestamp > 5 minutes ago (configurable), or something to that effect.

@jportner
Copy link
Contributor

roles (not encrypted), so we can find all sessions for a given role, which is a requirement from #18162.

I want to challenge this requirement. Is this really something that folks need for compliance? If so then it's not enough that the list of roles have changed; It's possible that the privileges granted to one of the user's assigned roles was updated. Having just a list of roles would not be enough to detect that a user's privileges have in fact changed.

The requirement is from NIST SP 800-53, AC-10 CONCURRENT SESSION CONTROL.

See text excerpt

Organizations may define the maximum number of concurrent sessions for information system accounts globally, by account type (e.g., privileged user, non-privileged user, domain, specific application), by account, or a combination. For example, organizations may limit the number of concurrent sessions for system administrators or individuals working in particularly sensitive domains or mission-critical applications. This control addresses concurrent sessions for information system accounts and does not address concurrent sessions by single users via multiple system accounts.

Since this control would be in place to prevent concurrent sessions by role to satisfy the "by account type" stipulation, we don't necessarily care if that role has been edited / underlying privileges have changed. We just care which users have that role.

IP address geolocation: so we can provide this in a UI

I think storing the IP address initially would be sufficient.

Yeah, I was under the impression that it might be considered PII so that's why I suggested storing the geolocation, I should have mentioned that. I'm glad you called it out.

User agent (browser/OS): so we can provide this in a UI

Would this be something we update every time we see it change, or would the initial user agent be sufficient?

I see this as for informational purposes only, not to be used as any sort of security control. So I think we only need to store the initial user agent, as that's not something that would change in normal usage.

@azasypkin
Copy link
Member

If anything, this may just pose additional risk, as it would allow an attacker to force Kibana to decrypt data (could allow algorithmic complexity attack to result in DoS).

Yep, that's what we already have.

The session ID should just be an opaque identifier, nothing more.

Yes, I was just thinking that we may want to store part of the AAD in the cookie or some other info we don't want user to change, but if we end up with only session ID they we don't need to encrypt it, agree.

Just to clarify: I think we probably only want to encrypt sensitive data (such as credentials or PII), right? That way we can query on the other data.

Correct.

I propose that, along with each SID, we generate another random value to be used as AAD

++

username hash (not encrypted), so we can find all sessions for a given user, which is a requirement from #18162. Then we can encrypt the username, which could be considered PII in some implementations.

You can have equal usernames in different realms, so the hash most likely should be based on username, provider type and provider name (or ES realm name, TBD) to cover "limit user's concurrent session" requirement.

roles (not encrypted), so we can find all sessions for a given role, which is a requirement from #18162.
Note: this is a sticky one — the session info is not the single source of truth for a user’s roles. To accommodate for this, conventional wisdom is to invalidate all of a user’s sessions when his role(s) change. However, a user’s roles can be changed directly in Elasticsearch, and we don’t have a way that I know of to ensure that Kibana is informed of this. We may need to consider adding something to Kibana authZ checks — to invalidate the session if the roles that are reflected in the Kibana sessionInfo do not match with the current roles in Elasticsearch. Not sure, interested to hear what others think.

That one is tricky indeed... In the absolute majority of the installations this won't be needed, so I think we can optimize here a bit with bulk requests depending on whether these controls are enabled or not for the Kibana installation (or specific provider/realm). Invalidating user session if their current role set is affected by the "concurrent account type sessions" requirement sounds acceptable. We retrieve user with their roles on every successful authenticate attempt, we know which roles are controlled by the requirement (through kibana.yml) so it feels like we have everything to figure out whether we should log user out or forbid login.

IP address geolocation: so we can provide this in a UI
User agent (browser/OS): so we can provide this in a UI

Do we have an immediate need in those or we can consider adding them at a later stage?

last-access timestamps (e.g. for audit purposes)
This would mean two cluster calls every time a user makes an HTTP request — one to fetch the session, another to update its last-access timestamp. I don’t think we need this in the session store, as this information will be available in the audit log (after it is overhauled)

That's true, but don't we do this already (and we'll have to) to constantly extend idleTimeout anyway?

@carlspataro
Copy link

@legrego
Responding to:

IP address geolocation: so we can provide this in a UI

I think storing the IP address initially would be sufficient. We can always specify an GeoIP Ingest Pipeline to populate this for us if we ever want or need this capability.

@legalastic I'm seeing mixed messages online -- is an IP address considered PII? We are considering storing and associating a user's IP address with their Kibana session, for auditing purposes, but also for potential features down the line (e.g., limiting number of sessions per IP, or invalidating sessions if the user's IP address has changed). This data lives within the user's cluster, and never leaves the user's cluster. It will also be periodically cleaned up as sessions are invalidated/expired, but the lifespan of this data is TBD and user-configurable to an extent.
+++++++++

Technically, under GDPR, an IP address is "personal data" where it is identifiable to an individual. (Parenthetically, I note the reference to geolocation (via MaxMind lookup) and would not consider that identifiable because the database there is too imprecise to provide address of a specific user.) As you may anticipate, the manner in which the IP address is used is also relevant to the analysis. Here, it looks like we would be using the IP address in a manner attributable to an individual.

For my perspective--is the issue here whether to encrypt the IP address or is there another issue under review?

@jportner
Copy link
Contributor

username hash (not encrypted), so we can find all sessions for a given user, which is a requirement from #18162. Then we can encrypt the username, which could be considered PII in some implementations.

You can have equal usernames in different realms, so the hash most likely should be based on username, provider type and provider name (or ES realm name, TBD) to cover "limit user's concurrent session" requirement.

Good call!

roles (not encrypted), so we can find all sessions for a given role, which is a requirement from #18162.
Note: this is a sticky one [...]

That one is tricky indeed... In the absolute majority of the installations this won't be needed, so I think we can optimize here a bit with bulk requests depending on whether these controls are enabled or not for the Kibana installation (or specific provider/realm). Invalidating user session if their current role set is affected by the "concurrent account type sessions" requirement sounds acceptable. We retrieve user with their roles on every successful authenticate attempt, we know which roles are controlled by the requirement (through kibana.yml) so it feels like we have everything to figure out whether we should log user out or forbid login.

I like it!

IP address geolocation: so we can provide this in a UI
User agent (browser/OS): so we can provide this in a UI

Do we have an immediate need in those or we can consider adding them at a later stage?

I think making a note of this for a later stage makes sense.

last-access timestamps (e.g. for audit purposes)
This would mean two cluster calls every time a user makes an HTTP request — one to fetch the session, another to update its last-access timestamp. I don’t think we need this in the session store, as this information will be available in the audit log (after it is overhauled)

That's true, but don't we do this already (and we'll have to) to constantly extend idleTimeout anyway?

Ah, I don't know why I wasn't thinking about idleTimeout. Well, right now we just update the cookie so it doesn't involve any extra round trip to Elasticsearch. Since the primary security concern driving server-side sessions is to be able to limit concurrent sessions, we could potentially leave idleTimeout in the cookie.

Moving idleTimeout to server side...

Pros:

  • All session info would be together in one place
  • Last access time would be useful for our planned server-side clean-up job to delete stale session data
  • We wouldn't have to worry about encrypting the cookie anymore (because it would only contain the SID and AAD, which can't be directly tampered with)
    • Side note on this point: Thinking this through again -- we could continue encrypting the SID+AAD cookie like we do today. This would prevent attackers from attempting to guess SIDs (which is not really a concern anyway). However, as a side benefit it would prevent attackers from forcing Kibana to call Elasticsearch with each attempt, since Kibana will only do so if the cookie is successfully decrypted.

Cons:

  • Extra round trip on every authenticated call, which could potentially add up to an enormous amount of traffic

So, could we come up with some other reasonable mechanism to facilitate cleanups of stale session data? If so, we could keep idleTimeout in the cookie.

@legrego
Copy link
Member

legrego commented Apr 21, 2020

@legalastic:

Technically, under GDPR, an IP address is "personal data" where it is identifiable to an individual. (Parenthetically, I note the reference to geolocation (via MaxMind lookup) and would not consider that identifiable because the database there is too imprecise to provide address of a specific user.) As you may anticipate, the manner in which the IP address is used is also relevant to the analysis. Here, it looks like we would be using the IP address in a manner attributable to an individual.

Yes, we would be associating the IP address with a username/individual, but we don't have plans in this initial phase to expose this data to end-users. However, users with sufficient privileges may be able to query this index directly, but those read privileges would have to be explicitly granted by one or more assigned roles.

If it matters, we will not be storing the username in plain text, so querying the index directly will not allow someone to directly associate an IP address with a username, even if the IP address is stored in plain text.

For my perspective--is the issue here whether to encrypt the IP address or is there another issue under review?

Yeah, I'm interested to know if we need to encrypt the IP address. Doing so would prevent anyone with direct access to the Elasticsearch index from reading this data, but I expect it will also prevent us from being able to enrich this with geolocation data.

@carlspataro
Copy link

@legrego I am going to see if we can get 15 minutes or so via zoom this week so that I can better visualize what you have in mind. My gut feeling is that the privacy implications here are pretty limited but I'd want to discuss live so I better understand. On the question of encrypting IP address--that may be as much a security-by-design question as it is a privacy-by-design question (e.g., if there are adequate controls over access to the Elasticsearch index then encryption is less of a requirement; plus, what is the risk to the customer/individual if the IP address were harvested.) I look forward to discussing in real time.

@carlspataro
Copy link

@legrego to follow up on our call. There's no issue here from a legal perspective as the IP addresses remain in the cluster and only accessible to authorized users. The IP addresses are not being extracted or otherwise accessed by Elastic. As discussed, whether to encrypt is a judgment call as to the likely risk that an unauthorized user would get access to that information and, if they did, what the impact to the privacy rights of the individual would be. With regard to the latter---the impact would likely be slight since there is no readily available way for an unauthorized user to further be able to correlate to a person. With regard to the former, it sounds as if the product has security features designed into it that, if properly configured, would not permit unauthorized access. So the risk of unauthorized access is (by design) low. If I can provide more info, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker Feature:Security/Authentication Platform Security - Authentication release_note:enhancement Team:Security Team focused on: Auth, Users, Roles, Spaces, Audit Logging, and more!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants