Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New version negotiation mechanism makes updates impossible for cached resources #66

Closed
domenic opened this issue Dec 16, 2019 · 4 comments · Fixed by #70
Closed

New version negotiation mechanism makes updates impossible for cached resources #66

domenic opened this issue Dec 16, 2019 · 4 comments · Fixed by #70

Comments

@domenic
Copy link
Collaborator

domenic commented Dec 16, 2019

The newish version negotiation mechanism (introduced in #47) has a pretty fatal flaw. Consider the following scenario:

  • The user visits https://example.com/static-page.html.
    • This page is cached for a year with Cache-Control: max-age=86400.
    • This page sends Origin-Policy: allowed=("policy-1")
  • The site operator updates their origin policy from "policy-1" to "policy-2", for some good reason.
    • They remember to update all pages on their site (including static-page.html) to send Origin-Policy: allowed=("policy-1" "policy-2"); preferred="policy-2", of course.
  • The user visits https://example.com/dynamic-page.html for the first time.
    • This page is not cached, e.g. Cache-Control: no-store.
    • This has the same Origin-Policy: allowed=("policy-1" "policy-2"); preferred="policy-2" header, so it uses "policy-1" for the initial load, but updates the https://example.com/.well-known/origin-policy cache entry to contain "policy-2" in the background.
    • Subsequent visits to https://example.com/dynamic-page.html use "policy-2", as intended.
  • The user visits https://example.com/static-page.html.
    • They get the cached copy...
    • Including the header Origin-Policy: allowed=("policy-1")...
    • But https://example.com/.well-known/origin-policy is "policy-2", which is not in the allowed list...
    • So the user gets a hard-failure interstitial network error loading https://example.com/static-page.html---oh no.

Notably, this problem did not happen with the previous design, because the previous design didn't restrict us to one origin policy per origin; instead it had multiple origin policies, at different URLs of the form https://example.com/.well-known/origin-policy/$policy-name. Upon re-reading, the previous design handled this in a different (but also somewhat broken) way. Visiting the cached page would update the default origin policy for the origin back to the old policy. The impact of this was somewhat limited: it threw away the (at that time normative) in-memory cache, and made it so that pages without the Sec-Origin-Policy header got the old policy.

This problem is also exacerbated by the change that allows any resource to deliver Origin-Policy headers; although a long-lived HTML page is a bit rare these days, a long-lived image or JS bundle is common.

Having one origin policy per origin seems like an intuitively good thing. But, maybe it is not really something you can reconcile with the existence of a HTTP cache; as long as pages exist in the cache with a preference for an old version of the origin policy, it seems you need to keep that old version around.

I'll continue thinking about potential ways to fix this, but thoughts would be welcome...

@domenic
Copy link
Collaborator Author

domenic commented Dec 16, 2019

One idea: change the "id" JSON field to an "ids" JSON field, which accepts an array. Then it would be the website author's responsibility to ensure that they don't remove something from "ids" until all cached resources that reference that ID are expired.

Additional idea, which could combine with that one or be independent: network errors due to mismatched origin policies on cached documents cause a refetch from the server, similar to a cache miss.

I'm unsure whether these are minor reasonable tweaks to a reasonable model, or desperate attempts to patch a broken model. I think it comes down to how important one thinks "one origin policy per origin" is.

@domenic
Copy link
Collaborator Author

domenic commented Jan 9, 2020

OK, so, I think this comes down to a choice between two main designs. One is an evolution of the current design, and one is an evolution of the previous design, but they're different enough that it's easier to just lay them out independently.

Multiple "ids" design

  • From the server's point of view, there is only one origin policy per origin, located at https://example.com/.well-known/origin-policy.
  • From the client's point of view, there can be multiple origin policies per origin due to:
    • Multiple open tabs
    • Different HTTP cache keys
  • That policy contains an "ids" field, which is an array of strings. The server operator is responsible for ensuring that they do not remove any ID values from this array until any resources which were cached with allowed=("some-id") are expired from the cache.
  • If the server operator fails to do this, then such cached resources will get invalidated, and thus re-fetched from the network. Hopefully they will be updated with new Origin-Policy headers with allowed=() values that match the current ID. If not, we need to fail for real. (I guess this means introducing a special flag on that fetch, which prevents it from looping infinitely.)
  • The browser can proactively pre-fetch the most current origin policy by fetching https://example.com/.well-known/origin-policy.

Multiple origin policy URLs design

  • The server maintains multiple origin policies, located at https://example.com/.well-known/origin-policy/$policy-name.
  • From the client's point of view, there can be multiple origin policies per origin due to:
    • Multiple open tabs
    • Different HTTP cache keys
    • Different policies requested by different resources on the origin
  • The policies do not contain any ID in their JSON (instead is inferred from the URL).
  • The server operator is responsible for ensuring that they do not cause any origin policy URLs to 404 until any resources which were cached with allow=("some-id") are expired from the cache.
  • If the server operator fails to do this, then we can invalidate the cached resource, which will hopefully be updated with a new Origin-Policy headers with allowed=() values that match a non-404ing origin policy URL. If not, we need to fail for real.
  • The browser can pro-actively pre-fetch the most current origin policy by fetching https://example.com/.well-known/origin-policy, if the server operator has responsibly set that up to do a 302 redirect to a specific policy URL.

Comparison

  • Multiple URLs makes it slightly harder to break previously-working static pages (like the OP scenario). You would have to proactively delete an origin policy resource to cause breakage.
  • Multiple URLs gives an additional way to have multiple origin policies per origin, which could be bad (e.g. loss of intended security) or good (not accidentally imposing policies a resource was unaware of).
  • Multiple "ids" makes it easier to fetch the most current origin policy without relying on the server operator to do extra work.
  • Multiple URLs requires more storage in the implementation, both in the HTTP cache, and in any memory cache used to speed things up.
  • Multiple URLs is way easier to write web platform tests for

Specific scenario comparisons (variants of the OP)

Mismatches on cached resources comparison where the server operator is minimizing round trips:

  • Multiple URLs flow:
    • https://example.com/static-page.html second request is answered from cache with Origin-Policy: allowed=("policy-1")
    • https://example.com/.well-known/origin-policy/policy-1 is answered from cache because the server operator set a longer lifetime on it
    • The page loads with no server round trips and with policy-1
  • Multiple "ids" flow:
    • https://example.com/static-page.html second request is answered from cache with Origin-Policy: allowed=("policy-1")
    • https://example.com/.well-known/origin-policy is looked up in the cache and found to have "ids": ["policy-1", "policy-2"], so it is used
    • The page loads with no server round trips and with policy-2

Mismatches on cached resources comparison where the server operator is ensuring policy versions are always matching what resources expect:

  • Multiple URLs flow:
    • https://example.com/static-page.html second request is answered from cache with Origin-Policy: allowed=("policy-1")
    • https://example.com/.well-known/origin-policy/policy-1 404s because the server operator set a shorter lifetime on it
    • The browser reloads https://example.com/static-page.html from the network, which answers with Origin-Policy: allowed=("policy-1" "policy-2"); preferred="policy-2"
    • The browser retrieves https://example.com/.well-known/origin-policy/policy-2 from the cache
    • The page loads with 1 server round trip and with policy-2
  • Multiple "ids" flow:
    • https://example.com/static-page.html second request is answered from cache with Origin-Policy: allowed=("policy-1")
    • https://example.com/.well-known/origin-policy is looked up in the cache and found to have "ids": ["policy-2"]
    • The browser reloads https://example.com/static-page.html from the network, which answers with Origin-Policy: allowed=("policy-1" "policy-2"); preferred="policy-2"
    • The browser sees that https://example.com/.well-known/origin-policy now works so it's good to go
    • The page loads with 1 server round trip and with policy-2

Conclusion

It looks like multiple "ids" wins, to me, by a slight margin.

I'll work on a spec PR adding that. The re-fetch-on-mismatch business seems a little hairy to spec correctly, so we'll see how that goes...

I'd welcome notes on anything I missed, as this is still feeling fairly tricky.

@annevk
Copy link

annevk commented Jan 10, 2020

It would be useful to hear from @arturjanc, @n8schloss, and others about deployment tradeoffs I think. I personally tend to agree that for something called Origin Policy we really ought to aim for having one and only allow multiple due to caching.

@domenic
Copy link
Collaborator Author

domenic commented Jan 10, 2020

The "reload the page from network if its cached Origin-Policy header mismatches the current origin policy" idea above is no longer seeming as nice to me, as I write the spec text. In particular, whose to say that the cached page is the one that needs to be reloaded, and not the origin policy?

In the above scenarios, it is the cached page that is older (by construction), but it could easily be the other way around. Indeed, normally when we see a mismatch between an allowed=() value and the current origin policy, we re-fetch the origin policy.

You could try to fix that by ... refetching whichever cached thing is older? ... but that seems messy.

So for now at least, I'll omit that. People should be sure to include both "ids"...

domenic added a commit that referenced this issue Jan 10, 2020
domenic added a commit that referenced this issue Jan 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants