Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add caching to service worker #92

Merged
merged 20 commits into from
Mar 14, 2024
Merged
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 104 additions & 8 deletions src/sw.ts
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,13 @@ interface GetVerifiedFetchUrlOptions {
path: string
}

interface StoreReponseInCacheOptions {
response: Response
cacheKey: string
isMutable: boolean
cache: Cache
}

/**
******************************************************
* "globals"
Expand All @@ -40,6 +47,8 @@ interface GetVerifiedFetchUrlOptions {
declare let self: ServiceWorkerGlobalScope
let verifiedFetch: VerifiedFetch
const channel = new HeliaServiceWorkerCommsChannel('SW')
const IMMUTABLE_CACHE = 'IMMUTABLE_CACHE'
const MUTABLE_CACHE = 'MUTABLE_CACHE'
const urlInterceptRegex = [new RegExp(`${self.location.origin}/ip(n|f)s/`)]
const updateVerifiedFetch = async (): Promise<void> => {
verifiedFetch = await getVerifiedFetch()
Expand Down Expand Up @@ -85,6 +94,7 @@ self.addEventListener('fetch', (event) => {
const request = event.request
const urlString = request.url
const url = new URL(urlString)
log('helia-sw: incoming request url: %s:', event.request.url)

if (isConfigPageRequest(url) || isSwAssetRequest(event)) {
// get the assets from the server
Expand All @@ -98,10 +108,9 @@ self.addEventListener('fetch', (event) => {
}

if (isRootRequestForContent(event)) {
// intercept and do our own stuff...
event.respondWith(fetchHandler({ path: url.pathname, request }))
} else if (isSubdomainRequest(event)) {
event.respondWith(fetchHandler({ path: url.pathname, request }))
event.respondWith(getResponseFromCacheOrFetch(event))
2color marked this conversation as resolved.
Show resolved Hide resolved
}
})

Expand Down Expand Up @@ -177,13 +186,93 @@ function isSwAssetRequest (event: FetchEvent): boolean {
return isActualSwAsset
}

/**
* Set the expires header on a response object to a timestamp based on the passed ttl interval
*/
function setExpiresHeader (response: Response, ttlSeconds: number = 3600): void {
const expirationTime = new Date(Date.now() + ttlSeconds * 1000)

response.headers.set('Expires', expirationTime.toUTCString())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be sensible to make this cache header explicitly SW specific (originally suggested in https://phyks.me/2019/01/manage-expiration-of-cached-assets-with-service-worker-caching.html)

Suggested change
response.headers.set('Expires', expirationTime.toUTCString())
response.headers.set('sw-cache-expires', expirationTime.toUTCString())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense

Copy link
Member

@lidel lidel Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long term, if we can avoid inventing things specific to service worker, and reuse HTTP semantics, that is better for verified-fetch being useful in more contexts, and easier to test (if these are set upstream, places like https://github.com/ipfs/helia-http-gateway get these headers for free, and we also increase surface of things that can be tested there).

My initial thought here was that if we reuse semantically meaningful Expires and set it only if not set yet, in the future it can be set inside verified-fetch, and not only we don't need to invent special header here, but also improve the upstream library and the code here will be smaller.

But fine to go with custom header here for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel Because the SW Cache API always takes precedence over browser HTTP cache and and verified-fetch will be returning the Cache-Control header, the Expires header would be: no-op.

Even if we were to use the Expires header, by respecting it when there's a Cache-Control header, we'd be breaking from HTTP semantics. This is why I think it makes sense to make it clear that this header is purely for SW purposes.

}

/**
* Checks whether a cached response object has expired by looking at the expires header
* Note that this ignores the Cache-Control header since the expires header is set by us
*/
function hasExpired (response: Response): boolean {
const expiresHeader = response.headers.get('Expires')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const expiresHeader = response.headers.get('Expires')
const expiresHeader = response.headers.get('sw-cache-expires')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in #109


if (expiresHeader == null) {
return false
}

const expires = new Date(expiresHeader)
const now = new Date()

if (expires < now) {
return true
}

return false
2color marked this conversation as resolved.
Show resolved Hide resolved
}

async function getResponseFromCacheOrFetch (event: FetchEvent): Promise<Response> {
const { protocol } = getSubdomainParts(event.request.url)
const url = new URL(event.request.url)
const isMutable = protocol === 'ipns'
const cacheKey = `${event.request.url}-${event.request.headers.get('Accept') ?? ''}`
trace('helia-sw: cache key: %s', cacheKey)
const cache = await caches.open(isMutable ? MUTABLE_CACHE : IMMUTABLE_CACHE)
const cachedResponse = await cache.match(cacheKey)

if ((cachedResponse != null) && !hasExpired(cachedResponse)) {
// If there is an entry in the cache for event.request,
// then response will be defined and we can just return it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing here seems off

Suggested change
// If there is an entry in the cache for event.request,
// then response will be defined and we can just return it.
// If there is an entry in the cache for event.request,
// then response will be defined and we can just return it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in #109

log('helia-ws: cached response HIT for %s (expires: %s) %o', cacheKey, cachedResponse.headers.get('Expires'), cachedResponse)

trace('helia-ws: updating cache for %s in the background', cacheKey)
// 👇 update cache in the background (don't await)
fetchHandler({ path: url.pathname, request: event.request })
.then(async response => storeReponseInCache({ response, isMutable, cache, cacheKey }))
.catch(err => {
err('helia-ws: failed updating response in cache for %s in the background', cacheKey, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err('helia-ws: failed updating response in cache for %s in the background', cacheKey, err)
error('helia-ws: failed updating response in cache for %s in the background', cacheKey, err)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in #109

})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to call fetchHandler and update the cache if we already have a cached response?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in #109. stale-while-revalidate only for ipns cache HIT


return cachedResponse
}

// 👇 fetch because no cached response was found
const response = await fetchHandler({ path: url.pathname, request: event.request })

void storeReponseInCache({ response, isMutable, cache, cacheKey }).catch(err => {
err('helia-ws: failed storing response in cache for %s', cacheKey, err)
})

return response
Copy link
Member

@SgtPooki SgtPooki Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we could combine these so we don't have two potential places where cache storing happens and could fail

Copy link
Member

@SgtPooki SgtPooki Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the code more locally, we don't need to fetch from the network at all on cache-hit. Since we're currently fighting rate limits from the trustless gateway, we could just not do network requests. The likelihood of something updating on IPFS is very low, whereas stale-while-revalidate is good for frequently updated content that you want a quick response for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in #109

}

async function storeReponseInCache ({ response, isMutable, cache, cacheKey }: StoreReponseInCacheOptions): Promise<void> {
// 👇 only cache successful responses
if (!response.ok) {
return
}

// Clone the response since streams can only be consumed once.
const respToCache = response.clone()

if (isMutable) {
trace('helia-ws: setting expires header on response key %s before storing in cache', cacheKey)
// 👇 Set expires header to an hour from now for mutable (ipns://) resources
// Note that this technically breaks HTTP semantics, whereby the cache-control max-age takes precendence
2color marked this conversation as resolved.
Show resolved Hide resolved
// Seting this header is only used by the service worker asd a mechanism similar to stale-while-revalidate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Seting this header is only used by the service worker asd a mechanism similar to stale-while-revalidate
// Setting this header is only used by the service worker using a mechanism similar to stale-while-revalidate

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in #109

setExpiresHeader(respToCache, 3600)
}

log('helia-ws: storing cache key %s in cache', cacheKey)
void cache.put(cacheKey, respToCache)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void cache.put(cacheKey, respToCache)
await cache.put(cacheKey, respToCache)

control flow should be done in higher level fetchHandler or getResponseFromCacheOrFetch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in #109

}

async function fetchHandler ({ path, request }: FetchHandlerArg): Promise<Response> {
/**
* > Any global variables you set will be lost if the service worker shuts down.
*
* @see https://developer.chrome.com/docs/extensions/develop/concepts/service-workers/lifecycle
*/
verifiedFetch = verifiedFetch ?? await getVerifiedFetch()
// test and enforce origin isolation before anything else is executed
const originLocation = await findOriginIsolationRedirect(new URL(request.url))
if (originLocation !== null) {
Expand All @@ -197,6 +286,13 @@ async function fetchHandler ({ path, request }: FetchHandlerArg): Promise<Respon
})
}

/**
* > Any global variables you set will be lost if the service worker shuts down.
*
* @see https://developer.chrome.com/docs/extensions/develop/concepts/service-workers/lifecycle
*/
verifiedFetch = verifiedFetch ?? await getVerifiedFetch()

2color marked this conversation as resolved.
Show resolved Hide resolved
/**
* Note that there are existing bugs regarding service worker signal handling:
* * https://bugs.chromium.org/p/chromium/issues/detail?id=823697
Expand Down
Loading