Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: set up auto-deployment of helia-service-worker gateway with nginx config #20

Closed
SgtPooki opened this issue Feb 13, 2024 · 11 comments
Closed
Assignees

Comments

@SgtPooki
Copy link
Member

from @lidel

we recently started naming things in a boring way where they describe what they do (trustless-gateway.link, delegated-ipfs.dev./routing/v1 etc 😄) so maybe we could mention why this gateway is special right in the name:
*.ipfs.in-service-worker.tld
*.ipfs.in-sw.tld
*.ipfs.in-browser.tld
(not feeling strongly, just brainstorming)

we should get one and set up CI to dogfood and autodeploy https://github.com/ipfs-shipyard/helia-service-worker-gateway
to *.ipfs.in-service-worker.tld (latest release) and *.ipfs.staging.in-service-worker.tld (staging branch for experimentation)

@SgtPooki, @aschmahmann any preference on the name and tld?

@lidel
Copy link
Member

lidel commented Feb 13, 2024

cc @2color as this is something we will use to educate people on the difference between HTTP servers like ipfs.io doing all the IPFS work on the backend and covering all the cost, and service worker approach, where we are doing IPFS work (trustless retrieval from multiple providers, local hash verification) on the client.

To give a more solid proposal, how does in-service-worker.dev sound?
Pitch: if the goal is to use this to educate developers, explicit is better. It does not bury the lede, is self-explanatory how it is different from other gateways and that SW is the mechanism, and the name composes nicely with .ipfs. and .ipns. subdomains.

@2color
Copy link
Member

2color commented Feb 13, 2024

Let's do it.
I'd vote for the shorter variant: ipfs.in-sw.xyz or .dev

@SgtPooki
Copy link
Member Author

SgtPooki commented Feb 13, 2024

I personally think in-sw is a little vague, though I would like typing that more.

I prefer a more explicit in-service-worker visually, but I think if someone is using in-sw, they will be familiar with what sw actually means, so maybe it's not an issue.

@SgtPooki
Copy link
Member Author

@lidel could we do in-browser.ipfs.io?

that could cause confusion, but could be nice?

https://specs-ipfs-tech.ipns.in-browser.ipfs.io/
https://specs-ipfs-tech.ipns.ipfs.io/

@lidel
Copy link
Member

lidel commented Feb 15, 2024

I am afraid we can't do subdomain of ipfs.io because that would defeat the origin isolation (no isolation on ipfs.io, because it has a path gateway at ipfs.io/ipfs/). General rule of thumb is to not reuse domains.

I'd say let's go with @2color's suggestion and pick short https://specs-ipfs-tech.ipns.in-sw.dev.

@ns4plabs are you able to get in-sw.dev the same way we got delegated-ipfs.dev ?

@2color
Copy link
Member

2color commented Feb 16, 2024

Since this is going to be targeted at users, I'm a bit wary of the .dev TLD. What about .xyz?

Another idea is to use the inbrowser.xyz which would work well as CID.ipfs.inbrowser.xyz; it's memorable and self explanatory.

@lidel lidel self-assigned this Feb 20, 2024
@lidel
Copy link
Member

lidel commented Feb 20, 2024

Thank you for all suggestions. We did traditional naming bikeshed 🙃 during colo today and ended up buying

  • inbrowser.link (stable production, will run latest release, thing we will use on slides, docs and comms)
  • inbrowser.dev (backup domain returning the same OR we may use it for testing, staging)

Tomorrow I'll be now working on system for building and deploying to both, along with wildcard certs.
Would be nice if we had CI here update production, but we can also live with semi-manual deployment process.

Below are my initial thoughts, but better ideas are welcome:

  • Investigate CF Pages

    • Benefit: delegate everything (TLS, deployments)
    • 🔴 We can't depend on Cloudflare Pages nor TLS certs because even tho second level wildcards are a paid product, they are not compatible with Cloudflare Pages due to "certificate priotization" (docs):

      Advanced certificates are not used with Cloudflare Pages nor R2 due to certificate prioritization. Both Pages and R2 custom domains use Cloudflare for SaaS certificates.

  • Investigate DNSLink and Fleek

    • Benefit: automated CI/CI, we delegate build and deployments to fleek, but handle TLS ourselves
    • Check if Cloudflare DNS allows for CNAME _dnslink.*.ip[n|s]s to _dnslink, if so it means we can have all subdomains share the same DNSLink.
      • 🔴 sadly no, setting CNAME _dnslink.*.ipns_dnslink.inbrowser.link does not work because we already have CNAME *.ipnsinbrowser.link` which takes precedence.
      • ❓ But we may set up recursive dnslink via TXT record, will try that
        _dnslink.inbrowser.link
    • If we could CNAME _dnslink.inbrowser.link to _dnslink.tbd.on.fleek.co and use Fleek for building and updating DNSLink for the SW payload.
    • 🟠 Fleek has no support for wildcards, so we still need to point at our infra to have TLS certs for subdomains (similar to dweb.link)
    • 🟠 Bit awkward when DNSLink is detected and loaded from local gateway --not a deal breaker, but we need the extra logic to account to avoid unnecessary recursion and fixup by redirecting to proper subdomain
  • lo-fi: do it in-house, manual deploys

    • Benefit: we don't depend on anyone
    • point at our infra, handle TLS certs for subdomains similar to dweb.link, host the same static directory with prebuilt index.html + js for all of them.
    • deployment: done via PR against https://github.com/ipshipyard/waterworks-infra. similar to dweb.link, but instead of deploying docker image of rainbow, build and deploy git tag or revision from this repo

@lidel
Copy link
Member

lidel commented Feb 21, 2024

Ok, I've been successful with

  • leveraging Fleek (for CI/CD and DNSLink automation)
  • and gateway-int.ipfs.io (for HTTP gateway based on Host header).

Things will start working in browser once we set up TLS for HTTPS, because many Web APIs are not enabled on insecure contexts, for now you can test with curl:

$ curl http://cid.ipfs.inbrowser.link 
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />

    <title>Helia bundle by Webpack</title>

    <link
      rel="stylesheet"
      href="https://unpkg.com/tachyons@4.10.0/css/tachyons.min.css"
    />
    <link rel="stylesheet" href="https://unpkg.com/ipfs-css@0.12.0/ipfs.css" />
  <link rel="icon" href="/favicon.ico"><script defer src="main.js"></script><script defer src="sw.js"></script></head>
  <body>
    <div id="root" class="montserrat f5"></div>
  </body>
</html>

Details for posterity / review

DNS is configured like this (omitted irrelevant records):

;; CNAME Records
_dnslink.inbrowser.link.	60	IN	CNAME	_dnslink.helia-service-worker-gateway.on.fleek.co. ; this DNSLink defines the source of truth for root and all subdomains
inbrowser.link.	60	IN	CNAME	gateway-int.ipfs.io. ; note: cloudflare flattens this to make it work
*.ipfs.inbrowser.link.	60	IN	CNAME	inbrowser.link. ; catch-all wildcard for IPFS subdomain gateway
*.ipns.inbrowser.link.	60	IN	CNAME	inbrowser.link. ; catch-all wildcard for IPNS subdomain gateway

;; TXT Records
inbrowser.link.	60	IN	TXT	"dnslink=/ipns/inbrowser.link" ; this TXT is returned on all _dnslink.*.ip[fn]s. subdomains thanks to wildcard CNAMEs

Making request to any domain under http://inbrowser.link, http://*.ipfs.inbrowser.link or http://*.ipns.inbrowser.link returns the same payload from fleek:

  1. Our gateway infra at gateway-int.ipfs.io reads Host header and resolves DNSlink based on the value, for example, if we have Host: cid.ipfs.inbrowser.link (subdomain request)...
  2. .. it resolves dnslink form _dnslink.cid.ipfs.inbrowser.link thanks to *.ipfs CNAME, and reads dnslink=/ipns/inbrowser.link from the main domain
  3. that recursive DNSLink triggers read from _dnslink.inbrowser.link which thanks to CNAME to _dnslink.helia-service-worker-gateway.on.fleek.co returns final CID managed by Fleek

This comes with interesting maintenance benefits:

  • We set DNS once and don't touch DNS at all.
  • No vendor lock: Fleek DNSLink could be replaced with something else, including IPNS record, at any time, and it would be a trivial change of one DNS record at _dnslink.inbrowser.link
  • Ecosystem can follow our gateway: we can use DNSLink as means of publishing software updates of the service worker gateway. Other gateways could follow our gateway and benefit from automated updates by setting up CNAME _dnslink.example.com to _dnslink.inbrowser.link.

Remaining work

  • waterworks infra – I'll need some help from @ns4plabs – tracked in https://github.com/ipshipyard/waterworks-infra/pull/21
    • Set up TLS for http://inbrowser.link, http://*.ipfs.inbrowser.link or http://*.ipns.inbrowser.link (and .dev variants)
    • configure Nginx to return /index.html instead of HTTP 404 (allows us to delegate and handle all redirects to JS)
  • JS - fill separate issues for any identified gaps
  • Ops: Decide if we simply deploy main to both .link and .dev (current setup) or if we want more sophisticated setup
    • For end goal: my preference would be to create separate Fleek project that builds from production branch, and use it for .link, and have .dev for main branch builds.
    • For now, let's keep things simple, point both at Fleek DNSLink built from main. When we stabilize codebase we can start making releases and refine this.

@SgtPooki
Copy link
Member Author

For end goal: my preference would be to create separate Fleek project that builds from production branch, and use it for .link, and have .dev for main branch builds.

I definitely prefer this approach. we could also have an automated PR (release-please style) that auto-updates on releases of "main" with a checklist of things to test against the .dev url; then merge would just ff prod to main HEAD

For now, it would be nice to have .dev for manual deployments too (but I can also deploy to my digitalocean box)

@lidel
Copy link
Member

lidel commented Feb 27, 2024

Adin reported some POPs struggle to get data from Fleek:

$ curl -I https://inbrowser.link/
HTTP/1.1 504 Gateway Time-out
Server: openresty
Date: Tue, 27 Feb 2024 15:07:41 GMT
Content-Type: text/html
Content-Length: 164
Connection: keep-alive
X-IPFS-LB-POP: gateway-bank1-sv15
X-BFID: cfd48b6b097b9e0f45810a2936bf26f6
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

After more pressing bugs are tackled, we will switch production at .link to setup similar to dist.ipfs.tech (cluster + manual dnslink update), fleek will be only used for dev and PR previews.

@lidel
Copy link
Member

lidel commented Mar 1, 2024

@SgtPooki @aschmahmann @2color FYSA we no longer depend on Fleek for publishing and pinning inbrowser.link and inbrowser.dev.

  • 👉️ TLDR: we build DAG ourselves (the CI job), pin it to our collab cluster, and update DNSLinks.

Example:

Follow-up work is to stabilize .link and only deploy releases there:

@lidel lidel closed this as completed Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: 🎉 Done
Development

No branches or pull requests

3 participants