Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To host stats.jenkins.io GSoC 2024 project in jenkins-infra #4132

Open
krisstern opened this issue Jun 7, 2024 · 81 comments
Open

To host stats.jenkins.io GSoC 2024 project in jenkins-infra #4132

krisstern opened this issue Jun 7, 2024 · 81 comments
Assignees
Labels
helpdesk Infrastructure related issues management in Github stats.jenkins.io

Comments

@krisstern
Copy link
Member

krisstern commented Jun 7, 2024

Service(s)

Helpdesk, stats.jenkins.io

Summary

We are in the process of redeveloping the frontend for the data presentation of the Jenkins Infra Statitics project, and will need help migrating the GSoC contributor's repo (https://github.com/shlomomdahan/stats.jenkins.io/) from his personal GitHub account to one in the jenkins-infra org.

@lemeurherve @gounthar @shlomomdahan

Reproduction steps

N/A

@krisstern krisstern added the triage Incoming issues that need review label Jun 7, 2024
@jenkins-infra-helpdesk-app jenkins-infra-helpdesk-app bot added helpdesk Infrastructure related issues management in Github stats.jenkins.io labels Jun 7, 2024
@lemeurherve lemeurherve self-assigned this Jun 7, 2024
@lemeurherve lemeurherve removed triage Incoming issues that need review helpdesk Infrastructure related issues management in Github labels Jun 7, 2024
@dduportal dduportal added the triage Incoming issues that need review label Jun 11, 2024
@dduportal dduportal added this to the infra-team-sync-2024-06-18 milestone Jun 11, 2024
@dduportal dduportal removed the triage Incoming issues that need review label Jun 11, 2024
@jenkins-infra-helpdesk-app jenkins-infra-helpdesk-app bot added the helpdesk Infrastructure related issues management in Github label Jun 11, 2024
@krisstern krisstern changed the title To host https://github.com/shlomomdahan/stats2.jenkins.io/ docs in jenkins-infra for GSoC 2024 To host https://github.com/shlomomdahan/stats.jenkins.io/ docs in jenkins-infra for GSoC 2024 Jun 12, 2024
@lemeurherve
Copy link
Member

@shlomomdahan could you initiate the transfer of your repository to https://githubcom/jenkins-infra please?

See https://docs.github.com/en/repositories/creating-and-managing-repositories/transferring-a-repository#transferring-a-repository-owned-by-your-personal-account

@lemeurherve
Copy link
Member

lemeurherve commented Jun 14, 2024

Action plan

Similar to:

@lemeurherve
Copy link
Member

lemeurherve commented Jun 14, 2024

@krisstern transfered the repository to jenkins-infra org (thanks!).

I've set @krisstern as "maintainer", and https://github.com/orgs/jenkins-infra/teams/core as "admin".

@krisstern
Copy link
Member Author

Thanks @lemeurherve!

@dduportal
Copy link
Contributor

Question about Fastly: what is the rationale behind using Fastly in this case (compared to serving the website ourselves)?

@lemeurherve
Copy link
Member

lemeurherve commented Jun 14, 2024

Using it as CDN, same logic as for other websites like jenkins.io, stories.jenkins.io, contributors.jenkins.io, docs.jenkins.io.

The website would be served from a nginx on publick8s at (new.)stats.origin.jenkins.io like the others.

Am I missing something?

@dduportal
Copy link
Contributor

Using it as CDN, same logic as for other websites like jenkins.io, stories.jenkins.io, contributors.jenkins.io, docs.jenkins.io.

The website would be served from a nginx on publick8s at (new.)stats.origin.jenkins.io like the others.

Am I missing something?

For jenkins.io, the rationale has always been because of the amount of requests costs too much in outbound bandwidth. But it was measured.

do we have current stats for stats.jenkins.io ?

the question is related to: do we need a CDN ? If no then what is the point.

the same question I have for stories and contributors but i never had the answer or at least i do not see anything around numbers and facts for it.

@lemeurherve
Copy link
Member

lemeurherve commented Jun 14, 2024

As stats.jenkins.io is currently hosted on GitHub Pages, I'm not sure we can access traffic stats from it.
I found https://github.blog/2014-01-07-introducing-github-traffic-analytics/ but it doesn't seem active anymore.

Then I don't think there's a lot of traffic on these pages, we can start serving it without Fastly then see what are the actual numbers in term of visits.

Removing the Fastly step points and reworking jenkins-infra/azure-net#257

@lemeurherve
Copy link
Member

FTR, I invited @shlomomdahan as writer on stats.jenkins.io.

Could be revised when a dedicated team is created.

@lemeurherve
Copy link
Member

Opened jenkins-infra/stats.jenkins.io#3 to add a pipeline for building the website on ci.jenkins.io and infra.ci.jenkins.io (pending jenkins-infra/kubernetes-management#5310).

@krisstern
Copy link
Member Author

Hi @lemeurherve, would it be possible to add @gounthar as a co-maintainer?

@lemeurherve
Copy link
Member

lemeurherve commented Jun 17, 2024

https://github.com/orgs/jenkins-infra/teams/stats-jenkins-io team including you and @gounthar created and set as "maintainer" on https://github.com/jenkins-infra/stats.jenkins.io/.

@dduportal dduportal self-assigned this Sep 12, 2024
@dduportal
Copy link
Contributor

Can we also add @Vandit1604 and @shlomomdahan as the repo maintainers along with @gounthar and me?

For @dduportal @smerle33: they should be added to the @jenkins-infra/stats-jenkins-io team (which has the "maintainer" role on the repository) and their possible custom/personal roles removed.

Just wanted to check, has this task been completed yet?

Hi @krisstern, it has been done:

  • Both @gounthar and you had been set up as "maintainers" of the team. It means you will be able to add/change/remove users in the team
  • Invitations sent to @Vandit1604 and @shlomomdahan to they can join the team

@dduportal
Copy link
Contributor

Update:

@lemeurherveCB
Copy link

lemeurherveCB commented Sep 12, 2024

(copied from #4265)

As noted in badges/shields#10522 (comment), there are far more consumers than we (I) initially though: https://github.com/search?q=stats.jenkins.io&type=code 😱

Links to (old.)stats.jenkins.io were pointing to files (still) stored on GitHub (currently on the gh-pages, generated from the pipeline of https://github.com/jenkins-infra/infra-statistics), while the new frontend doesn't serve them.

To avoid breaking existing consumers and without serving these files from our service, I propose to put in place redirections from the three folders of https://github.com/jenkins-infra/infra-statistics/tree/gh-pages (then from the data branch) to https://raw.githubusercontent.com/jenkins-infra/infra-statistics/gh-pages/ with the help of jenkins-infra/helm-charts#1332, WDYT?

Ex:
https://stats.jenkins.io/plugin-installation-trend/view-job-filters.stats.json
would redirect to
https://raw.githubusercontent.com/jenkins-infra/infra-statistics/gh-pages/plugin-installation-trend/view-job-filters.stats.json

Note: as https://stats.jenkins.io/jenkins-stats, https://stats.jenkins.io/plugin-installation-trend & https://stats.jenkins.io/pluginversions are not actual pages of the new frontend, this shouldn't cause conflict.
The remaining issue that I see with this proposition is for consumers not following redirections that would still need to adapt their fetching process.

@halkeye
Copy link
Member

halkeye commented Sep 12, 2024

why take over the old domain at all? its pretty bad practice to break old files, and stats has always been raw metrics, csvs, and stuff

why not call the new site data.jenkins.io or dataviz and keep them seperate?

@lemeurherveCB
Copy link

Good point.

@timja
Copy link
Member

timja commented Sep 12, 2024

I think fine to take over the domain but old links need to be maintained

@dduportal
Copy link
Contributor

As underlined by @timja in another thread, any reason not to rollback and re-architecture?
For instance the new frontend in the github page site so it merges both sites ?

@lemeurherveCB
Copy link

lemeurherveCB commented Sep 12, 2024 via email

@dduportal
Copy link
Contributor

Yeah that should be doable, the content is static. I take the blame here I should have thought about that possibility. I'll try on a fork tomorrow then propose a plan to adapt the service and decommission the resources that won't be used anymore if that's OK for you.

That's OK in term of planning 👍 thanks!

@krisstern @lemeurherve @gounthar WDYT if we roll back the DNS to the old site to repair user use cases, and it will let you time to plan, act and retest?

@krisstern
Copy link
Member Author

Sure @dduportal, the arrangement is okay with me

@dduportal
Copy link
Contributor

Just realized something: jenkins-infra/helm-charts#1332 (comment)

Just a point about the GitHub Pages: it only accepts a single CNAME.
Looks like it is a problem here as we have to choose between old.stats.jenkins.io or stats.jenkins.io if we rollback.

  1. If we rollback, we'll break users already migrated to use the old.stats.jenkins.io. Including plugin site (but we can revert the PRs quickly for this specific one).
  2. GitHub Page is useful for not paying bandwidth, BUT it is not easy for it to manage multiple CNAME (which make sense: it's free). Have to think about it for the new architecture.
  • Also: in term of permissions, it is a nightmare for ensuring production works as expected for the Jenkins Infra team. Any repository maintainer can accidentally break it.

@timja
Copy link
Member

timja commented Sep 13, 2024

Also: in term of permissions, it is a nightmare for ensuring production works as expected for the Jenkins Infra team. Any repository maintainer can accidentally break it.

I would say not a real concern, you are highly unlikely to touch this area.

@lemeurherveCB
Copy link

I propose to revise the decision to not use Fastly: activating it would allow us to serve files from our service without worrying about the bandwidth, while letting old.stats.jenkins.io in place for consumers who already switched.

@timja
Copy link
Member

timja commented Sep 13, 2024

Sounds good to me, any concerns about using fastly?

Enough credits?

@dduportal
Copy link
Contributor

Sounds good to me, any concerns about using fastly?

Enough credits?

As discussed privately with @lemeurherve :

Cons of setting up Fastly:

  • Additional work (to set it up, changing DNSes) as it is an additional component
  • We're not sure if it is really needed here (e.g. how much data is really served by stats.jio?)
  • Need to monitor if it's not adding too much usage in the sponsorship (I don't know, we can look now but we'll have to check it monthly)

Pros of using Fastly:

  • Solve the bandwidth issue
  • Fire and forget for the stats.jio website
  • Same pattern as www.jenkins.io, plugins.jenkins.io, etc.

Any objection if we start without Fastly (less work and @lemeurherve can focus on solving the misssing files), wait for 1 month, check usage and enable Fastly eventually if needed?

@timja
Copy link
Member

timja commented Sep 13, 2024

Sounds good to me

@lemeurherveCB
Copy link

lemeurherveCB commented Sep 13, 2024

Opened jenkins-infra/stats.jenkins.io#168 to restore broken links by serving all existing static files from stats.jenkins.io while letting old.stats.jenkins.io in place for now.

@lemeurherveCB
Copy link

lemeurherveCB commented Sep 13, 2024

@lemeurherveCB
Copy link

lemeurherveCB commented Sep 13, 2024

Proposition for the next steps:

@dduportal
Copy link
Contributor

The production deployment process is broken since jenkins-infra/stats.jenkins.io#168
as per @lemeurherveCB messag in https://matrix.to/#/!JLUOInpEYmxJIYXlzs:matrix.org/$lK7-7uRQPpVDLey3LeH0KpMj5EJlxIJx7rGDxTsVlLA?via=g4v.dev&via=gitter.im&via=matrix.org

The azcopy keeps failing on subsequent replay or builds.

Currently investigating.

@dduportal
Copy link
Contributor

Currently investigating.

Failures from error logs are described in jenkins-infra/stats.jenkins.io#168 (comment)

@dduportal
Copy link
Contributor

dduportal commented Sep 16, 2024

Additional issue with the new website (and the addon of old files) which need to reconsider/fine tune the "try file" method (due to frontend managed routing) - #4291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
helpdesk Infrastructure related issues management in Github stats.jenkins.io
Projects
None yet
Development

No branches or pull requests