Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: minify HTML on build #2537

Merged
merged 4 commits into from
Nov 24, 2019

Conversation

nschonni
Copy link
Member

@nschonni nschonni commented Sep 8, 2019

No description provided.

@XhmikosR
Copy link
Contributor

XhmikosR commented Sep 8, 2019

IMO html-minifier definitely needs custom options and most importantly this shouldn't happen in dev mode. This can change the rendering with the default options. The safest approach would be to use conservative collapse.

My suggestion is to use this config:

{
  collapseBooleanAttributes: true,
  collapseWhitespace: true,
  conservativeCollapse: true,
  decodeEntities: true,
  minifyCSS: {
    level: {
      1: {
        specialComments: 0
      }
    }
  },
  minifyJS: true,
  minifyURLs: false,
  processConditionalComments: true,
  removeAttributeQuotes: true,
  removeComments: true,
  removeOptionalAttributes: true,
  removeOptionalTags: true,
  removeRedundantAttributes: true,
  removeScriptTypeAttributes: true,
  removeStyleLinkTypeAttributes: true,
  removeTagWhitespace: false,
  sortAttributes: true,
  sortClassName: true
}

This was referenced Sep 8, 2019
@XhmikosR
Copy link
Contributor

XhmikosR commented Sep 8, 2019

Also, it goes without saying, we need to test this a lot before landing. :)

About the env, I made #2540 which you can use.

@XhmikosR
Copy link
Contributor

XhmikosR commented Sep 8, 2019

To maintainers: I just tried this PR, and I was right; it does changes the rendered HTML. It shouldn't land with the default options. Also, we definitely need this to run only for production because it's a lot slower.

But I think we should also address #2524 and #2523. html-minifier is using uglify-js which doesn't work with ES6. Plus we need to be extra careful with semicolons missing in the inline JS.

So, I'd put this on hold for now. Patch for reference master...XhmikosR:feat--minify-HTML-on-build

@XhmikosR
Copy link
Contributor

XhmikosR commented Sep 8, 2019

On my machine with my patch, it takes like ~7 minutes to finish the production build. I definitely want this to land at some point, just pointing out what I see so far.

There's also some new HTML errors which should be fixed after #2527 lands and we have #2469 landed too, to verify on each build.

@Trott
Copy link
Member

Trott commented Sep 8, 2019

Given the benefits of compression during HTTP transfers, is the cost/work of safely implementing HTML minification worth it? The current main page HTML is 4.2 Kb over the wire after compression. Even if HTML minification saved us a whopping 15%, that would be around 600 bytes. Is that worthwhile when it risks messing up rendering and complicating testing/debugging, even if those risks are small?

@XhmikosR
Copy link
Contributor

XhmikosR commented Sep 9, 2019

I thought about it more, too.

In total, it reduces the total HTML a lot. It's ~11K files. That being said, the gain from this after compression won't be big. At least for the English pages. For the translations, where people keep commented out the English strings too, it helps a lot more.

Perhaps my config is too strict hence why I get such times. I haven't tried a production deploy without uglify for example. This should speed things up, maybe to an acceptable number.

That being said, since we use Cloudflare, we could also just enable HTML minification there alternatively.

I'd say, let's keep this open for a little more. I'd like to experiment more with this later after we clean up some of the other PRs.

@XhmikosR
Copy link
Contributor

XhmikosR commented Sep 9, 2019

BTW with my config, the output doesn't change. This is a known issue due to inline-block elements.

@XhmikosR
Copy link
Contributor

XhmikosR commented Oct 1, 2019

@Trott I think this should be closed since I can't push my changes to @nschonni's fork. I have updated the patch on my fork master...XhmikosR:feat--minify-HTML-on-build and the plan is to try it again after #2543 is merged so that we don't minify inline JS which seems to cause the biggest slowdown.

@Trott
Copy link
Member

Trott commented Oct 1, 2019

@Trott I think this should be closed since I can't push my changes to @nschonni's fork. I have updated the patch on my fork master...XhmikosR:feat--minify-HTML-on-build and the plan is to try it again after #2543 is merged so that we don't minify inline JS which seems to cause the biggest slowdown.

@nschonni Does that sound good to you?

@nschonni
Copy link
Member Author

nschonni commented Oct 1, 2019

Yeah, there is not much in this PR besides adding it to the pipeling.
Just kept it open for the discussion

@XhmikosR
Copy link
Contributor

XhmikosR commented Oct 9, 2019

I rebased this and pushed it to @nschonni's branch.

@XhmikosR XhmikosR force-pushed the feat--minify-HTML-on-build branch 2 times, most recently from 7558c8c to 701ef8a Compare October 12, 2019 08:43
Copy link
Contributor

@MylesBorins MylesBorins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rubber Stamp LGTM assuming that the final build still renders without issue.

@XhmikosR
Copy link
Contributor

I need to investigate this a little more.

It would certainly help if we had PR previews.

@Trott
Copy link
Member

Trott commented Nov 10, 2019

@nschonni If you still want this (and my objection is minimal because if this ever does cause a problem, it would be easy to disable), can you resolve the conflict?

@XhmikosR
Copy link
Contributor

Rebased. I'm still not sure this is worth it for nodejs.org. For translations it has bigger gains for sure. But on the other hand we don't minify anything in core docs either.

That being said, personally, I always do this in my projects when possible. Also, with the changes I made, it should be "safe" layout-wise.

We could always land it and see how it goes.

Here's a live preview: https://hopeful-bartik-af0f39.netlify.com/en/

@nodejs/website please check if everything looks good, mostly in translations.

@XhmikosR
Copy link
Contributor

XhmikosR commented Nov 10, 2019

Also, we've yet to see the slowdown when deploying to production. CI doesn't run the full build but it's almost 2.5x slower.

@tniessen
Copy link
Member

From what I can tell based on a few tests, this reduces the page size by less than 20% (before compression, likely less after compression), while adding a dependency, making the build pipeline more complex and the output harder to read for humans. The vast majority of our traffic comes from downloads, not HTML pages. Is this tiny reduction really worth it?

@nschonni
Copy link
Member Author

I don't feel super strongly about this, but the benefit of minification (even if it's small), is that not everyone that is reading the docs will have fast internet connections.

@tniessen
Copy link
Member

I don't feel super strongly about this, but the benefit of minification (even if it's small), is that not everyone that is reading the docs will have fast internet connections.

Does this apply to the docs? I was under the impression that the website build pipeline does not touch the docs at all, since we generate the HTML files in node core.


But I might be wrong, so let's assume that this does apply to the docs. The largest doc file right now is the fs API documentation, so we can expect the largest difference there. The HTML file alone has a size of 375 KB, but Brotli encoding reduces that to 37 KB. Assuming that minification reduces the size by 20%, and making the simplified assumption that this will be reflected after compression, we will save less than 8 KB of data.

So how much does this benefit users with a slow internet connection? Let's assume that the user is on a very old 3G connection, and in a moving vehicle. That leaves us with an estimated download rate of 348kbit/s. At this network speed, saving 8 KB equals saving 188 milliseconds. (On the first page load only, after that, the browser will cache the document and the benefit diminishes.)

@XhmikosR
Copy link
Contributor

Does this apply to the docs? I was under the impression that the website build pipeline does not touch the docs at all, since we generate the HTML files in node core.

Nope, the core docs are generated in the main repo so this change doesn't apply there.

I think the wise thing would be to not land this patch.

@tniessen
Copy link
Member

Let's see if the author or one of the approving reviewers is against closing, ping @nschonni @MylesBorins @aymen94.

@XhmikosR
Copy link
Contributor

Alright, I'll rebase tomorrow and merge it and we see how it goes. I went with the settings that from my experience find the safest. The only thing I'm not sure about is the decode entities setting, which is what I wanted people to test with translations on #2537 (comment)

@bnb
Copy link
Contributor

bnb commented Nov 20, 2019

For what it's worth I'm not necessarily advocating for merging – just for us to make a more informed decision based on previous context we have around usage and if it'll be a meaningfully impactful change :)

@XhmikosR
Copy link
Contributor

Rebased and also updated the Preview site https://hopeful-bartik-af0f39.netlify.com/en/. Do note that the preview site is using gzip while we use brotli on prod

@SEWeiTung SEWeiTung merged commit 0cc3648 into nodejs:master Nov 24, 2019
@XhmikosR
Copy link
Contributor

@MaledongGit why was this squashed? Please do not blindly merge. I kept the patches separate for a reason. I was the author of the last 2 patches.

@XhmikosR
Copy link
Contributor

I'm gonna have to stay away from this repo until this never happens again.

@XhmikosR
Copy link
Contributor

Also just FYI, this takes ~7 min to build on prod.

Before this patch
stdout: > nodejs.org@ deploy /website
stdout: > npm-run-all load-schedule build:deploy external:survey
stdout: 
stdout: 
stdout: > nodejs.org@ load-schedule /website
stdout: > curl -sS https://raw.githubusercontent.com/nodejs/Release/master/schedule.json -o source/schedule.json
stdout: 
stdout: 
stdout: > nodejs.org@ build:deploy /website
stdout: > node build.js --preserveLocale
stdout: 
stdout: [ncp] build/static started
stdout: [sass] static/css started
stdout: [sass] static/css finished: 371.851ms
stdout: [metalsmith] build/ar started
stdout: [metalsmith] build/ca started
stdout: [metalsmith] build/de started
stdout: [metalsmith] build/en started
stdout: [metalsmith] build/es started
stdout: [metalsmith] build/fa started
stdout: [metalsmith] build/fr started
stdout: [metalsmith] build/gl started
stdout: [metalsmith] build/it started
stdout: [metalsmith] build/ja started
stdout: [metalsmith] build/ko started
stdout: [metalsmith] build/pt-br started
stdout: [metalsmith] build/ru started
stdout: [metalsmith] build/tr started
stdout: [metalsmith] build/uk started
stdout: [metalsmith] build/zh-cn started
stdout: [metalsmith] build/zh-tw started
stdout: [ncp] build/static finished: 43.397s
stdout: [metalsmith] build/gl finished: 2:07.771 (m:ss.mmm)
stdout: [metalsmith] build/fa finished: 2:07.802 (m:ss.mmm)
stdout: [metalsmith] build/fr finished: 2:07.832 (m:ss.mmm)
stdout: [metalsmith] build/ca finished: 2:07.869 (m:ss.mmm)
stdout: [metalsmith] build/de finished: 2:07.858 (m:ss.mmm)
stdout: [metalsmith] build/it finished: 2:07.825 (m:ss.mmm)
stdout: [metalsmith] build/tr finished: 2:07.780 (m:ss.mmm)
stdout: [metalsmith] build/es finished: 2:07.845 (m:ss.mmm)
stdout: [metalsmith] build/zh-tw finished: 2:07.771 (m:ss.mmm)
stdout: [metalsmith] build/ar finished: 2:07.911 (m:ss.mmm)
stdout: [metalsmith] build/en finished: 2:07.857 (m:ss.mmm)
stdout: [metalsmith] build/ja finished: 2:07.828 (m:ss.mmm)
stdout: [metalsmith] build/ko finished: 2:07.824 (m:ss.mmm)
stdout: [metalsmith] build/zh-cn finished: 2:07.776 (m:ss.mmm)
stdout: [metalsmith] build/pt-br finished: 2:07.819 (m:ss.mmm)
stdout: [metalsmith] build/uk finished: 2:07.780 (m:ss.mmm)
stdout: [metalsmith] build/ru finished: 2:07.814 (m:ss.mmm)

stdout: sent 102,378,051 bytes  received 293,220 bytes  5,866,929.77 bytes/sec
stdout: total size is 356,996,336  speedup is 3.48
This patch
stdout: > nodejs.org@ deploy /website
stdout: > npm-run-all load-schedule build:deploy external:survey
stdout: 
stdout: 
stdout: > nodejs.org@ load-schedule /website
stdout: > curl -sS https://raw.githubusercontent.com/nodejs/Release/master/schedule.json -o source/schedule.json
stdout: 
stdout: 
stdout: > nodejs.org@ build:deploy /website
stdout: > node build.js --preserveLocale
stdout: 
stdout: [ncp] build/static started
stdout: [sass] static/css started
stdout: [sass] static/css finished: 421.025ms
stdout: [metalsmith] build/ar started
stdout: [metalsmith] build/ca started
stdout: [metalsmith] build/de started
stdout: [metalsmith] build/en started
stdout: [metalsmith] build/es started
stdout: [metalsmith] build/fa started
stdout: [metalsmith] build/fr started
stdout: [metalsmith] build/gl started
stdout: [metalsmith] build/it started
stdout: [metalsmith] build/ja started
stdout: [metalsmith] build/ko started
stdout: [metalsmith] build/pt-br started
stdout: [metalsmith] build/ru started
stdout: [metalsmith] build/tr started
stdout: [metalsmith] build/uk started
stdout: [metalsmith] build/zh-cn started
stdout: [metalsmith] build/zh-tw started
stdout: [ncp] build/static finished: 45.596s
stdout: [metalsmith] build/gl finished: 7:06.471 (m:ss.mmm)
stdout: [metalsmith] build/ca finished: 7:06.524 (m:ss.mmm)
stdout: [metalsmith] build/de finished: 7:06.517 (m:ss.mmm)
stdout: [metalsmith] build/fa finished: 7:06.576 (m:ss.mmm)
stdout: [metalsmith] build/fr finished: 7:06.571 (m:ss.mmm)
stdout: [metalsmith] build/es finished: 7:06.583 (m:ss.mmm)
stdout: [metalsmith] build/it finished: 7:06.549 (m:ss.mmm)
stdout: [metalsmith] build/tr finished: 7:06.521 (m:ss.mmm)
stdout: [metalsmith] build/zh-tw finished: 7:06.509 (m:ss.mmm)
stdout: [metalsmith] build/ar finished: 7:06.700 (m:ss.mmm)
stdout: [metalsmith] build/pt-br finished: 7:06.600 (m:ss.mmm)
stdout: [metalsmith] build/ru finished: 7:06.594 (m:ss.mmm)
stdout: [metalsmith] build/uk finished: 7:06.585 (m:ss.mmm)
stdout: [metalsmith] build/ko finished: 7:06.607 (m:ss.mmm)
stdout: [metalsmith] build/zh-cn finished: 7:06.581 (m:ss.mmm)
stdout: [metalsmith] build/en finished: 7:06.658 (m:ss.mmm)
stdout: [metalsmith] build/ja finished: 7:06.612 (m:ss.mmm)

stdout: sent 99,517,386 bytes  received 282,204 bytes  5,117,927.69 bytes/sec
stdout: total size is 328,355,985  speedup is 3.29

IMO the gains don't justify the slowdown. Instead we could just use Cloudflare's minify HTML for this site only.

/me is out

@phillipj
Copy link
Member

phillipj commented Nov 24, 2019 via email

@XhmikosR
Copy link
Contributor

How is it intoxicating? I've asked numerous times to not change my commits or my commit messages, yet it keeps happening. I don't have any other way to communicate this.

Why would it seem normal to someone to squash patches from different authors is beyond my understanding. This is not fair to say the least and causes issues. For example, let's say this specific set of patches breaks something, but it's not @nschonni's initial patch. Why would @nschonni be blamed for my patches and vice versa?

I'm not sure how to document this. It's common sense for me to not squash patches from different authors without consent or if they are not trivial.

@MylesBorins
Copy link
Contributor

How is it intoxicating? I've asked numerous times to not change my commits or my commit messages, yet it keeps happening. I don't have any other way to communicate this.

There are other ways to communicate this such as explicitly requesting in the PR or attempting to make policy changes. Threatening to stop contributing without offering a solution isn't super helpful. That being said it seems to me you might not know how to go about making a change to how the repo is maintained.

The rules for how things should be landed can be found in our Collaborator Guide. It currently has no rules about squashing or not squashing... it specifically states how long PRs must remain open, that there must be consensus, and that it must run through CI if there are changes to executable code.

If you would like to implement new rules it could be done by submitting a PR against the collaborator guide and building consensus around those changes. TBH I'm a fan of squashing PRs into a single change unless there are explicit technical reasons to keep things separate. For example the PR to land the new ESM implementation in core was opened by me, co-authored by 5 people, originally made up of 20+ commits. We squashed to one to simplify maintenance, history, and backportability.

Why would @nschonni be blamed for my patches and vice versa?

I don't think this would ever happen TBH. We definitely do not operate any of the repos in the node.js org in a "blame" fashion. If something is broken, we fix it. I'm a huge fan of blameless post mortem's, looking at what allowed something to fail rather than blaming an individual. For example @MaledongGit did absolutely nothing wrong by landing this PR the way they did.

If you think there is a problem, it is with our rules and procedures (as mentioned above). So lets work on trying to change those. Although I will be honest that you will have to convince me about why the extra effort of preserving commits it worth the extra effort, I do promise to approach it with an open mind.

I'm not sure how to document this. It's common sense for me to not squash patches from different authors without consent or if they are not trivial.

Open a PR against the collab guide and lets take it from there. I'm sorry this has been so frustrating for you.

@XhmikosR
Copy link
Contributor

There are other ways to communicate this such as explicitly requesting in the PR or attempting to make policy changes. Threatening to stop contributing without offering a solution isn't super helpful. That being said it seems to me you might not know how to go about making a change to how the repo is maintained.

I wasn't threatening anyone, I was actually just mentioning my plans. But I do see how this may have sounded. Sorry for that, please read below why I was frustrated by this and some background info.

I don't think this would ever happen TBH. We definitely do not operate any of the repos in the node.js org in a "blame" fashion. If something is broken, we fix it. I'm a huge fan of blameless post mortem's, looking at what allowed something to fail rather than blaming an individual. For example @MaledongGit did absolutely nothing wrong by landing this PR the way they did.

I mean this in the git blame sense, not the literal blame sense. git log is a pretty valuable thing I'm using constantly, personally.

If you think there is a problem, it is with our rules and procedures (as mentioned above). So lets work on trying to change those. Although I will be honest that you will have to convince me about why the extra effort of preserving commits it worth the extra effort, I do promise to approach it with an open mind.

Well, to me, every change should be credited to person who did it. It's just the ethical thing to do, even if it's something trivial.

The fact that GitHub allows us now to squash PRs does not mean it should be done blindly. My patches in this PR were seemingly pretty trivial, but I'm a huge fan of giving credits where credits are due. It's a principle I follow all my life so far.

So, in order for me to provide this trivial patch, I had to do numerous tests and check if and when things break, which they did in the original patch.

Open a PR against the collab guide and lets take it from there. I'm sorry this has been so frustrating for you.

I wouldn't mind if it wasn't the first time someone changed my commit message to something I don't agree, and I have expressed my discomfort numerous times (in PRs though) without any success.

Open a PR against the collab guide and lets take it from there. I'm sorry this has been so frustrating for you.

I'll see what I can do. I'm not so good with words, and English isn't my native language. Also, I'd rather do something that has a biggest impact than docs, but I do acknowledge that docs are needed in big organizations, so I'll try to put something together.

@XhmikosR
Copy link
Contributor

Forgot to comment on this:

For example @MaledongGit did absolutely nothing wrong by landing this PR the way they did.

This is what's wrong IMHO. He squashed patches from different authors (me) without their consent.

@Trott
Copy link
Member

Trott commented Nov 24, 2019

Maybe the whole "We shouldn't squash by default in this repo" conversation might be better to have over in the discussion board? I think it's a perfectly fine conversation to have. The way this repo is managed has changed a lot in the last year or so and it would be good to get some documentation/agreement on what we do and how we do it.

@MylesBorins
Copy link
Contributor

This is what's wrong IMHO. He squashed patches from different authors (me) without their consent.

If you think it is wrong it is the process not the person, is what I was trying to point out

Well, to me, every change should be credited to person who did it. It's just the ethical thing to do, even if it's something trivial.

Perhaps we can enforce "if you are squashing comomits please include a co-authored-by label. AFAIK github does automatically generate this when squashing.

@MylesBorins
Copy link
Contributor

and to what trott said lets move this discussion to either an explicit thread or the discussion board so that we can keep it documented in a single place that is focused on improving our landing guifdelines

@XhmikosR
Copy link
Contributor

Is there a discussion board? Not super familiar with your internals, so bear with me :)

@Trott
Copy link
Member

Trott commented Nov 24, 2019

Is there a discussion board? Not super familiar with your internals, so bear with me :)

https://github.com/orgs/nodejs/teams/website

Sorry, I forgot that on GitHub, the discussions are for teams and not repos. So it would be that discussion board above.

@nschonni nschonni deleted the feat--minify-HTML-on-build branch November 24, 2019 17:09
@tniessen
Copy link
Member

@MaledongGit did absolutely nothing wrong by landing this PR the way they did.

Regardless of squashing, there was an active, ongoing discussion about whether this should land or not. The PR was approved, but that does not always mean that it is ready to land. Personally, I am still not convinced that this is a good addition.

@XhmikosR
Copy link
Contributor

XhmikosR commented Nov 24, 2019 via email

tniessen added a commit to tniessen/nodejs.org that referenced this pull request Nov 24, 2019
Trott pushed a commit that referenced this pull request Nov 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants