Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a font name parser #717

Closed
wants to merge 8 commits into from
Closed

Conversation

Finii
Copy link
Collaborator

@Finii Finii commented Dec 15, 2021

[why]
A lot of the fonts have incorrect naming after patching. A completely
different approach can help to come up with a consistent naming scheme.

[how]
See README

Requirements / Checklist

  • Read the Contributing Guidelines
  • Read or at least glanced at the FAQ
  • Read or at least glanced at the Wiki
  • Scripts execute without error (if necessary):
    • If any of the scripts were modified they have been tested and execute without error, e.g.:
      • ./font-patcher Inconsolata.otf --fontawesome --octicons --pomicons
      • ./gotta-patch-em-all-font-patcher\!.sh Hermit
  • Extended the README and documentation if necessary, e.g. You added a new font please update the table

What does this Pull Request (PR) do?

How should this be manually tested?

Any background context you can provide?

In general (MS or Apple, is the same):
https://docs.microsoft.com/en-us/typography/opentype/spec/name#name-ids
https://www.fonttutorials.com/how-to-name-font-family/

About the length limits:
https://adobe-type-tools.github.io/font-tech-notes/pdfs/5088.FontNames.pdf
https://forum.glyphsapp.com/t/overly-strict-font-name-max-length-recommendation-in-naming-tutorial/10164
fonttools/fontbakery#1488
https://typedrawers.com/discussion/617/family-name/p2

What are the relevant tickets (if any)?

Screenshots (if appropriate or helpful)

Edit: Add all links in Background paragraph

@Finii Finii force-pushed the feature/rewrite-setup_font_names branch from 77e89f0 to 71e48f9 Compare December 15, 2021 11:07
[why]
Code Climate configuration for Python is per default rather restrictive
and would force us to split all stuff into too many files.

[how]
Have some faith in the cognitive capabilities of people.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
DO NOT MERGE

[why]
A lot of the fonts have incorrect naming after patching. A completely
different approach can help to come up with a consistent naming scheme.

[how]
See bin/scripts/name-parser/README.md

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
Some CJK fonts seem to have no Fullname.

[how]
But they have a Postscript name. Use that for parsing the names.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii Finii changed the title Draft: Introduce a file name parser Draft: Introduce a font name parser Dec 15, 2021
@Finii
Copy link
Collaborator Author

Finii commented Dec 16, 2021

Here an example.

This is how the fonts come out now:

 |PSname                                             | | Fullname                                           | | Family                         | | Subfamily                      | | Typogr. Family                 | | Typogr. Subfamily
 |-------------------------------------------------- |-| -------------------------------------------------- |-| ------------------------------ |-| ------------------------------ |-| ------------------------------ |-| ------------------------------
 |CaskaydiaCoveNerdFontComplete-BoldItalic           | | Caskaydia Cove Bold Italic Nerd Font Complete      | | CaskaydiaCove Nerd Font        | | Bold Italic                    | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFontComplete-Bold                 | | Caskaydia Cove Bold Nerd Font Complete             | | CaskaydiaCove Nerd Font        | | Bold                           | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFontComplete-ExtraLightItalic     | | Caskaydia Cove ExtraLight Italic Nerd Font Complet | | Cascadia Code ExtraLight       | | Italic                         | | CaskaydiaCove Nerd Font        | | ExtraLight Italic
 |CaskaydiaCoveNerdFontComplete-ExtraLight           | | Caskaydia Cove ExtraLight Nerd Font Complete       | | Cascadia Code ExtraLight       | | Regular                        | | CaskaydiaCove Nerd Font        | | ExtraLight
 |CaskaydiaCoveNerdFontComplete-LightItalic          | | Caskaydia Cove Light Italic Nerd Font Complete     | | Cascadia Code Light            | | Italic                         | | CaskaydiaCove Nerd Font        | | Light Italic
 |CaskaydiaCoveNerdFontComplete-Light                | | Caskaydia Cove Light Nerd Font Complete            | | Cascadia Code Light            | | Regular                        | | CaskaydiaCove Nerd Font        | | Light
 |CaskaydiaCoveNerdFontComplete-Italic               | | Caskaydia Cove Italic Nerd Font Complete           | | CaskaydiaCove Nerd Font        | | Italic                         | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFontComplete-Regular              | | Caskaydia Cove Regular Nerd Font Complete          | | CaskaydiaCove Nerd Font        | | Regular                        | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFontComplete-SemiBoldItalic       | | Caskaydia Cove SemiBold Italic Nerd Font Complete  | | Cascadia Code SemiBold         | | Italic                         | | CaskaydiaCove Nerd Font        | | SemiBold Italic
 |CaskaydiaCoveNerdFontComplete-SemiBold             | | Caskaydia Cove SemiBold Nerd Font Complete         | | Cascadia Code SemiBold         | | Regular                        | | CaskaydiaCove Nerd Font        | | SemiBold
 |CaskaydiaCoveNerdFontComplete-SemiLightItalic      | | Caskaydia Cove SemiLight Italic Nerd Font Complete | | Cascadia Code SemiLight        | | Italic                         | | CaskaydiaCove Nerd Font        | | SemiLight Italic
 |CaskaydiaCoveNerdFontComplete-SemiLight            | | Caskaydia Cove SemiLight Nerd Font Complete        | | Cascadia Code SemiLight        | | Regular                        | | CaskaydiaCove Nerd Font        | | SemiLight

Everything looks fine, but ... Family is unchanged for some/most fonts (because of a fontforge bug unexpected behavior). And because this font has a RFN, we need to change the Family (this is where the problems start).

If we would 'write want we want' we would end up with this:

 |PSname                                             | | Fullname                                           | | Family                         | | Subfamily                      | | Typogr. Family                 | | Typogr. Subfamily
 |-------------------------------------------------- |-| -------------------------------------------------- |-| ------------------------------ |-| ------------------------------ |-| ------------------------------ |-| ------------------------------
 |CaskaydiaCoveNerdFont-BoldItalic                   | | Caskaydia Cove Bold Italic Nerd Font               | | CaskaydiaCove Nerd Font        | | Bold Italic                    | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFont-Bold                         | | Caskaydia Cove Bold Nerd Font                      | | CaskaydiaCove Nerd Font        | | Bold                           | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFont-ExtraLightItalic             | | Caskaydia Cove ExtraLight Italic Nerd Font         | | CaskaydiaCove Nerd Font        | | ExtraLightItalic               | | CaskaydiaCove Nerd Font        | | ExtraLight Italic
 |CaskaydiaCoveNerdFont-ExtraLight                   | | Caskaydia Cove ExtraLight Nerd Font                | | CaskaydiaCove Nerd Font        | | ExtraLight                     | | CaskaydiaCove Nerd Font        | | ExtraLight
 |CaskaydiaCoveNerdFont-Italic                       | | Caskaydia Cove Italic Nerd Font                    | | CaskaydiaCove Nerd Font        | | Italic                         | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFont-LightItalic                  | | Caskaydia Cove Light Italic Nerd Font              | | CaskaydiaCove Nerd Font        | | LightItalic                    | | CaskaydiaCove Nerd Font        | | Light Italic
 |CaskaydiaCoveNerdFont-Light                        | | Caskaydia Cove Light Nerd Font                     | | CaskaydiaCove Nerd Font        | | Light                          | | CaskaydiaCove Nerd Font        | | Light
 |CaskaydiaCoveNerdFont-Regular                      | | Caskaydia Cove Regular Nerd Font                   | | CaskaydiaCove Nerd Font        | | Regular                        | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFont-SemiBoldItalic               | | Caskaydia Cove SemiBold Italic Nerd Font           | | CaskaydiaCove Nerd Font        | | SemiBoldItalic                 | | CaskaydiaCove Nerd Font        | | SemiBold Italic
 |CaskaydiaCoveNerdFont-SemiBold                     | | Caskaydia Cove SemiBold Nerd Font                  | | CaskaydiaCove Nerd Font        | | SemiBold                       | | CaskaydiaCove Nerd Font        | | SemiBold
 |CaskaydiaCoveNerdFont-SemiLightItalic              | | Caskaydia Cove SemiLight Italic Nerd Font          | | CaskaydiaCove Nerd Font        | | SemiLightItalic                | | CaskaydiaCove Nerd Font        | | SemiLight Italic
 |CaskaydiaCoveNerdFont-SemiLight                    | | Caskaydia Cove SemiLight Nerd Font                 | | CaskaydiaCove Nerd Font        | | SemiLight                      | | CaskaydiaCove Nerd Font        | | SemiLight

The names in Family use now our new name, but they do not contain the weight anymore (compare with the Cascadia Family names above), i.e. SemiLight. Instead the weight ended up in the SubFamily.

The Family/SubFamily grouping is for sets of four font, that have combinations of Bold and Italic. Not for weights.
We want two write the same stuff that belongs into the Typographic Family / Subfamily fields, that allow an indefinite number of fonts grouped... (BTW: Typographic Subfamily we do not try to change at all).

Obviously what we want is not correct.

This is not so easy to tackle with the current naming code; and the reason I started the complete thing from scratch.

A compromise is to Really set Family to what we want (a variant of #690).
But keep SubFamily untouched (accidentially).

With that it would look like this:

 |PSname                                             | | Fullname                                           | | Family                         | | Subfamily                      | | Typogr. Family                 | | Typogr. Subfamily
 |-------------------------------------------------- |-| -------------------------------------------------- |-| ------------------------------ |-| ------------------------------ |-| ------------------------------ |-| ------------------------------
 |CaskaydiaCoveNerdFont-BoldItalic                   | | Caskaydia Cove Bold Italic Nerd Font               | | CaskaydiaCove Nerd Font        | | Bold Italic                    | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFont-Bold                         | | Caskaydia Cove Bold Nerd Font                      | | CaskaydiaCove Nerd Font        | | Bold                           | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFont-ExtraLightItalic             | | Caskaydia Cove ExtraLight Italic Nerd Font         | | CaskaydiaCove Nerd Font        | | Italic                         | | CaskaydiaCove Nerd Font        | | ExtraLight Italic
 |CaskaydiaCoveNerdFont-ExtraLight                   | | Caskaydia Cove ExtraLight Nerd Font                | | CaskaydiaCove Nerd Font        | | Regular                        | | CaskaydiaCove Nerd Font        | | ExtraLight
 |CaskaydiaCoveNerdFont-Italic                       | | Caskaydia Cove Italic Nerd Font                    | | CaskaydiaCove Nerd Font        | | Italic                         | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFont-LightItalic                  | | Caskaydia Cove Light Italic Nerd Font              | | CaskaydiaCove Nerd Font        | | Italic                         | | CaskaydiaCove Nerd Font        | | Light Italic
 |CaskaydiaCoveNerdFont-Light                        | | Caskaydia Cove Light Nerd Font                     | | CaskaydiaCove Nerd Font        | | Regular                        | | CaskaydiaCove Nerd Font        | | Light
 |CaskaydiaCoveNerdFont-Regular                      | | Caskaydia Cove Regular Nerd Font                   | | CaskaydiaCove Nerd Font        | | Regular                        | | CaskaydiaCove Nerd Font        | |
 |CaskaydiaCoveNerdFont-SemiBoldItalic               | | Caskaydia Cove SemiBold Italic Nerd Font           | | CaskaydiaCove Nerd Font        | | Italic                         | | CaskaydiaCove Nerd Font        | | SemiBold Italic
 |CaskaydiaCoveNerdFont-SemiBold                     | | Caskaydia Cove SemiBold Nerd Font                  | | CaskaydiaCove Nerd Font        | | Regular                        | | CaskaydiaCove Nerd Font        | | SemiBold
 |CaskaydiaCoveNerdFont-SemiLightItalic              | | Caskaydia Cove SemiLight Italic Nerd Font          | | CaskaydiaCove Nerd Font        | | Italic                         | | CaskaydiaCove Nerd Font        | | SemiLight Italic
 |CaskaydiaCoveNerdFont-SemiLight                    | | Caskaydia Cove SemiLight Nerd Font                 | | CaskaydiaCove Nerd Font        | | Regular                        | | CaskaydiaCove Nerd Font        | | SemiLight

Now the Family is the same for all fonts, which is too many in the Family but anyhow. Because we do not touch the SubFamily it at least does make sense a bit - if only one weight is installed by the user. Which can be the case; is often the case? I don't know.

And there are other issues aplenty. One is this:

The PostScript font name "CaskaydiaCoveNerdFont-Bold Italic" is invalid.
It should be printable ASCII, must not contain (){}[]<>%/ or space and must be shorter than 63 characters

What font-patcher needs is some understanding how fonts are to be grouped.

This MR adds a class FontnameTools, where the name is analyzed and classified into

  • Name Base
  • Name Suffix
  • Weight
  • Style
  • Other naming parts
  • Unclassified rest

Once you have this information it is easy to assemble all name fields in a consistent matter (FontnameParser).

I guess a lot of this is already in the README file, but I wanted to add the images here.

Edit: Correct Really set SubFamily to what we want to Really set Family to what we want

@Finii Finii mentioned this pull request Dec 16, 2021
5 tasks
@Finii
Copy link
Collaborator Author

Finii commented Dec 23, 2021

Just stumbled over #107 with this cite:

When using the windows-compatible named fonts on MacOS, I found that some font families where not being handled properly. Different weights of the same font would conflict with each other. This was caused by the font name being the identical across the font weights. I solved this by tweaking how the windows name is generated, and ensure it is unique for each font weight. IMHO, it would be better to have only one naming convention that works everywhere. This would eliminate half of the binary fonts and generation time.

Well, I guess this is a solution. Apart maybe for the limited Family name length on Windows. But then, maybe that is fixed in Windows now. I can run a check if the (supposed) length restriction is ever violated, and if not we can for sure drop the 'for Windows' prepatched fonts.

[why]
Under certain circumstances the WWS names (Family and Subfamily) are
used to identify a font. We do not touch these SFNT table entries, so
when the font is renamed these are wrong (have the original name).
Font-grouping will go wrong then.

[how]
The typographic ('Preferred') Family and Styles are set correctly
already and they follow the WWS pattern, so the WWS fields can be (and
should) be empty. They exist to allow font grouping in the case where
the typographic names do not follow that pattern.

Remove preexisting WWS entries (because they are not needed anymore,
otherwise we would need to write the corrected new names there).

We already set the WWS bit in fsSelection that is needed:
    def fs_selection(self, fs):
        """Modify a given fsSelection value for current name, bits 0, 5, 6, 8, 9 touched"""
        [...]
        b |= WWS # We assert this by our naming process
        return b

Unfortunately we have no way (jet) to set fsSelection.

This is only the case for Iosevka for all fonts in src/unpatched-fonts.

Reported-by: Rui Ming (Max) Xiong <xsrvmy>
Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii
Copy link
Collaborator Author

Finii commented Jan 4, 2022

@xsrvmy: Now I understood the issue. We need to remove the WWS entries, so that the Typographic entries are used. They are set with the correct names already, and as they follow WWS no extra WWS entries are needed (and existing have to be removed).

I missed the step in parens, because no font src/unpatched-fonts/ (except Iosevka) has WWS set.

Thanks for pointing that out.

I'm still looking for a nice way to modify fsSelection... Solved 🎉

fontforge has an undocumented call to set the fsSelection bits.
Never rely on documentation :-(

Found this here:
fontforge/fontforge#3174

And the readback values are actually not read from the source font, so
we do not use them.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
A lot people expect the font-patcher to be a stand alone script. They
even think that the source glyphs (symbols) to be added to be somehow
magically there and one PR makes sure that they are fetched if missing.

The same problem arises when we have a script distributed over multiple
files. For maintenance reasons and code quality this is what one wants.
But that might hinder easy use of the font-patcher.

[how]
Put all the code in the main script.

That has an additional drawback: For the nameparser_test* scripts to
work we need stand alone files for that classes. Now the code is
duplicated and will get out of sync.

I have no solution for that, and it all boils down what Nerd Font wants
to do.

One solution would be to have font-patcher properly set up / divided in
many .py files, and to create one monolithic font-patcher from all the
sources on demand (via github actions or manually when someone pushes
changes to any of the constituends). That approach is taken by a lot of
C++ 'header only libraries' that originally consist of a lot files but
create one big 'all in one' file automatically from all the small files.

For now I guess we can live with the duplication, but we need to think
about a solution, as this will bite us sooner or later.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii Finii marked this pull request as ready for review January 4, 2022 09:17
@Finii Finii changed the title Draft: Introduce a font name parser Introduce a font name parser Jan 4, 2022
@Finii
Copy link
Collaborator Author

Finii commented Jan 4, 2022

I have forgotten what I wanted to add to these comments before review. Probably everything has been said already, but unfortunately all over the place and not only here. Maybe it is sufficient anyhow.

Please let me know if you need more information and / or examples.


What can be done in CI is to run a name_parser_test2 on any changed source-font, and it shall not find any new 'Issues'.
I would not run it on all fonts, because that takes ages.
If new fonts are added one needs to run the test2 and replace its known_issues file (with known_issues.new, that will be created).

@Finii
Copy link
Collaborator Author

Finii commented Jan 5, 2022

fontforge/fontforge#4877

[why]
We want to patch Cascadia with `--parser` while all other fonts shall be
patched as before.

[how]
Use the config.cfg file that each source font can have to specify one
arbitrary option to the font-patcher calls.

This is just set in Cascadia's config.cfg, but can be extended to other
fonts gradually.

In this way the stand alone `font-patcher` works as before, unless
someone adds the `--parser` option. Which probably will become the
recommended way to use it over time.

The patch-em-all script on the other hand can be instructed to use or
not to use --parser on a font by font basis via their cfg file.

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
[why]
The fontname for Windows can be quite unusable, for example
  `CaskaydiaCoveNerdFontCompleteM-`
for several different fonts, as this is the maximum allowed length of 31
characters that is enforced.

The style/weight is completely lost.

[how]
Split the name into base and style (at a dash `-`) and just shrink the
base name. Result for example:
  `CaskaydiaCoveN-ExtraLightItalic`

Use equal approach for the PostScriptName (although it is less likely
that length limit is ever met).

Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
@Finii
Copy link
Collaborator Author

Finii commented Jan 10, 2022

Pulled last two commits developed while working on #723 that would really belong here.

CodeClimate has some issue, but I can not see the job page... trying ...

The short names we generate for Windows are probably not needed, or only for very old stuff. The exact origin of that issue is hard to get, talks about Microsort Word 2011 for Mac that had it? Maybe we should try to run without the short names. I for my part usually use the 'normal' fonts with no issues on Windows 10.

See also
fontforge/fontforge@dadd4e5 (authored 2006!)
https://typedrawers.com/discussion/617/family-name/p2
https://www.fonttutorials.com/how-to-name-font-family/

@Finii
Copy link
Collaborator Author

Finii commented Jan 13, 2022

Note to self: Check regression #526

@Finii
Copy link
Collaborator Author

Finii commented Apr 6, 2022

I guess this is finally superseeded by #723; as that contains THIS and some more.

@torarnv
Copy link
Contributor

torarnv commented Apr 10, 2022

One thing I also noticed in the current state of the code (not applying this patch), is that:

  • The family name is pulled out of the PS name, resulting in "Avenir Next" becoming "AvenirNext" in the patched font
  • The family name and full name not agreeing on where to inject the "Nerd Font" part:
    • Family name: Menlo Nerd Font
    • Full name: Menlo Regular Nerd Font
      • This should be "Menlo Nerd Font Regular" I think

@Finii
Copy link
Collaborator Author

Finii commented Apr 10, 2022

The family name is pulled out of the PS name, resulting in "Avenir Next" becoming "AvenirNext" in the patched font

The code in this MR keeps the blank, I believe.
Relevant code here

def camel_explode(word):

  • The family name and full name not agreeing on where to inject the "Nerd Font" part:

    • Family name: Menlo Nerd Font

    • Full name: Menlo Regular Nerd Font

      • This should be "Menlo Nerd Font Regular" I think

You are correct. It's fixed by this MR. This is the reason why the code must be 'smart' enough to differentiate between 'name parts' and 'style/weight' parts.

@Finii
Copy link
Collaborator Author

Finii commented Apr 10, 2022

* This should be "Menlo Nerd Font Regular" I think

You are correct.

The FamilyName should of course not contain Regular, probably you mean FontName. (very few fonts have that).

For that fonts it can be selected... see

def set_keep_regular_in_family(self, keep):

Here a list of the changes that this MR does (+: new, -: old, >: original) for all our prepatched fonts.
https://github.com/ryanoasis/nerd-fonts/blob/feature/cascadia-2111.01/bin/scripts/name_parser/name_parser_test2.known_issues

@Finii
Copy link
Collaborator Author

Finii commented Apr 10, 2022

The family name is pulled out of the PS name, resulting in "Avenir Next" becoming "AvenirNext" in the patched font

This 'camelcasing' (i.e. removing blank between name parts) is a Nerd Fonts quirk 😬
Can be turned on or off ;-) with this MR:

def enable_short_families(self, camelcase_name, prefix):

@torarnv
Copy link
Contributor

torarnv commented Apr 10, 2022

The FamilyName should of course not contain Regular, probably you mean FontName. (very few fonts have that).

Right, in the above example the family name should be "Menlo Nerd Font", and the full name "Menlo Nerd Font Regular".

Which seems to be the pattern followed by this, which is great:

https://github.com/ryanoasis/nerd-fonts/blob/feature/cascadia-2111.01/bin/scripts/name_parser/name_parser_test2.known_issues

@Finii
Copy link
Collaborator Author

Finii commented Apr 10, 2022

Just noticed that Meslo is missing from the name_parser_test2.known_issues 'database'??!
Which is strange, because Meslo is mentioned in a comment in name_parser_test2.
That's a bug. Need to investigate.

...

According to the README one has to call (from the bin/scripts/name-parser/ directory)
fontforge name_parser_test2 ../../../src/unpatched-fonts/**/*.[ot]tf 2>/dev/null

And it finds the Meslo fonts, so why are they missing?

$ ll  ../../../src/unpatched-fonts/**/*.[ot]tf | grep Meslo
-rw-rw-r-- 1 fini fini  442516 Mär 11 10:46 ../../../src/unpatched-fonts/Meslo/L/Bold-Italic/Meslo LG L Bold Italic for Powerline.ttf
[...]

Puzzling

@Finii
Copy link
Collaborator Author

Finii commented Aug 22, 2022

This has been pulled indirectly via #723

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants