-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check glyph CID difference across different SHS versions #259
Comments
Sorry for the delay in responding. One way you can do this is to check the mapping with fontTools to compare any fonts, not just these. Here's something I threw together in a couple minutes so it could definitely be improved, but gives the basic idea. You just provide the first and second font and you will get the diffs. import sys
from fontTools.ttLib import TTFont
path1 = sys.argv[1]
path2 = sys.argv[2]
font1 = TTFont(path1)
font2 = TTFont(path2)
cmap1 = font1['cmap'].getBestCmap()
cmap2 = font2['cmap'].getBestCmap()
def report_missing(cmap1, cmap2):
# Report mappings in the first font that are not in the second font
missing = []
for uni, cid in cmap1.items():
if uni not in cmap2:
missing.append(hex(uni))
if missing:
report = "\n".join(missing)
print(f'Mappings not found in {path2}:\n{report}')
def report_changed(cmap1, cmap2):
# Report mappings that have changed
changed = []
for uni, cid in cmap1.items():
if uni in cmap2:
if cmap2.get(uni) != cid:
changed.append(f'{hex(uni)} {cid} -> {cmap2.get(uni)}')
if changed:
report = "\n".join(changed)
print(f'Mappings differ between {path1} and {path2}:\n{report}')
report_missing(cmap1, cmap2)
report_changed(cmap1, cmap2) For what you are asking there will be a lot of differences, but a really minor example is the diff between SourceHanSansJP-Regular.otf and NotoSansJP-Regular.otf (explained here):
I'm not sure where that image was generated. It looks like @tamcy has a lot of those. That's definitely useful, wherever it came from. @tamcy is that something that can be shared? |
@punchcutter Sadly this can't be used for my issue as I'm checking across different SHS versions, which had different CID mappings as new glyphs were introduced and old are removed, drastically changing the CIDs in SHS between 1.004 (where Genne Gothic is based on) and 2.001. There are a few glyphs (notably inherited glyphs that can used in inherited glyphs project) that are removed from SHS 2.001 which Genne Gothic had used to substitute glyphs in. From the SHS Readme page 28:
The picture below illustrates how CID changes do not reflect glyph changes across versions: Unless there are software that can detect vector outline changes, or official site can provide full glyph/CID changes across versions, I don't think there's an easy way to find out what characters are changed between SHS v1.004 and v2.001. |
@NightFurySL2001 A script like this does exactly what you said about finding differences in CID-mapping between different versions of SHS or any other fonts. If you also want to have visual differences then that's a completely different thing. I guess I'm not entirely clear what you are trying to do. |
@punchcutter Well I want to find what characters are different visually between Genne Gothic and SHS KR v2.001. The problem here is that, the same glyph from v1.004 will probably have a different CID number in v2.001, while different glyph between v1.004 and v2.001 will have the same CID number. Eg. with the picture, CID-47131 (and a lot of others) have the same CID number but different glyph shape (either pointing to the same word or different word entierly), while the other that have the fully same glyphs like U+9F2D that are different in CID number between versions, eg. CID-47221 in v1.004 moved to CID-47220 in v2.001. I want to find out which glyphs differ in visual appearance between fonts, like the 龍 in Genne Gothic is different than SHS KR v2.001, and is probably removed in v2.001. |
@punchcutter Sorry for the late response. Had been very busy since last month. Indeed, what @NightFurySL2001 posted in the first post came from a tool that I wrote. This was developed for two purposes: (1) to help me inspect and identify issues regarding the Source Han Sans HK release so that I can report them, and (2) to help the development of my "opinionated" version of the font, Chiron Sans HK. The tool, which is simply a webpage printing characters in different regions on each row, helps me decide which glyph to use in my font, or a redesign is needed for a codepoint (okay there's a search function but you get my point). Because of this goal, I'd made a lot of assumptions when writing it. In addition, it's HK version centric (the HK reference glyph is shown, alongside with the TW glyph), and you can only view the glyphs by codepoints (not CIDs. Non-default glyph like those only accessible through IVS are currently not supported). So this probably isn't what @NightFurySL2001 is looking for. All Source Han Sans related information used by this tool came from this repository. In particular, the CMap file and the CID file are very useful to me. The former describes how codepoints are mapped to CIDs, while the latter contains a map of CIDs to glyph names - which looks like the "CID List" @NightFurySL2001 had asked for. |
@tamcy Thank you for your reply.
|
中易宋體 has tons of errors compared to latest GB18030. Better not rely on it. |
I have contacted a developer of the project and had gotten the info that I need. Closing this issue as solved. |
With a little help from my friend on sorting through all the infos, here is a full list: |
Is there any list/file that specified the CID-glyph relationship? I am trying to get a modified subset from Genne Gothic compared to SHS 2.001 KR version but some of the CIDs in
cmap
are changed between 1.004 (Genne Gothic) and 2.001 (SHS). Is there any documentation of CID list or CID changes between versions?P/S: also how is this image generated? Is there any software that can let devs to look through all the CIDs and variants per font?
The text was updated successfully, but these errors were encountered: