Letter and word spacing #171

therahedwig · 2022-05-02T13:06:20Z

Implements letter and word spacing. This patch applies on top of #169

As discussed in #166 , LTR has the spacing added to the advance, RTL added to offset and advance, TTB spacing 'removed' from the advance because TTB advance is negative.

Fixes #166

therahedwig · 2022-05-05T10:34:40Z

I've rebased it o it doesn't apply on the baseline patch anymore, because I am increasingly skeptical I can get that one to work.

src/raqm.c

khaledhosny · 2022-05-06T02:30:04Z

I think the letter spacing logic needs to be reworked. You currently iterate over input code points, but you need to iterate over grapheme clusters (not to be confused with HarfBuzz clusters) and add the letter spacing to the last code point in the cluster.

Then, when adding the spacing to the glyphs themselves, you need to add it to the last glyph (in visual order) in any sequence of glyphs that has the same HarfBuzz cluster. You can test Latin text with decomposed combining marks (e.g. x̣v̇) and Arabic text with vowel marks (e.g. وَرِقْ).

Furthermore, you need to disable ligatures when letter spacing is used (at least liga features, but I think browsers disable a few other features as well), otherwise text with ligatures will be unevenly spaced.

therahedwig · 2022-05-07T13:11:55Z

I think the letter spacing logic needs to be reworked. You currently iterate over input code points, but you need to iterate over grapheme clusters (not to be confused with HarfBuzz clusters) and add the letter spacing to the last code point in the cluster.

Then, when adding the spacing to the glyphs themselves, you need to add it to the last glyph (in visual order) in any sequence of glyphs that has the same HarfBuzz cluster. You can test Latin text with decomposed combining marks (e.g. x̣v̇) and Arabic text with vowel marks (e.g. وَرِقْ).

Ok, so I had some trouble getting this to work. Namely, when I reuse raqm's grapheme break functionality, and then use the harfbuzz clusters to apply spacing, the spacing just did not get applied, especially with grapheme clusters that had more than two characters. Part of my confusion here is also that : if harfbuzz clusters and grapheme clusters are not the same thing, why are we applying these to the last entry in the grapheme cluster, furthermore, if we are only applying letter spacing once per harfbuzz cluster, why do we worry about grapheme clusters at all?

I currently have offset added to RTL characters that aren't the last in their cluster, which works, but I now realize that's somewhat inconsistent and potentially very troublsome in the future, and will try to adjust this now.

Furthermore, you need to disable ligatures when letter spacing is used (at least liga features, but I think browsers disable a few other features as well), otherwise text with ligatures will be unevenly spaced.

I've implemented this in the letter-spacing function, using clig and liga, as those are what css-fonts specifies are the common ligatures. I could also turn off dlig, hlig and calt, as all those are also considered ligatures according to css-fonts. I am reusing the set_font_feature function here, as that reduces code duplication, but that does rely on letter-spacing to happen after setting the font-features. If you'd prefer this to happen just before shaping, please say so.

Furthermore, I am having a bit of trouble with the tests: It seems deja vu sans doesn't do decomposition for latin with diacritics, so testing letterspacing on latin is somewhat tricky. Is this why we should be using the grapheme clusters?

EDIT: I forgot to say that I did apply a check for grapheme breaks for wordspacing, as I've noticed that browsers don't consider a space with a combining mark a word space. I have also fixed RTL now.

…tures.

therahedwig · 2022-05-08T10:21:48Z

Ok, after a night of sleep, I managed to get grapheme breaking to work by checking whether the loop is at the start of a cluster instead of the end, and then adding the spacing to the first codepoint.

I can't get my head around applying it to the last code point, even though I understand there might be a situation where, say, the grapheme cluster goes from 3 to 9 and the harfbuzz cluster from 3 to 5. I've tried the following, which would need adjustment for RTL, but should work on LTR:

diff --git a/src/raqm.c b/src/raqm.c
index 2e7ac71..452532a 100644
--- a/src/raqm.c
+++ b/src/raqm.c
@@ -1086,28 +1086,25 @@ _raqm_set_spacing (raqm_t *rq,
 
   for (size_t i = start; i < end; i++)
   {
-    bool set_spacing = i == 0;
+    bool set_spacing = i == rq->text_len - 1;
     if (!set_spacing)
-      set_spacing = _raqm_allowed_grapheme_boundary (rq->text[i-1], rq->text[i]);
+      set_spacing = _raqm_allowed_grapheme_boundary (rq->text[i], rq->text[i+1]);
 
     if (set_spacing)
     {
       if (word_spacing)
       {
-        if (_raqm_allowed_grapheme_boundary (rq->text[i], rq->text[i+1]))
+        /* CSS word seperators, word spacing is only applied on these.*/
+        if (rq->text[i] == 0x0020  || /* Space */
+            rq->text[i] == 0x00A0  || /* No Break Space */
+            rq->text[i] == 0x1361  || /* Ethiopic Word Space */
+            rq->text[i] == 0x10100 || /* Aegean Word Seperator Line */
+            rq->text[i] == 0x10101 || /* Aegean Word Seperator Dot */
+            rq->text[i] == 0x1039F || /* Ugaric Word Divider */
+            rq->text[i] == 0x1091F)   /* Phoenician Word Separator */
         {
-          /* CSS word seperators, word spacing is only applied on these.*/
-          if (rq->text[i] == 0x0020  || /* Space */
-              rq->text[i] == 0x00A0  || /* No Break Space */
-              rq->text[i] == 0x1361  || /* Ethiopic Word Space */
-              rq->text[i] == 0x10100 || /* Aegean Word Seperator Line */
-              rq->text[i] == 0x10101 || /* Aegean Word Seperator Dot */
-              rq->text[i] == 0x1039F || /* Ugaric Word Divider */
-              rq->text[i] == 0x1091F)   /* Phoenician Word Separator */
-          {
-            rq->text_info[i].spacing_after = spacing;
-            rq->text_info[i].spacing_is_percentage = percentage;
-          }
+          rq->text_info[i].spacing_after = spacing;
+          rq->text_info[i].spacing_is_percentage = percentage;
         }
       }
       else
@@ -2175,11 +2172,22 @@ _raqm_shape (raqm_t *rq)
       pos = hb_buffer_get_glyph_positions (run->buffer, &len);
       info = hb_buffer_get_glyph_infos (run->buffer, &len);
 
+      unsigned int cluster_length = 0;
+      unsigned int current_cluster = 0;
       for (unsigned int i = 0; i < len; i++)
       {
         _raqm_ft_transform (&pos[i].x_advance, &pos[i].y_advance, matrix);
         _raqm_ft_transform (&pos[i].x_offset, &pos[i].y_offset, matrix);
 
+        if (current_cluster == info[i].cluster)
+        {
+          cluster_length += 1;
+        }
+        else
+        {
+          current_cluster = info[i].cluster;
+          cluster_length = 0;
+        }
         bool set_spacing = false;
         if (run->direction == HB_DIRECTION_RTL)
         {
@@ -2193,7 +2201,7 @@ _raqm_shape (raqm_t *rq)
             set_spacing = info[i].cluster != info[i+1].cluster;
         }
 
-        _raqm_text_info rq_info = rq->text_info[info[i].cluster];
+        _raqm_text_info rq_info = rq->text_info[current_cluster + cluster_length];
 
         if (rq_info.spacing_after != 0 && set_spacing)
         {

The problem I am seeing is that Deja Vu Sans and Noto Sans will give different results for letter spacing for letters with excessive combining marks, with the former spacing out each letter while the latter clumps some letters. With the current solution, all my test-cases work, though I am worried about the edge case identified earlier on in the post.

I've also added disabling the other opentype features.

It isn't actually parrt of CSS and seems odd in any case.

therahedwig · 2022-11-04T13:30:06Z

Ok, half a year later I haven't really found an example of the problem I was worried about, despite my other text work, and in hindsight it may have also been a bug somewhere else...

I've removed 'percentage' based spacing because it is not actually in the CSS spec, and I must've mixed something up.

khaledhosny · 2023-01-08T20:58:35Z

Thanks! (and sorry for the late review).

therahedwig mentioned this pull request May 2, 2022

Fix font feature range #172

Merged

therahedwig force-pushed the letter-and-word-spacing branch from d33b18d to ad0428a Compare May 5, 2022 10:32

khaledhosny requested changes May 6, 2022

View reviewed changes

src/raqm.c Show resolved Hide resolved

src/raqm.c Show resolved Hide resolved

src/raqm.c Outdated Show resolved Hide resolved

src/raqm.c Show resolved Hide resolved

khaledhosny reviewed May 6, 2022

View reviewed changes

src/raqm.c Show resolved Hide resolved

therahedwig added 2 commits May 6, 2022 13:41

Implement letter and word spacing.

3226b6c

Apply suggestions, make letter-spacing work for combining marks.

e85b395

therahedwig force-pushed the letter-and-word-spacing branch from ad0428a to e85b395 Compare May 7, 2022 12:52

therahedwig added 2 commits May 7, 2022 15:35

Fix RTL letter spacing so spacing is applied to the correct base.

54e0662

Use grapheme breaking and disable more letterspacing related font fea…

793d7a5

…tures.

therahedwig added 2 commits November 4, 2022 14:06

Remove percentage option from the spacing options.

fdaccc9

It isn't actually parrt of CSS and seems odd in any case.

Fix the tests.

4cfd7fe

therahedwig requested a review from khaledhosny November 4, 2022 13:30

khaledhosny merged commit d63abe5 into HOST-Oman:master Jan 8, 2023

nulano mentioned this pull request Nov 7, 2023

Fix incorrect color blending for overlapping glyphs python-pillow/Pillow#7497

Merged

cmb69 mentioned this pull request Aug 25, 2024

gdImageStringFT doesn't progress by the correct distance in y direction between characters libgd/libgd#887

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Letter and word spacing #171

Letter and word spacing #171

therahedwig commented May 2, 2022

therahedwig commented May 5, 2022

khaledhosny commented May 6, 2022

therahedwig commented May 7, 2022 •

edited

Loading

therahedwig commented May 8, 2022

therahedwig commented Nov 4, 2022

khaledhosny commented Jan 8, 2023

Letter and word spacing #171

Letter and word spacing #171

Conversation

therahedwig commented May 2, 2022

therahedwig commented May 5, 2022

khaledhosny commented May 6, 2022

therahedwig commented May 7, 2022 • edited Loading

therahedwig commented May 8, 2022

therahedwig commented Nov 4, 2022

khaledhosny commented Jan 8, 2023

therahedwig commented May 7, 2022 •

edited

Loading