Skip to content
This repository has been archived by the owner on May 7, 2021. It is now read-only.

apply normalization to input #18

Open
balmas opened this issue Mar 24, 2016 · 0 comments
Open

apply normalization to input #18

balmas opened this issue Mar 24, 2016 · 0 comments

Comments

@balmas
Copy link
Collaborator

balmas commented Mar 24, 2016

we should have the option to normalize text input through the input forms -- most of our services (such as tokenizing splitting) require precombined unicode chars to function properly and we should normalize on input if we can.

found by @vgorman1 -- her μήδε used a combining char (combining \u03b7\u0301) but the code looks for the precombined character (\u03ae).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant