Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider making 'ascii' the default/recommended casefolding #1718

Closed
slingamn opened this issue Jun 28, 2021 · 2 comments
Closed

consider making 'ascii' the default/recommended casefolding #1718

slingamn opened this issue Jun 28, 2021 · 2 comments
Milestone

Comments

@slingamn
Copy link
Member

I'm not at all certain about this and would appreciate input from all stakeholders.

Right now, the default/recommended casefolding is 'precis', i.e. RFC 8264, allowing the use of non-ASCII characters in nicknames and channel names. However, it seems like we've given up hope of the wider IRCv3 community adopting internationalized identifiers at the protocol level; the community seems to be going in the direction of "display names" instead. (Although: I haven't seen a proposal for assigning display names to channels.)

This creates a problem for client developers, who won't have a reliable algorithm for determining whether Ergo considers two identifiers to be equivalent under case normalization. The workaround we've been pushing is #1083, i.e., always publishing the canonical form of the identifier. This approach is vulnerable to bugs and edge cases.

The other problem with PRECIS is confusable characters, which are only imperfectly addressed by the skeleton algorithm.

So the proposal here is to change the default/recommended value of casefolding from 'precis' to 'ascii'. ('precis' would remain fully supported in the codebase, especially because operators can't safely switch between casefoldings, at the risk of making account or channel registrations unusable.)

@Mikaela
Copy link
Contributor

Mikaela commented Jun 28, 2021

Does this have implications for #1441 ?

@DanielOaks
Copy link
Member

I'd probably do this once we actually have a way to support display names (i.e. once Metadata is implemented which I... should be doing eventually). That also takes care of the display names for channel things, since you'd just attach the key to the channel as well.

I like the idea of core protocol identifiers being restricted to be as simple as possible (that's why I pulled the unicode identifiers spec out of v3, after all) but it is something that differentiates us from other servers in a fairly big way, so keeping that around for now until there's a recommended method we can switch to probably makes sense.

slingamn added a commit to slingamn/ergo that referenced this issue Dec 12, 2022
@slingamn slingamn added this to the v2.11 milestone Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants