Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create_spoken_forms: If file_extensions.csv is empty, FILE_EXTENSIONS_REGEX is wrong #1220

Open
rntz opened this issue Jun 24, 2023 · 0 comments
Assignees

Comments

@rntz
Copy link
Collaborator

rntz commented Jun 24, 2023

In create_spoken_forms.py we define FILE_EXTENSIONS_REGEX like so:

FILE_EXTENSIONS_REGEX = "|".join(
    re.escape(file_extension.strip()) + "$"
    for file_extension in file_extensions.values()
)

The problem with this is: what if file_extensions.values() is empty? Then we get "|".join([]) which is "". Unfortunately this is wrong. We want it to be the regex which matches nothing (no strings, not even the empty string). Instead it is the regex which matches the empty string, so it will match every zero-length position in a string.

To write the regex which matches nothing, you can do r"^\b$".

I can't follow how FILE_EXTENSIONS_REGEX is used in the rest of the file, so I'm not sure whether this could actually cause any problems. @pokey ?

(Noticed this while reviewing #1199.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants