-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved conversion from LaTeX to HTML/Unicode #841
Conversation
Assert.assertEquals("ı", layout.format("\\i")); | ||
Assert.assertEquals("ı", layout.format("\\{i}")); | ||
Assert.assertEquals("ıı", layout.format("\\i\\i")); | ||
Assert.assertEquals("ı ı", layout.format("\\i \\i")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does a HTMLChars() formatter format latex to unicode and not to HTML?
Is there something wrong with the implementation or is the naming just suboptimal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Historical reasons. Earlier the huge conversion list resided in that file, so adding a method "formatUnicode" was simple. Indeed there should (very soon) be a UnicodeToLatexConverter extracted... Doing this is steps...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, wrong answer. :-) The answer above holds for HTMLConverter...
I think HTMLChars format to HTML? Now, since there are named entities these are used, if not, the numerical are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah my mistake! Sorry....
1783e9d
to
aa1ad68
Compare
I fixed the equation issue and added support for some more LaTeX text styles. For example, \textsuperscript and \textsubscript are supported for both HTML and OO. This also means that the preview looks quite a bit better now. |
6c48bbe
to
1c851ea
Compare
1c851ea
to
101a15c
Compare
Please fix the failing tests. Then this can be merged in as well. |
Converters could be added to the Save Actions under the Formatter Interface as well. |
Yeah, I just noted that they failed. :-)
That's what you get trying to distinguish italic and emphasise...
I'll correct it later and merge. Thanks!
|
Indeed. I'll also extract the Unicode formatter from HTMLConverter now that
they are not tightly connected anymore.
|
Ok, please add the label ready-for-review again when you are done with the other changes. |
3442a94
to
4c0b31f
Compare
f8e42ab
to
cde908f
Compare
It turned out that it also made sense to move the layout formatters from export.layout to logic.layout, so the final(?) commit is handling that. Some minor refactoring had to be done, but nothing controversial. |
bf49de3
to
c3526b8
Compare
I'm giving up on this PR for now... Doesn't really make sense to add a JournalAbbreviationRepository argument to PdfImporter or MoveFileAction, although that is required if it is propagated (and now I've only gone two steps back...). Will be back after a week or so of holidays (true story, nothing to do with this)... Accessing a global variable is a single formatter or passing an argument to hundreds of non-related classes to possibly be used in a single formatter? I'm not convinced it is worth it... |
The idea is that the GUI part can have globals, whereas the logic does not. With this in mind, the PdfImporter or MoveFileAction do not need to have this class, but can merely pass in the global variable if necessary. Does this make sense? |
It makes sense from the perspective that I understand where to start/stop. :-) |
18d47fe
to
6111eda
Compare
09c750e
to
af73747
Compare
👍 LGTM (but can you remove the unused imports, please?) |
af73747
to
57b13f0
Compare
57b13f0
to
fafb2c7
Compare
c248faf
to
68153fa
Compare
All(?) imports are now removed and I managed to sort out the PdfCleanup thing as well. Feel free to merge. :-) |
Improved conversion from LaTeX to HTML/Unicode
Sorry for being such a pain in the ass with this abbreviation thing...but one last remark: did you tried out the Pdf rename cleanup as a user? I think the dialog always passes a null repository to the cleanup worker which in the end would result in a NPE. Not sure through... |
Now the huge list of conversion symbols is used when converting from LaTeX to HTML and Unicode. I also added a field right-click menu item to convert from LaTeX to Unicode. This sort of works except that it doesn't handle text within $$ in a good way (remove the $$ and the symbol conversion works).
The huge diff is caused by reformatting the huge list, as the newly added entries was aligned differently compared to earlier on saving...