Improved conversion from LaTeX to HTML/Unicode #841

oscargus · 2016-02-20T09:10:41Z

Now the huge list of conversion symbols is used when converting from LaTeX to HTML and Unicode. I also added a field right-click menu item to convert from LaTeX to Unicode. This sort of works except that it doesn't handle text within $$ in a good way (remove the $$ and the symbol conversion works).

The huge diff is caused by reformatting the huge list, as the newly added entries was aligned differently compared to earlier on saving...

tobiasdiez · 2016-02-20T10:18:20Z

src/test/java/net/sf/jabref/exporter/layout/format/HTMLCharsTest.java

-        Assert.assertEquals("&#305;", layout.format("\\i"));
-        Assert.assertEquals("&#305;", layout.format("\\{i}"));
-        Assert.assertEquals("&#305;&#305;", layout.format("\\i\\i"));
+        Assert.assertEquals("&imath; &imath;", layout.format("\\i \\i"));


Why does a HTMLChars() formatter format latex to unicode and not to HTML?
Is there something wrong with the implementation or is the naming just suboptimal?

Historical reasons. Earlier the huge conversion list resided in that file, so adding a method "formatUnicode" was simple. Indeed there should (very soon) be a UnicodeToLatexConverter extracted... Doing this is steps...

Ah, sorry, wrong answer. :-) The answer above holds for HTMLConverter...

I think HTMLChars format to HTML? Now, since there are named entities these are used, if not, the numerical are used.

Ah my mistake! Sorry....

oscargus · 2016-02-20T12:37:00Z

I fixed the equation issue and added support for some more LaTeX text styles. For example, \textsuperscript and \textsubscript are supported for both HTML and OO.

This also means that the preview looks quite a bit better now.

simonharrer · 2016-02-22T09:37:40Z

Please fix the failing tests. Then this can be merged in as well.

simonharrer · 2016-02-22T09:38:34Z

Converters could be added to the Save Actions under the Formatter Interface as well.

oscargus · 2016-02-22T09:41:21Z

Yeah, I just noted that they failed. :-) That's what you get trying to distinguish italic and emphasise... I'll correct it later and merge. Thanks!

oscargus · 2016-02-22T09:43:22Z

Indeed. I'll also extract the Unicode formatter from HTMLConverter now that they are not tightly connected anymore.

simonharrer · 2016-02-22T11:09:45Z

Ok, please add the label ready-for-review again when you are done with the other changes.

oscargus · 2016-02-22T22:03:47Z

It turned out that it also made sense to move the layout formatters from export.layout to logic.layout, so the final(?) commit is handling that. Some minor refactoring had to be done, but nothing controversial.

oscargus · 2016-02-23T22:32:41Z

I'm giving up on this PR for now... Doesn't really make sense to add a JournalAbbreviationRepository argument to PdfImporter or MoveFileAction, although that is required if it is propagated (and now I've only gone two steps back...). Will be back after a week or so of holidays (true story, nothing to do with this)...

Accessing a global variable is a single formatter or passing an argument to hundreds of non-related classes to possibly be used in a single formatter? I'm not convinced it is worth it...

simonharrer · 2016-02-24T08:50:43Z

The idea is that the GUI part can have globals, whereas the logic does not. With this in mind, the PdfImporter or MoveFileAction do not need to have this class, but can merely pass in the global variable if necessary. Does this make sense?

oscargus · 2016-02-26T00:04:26Z

It makes sense from the perspective that I understand where to start/stop. :-)

…an now...

simonharrer · 2016-02-26T09:24:59Z

👍 LGTM (but can you remove the unused imports, please?)

oscargus · 2016-02-26T15:13:44Z

All(?) imports are now removed and I managed to sort out the PdfCleanup thing as well.

Feel free to merge. :-)

Improved conversion from LaTeX to HTML/Unicode

tobiasdiez · 2016-02-26T16:04:38Z

Sorry for being such a pain in the ass with this abbreviation thing...but one last remark: did you tried out the Pdf rename cleanup as a user? I think the dialog always passes a null repository to the cleanup worker which in the end would result in a NPE. Not sure through...

oscargus added type: enhancement status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers labels Feb 20, 2016

oscargus mentioned this pull request Feb 20, 2016

Support decoding and encoding of LaTeX characters #161

Closed

tobiasdiez reviewed Feb 20, 2016
View reviewed changes

oscargus force-pushed the bigconversionlist branch 2 times, most recently from 1783e9d to aa1ad68 Compare February 20, 2016 12:24

oscargus force-pushed the bigconversionlist branch 2 times, most recently from 6c48bbe to 1c851ea Compare February 21, 2016 19:14

oscargus mentioned this pull request Feb 21, 2016

Updated HTMLChars documentation JabRef/user-documentation#7

Merged

oscargus force-pushed the bigconversionlist branch from 1c851ea to 101a15c Compare February 22, 2016 09:10

simonharrer removed the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Feb 22, 2016

oscargus force-pushed the bigconversionlist branch from 3442a94 to 4c0b31f Compare February 22, 2016 13:56

oscargus added status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers and removed status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers labels Feb 22, 2016

oscargus force-pushed the bigconversionlist branch from f8e42ab to cde908f Compare February 22, 2016 21:59

oscargus added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Feb 22, 2016

oscargus force-pushed the bigconversionlist branch 2 times, most recently from bf49de3 to c3526b8 Compare February 22, 2016 22:50

oscargus removed the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Feb 23, 2016

Improved conversion from LaTeX to HTML/Unicode

af49fe4

oscargus added 10 commits February 26, 2016 01:05

Improved handling of equations

2e12ea7

Improved HTMLChars and OOPreFormatter to handle more text styles

c1175ad

Separate Unicode to Latex formatter

bc71448

Moved and renamed HTML/Unicode to Latex converters

49ad8aa

Moved layout formatters from export to logic

dc49159

Fixed NPEs in tests which should have seen these NPEs much earlier th…

92a27f5

…an now...

Fixed comments

4a8ab81

Restructured OO formatting code

387ea84

Restructured HTMLChars

497f422

Finally fixed the tests(?)

6111eda

oscargus force-pushed the bigconversionlist branch from 18d47fe to 6111eda Compare February 26, 2016 00:23

oscargus added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Feb 26, 2016

oscargus force-pushed the bigconversionlist branch from 09c750e to af73747 Compare February 26, 2016 03:25

oscargus force-pushed the bigconversionlist branch from af73747 to 57b13f0 Compare February 26, 2016 11:35

Got rid of some warnings

fafb2c7

oscargus force-pushed the bigconversionlist branch from 57b13f0 to fafb2c7 Compare February 26, 2016 11:50

Fixed comments

68153fa

oscargus force-pushed the bigconversionlist branch from c248faf to 68153fa Compare February 26, 2016 12:34

oscargus mentioned this pull request Feb 26, 2016

Preserve capital Umlauts during cleanup #405

Closed

A few less warnings

a96d70c

simonharrer added a commit that referenced this pull request Feb 26, 2016

Merge pull request #841 from oscargus/bigconversionlist

69e4fe2

Improved conversion from LaTeX to HTML/Unicode

simonharrer merged commit 69e4fe2 into JabRef:master Feb 26, 2016

oscargus mentioned this pull request Mar 7, 2016

Cleanup: Offer conversion from Unicode to LaTeX #809

Closed

1 task

oscargus deleted the bigconversionlist branch March 16, 2016 09:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved conversion from LaTeX to HTML/Unicode #841

Improved conversion from LaTeX to HTML/Unicode #841

oscargus commented Feb 20, 2016

tobiasdiez Feb 20, 2016

oscargus Feb 20, 2016

oscargus Feb 20, 2016

tobiasdiez Feb 20, 2016

oscargus commented Feb 20, 2016

simonharrer commented Feb 22, 2016

simonharrer commented Feb 22, 2016

oscargus commented Feb 22, 2016 via email

oscargus commented Feb 22, 2016 via email

simonharrer commented Feb 22, 2016

oscargus commented Feb 22, 2016

oscargus commented Feb 23, 2016

simonharrer commented Feb 24, 2016

oscargus commented Feb 26, 2016

simonharrer commented Feb 26, 2016

oscargus commented Feb 26, 2016

tobiasdiez commented Feb 26, 2016

Improved conversion from LaTeX to HTML/Unicode #841

Improved conversion from LaTeX to HTML/Unicode #841

Conversation

oscargus commented Feb 20, 2016

tobiasdiez Feb 20, 2016

Choose a reason for hiding this comment

oscargus Feb 20, 2016

Choose a reason for hiding this comment

oscargus Feb 20, 2016

Choose a reason for hiding this comment

tobiasdiez Feb 20, 2016

Choose a reason for hiding this comment

oscargus commented Feb 20, 2016

simonharrer commented Feb 22, 2016

simonharrer commented Feb 22, 2016

oscargus commented Feb 22, 2016 via email

oscargus commented Feb 22, 2016 via email

simonharrer commented Feb 22, 2016

oscargus commented Feb 22, 2016

oscargus commented Feb 23, 2016

simonharrer commented Feb 24, 2016

oscargus commented Feb 26, 2016

simonharrer commented Feb 26, 2016

oscargus commented Feb 26, 2016

tobiasdiez commented Feb 26, 2016