Python 3: find a suitable and compatible replacement for txt2tags #8734

josephsl · 2018-09-10T16:55:51Z

Hi,

Similar to #8375:

Background:

NVDA's documentation is written with txt2tags (t2t) markup, similar to how people would use Markdown markup and then generate HTML, wiki pages and what not. At the moment this works for Python 2.7, but could pose a problem for Python 3 transition.

Apart from postproc/heading conversion problem (#3031), txt2tags continues to prefer Python 2. Although there was an experimental Python 3 branch, there hasn't been any updates since 2010.

Also, as NVDA was localized into more languages, other issues with our current docs translation process emerged, including byte order marks, encoding problems, typos and resulting syntax errors and what not. In the past, there has been a discussion on NvDA translations list regarding using a different docs translation process, with some peple suggesting using Markdown/Gettext for documentation translation management, similar to how add-on entries on community add-ons website are locailzed.

Although this issue won't have immediate impact, it will become a showstopper once we declare transition to Python 3. Thus, to minimize shock throughout the community (especially for translators and code contributors), I propose that we find a suitable and compatible alternative to t2t.

#Dependency requirements:

Compatible with both Python 2.7 and at least 3.5.
Making it easier for translators to localize documentation and minimizes errors.
A suitable path for moving from t2t to the new format.

Steps to reproduce:

Try compiling NVDA with Python 3 (scons, ignoring a bit about winreg module name).

Actual behavior:

Various errors are thrown by Python due te txt2tags issues.

Expected behavior:

NVDA compiles in Python 3 mode.

System configuration:

NVDA Installed/portable/running from source:

Not applicable

NVDA version:

N/A

Windows version:

N/A

Name and version of other software in use when reproducing the issue:

Python 2.7.15, 3.7.0, txt2tags 2.5

Other information about your system:

N/A

Possible new dependencies:

The closest is Markdown (md), as the add-ons community and contributors using GitHub are familiar with this. Also, it might be possible to use Ikiwiki's Gettext plug-in to transform po into md files.

Other suggestions are welcome.

Thanks.

zstanecic · 2018-09-10T17:17:40Z

Ok, markdown has some issues, which maybe we cannot solve: The key command was organizedin tables. how we can manage it?

josephsl · 2018-09-10T17:19:12Z

Hi, I believe the one the community is using (2.6.x) supports tables, but need to confirm it with an add-on. Thanks.

zstanecic · 2018-09-10T17:21:43Z

Oh, Joseph, I remember, Braille extender uses tables in the docs

zstanecic · 2018-09-10T17:27:06Z

btw, if we switch fully to poedit with markdown conversion, we can avoid the structural differences thing. this is the most biggest pain for the translators

LeonarddeR · 2018-12-11T13:48:27Z

@zstanecic: I believe the PHP Markdown extensions support tables to some extend, and the python markdown module should support these extensions. There also is an extension API

We would need more extensions, such as table of contents

I would be an advocate of using markdown. As @josephsl stated, translators are familiar with Markdown, and it is used in other areas as well, i.e. on github.

zstanecic · 2018-12-11T13:50:47Z

Hi Leonard, I am advocate of markdown too, if user guide can be translated via the poedit, as we do it for the ikiwiki add-on pages.

Adriani90 · 2018-12-11T16:24:26Z

We can convert the userguide and changes with Pandoc to github flavour markdown (gfm) and then save them as .md files. They can easily be converted to html with pandoc, also automatically. The next step would be to restructure the content so taht strings can be extracted by getText and squashed into a .pot file. But this means much more work for NV Access in the long term because every addition to userguide and to changes must be localized which I guess it is not as easy as it is done currently. In my view I can also go with .md files in gfm format. Especially because I can browse the userguide much faster and can see the context of the sentences imediately.

Adriani90 · 2018-12-11T16:26:01Z

but in general .po files can be translated much much faster.

Adriani90 · 2018-12-11T16:31:29Z

there is also a txt2po and a ini2po converter. So we can convert with pandoc to .txt and then generate .po files which in turn can be converted to html with translate toolkit.

LeonarddeR · 2018-12-11T17:29:20Z

I don't think a conversion to pot files is going to happen for the user guide. The current user guide translation process gives users much more freedom than gettext based translation. This means that for some languages, some user guides might have extra paragraphs, or other paragraphs are extended, missing or shortened.

derekriemer · 2018-12-12T14:32:59Z

Overall, I am a big supporter of switching to markdown. Apparently pandoc can convert from t2tt, and pandoc can do tons of things, so we have a lot of freedom here.

LeonarddeR · 2018-12-12T14:37:22Z

Note that we're using several hooks in the user guide to produce the keyboard commands document. These have to be converted to a markdown compatible approach.

Adriani90 · 2018-12-12T14:54:04Z

Yes, not only this but for example also in changes where issues in brackets are directly linking to Github. The hook problem was the one I ran into yesterday. I am still trying to find the propper syntax.

Adriani90 · 2018-12-13T14:28:00Z

this seems not very easy to acomplish becaus Github flavoured markup does not seem to have an equivalent for %includeconf which is currently used by txt2tags. With Pandoc you can create sort of bash script but it is quite limited. at least from what I can see. The problem is that I don't know how to include configs of bash scripts or other external files into the userguide or changes file. At least for translators, they are more used to the markup of the addons homepage which is not github flavoured. Maybe that would be also an option but I doubt because that markup seems not very rich with regard to functionality. There are only standard strings like tables, bullets and what not.

Actually txt2tags is a quite good markup. It is too bad that it probably does not comply with Python 3 requirements.

LeonarddeR · 2018-12-20T15:40:48Z

I think we should start with the changes files as an experiment for this. After that, we can expand to the user guide as well.

LeonarddeR · 2018-12-20T18:15:55Z

Here is an overview of how the extension API kan be used to accomplish several tasks that are now performed by t2t.

Global

Looks like we can drop them all :)

Build

NVDA_VERSION, NVDA_URL and NVDA_COPYRIGHT_YEARS all can be passed at runtime and should be defined in the extension as options. See Integrating Your Code Into Markdown > Configuration Settings. Then, they can be replaced using inline patterns.

Changes file:

Make ticket references into links: see Inline Patterns>future
Make links open in a new tab/window: the markdown extensions overview mentions a newtab extension, but the github URL is dead. Alternatively, we can easily create something ourselves, see Working with the ElementTree > def set_link_class(self, element):

User guide

A preprocessor should be created that ignores every line that starts with %kc:

Key commands

I think that the keyCommandsDoc module can be converted to a markdown extension as well, where the whole walk through the user guide can be a preprocessor

nishimotz · 2018-12-21T13:34:13Z

I have worked around txt2tags regarding Python 3.

https://github.com/nvdajp/txt2tags

So far, only tested with NVDA documents in Japanese and English.

Please take a look.

Thank you @LeonarddeR for letting me know the modernizer tool at somewhere else.

LeonarddeR · 2018-12-21T13:52:22Z

Thanks @nishimotz! Good to know that there is actually a way to stay at txt2tags for a while.

dpy013 · 2019-04-28T10:19:52Z

there is also a txt2po and a ini2po converter. So we can convert with pandoc to .txt and then generate u.po files which in turn can be converted to html with translate toolkit.

We can convert the userguide and changes with Pandoc to github flavour markdown (gfm) and then save them as .md files. They can easily be converted to html with pandoc, also automatically. The next step would be to restructure the content so taht strings can be extracted by getText and squashed into a .pot file. But this means much more work for NV Access in the long term because every addition to userguide and to changes must be localized which I guess it is not as easy as it is done currently. In my view I can also go with .md files in gfm format. Especially because I can browse the userguide much faster and can see the context of the sentences imediately.
hello Adriani
Is it possible to convert the NVDA User Guide to a Sphinx-supported file and then generate a .pot with Sphinx?

Adriani90 · 2019-04-28T20:55:53Z

@dingpengyu unfortunately I am not an expert in reST files and sphinx markup. It is certainly possible to convert parts of the userguide into sphinx with pandoc but we actually want an easy markup like txt2tags or gfl because it is very user friendly. But however, if we would convert the files to sphinx markup, I am not sure if the markup supports all the bash scripts used in the translation system and in the document structure (i.e. direct links to github issues etc.). Could you please elaborate on the benefits we would have with Sphinx? Where are the limits of this markup?

LeonarddeR · 2019-05-29T15:02:23Z

@nishimotz: would you be able to provide a pr for all the code you changed that is part of the NVDA repository? Also, I think we prefer a conversion to Python 3 code, not necessarily compatible with both (e.g. using six). I think we should also stick to the txt2tags version we're currently using in the build process, regardless whether that's the most recent one.

I think it is really important that we step away from txt2tags at some point. However for now, I think it is important to focus on creating a distribution based on Python 3 that actually works. Converting all the documentation really sounds like a separate project.

nishimotz · 2019-05-30T04:26:52Z

@LeonarddeR Firstly I will remove Python 2 support from my PoC work, then make PR against NVDA repository.
What branch of NVDA should I use?

LeonarddeR · 2019-05-30T10:21:23Z

I think that should be threshold. Note that txt2tags itself resides in https://github.com/nvaccess/nvda-misc-deps/ . I think it makes sense to keep it that way for now.

nishimotz · 2019-05-31T07:42:54Z

@LeonarddeR you mean I should fork nvda-misc-deps repository to make PR?

LeonarddeR · 2019-05-31T07:56:15Z

For the txt2tags file, yes. For other files, such as site_scons.site_tools.t2t, the code resides in the normale NVDA repository. I agree it is slightly confusing.

nishimotz · 2019-05-31T10:15:40Z

@LeonarddeR created PR #9648

* txt2tags for python3 #8734 * address review comments Co-Authored-By: Leonard de Ruijter <leonardder@users.noreply.github.com> * address review comments * address review comments. revert miscDeps * address review comment #9648 * Update miscDeps to master containing python3 txt2tags.

feerrenrut · 2019-06-21T07:16:37Z

I'll close this issue, text2tags with python 3 has been addressed with #8734. However, in the future we may look at replacing text2tags. At this we are assuming the text2tags project is dead. The information will likely be helpful when we come back to this idea.

LeonarddeR · 2019-07-13T17:29:43Z

We're discussing alternatives for epydoc in #9840.

commit befffdd Author: Takuya Nishimoto <nishimotz@gmail.com> Date: Fri May 31 22:35:55 2019 +0900 address review comments. revert miscDeps commit 3070fcc Author: Takuya Nishimoto <nishimotz@gmail.com> Date: Fri May 31 22:32:13 2019 +0900 address review comments commit e32fa0e Author: Takuya Nishimoto <nishimotz@gmail.com> Date: Fri May 31 22:27:26 2019 +0900 address review comments Co-Authored-By: Leonard de Ruijter <leonardder@users.noreply.github.com> commit adcfdea Author: Takuya Nishimoto <nishimotz@gmail.com> Date: Fri May 31 17:34:24 2019 +0900 txt2tags for python3 nvaccess#8734

aureliojargas · 2019-12-28T23:00:35Z

Hi, just a friendly heads up that now finally there's an official Python 3 version for txt2tags in https://github.com/txt2tags/txt2tags/tree/v3, also available in PyPI at https://pypi.org/project/txt2tags/. Maybe that one can fit your needs.

josephsl · 2019-12-29T02:32:35Z

Hi, thanks for this update. Let’s take a look at it after 2019.3 comes out. CC @LeonarddeR, @nishimotz

dpy013 · 2019-12-29T03:34:11Z

For short periods we can use txt2tags
I personally think it would be better to use sphinx to manage documents.

rojanu · 2020-05-31T11:24:58Z

Here is a good comparison list;
https://github.com/KiCad/kicad-doc/tree/master/doc_alternatives.

Personally, I preferer asciidoc it's human readable, not so different from MD. We use it at work to generate documentation to PDF and html formats, as well as reading the raw format on github. https://po4a.org can be used to translate it as well.

Adriani90 · 2023-12-16T21:17:56Z

The tranlsation system is in process of changing to Crowdin and a olution for converting t2t to markdown or something else will become relevant again. I am reopening.

derekriemer · 2024-01-23T05:33:20Z

Any way we can autogenerate the numbered sections? Those are useful for telling someone verbally where to look in the user guide. Derek Riemer Improving the world one byte at a time! ⠠⠊⠍⠏⠗⠕⠧⠬ ⠮ ⠸⠺ ⠐⠕ ⠃⠽⠞⠑ ⠁⠞ ⠁ ⠐⠞⠖ • Personal website: https://derekriemer.com • Email: ***@***.*** • Phone: (303) 906-2194

…

On Tue, Jan 9, 2024 at 4:05 PM Sean Budd ***@***.***> wrote: Closed #8734 <#8734> as completed via dac5aa2 <dac5aa2> . — Reply to this email directly, view it on GitHub <#8734 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABI2FPI3673WXEGH6QJKRVDYNXEJXAVCNFSM4FUHFRU2U5DIOJSWCZC7NNSXTWQAEJEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW4OZRGE2DGNRWGUZTGMBT> . You are receiving this because you commented.Message ID: ***@***.***>

seanbudd · 2024-01-23T05:35:02Z

Hi @derekriemer - this is planned and tracked in #16059 using CSS

Closes nvaccess#8734 Part of nvaccess#15014 Related nvaccess/nvda-misc-deps#30, nvaccess#16002, nvaccess#15950, nvaccess#15939, nvaccess#15981 Summary of the issue: In order to migrate to Crowdin, we must convert txt2tags to markdown. In order transition safely, beta will build docs from t2t to markdown to html. Eventually the t2t will be removed and the markdown will become the source of truth. Description of user facing changes The user Guide no longer has numbered sections Translators and documentation writers now use extended markdown syntax rather than txt2tags. Description of development approach The build system performs certain pre-processing and post-processing when converting t2t to HTML. This equivalent system should be retained - i.e. the translator and documentation contribution experience should remain the same for markdown to HTML. Additionally, when converting t2t to markdown, new processing rules had to be created. There is no universal standard for custom anchors in markdown. As such text2tags doesn't have default rules for this. To retain our custom anchors, I added rules for a common markdown extended syntax. Similarly a shortcut is used for generating table of contents. See nvaccess/nvda-misc-deps#30 Setting the language code is done using the user_docs folder name, and the direction of RTL is manually set for the 3 current languages that require it, Persian, Arabic, Hebrew. Special Catalan processing for adding hreflang attributes is converted to a markdown extension syntax of {hreflang=en} The key commands file is generated using a custom made markdown python extension.

josephsl added the z Python 3 transition (archived) Python 3 transition label Sep 10, 2018

LeonarddeR mentioned this issue Dec 20, 2018

Title of NVDA documentation pages is h2 instead of h1 #3031

Open

LeonarddeR mentioned this issue May 15, 2019

Update setup script to be compatible with Py2exe for Python 3.7 #8375

Closed

LeonarddeR mentioned this issue May 29, 2019

Update NVDA build environment to Python 3 #9638

Closed

nishimotz mentioned this issue May 31, 2019

txt2tags for python3 #8734 #9648

Merged

nishimotz mentioned this issue May 31, 2019

Python3 compatible version of txt2tags nvaccess/nvda-misc-deps#12

Merged

feerrenrut closed this as completed Jun 21, 2019

josephsl mentioned this issue Jul 23, 2019

What's new and readme: we are moving to Python 3.7 #9942

Merged

feerrenrut added the component/documentation label May 2, 2020

Adriani90 mentioned this issue Feb 27, 2023

Please use po file to make documentation translation. #12576

Open

Adriani90 reopened this Dec 16, 2023

Adriani90 mentioned this issue Dec 20, 2023

Convert t2t to markdown, then to HTML #15945

Merged

5 tasks

seanbudd added this to the 2024.1 milestone Dec 21, 2023

seanbudd linked a pull request Dec 28, 2023 that will close this issue

Convert t2t to markdown, then to HTML #15945

Merged

5 tasks

seanbudd mentioned this issue Jan 4, 2024

Remove txt2tags nvaccess/nvda-misc-deps#31

Merged

seanbudd closed this as completed in dac5aa2 Jan 9, 2024

Python 3: find a suitable and compatible replacement for txt2tags #8734

Python 3: find a suitable and compatible replacement for txt2tags #8734

Comments

josephsl commented Sep 10, 2018

Background:

#Dependency requirements:

Steps to reproduce:

Actual behavior:

Expected behavior:

System configuration:

NVDA Installed/portable/running from source:

NVDA version:

Windows version:

Name and version of other software in use when reproducing the issue:

Other information about your system:

Other questions:

Does the issue still occur after restarting your PC?

Have you tried any other versions of NVDA?

Possible new dependencies:

zstanecic commented Sep 10, 2018 via email

josephsl commented Sep 10, 2018 via email • edited by feerrenrut Loading

zstanecic commented Sep 10, 2018 via email

zstanecic commented Sep 10, 2018 via email

LeonarddeR commented Dec 11, 2018

zstanecic commented Dec 11, 2018 via email • edited by feerrenrut Loading

Adriani90 commented Dec 11, 2018

Adriani90 commented Dec 11, 2018

Adriani90 commented Dec 11, 2018

LeonarddeR commented Dec 11, 2018

derekriemer commented Dec 12, 2018

LeonarddeR commented Dec 12, 2018 via email

Adriani90 commented Dec 12, 2018 via email • edited by feerrenrut Loading

Adriani90 commented Dec 13, 2018 • edited Loading

LeonarddeR commented Dec 20, 2018

LeonarddeR commented Dec 20, 2018 • edited Loading

Global

Build

Changes file:

User guide

Key commands

nishimotz commented Dec 21, 2018

LeonarddeR commented Dec 21, 2018

dpy013 commented Apr 28, 2019

Adriani90 commented Apr 28, 2019

LeonarddeR commented May 29, 2019

nishimotz commented May 30, 2019

LeonarddeR commented May 30, 2019 via email

nishimotz commented May 31, 2019

LeonarddeR commented May 31, 2019

nishimotz commented May 31, 2019

feerrenrut commented Jun 21, 2019

LeonarddeR commented Jul 13, 2019

aureliojargas commented Dec 28, 2019

josephsl commented Dec 29, 2019 via email • edited by feerrenrut Loading

dpy013 commented Dec 29, 2019

rojanu commented May 31, 2020

Adriani90 commented Dec 16, 2023

derekriemer commented Jan 23, 2024 via email

seanbudd commented Jan 23, 2024

josephsl commented Sep 10, 2018 via email •

edited by feerrenrut

Loading

zstanecic commented Dec 11, 2018 via email •

edited by feerrenrut

Loading

Adriani90 commented Dec 12, 2018 via email •

edited by feerrenrut

Loading

Adriani90 commented Dec 13, 2018 •

edited

Loading

LeonarddeR commented Dec 20, 2018 •

edited

Loading

josephsl commented Dec 29, 2019 via email •

edited by feerrenrut

Loading