Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3: find a suitable and compatible replacement for txt2tags #8734

Closed
josephsl opened this issue Sep 10, 2018 · 35 comments · Fixed by #15945
Closed

Python 3: find a suitable and compatible replacement for txt2tags #8734

josephsl opened this issue Sep 10, 2018 · 35 comments · Fixed by #15945

Comments

@josephsl
Copy link
Collaborator

Hi,

Similar to #8375:

Background:

NVDA's documentation is written with txt2tags (t2t) markup, similar to how people would use Markdown markup and then generate HTML, wiki pages and what not. At the moment this works for Python 2.7, but could pose a problem for Python 3 transition.

Apart from postproc/heading conversion problem (#3031), txt2tags continues to prefer Python 2. Although there was an experimental Python 3 branch, there hasn't been any updates since 2010.

Also, as NVDA was localized into more languages, other issues with our current docs translation process emerged, including byte order marks, encoding problems, typos and resulting syntax errors and what not. In the past, there has been a discussion on NvDA translations list regarding using a different docs translation process, with some peple suggesting using Markdown/Gettext for documentation translation management, similar to how add-on entries on community add-ons website are locailzed.

Although this issue won't have immediate impact, it will become a showstopper once we declare transition to Python 3. Thus, to minimize shock throughout the community (especially for translators and code contributors), I propose that we find a suitable and compatible alternative to t2t.

#Dependency requirements:

  1. Compatible with both Python 2.7 and at least 3.5.
  2. Making it easier for translators to localize documentation and minimizes errors.
  3. A suitable path for moving from t2t to the new format.

Steps to reproduce:

Try compiling NVDA with Python 3 (scons, ignoring a bit about winreg module name).

Actual behavior:

Various errors are thrown by Python due te txt2tags issues.

Expected behavior:

NVDA compiles in Python 3 mode.

System configuration:

NVDA Installed/portable/running from source:

Not applicable

NVDA version:

N/A

Windows version:

N/A

Name and version of other software in use when reproducing the issue:

Python 2.7.15, 3.7.0, txt2tags 2.5

Other information about your system:

N/A

Other questions:

Does the issue still occur after restarting your PC?

Yes

Have you tried any other versions of NVDA?

Yes

Possible new dependencies:

The closest is Markdown (md), as the add-ons community and contributors using GitHub are familiar with this. Also, it might be possible to use Ikiwiki's Gettext plug-in to transform po into md files.

Other suggestions are welcome.

Thanks.

@josephsl josephsl added the z Python 3 transition (archived) Python 3 transition label Sep 10, 2018
@zstanecic
Copy link
Contributor

zstanecic commented Sep 10, 2018 via email

@josephsl
Copy link
Collaborator Author

josephsl commented Sep 10, 2018 via email

@zstanecic
Copy link
Contributor

zstanecic commented Sep 10, 2018 via email

@zstanecic
Copy link
Contributor

zstanecic commented Sep 10, 2018 via email

@LeonarddeR
Copy link
Collaborator

@zstanecic: I believe the PHP Markdown extensions support tables to some extend, and the python markdown module should support these extensions. There also is an extension API

We would need more extensions, such as table of contents

I would be an advocate of using markdown. As @josephsl stated, translators are familiar with Markdown, and it is used in other areas as well, i.e. on github.

@zstanecic
Copy link
Contributor

zstanecic commented Dec 11, 2018 via email

@Adriani90
Copy link
Collaborator

We can convert the userguide and changes with Pandoc to github flavour markdown (gfm) and then save them as .md files. They can easily be converted to html with pandoc, also automatically. The next step would be to restructure the content so taht strings can be extracted by getText and squashed into a .pot file. But this means much more work for NV Access in the long term because every addition to userguide and to changes must be localized which I guess it is not as easy as it is done currently. In my view I can also go with .md files in gfm format. Especially because I can browse the userguide much faster and can see the context of the sentences imediately.

@Adriani90
Copy link
Collaborator

but in general .po files can be translated much much faster.

@Adriani90
Copy link
Collaborator

there is also a txt2po and a ini2po converter. So we can convert with pandoc to .txt and then generate .po files which in turn can be converted to html with translate toolkit.

@LeonarddeR
Copy link
Collaborator

I don't think a conversion to pot files is going to happen for the user guide. The current user guide translation process gives users much more freedom than gettext based translation. This means that for some languages, some user guides might have extra paragraphs, or other paragraphs are extended, missing or shortened.

@derekriemer
Copy link
Collaborator

Overall, I am a big supporter of switching to markdown. Apparently pandoc can convert from t2tt, and pandoc can do tons of things, so we have a lot of freedom here.

@LeonarddeR
Copy link
Collaborator

LeonarddeR commented Dec 12, 2018 via email

@Adriani90
Copy link
Collaborator

Adriani90 commented Dec 12, 2018 via email

@Adriani90
Copy link
Collaborator

Adriani90 commented Dec 13, 2018

this seems not very easy to acomplish becaus Github flavoured markup does not seem to have an equivalent for %includeconf which is currently used by txt2tags. With Pandoc you can create sort of bash script but it is quite limited. at least from what I can see. The problem is that I don't know how to include configs of bash scripts or other external files into the userguide or changes file. At least for translators, they are more used to the markup of the addons homepage which is not github flavoured. Maybe that would be also an option but I doubt because that markup seems not very rich with regard to functionality. There are only standard strings like tables, bullets and what not.

Actually txt2tags is a quite good markup. It is too bad that it probably does not comply with Python 3 requirements.

@LeonarddeR
Copy link
Collaborator

I think we should start with the changes files as an experiment for this. After that, we can expand to the user guide as well.

@LeonarddeR
Copy link
Collaborator

LeonarddeR commented Dec 20, 2018

Here is an overview of how the extension API kan be used to accomplish several tasks that are now performed by t2t.

Global

Looks like we can drop them all :)

Build

NVDA_VERSION, NVDA_URL and NVDA_COPYRIGHT_YEARS all can be passed at runtime and should be defined in the extension as options. See Integrating Your Code Into Markdown > Configuration Settings. Then, they can be replaced using inline patterns.

Changes file:

  • Make ticket references into links: see Inline Patterns>future
  • Make links open in a new tab/window: the markdown extensions overview mentions a newtab extension, but the github URL is dead. Alternatively, we can easily create something ourselves, see Working with the ElementTree > def set_link_class(self, element):

User guide

A preprocessor should be created that ignores every line that starts with %kc:

Key commands

I think that the keyCommandsDoc module can be converted to a markdown extension as well, where the whole walk through the user guide can be a preprocessor

@nishimotz
Copy link
Contributor

I have worked around txt2tags regarding Python 3.

https://github.com/nvdajp/txt2tags

So far, only tested with NVDA documents in Japanese and English.

Please take a look.

Thank you @LeonarddeR for letting me know the modernizer tool at somewhere else.

@LeonarddeR
Copy link
Collaborator

Thanks @nishimotz! Good to know that there is actually a way to stay at txt2tags for a while.

@dpy013
Copy link
Contributor

dpy013 commented Apr 28, 2019

there is also a txt2po and a ini2po converter. So we can convert with pandoc to .txt and then generate u.po files which in turn can be converted to html with translate toolkit.

We can convert the userguide and changes with Pandoc to github flavour markdown (gfm) and then save them as .md files. They can easily be converted to html with pandoc, also automatically. The next step would be to restructure the content so taht strings can be extracted by getText and squashed into a .pot file. But this means much more work for NV Access in the long term because every addition to userguide and to changes must be localized which I guess it is not as easy as it is done currently. In my view I can also go with .md files in gfm format. Especially because I can browse the userguide much faster and can see the context of the sentences imediately.
hello Adriani
Is it possible to convert the NVDA User Guide to a Sphinx-supported file and then generate a .pot with Sphinx?

@Adriani90
Copy link
Collaborator

@dingpengyu unfortunately I am not an expert in reST files and sphinx markup. It is certainly possible to convert parts of the userguide into sphinx with pandoc but we actually want an easy markup like txt2tags or gfl because it is very user friendly. But however, if we would convert the files to sphinx markup, I am not sure if the markup supports all the bash scripts used in the translation system and in the document structure (i.e. direct links to github issues etc.). Could you please elaborate on the benefits we would have with Sphinx? Where are the limits of this markup?

@LeonarddeR
Copy link
Collaborator

@nishimotz: would you be able to provide a pr for all the code you changed that is part of the NVDA repository? Also, I think we prefer a conversion to Python 3 code, not necessarily compatible with both (e.g. using six). I think we should also stick to the txt2tags version we're currently using in the build process, regardless whether that's the most recent one.

I think it is really important that we step away from txt2tags at some point. However for now, I think it is important to focus on creating a distribution based on Python 3 that actually works. Converting all the documentation really sounds like a separate project.

@nishimotz
Copy link
Contributor

@LeonarddeR Firstly I will remove Python 2 support from my PoC work, then make PR against NVDA repository.
What branch of NVDA should I use?

@LeonarddeR
Copy link
Collaborator

LeonarddeR commented May 30, 2019 via email

@nishimotz
Copy link
Contributor

@LeonarddeR you mean I should fork nvda-misc-deps repository to make PR?

@LeonarddeR
Copy link
Collaborator

For the txt2tags file, yes. For other files, such as site_scons.site_tools.t2t, the code resides in the normale NVDA repository. I agree it is slightly confusing.

@nishimotz
Copy link
Contributor

@LeonarddeR created PR #9648

michaelDCurran pushed a commit that referenced this issue Jun 11, 2019
* txt2tags for python3 #8734

* address review comments

Co-Authored-By: Leonard de Ruijter <leonardder@users.noreply.github.com>

* address review comments

* address review comments. revert miscDeps

* address review comment #9648

* Update miscDeps to master containing python3 txt2tags.
@feerrenrut
Copy link
Contributor

I'll close this issue, text2tags with python 3 has been addressed with #8734. However, in the future we may look at replacing text2tags. At this we are assuming the text2tags project is dead. The information will likely be helpful when we come back to this idea.

@LeonarddeR
Copy link
Collaborator

We're discussing alternatives for epydoc in #9840.

LeonarddeR pushed a commit to LeonarddeR/nvda that referenced this issue Nov 14, 2019
commit befffdd
Author: Takuya Nishimoto <nishimotz@gmail.com>
Date:   Fri May 31 22:35:55 2019 +0900

    address review comments. revert miscDeps

commit 3070fcc
Author: Takuya Nishimoto <nishimotz@gmail.com>
Date:   Fri May 31 22:32:13 2019 +0900

    address review comments

commit e32fa0e
Author: Takuya Nishimoto <nishimotz@gmail.com>
Date:   Fri May 31 22:27:26 2019 +0900

    address review comments

    Co-Authored-By: Leonard de Ruijter <leonardder@users.noreply.github.com>

commit adcfdea
Author: Takuya Nishimoto <nishimotz@gmail.com>
Date:   Fri May 31 17:34:24 2019 +0900

    txt2tags for python3 nvaccess#8734
@aureliojargas
Copy link

Hi, just a friendly heads up that now finally there's an official Python 3 version for txt2tags in https://github.com/txt2tags/txt2tags/tree/v3, also available in PyPI at https://pypi.org/project/txt2tags/. Maybe that one can fit your needs.

@josephsl
Copy link
Collaborator Author

josephsl commented Dec 29, 2019 via email

@dpy013
Copy link
Contributor

dpy013 commented Dec 29, 2019

For short periods we can use txt2tags
I personally think it would be better to use sphinx to manage documents.

@rojanu
Copy link

rojanu commented May 31, 2020

Here is a good comparison list;
https://github.com/KiCad/kicad-doc/tree/master/doc_alternatives.

Personally, I preferer asciidoc it's human readable, not so different from MD. We use it at work to generate documentation to PDF and html formats, as well as reading the raw format on github. https://po4a.org can be used to translate it as well.

@Adriani90
Copy link
Collaborator

The tranlsation system is in process of changing to Crowdin and a olution for converting t2t to markdown or something else will become relevant again. I am reopening.

@Adriani90 Adriani90 reopened this Dec 16, 2023
@seanbudd seanbudd added this to the 2024.1 milestone Dec 21, 2023
@seanbudd seanbudd linked a pull request Dec 28, 2023 that will close this issue
5 tasks
@derekriemer
Copy link
Collaborator

derekriemer commented Jan 23, 2024 via email

@seanbudd
Copy link
Member

Hi @derekriemer - this is planned and tracked in #16059 using CSS

Adriani90 pushed a commit to Adriani90/nvda that referenced this issue Mar 13, 2024
Closes nvaccess#8734
Part of nvaccess#15014
Related nvaccess/nvda-misc-deps#30, nvaccess#16002, nvaccess#15950, nvaccess#15939, nvaccess#15981

Summary of the issue:
In order to migrate to Crowdin, we must convert txt2tags to markdown.
In order transition safely, beta will build docs from t2t to markdown to html.
Eventually the t2t will be removed and the markdown will become the source of truth.

Description of user facing changes
The user Guide no longer has numbered sections

Translators and documentation writers now use extended markdown syntax rather than txt2tags.

Description of development approach
The build system performs certain pre-processing and post-processing when converting t2t to HTML.
This equivalent system should be retained - i.e. the translator and documentation contribution experience should remain the same for markdown to HTML.
Additionally, when converting t2t to markdown, new processing rules had to be created.

There is no universal standard for custom anchors in markdown.
As such text2tags doesn't have default rules for this. To retain our custom anchors, I added rules for a common markdown extended syntax.
Similarly a shortcut is used for generating table of contents.
See nvaccess/nvda-misc-deps#30

Setting the language code is done using the user_docs folder name, and the direction of RTL is manually set for the 3 current languages that require it, Persian, Arabic, Hebrew.

Special Catalan processing for adding hreflang attributes is converted to a markdown extension syntax of {hreflang=en}

The key commands file is generated using a custom made markdown python extension.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.