Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching to YAML for config #128

Open
brokkr opened this issue May 5, 2021 · 5 comments
Open

Switching to YAML for config #128

brokkr opened this issue May 5, 2021 · 5 comments
Milestone

Comments

@brokkr
Copy link
Owner

brokkr commented May 5, 2021

I believe either JSON or YAML would allow for significant code simpification (there's a lot of lines just parsing the XML). And I don't believe we really depend on the things that XML brings to the table (namespaces etc.)?

Advantages of JSON: Built-in (one less dependency), fast
Advantages of YAML: Easy to read, supplanting pickle?

Really depends on how well either one deals with the heavily nested structure of say, a subscription with filters and metadata and renames... Also how would attributes translate?

@brokkr
Copy link
Owner Author

brokkr commented Jun 21, 2021

Leaving aside the question of attributes, YAML seems like an easy choice.

  • It's easy to read and write
  • Recreating the poca.xml structure is easy
  • Parsing it is two lines of code and I have the entire dictionary in hand

Example:

defaults:
  metadata:
    genre: podcast
settings:
  base_dir: /mnt/media/
  filenames: permissive
  id3removev1: true
  id3v2version: 3
  useragent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0
subscriptions:
- filters:
    title: natur
  max_number: 2
  metadata:
    artist: Det:Er
  rename:
  - title
  - episode_title
  title: "Er h\xE5bet stadig gr\xF8nt?"
  url: https://www.dr.dk/mu/feed/er-habet-stadig-groent.xml?format=podcast
- filters:
    filename: CCEpSpecial
  max_number: 2
  title: The Crate and Crowbar
  url: http://crateandcrowbar.com/category/episode/feed/

A few issues:

  • Dictionaries are randomly rearranged. Apparently this can be avoided using sortkeys=False when dumping? (https://stackoverflow.com/a/64902048) but it doesn't seem to work ootb with safe_dump.
  • yaml dumping is by default encoded latin-1. Apparently it can use UTF-8 instead by using the allow_unicode=True setting (https://stackoverflow.com/a/29600111) Also works for safe_dump (tested)
  • Biggest one: How to validate input? YAML doesn't come with schemas so some options are:
  • read up on how safe is safe_load? and if we safe_load does it matter if we don't safe_dump?

@brokkr
Copy link
Owner Author

brokkr commented Jul 4, 2021

Also: pathlib.Path (Python 3.4) for all Paths class operations

@brokkr
Copy link
Owner Author

brokkr commented Jul 5, 2021

Note: using os.access on pathlike object (e.g. pahlib's Path class) requires 3.6

@brokkr
Copy link
Owner Author

brokkr commented Jul 26, 2021

Still to do:

  • input validation
  • safe_loading/safe_dumping
  • updating 301
  • save state.yaml

@brokkr
Copy link
Owner Author

brokkr commented Jul 27, 2021

We can now save and read state.yaml including converting time_structs and pathlib.Paths back and forth. Still to do in addition to those mentioned above:

  • consolidate handling of yaml in one submodule
  • save 410 codes to state

@brokkr brokkr changed the title Switching to JSON/YAML for config Switching to YAML for config Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant