Skip to content

readium/speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Readium Speech

Readium Speech is a TypeScript library for implementing a read aloud feature with Web technologies. It follows best practices gathered through interviews with members of the digital publishing industry.

While this project is still in a very early stage, it is meant to power the read aloud feature for two different Readium projects: Readium Web and Thorium.

Readium Speech was spun out as a separate project in order to facilitate its integration as a shared component, but also because of its potential outside of the realm of ebook reading apps.

Scope

  • Extracting Guided Navigation objects from a document (or a fragment of a document)
  • Generating utterances from these Guided Navigation objects
  • Processing utterances (prepending/appending text to utterances based on context, pronunciation through SSML/PLS…)
  • Voice selection
  • TTS playback
  • Highlighting

Current focus

For our initial work on this project, we're focusing on voice selection based on recommended voices.

The outline of this work has been explored in a GitHub discussion and through a best practices document.

It's currently under review in a draft PR.

Demo

A live demo of the voice selection API is available.

It demonstrates the following features:

  • fetching a list of all available languages, translating them to the user's locale and sorting them based on these translations
  • returning a list of voices for a given language, grouped by region and sorted based on quality
  • filtering languages and voices based on gender and offline availability
  • using embedded test utterances to demo voices