Skip to content
/ Hera Public

This project presents Hera, an Operating System level voice recognition package that understands voice commands to perform actions to simplify the user’s workflow. We propose a modernistic way of interacting with Linux systems, where the latency of conventional physical inputs are minimized through the use of natural language speech recognition.

Notifications You must be signed in to change notification settings

HeyHera/Hera

Repository files navigation

Hera - An Operating System Level Voice Recognition Package

Our project propose a new way of interacting with the operating system that prioritizes on improving the user experience via voice commands. It is able to recognize the spoken language and is able to draw meaningful conclusions from it and to provide responses accordingly.

Introduction

Our project propose a new way of interacting with the operating system that prioritizes on improving the user experience via voice commands. It is able to recognize the spoken language and is able to draw meaningful conclusions from it and to provide responses accordingly. Unlike the traditional approach which rely heavily on the physical inputs, our proposed system can provide an alternative method through the means of voice interactions. Though we are developing a voice based system, the traditional physical input is still available, so the user can experience the best of both worlds.

Features

  • Custom wake word detection
  • Natural Language Understanding
  • Ability to launch applicatons
  • Launch custom scripts
  • Play music and movies from the folder specified

Features to be added

  • Usage analysis

Methodology

For effective and efficient embedding of speech recognition into Linux Operating System, we employ a multimodule approach, namely Assistant, Coordinator and Skill modules. These modules determine how the voice data is collected, processed and evaluated. The entire working of the system is divided into two phases, Assistant-Coordinator (Primary) phase and Coordinator-Skill-Synthesis (Secondary) phase. The primary phase consist of transcribing the voice data to the corresponding intents. The secondary phase deals with mapping intents into corresponding skills and providing feedback in the form of speech or raw data. Read more

Our project was made possible using

Installation

Python 3.7 is needed for dependencies. Check the python version by running

python --version

Setting up virtual environment

We recommend installing Hera on seperate virtual environment

sudo apt install python3-venv
python3 -m venv env

Clone the repositry

git clone https://github.com/HeyHera/Hera.git

Installing dependencies

pip install -r requitements.txt

Download necessary models

Models for wake word detection and intent classification is given in the repository itself. Other models needs to be downloaded and placed in the right directory.

Running Hera

python app.py

Our Mentor

  • Ahammed Siraj K K

Members of the team

Troubleshoot

Before running Hera, test your microphone

arecord -f cd -d 10 --device="hw:0,0" /tmp/test-mic.wav
aplay /tmp/test-mic.wav

About

This project presents Hera, an Operating System level voice recognition package that understands voice commands to perform actions to simplify the user’s workflow. We propose a modernistic way of interacting with Linux systems, where the latency of conventional physical inputs are minimized through the use of natural language speech recognition.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages