News! The project has been split into three modules:
- backend - the core module of the project
- plugins - the default set of built-in plugins
- webapp - an administration/system-panel
The "Voice Consultant" / Assistant is a project which aims to provide a system which works only by voice input provided by the user.
The system is able to recognize and analyse the speech of an user (hot word listening) and to react with different kind actions (depending on what a user provided).
For example, a user is able to ask for the current time or the current weather simply by saying the keywords "current date" or "current weather".
Keyword | How the system responds |
---|---|
Hello | Simple speech recognition test, "Hello User" |
Date | The current date and time |
Status | The available network interfaces which starts with "192." |
Weather | The weather from OpenWeatherMap |
... more to come |
This project heavily relies on gradle/spring/boot/jpa/java/etc
and overall java ecosystem.
- Clone this repository:
-
git clone https://github.com/benjaminfoo/Voice-Consultant.git
-
run
./gradlew backend:bootRun
for the standalone-assitant -
run
./gradlew webapp:bootRun
for the standalone-assistant and the administration-panel- open http://localhost:8080/ for the administration-panel (in development)
-
run
./gradlew backend:jar
for compiling the default plugins -
run
./gradlew tasks
for listing every task in this project -
run
./gradlew clean
for deleting all the files build from compilation/etc.
-
You can use the web-application to administrate your voice-system. It lists all the installed plugins, and some details about system properties.
Used web-frameworks:
- JQuery 3.3.1
- Popper 1.14.6
- Bootstrap 4.0.0
This project contains the plug-in management api and implementations of serviceloaders in order to provide plug-in mechanisms and a set of implementations of the previously mentioned commands. It is based on the Java Service-Loader API.
In order to load your own plug-ins into the backend, just provide a jar with implementations of the org.owls.voice.plugins.api.PlugInInterface
.
Remember to put a service-descriptor containing your implementations in the resources <project>\src\main\resources\META-INF\services\org.owls.voice.plugins.api.PlugInInterface
- I recommend to peek into the plugins-project if you're interested in development.
The goal of the core is to provide an expendable system / platform for using a computer without using a keyboard, hands or even a monitor / display. The system provides speech recognition by CMU Sphinx, text-to-speech-synthesis provided by Mary-TTS. The overall architecture is based on mostly spring-related frameworks (alot).
As the FAQ of CMU Sphinx states: Speech recognition accuracy is not always great. To test speech recognition you need to run recognition on prerecorded reference database to see what happens and optimize parameters. More information about voice recognition at the CMU Sphinx FAQ
- Install Oracle JDK 8
sudo apt-get update && sudo apt-get install oracle-java8-jdk java -version
- clone this project to your raspberry pi
- run
./gradlew webapp:bootRun
in the checked out directory - visit the admin-panel via http://192.168.178.24:8080
- CMU Sphinx - https://cmusphinx.github.io/
- Mary TTS - https://github.com/marytts/marytts
- Spring Boot - https://spring.io/
- H2 Database Engine - http://www.h2database.com
- Gradle - https://gradle.org/
- GSon - https://github.com/google/gson
- Jackson - https://github.com/FasterXML/jackson
- LMMS - https://lmms.io/
- benjaminfoo (https://github.com/benjaminfoo/)