- πOverview
- β¨ Features
- π Prototype
- π» Technological Stack
- ποΈ System Architecture
- π§ Installation
- π Model Development
- βοΈ Challenges and Solutions
- π Future Enhancements
- Project Adminβ‘
- π€ Contributing
- π License
InfoBuddy is an innovative machine learning-based project that identifies key physical attributesβsuch as height, weight, voltage, wattage, width, volume, and depthβof objects from uploaded images. This project was created by Team Prime Predictors for the Unstop Amazon ML Challenge 2024. The solution features an interactive frontend that supports multilingual interaction (English, Spanish, French, Hindi) and offers both voice input and output capabilities for seamless user experience.
- π Attribute Detection: Detects height, weight, voltage, wattage, width, volume, and depth from uploaded images.
- π Multilingual Support: Interface supports English, Spanish, French, and Hindi.
- ποΈ Voice Interaction: Includes voice input and output features using native JavaScript libraries.
- π Text-to-Speech (TTS): Read-along feature that reads out the responses for enhanced accessibility.
- π¬ Chat History: Keeps track of user interactions for easy reference.
Recording.2024-10-04.102049.1.mp4
Recording.2024-10-04.103917.1.mp4
- Backend: Python, Jupyter Notebook, GEMINI API.
- Frontend: React, JavaScript, HTML, CSS.
- Automation and Testing: Selenium.
- Speech and Voice: Native JavaScript libraries for speech-to-text and text-to-speech.
-
Backend: Python-based machine learning model built in Jupyter Notebook.
- Utilizes the GEMINI API for data processing. - Integrated with Selenium for testing and automation.
-
Frontend: React-based application with interactive elements.
- Offers chat-style interaction and dynamic response rendering. - Allows language switching and voice integration.
To set up the project locally, follow these steps:
- Node.js
- Python 3.x
- Jupyter Notebook
- GEMINI API Access
git clone https://github.com/apu52/INFOBUDDY_ML_CHALLANGE_2k24.git
cd InfoBuddy
The machine learning model is built using Python and Jupyter Notebook. The following steps were followed for model development:
- π Data Collection: Compiled a dataset of various objects to train the model for accurate attribute detection.
- π€ Model Training: Leveraged deep learning techniques to train the model for predicting height, weight, and other parameters.
- π Testing: Utilized Selenium for testing the modelβs output against expected values.
- π Data Processing: Ensured data consistency and variety during the model training phase to enhance accuracy.
- ποΈ Voice Integration: Overcame voice recognition challenges by leveraging native JavaScript libraries and optimizing the voice flow.
- π Expand the dataset for more accurate attribute detection.
- π Incorporate additional languages and improve voice command capabilities.
- π Optimize the real-time detection feature to handle more complex objects.
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Commit your changes and open a pull request.
Show some β€οΈΒ by giving to this repository.
This project is licensed under the Apache-2.0 license License. See the [LICENSE] file for details.