📊 InfoBuddy

Table of Contents📚

📝Overview
✨ Features
🚀 Prototype
💻 Technological Stack
🏗️ System Architecture
🔧 Installation
📈 Model Development
⚙️ Challenges and Solutions
🌟 Future Enhancements
Project Admin⚡
🤝 Contributing
📄 License

📝 Overview

InfoBuddy is an innovative machine learning-based project that identifies key physical attributes—such as height, weight, voltage, wattage, width, volume, and depth—of objects from uploaded images. This project was created by Team Prime Predictors for the Unstop Amazon ML Challenge 2024. The solution features an interactive frontend that supports multilingual interaction (English, Spanish, French, Hindi) and offers both voice input and output capabilities for seamless user experience.

✨ Features

🔍 Attribute Detection: Detects height, weight, voltage, wattage, width, volume, and depth from uploaded images.
🌐 Multilingual Support: Interface supports English, Spanish, French, and Hindi.
🎙️ Voice Interaction: Includes voice input and output features using native JavaScript libraries.
🔊 Text-to-Speech (TTS): Read-along feature that reads out the responses for enhanced accessibility.
💬 Chat History: Keeps track of user interactions for easy reference.

🚀 Prototype

Recording.2024-10-04.102049.1.mp4

Recording.2024-10-04.103917.1.mp4

💻 Technological Stack

Backend: Python, Jupyter Notebook, GEMINI API.
Frontend: React, JavaScript, HTML, CSS.
Automation and Testing: Selenium.
Speech and Voice: Native JavaScript libraries for speech-to-text and text-to-speech.

🏗️ System Architecture

Backend: Python-based machine learning model built in Jupyter Notebook.

          - Utilizes the GEMINI API for data processing.
          - Integrated with Selenium for testing and automation.

Frontend: React-based application with interactive elements.

          - Offers chat-style interaction and dynamic response rendering.
          - Allows language switching and voice integration.

🔧 Installation

To set up the project locally, follow these steps:

1. Prerequisites

Node.js
Python 3.x
Jupyter Notebook
GEMINI API Access

Clone the Repository

git clone https://github.com/apu52/INFOBUDDY_ML_CHALLANGE_2k24.git  
cd InfoBuddy

📈 Model Development

The machine learning model is built using Python and Jupyter Notebook. The following steps were followed for model development:

📊 Data Collection: Compiled a dataset of various objects to train the model for accurate attribute detection.
🤖 Model Training: Leveraged deep learning techniques to train the model for predicting height, weight, and other parameters.
🔍 Testing: Utilized Selenium for testing the model’s output against expected values.

⚙️ Challenges and Solutions

🔄 Data Processing: Ensured data consistency and variety during the model training phase to enhance accuracy.
🎙️ Voice Integration: Overcame voice recognition challenges by leveraging native JavaScript libraries and optimizing the voice flow.

🌟 Future Enhancements

📊 Expand the dataset for more accurate attribute detection.
🌐 Incorporate additional languages and improve voice command capabilities.
🔄 Optimize the real-time detection feature to handle more complex objects.

Team "Prime Predictors"⚡

Shreya Gupta

Surya R

Debasri Pal

Arpan Chowdhury

🤝 Contributing

Contributions are welcome! Please follow these steps to contribute:

Fork the repository.
Create a new branch for your feature or bug fix.
Commit your changes and open a pull request.

Show some ❤️ by giving to this repository.

📄 License

This project is licensed under the Apache-2.0 license License. See the [LICENSE] file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
ML_MODEL_TEST_RESULTS		ML_MODEL_TEST_RESULTS
SOURCE		SOURCE
dataset		dataset
entity_csvs		entity_csvs
image_links		image_links
images		images
public		public
src		src
.gitignore		.gitignore
Arpan.jpg		Arpan.jpg
Debasri.jpg		Debasri.jpg
GEMINI_API.py		GEMINI_API.py
Graph.png		Graph.png
Inference_with_TrOCR_+_Gradio_demo.ipynb		Inference_with_TrOCR_+_Gradio_demo.ipynb
InternVLM2.py		InternVLM2.py
LICENSE		LICENSE
README.md		README.md
Recording 2024-10-04 102049 (1).mp4		Recording 2024-10-04 102049 (1).mp4
Recording 2024-10-04 103917 (1).mp4		Recording 2024-10-04 103917 (1).mp4
Shreya.jpg		Shreya.jpg
Surya.jpg		Surya.jpg
Technical Report for Project infobuddy.pdf		Technical Report for Project infobuddy.pdf
cluster.py		cluster.py
download.py		download.py
embed.py		embed.py
file_maker.py		file_maker.py
final_inference.py		final_inference.py
internVL_API.py		internVL_API.py
kosmos.ipynb		kosmos.ipynb
message.py		message.py
package-lock.json		package-lock.json
package.json		package.json
paligemma_fine_tune.py		paligemma_fine_tune.py
paligemma_test.py		paligemma_test.py
seleniumScriptInterVLM2.py		seleniumScriptInterVLM2.py
seleniumScriptInternVLM2.py		seleniumScriptInternVLM2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 InfoBuddy

Table of Contents📚

📝 Overview

✨ Features

🚀 Prototype

💻 Technological Stack

🏗️ System Architecture

🔧 Installation

1. Prerequisites

Clone the Repository

📈 Model Development

⚙️ Challenges and Solutions

🌟 Future Enhancements

Team "Prime Predictors"⚡

Shreya Gupta

Surya R

Debasri Pal

Arpan Chowdhury

🤝 Contributing

📄 License

About

Releases

Packages

Languages

License

apu52/INFOBUDDY_ML_CHALLANGE_2k24

Folders and files

Latest commit

History

Repository files navigation

📊 InfoBuddy

Table of Contents📚

📝 Overview

✨ Features

🚀 Prototype

💻 Technological Stack

🏗️ System Architecture

🔧 Installation

1. Prerequisites

Clone the Repository

📈 Model Development

⚙️ Challenges and Solutions

🌟 Future Enhancements

Team "Prime Predictors"⚡

Shreya Gupta

Surya R

Debasri Pal

Arpan Chowdhury

🤝 Contributing

📄 License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages