Gemini AI is an advanced AI-powered web application that provides a suite of services, including a chatbot, image captioning, text embedding, and a Q&A feature. The application is built using Streamlit, leveraging the Gemini Pro model from Google’s generative AI capabilities.
- Engage with the Gemini Pro chatbot, capable of understanding and generating human-like text responses.
- The chat history is displayed to maintain the conversation context.
- Upload an image and receive a descriptive caption generated by the Gemini AI model.
- Supports
.jpg
,.jpeg
, and.png
file formats.
- Input text to generate embeddings useful for document retrieval and other NLP tasks.
- Embeddings are displayed as a list of floating-point numbers.
- Pose any question, and the Gemini AI model will generate an informative answer.
main.py
: The main file that sets up the Streamlit app and handles navigation between the different services offered.streamlit_util.py
: Contains utility functions used across the Streamlit app for layout and styling.gemini_util.py
: Contains utility functions for interacting with the Gemini AI models, including loading models, generating captions, embedding text, and handling user prompts.
git clone https://github.com/username/gemini-ai.git
cd gemini-ai
-
Ensure you have Python 3.8 or above. Install dependencies using
pip
.pip install -r requirements.txt
-
You'll need an API key from Google’s generative AI services. Set it up in your environment.
export API_KEY='your_api_key_here'
-
Start the Streamlit app.
streamlit run main.py
- ChatBot: Select the "ChatBot" option from the sidebar to interact with the AI chatbot.
- Image Captioning: Upload an image in the "Image Captioning" section to receive a descriptive caption.
- Text Embedding: Enter a piece of text in the "Embed text" section to get embeddings.
- Ask Me Anything: Pose a question in the "Ask Me Anything" section to get an answer.
- Python 3.8+
- Streamlit
- Pillow
- Google Generative AI SDK
- streamlit-option-menu
Check out requirements.txt for more details.
This project is licensed under the MIT License. Check the LICENSE file for more details.