Gemini LLM Image Assistant | Gemini Flash Model
This project is a Streamlit-based web app that integrates with Google's Gemini Generative AI model. Users can input text and upload an image, and the app generates a response using the AI model to describe or provide insights about the image or input. It uses the Google Gemini API to handle the AI processing and displays the results on the app interface.
This application harnesses the capabilities of Google's Gemini model to describe images and generate content based on user prompts. Built with Streamlit, this application allows users to upload an image and optionally provide a text prompt, receiving detailed descriptions or generated content in response.
Features
- Environment Variables: Securely loads necessary credentials using dotenv.
- Streamlit Interface: Provides a user-friendly, web-based interface for interaction.
- Image Processing: Supports uploading and processing images in JPG, PNG, and JPEG formats.
- Gemini 1.5 Flash Model: Utilizes the advanced capabilities of the Gemini 1.5 Flash model.
- Dynamic Content Generation: Generates textual content based on image and optional text input.
Prerequisites
- Python 3.8+
- Streamlit
- dotenv Python package
- PIL (Pillow for image handling)
- os
Usage
- Upload an image using the file uploader.
- Optionally, enter a text prompt related to the image.
- Click 'Tell me about the image' to process and generate content. It can generate content with or without a prompt.