Gemini LLM Image Assistant | Gemini Flash Model

This project is a Streamlit-based web app that integrates with Google's Gemini Generative AI model. Users can input text and upload an image, and the app generates a response using the AI model to describe or provide insights about the image or input. It uses the Google Gemini API to handle the AI processing and displays the results on the app interface.

Services:

LLM
Streamlit

Client:

Personal

Project link:

www.flatheme.net

Duration:

N/A

This application harnesses the capabilities of Google's Gemini model to describe images and generate content based on user prompts. Built with Streamlit, this application allows users to upload an image and optionally provide a text prompt, receiving detailed descriptions or generated content in response.

Features

Environment Variables: Securely loads necessary credentials using dotenv.
Streamlit Interface: Provides a user-friendly, web-based interface for interaction.
Image Processing: Supports uploading and processing images in JPG, PNG, and JPEG formats.
Gemini 1.5 Flash Model: Utilizes the advanced capabilities of the Gemini 1.5 Flash model.
Dynamic Content Generation: Generates textual content based on image and optional text input.

Prerequisites

Python 3.8+
Streamlit
dotenv Python package
PIL (Pillow for image handling)
os

Usage

Upload an image using the file uploader.
Optionally, enter a text prompt related to the image.
Click 'Tell me about the image' to process and generate content. It can generate content with or without a prompt.

Prev Project

Next Project

Utsho