Car Price Predictor

A machine learning system for predicting car prices using MLflow and Streamlit. This project implements a complete pipeline for predicting car prices with an interactive web interface and real-time visualization.

Architecture

Frontend: Streamlit application for data visualization and interaction
Backend: FastAPI REST API for data management
ML Pipeline: MLflow for model management and serving
Storage: PostgreSQL for data storage, MinIO for model artifacts

 graph TB
    UI[Streamlit UI:8501] --> API[FastAPI:8000]
    API --> DB[(PostgreSQL:5432)]
    DB --> DEB[Debezium:8083]
    DEB --> KAFKA[Kafka:9092]
    KAFKA --> ML[ML Service]
    ML --> MLFLOW[MLflow:5000]
    MLFLOW --> MINIO[(MinIO:9000)]
    ML --> KAFKA
    KAFKA --> API
    API --> DB

    ADMIN[Adminer:8081] --> DB
    KAFKAUI[Kafka UI:8080] --> KAFKA
    ZK[Zookeeper:2181] --> KAFKA
    
    subgraph "User Interface"
        UI
        ADMIN
        KAFKAUI
    end

    subgraph "Storage"
        DB
        MINIO
    end

    subgraph "Processing"
        KAFKA
        DEB
        ML
        MLFLOW
        ZK
    end

    style UI fill:#2563eb,stroke:#1d4ed8,color:#fff
    style API fill:#2563eb,stroke:#1d4ed8,color:#fff
    style DB fill:#059669,stroke:#047857,color:#fff
    style MINIO fill:#059669,stroke:#047857,color:#fff
    style KAFKA fill:#4b5563,stroke:#374151,color:#fff
    style ML fill:#7c3aed,stroke:#6d28d9,color:#fff
    style MLFLOW fill:#7c3aed,stroke:#6d28d9,color:#fff

Prerequisites

Docker and Docker Compose
Python 3.9+ (for local development)

Quick Start

Clone the repository:

git clone https://github.com/Stefen-Taime/car-price-predictor
cd car-price-predictor

Start the services:

docker-compose up --build

Train the initial model:

cd ml_service
python train.py

Start the FastAPI backend:

cd backend
uvicorn main:app --reload

Start the Streamlit frontend:

cd web-ui
streamlit run app.py

Access the applications:

Streamlit UI: http://localhost:8501
FastAPI Docs: http://localhost:8000/docs
MLflow UI: http://localhost:5000
MinIO Console: http://localhost:9001
Kafka UI: http://localhost:8080

Project Structure

.
├── backend/                # FastAPI backend service
│   ├── main.py            # Main API application
│   └── requirements.txt   # Python dependencies
├── data/                  # Training data
│   └── ford.csv          # Sample car data
├── ml_service/           # ML training and prediction service
│   ├── train.py         # Model training script
│   └── main.py          # Prediction service
├── mlflow/               # MLflow service configuration
├── postgres/            # PostgreSQL initialization scripts
├── web-ui/              # Streamlit frontend application
│   ├── app.py          # Main Streamlit application
│   └── requirements.txt # Python dependencies
├── docker-compose.yml   # Docker services configuration
└── README.md

Features

Real-time car price predictions
Interactive data visualization with Streamlit
RESTful API with FastAPI
ML model versioning and tracking with MLflow
Beautiful charts with Plotly
Scalable architecture
API documentation with Swagger UI

Development

Backend Development

cd backend
pip install -r requirements.txt
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Frontend Development

cd web-ui
pip install -r requirements.txt
streamlit run app.py

ML Service Development

cd ml_service
pip install -r requirements.txt
python train.py

API Documentation

The API documentation is available at http://localhost:8000/docs when the backend service is running. The following endpoints are available:

GET /cars: List all cars
POST /cars: Add a new car
GET /cars/{car_id}: Get car details
GET /health: Check service health

Data Flow

User submits car data through Streamlit interface
Data is sent to FastAPI backend
ML service makes predictions using MLflow
Results are stored in PostgreSQL
Updated data is displayed in Streamlit UI

Technologies Used

Backend:
- FastAPI for REST API
- Pydantic for data validation
- SQLAlchemy for database ORM
Frontend:
- Streamlit for UI
- Plotly for data visualization
- Pandas for data manipulation
ML Pipeline:
- MLflow for model management
- scikit-learn for ML models
- PostgreSQL for data storage
- MinIO for artifact storage

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Acknowledgments

Ford Used Car Dataset
MLflow for ML model management
FastAPI for the backend API
Streamlit for the interactive UI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Car Price Predictor

Architecture

Prerequisites

Quick Start

Project Structure

Features

Development

Backend Development

Frontend Development

ML Service Development

API Documentation

Data Flow

Technologies Used

Contributing

Acknowledgments

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
backend		backend
data		data
ml_service		ml_service
mlflow		mlflow
postgres		postgres
streamlit_app		streamlit_app
README.md		README.md
add_cars.sh		add_cars.sh
debezium-connector-config.json		debezium-connector-config.json
docker-compose.yml		docker-compose.yml

Stefen-Taime/car-price-predictor

Folders and files

Latest commit

History

Repository files navigation

Car Price Predictor

Architecture

Prerequisites

Quick Start

Project Structure

Features

Development

Backend Development

Frontend Development

ML Service Development

API Documentation

Data Flow

Technologies Used

Contributing

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages