prodigyflow

๐Ÿš€ ProdigyFlow โ€” Intelligent Data Analytics Agent

ProdigyFlow is an AI-powered multi-agent data analytics framework designed to automate end-to-end exploratory analysis, visualization, insights generation, and reporting. It enables students, analysts, and researchers to turn raw datasets into meaningful business insights with minimal manual effort.

This project has been developed as part of an academic capstone initiative by: Komal Harshita & Priyamvadha Sahasvi Nune,
Department of Computer Science & Engineering.


Project Motivation โ€” Why We Chose This

In real-world business environments, analysts spend 70โ€“80% of time cleaning, exploring, and summarizing data before any modelling or decision-making. This process is repetitive, time-consuming, and prone to error.

We wanted to build a system that:

ProdigyFlow reflects our goal to create simple, useful, modular tools that solve real analytical problems while being easy for students and businesses to adopt.


๐ŸŽฏ Objectives


๐Ÿง  Core Features

Feature Description
Automated Data Analysis Agent Generates insights, metadata & summaries
Visualization Agent Creates automated charts & visual summaries
Gemini-powered AI Summary Natural-language insights from data
Structured Output Formatting Clean and professional console reporting
Modular Agent Design Add or replace agents independently
CSV/Excel Ingestion Support Easily test custom datasets

๐Ÿค– Core System Agents

Agent Name Responsibility Output
analysis_agent.py Reads dataset, extracts statistics, generates Gemini summary Insights, metadata JSON
visualization_agent.py Generates visual graphs and saves locally PNG charts
cleaning_agent.py Cleans missing values, formatting, and structure Cleaned dataset
test_gemini.py Tests Gemini API connection Model response output

๐Ÿงช Running the Project Locally

1๏ธโƒฃ Create and activate virtual environment

python -m venv .venv
.\.venv\Scripts\activate         
source .venv/bin/activate   

2๏ธโƒฃ Install dependencies

pip install -r requirements.txt

3๏ธโƒฃ Run the Analysis Agent

python agents/analysis_agent.py

4๏ธโƒฃ Test Gemini API

python agents/test_gemini.py

Add a Custom CSV file

Place your dataset inside the data/ folder and update path in code:

file_path = "data/your_file.csv"

Sample Terminal Output Preview

๐Ÿš€ Running a dry test of analysis_agent...

๐Ÿ“‚ Using file: data/student_marks.csv

๐Ÿ“Š INSIGHTS (Structured Data Overview)
------------------------------------
{ ... dataset overview JSON ... }

๐Ÿค– AI-GENERATED SUMMARY
-----------------------
โ€ข Key performance trends detected  
โ€ข Distribution shows variation in subject performance  
โ€ข Potential improvement insights  

๐Ÿ“ METADATA
-----------
{ ... summary JSON ... }

โœ” Analysis completed successfully!

โœจ What We Learned


Future Scope

๐Ÿ”น Build a web-based interface using FastAPI/Streamlit ๐Ÿ”น Add database integration and Auto-EDA dashboards ๐Ÿ”น Support PDF report generation ๐Ÿ”น Multi-file dataset comparison ๐Ÿ”น Plug-and-play Machine Learning agent

ProdigyFlow is only the beginning โ€” we plan to expand it into a fully intelligent analytical automation assistant.


๐Ÿค Contributors

Name Role
Komal Harshita Lead Developer, Agent Architecture, AI Integration
Priyamvadha Sahasvi Nune Data Research, Analytics, Testing & Documentation

๐ŸŒŸ Support

If you like this project, please โญ star the repository and share feedback!

๐Ÿ“ฆ Repository https://github.com/komalharshita/prodigyflow

๐Ÿ“˜ Project Documentation https://komalharshita.github.io/prodigyflow/


๐Ÿ“ License

This project is released under MIT License โ€” feel free to use or modify with attribution.


Thank you for exploring ProdigyFlow!