🏦 AI Bank Statement Automation with LLM & Personal Financial Analysis

Intelligent document automation for bank statements — Extract, structure, analyze, and query financial data from PDFs using YOLO + OCR + LLM, powered by CrewAI agents and local LLMs (LM Studio / Ollama).

✨ Key Features

Advanced PDF Parsing — YOLO layout detection + OCR + LLM-based table extraction
CrewAI Agents & Skills — Specialized skills for bank statement parsing, PII redaction, financial analysis, and RAG querying
Local LLM First — Full support for LM Studio and Ollama via LiteLLM (with async acompletion)
Reasoning Model Support — Automatically handles models that return content in reasoning_content
Secure RAG Pipeline — PII redaction before embedding into vector database (Qdrant / Chroma)
Financial Intelligence — Income/expense categorization, trend analysis, and natural language querying
Async Ready — Modern async LLM calls for better performance with CrewAI
Development Notebook — Rich Jupyter notebook for experimentation and rapid iteration
MLflow Integration — Trace and monitor LLM calls and full AI agent workflows

⚠️ Important Notes for Local LLMs (LM Studio / Ollama)

When using local models via LM Studio or Ollama, please follow these recommendations for best performance:

Model Size: Use models with 9B parameters or larger (e.g. Qwen2.5-14B, Qwen3-27B, Gemma-2-9B, Llama-3.1-8B or above). Smaller models (7B and below) significantly reduce output quality and reasoning ability.
Context Length: Set context size to 16K tokens or higher (recommended 32K+). Lower context sizes (e.g. 8K) often cause the agent to lose important instructions and produce incomplete or incorrect results.
JSON Output Weakness: Local LLMs generally perform poorly with strict JSON/structured output. They tend to mix reasoning with JSON or fail validation.
Recommendation: Prefer asking the agent to output clean Markdown reports instead of forcing JSON.

Tip: If you must use structured output, consider post-processing the Markdown result with a second LLM call or instructor library.

🛠 Tech Stack

Component	Technology
LLM Orchestration	LiteLLM (LM Studio, Ollama, OpenAI, DeepSeek, etc.)
Agent Framework	CrewAI + Skills
Document Processing	PyMuPDF, YOLO, OCR
Vector Database	Qdrant, Chroma
RAG	LangChain + PII redaction
Tracing & Monitoring	MLflow
Frontend (Optional)	Streamlit

📁 Project Structure

AI-Bank-Statement-Document-Automation/
├── backend/
│   └── app/
│       ├── core/
│       │   └── ai_agent_skills_dev.ipynb   # ← Main development notebook
│       ├── skills/                         # CrewAI Skills
│       │   ├── bank-statement-parsing/
│       │   ├── financial-analysis/
│       │   ├── pii-handling/
│       │   └── output-formatting/
│       └── ...
├── data/
│   ├── bank-statement-document/            # Sample PDFs
│   └── uploads/
├── requirements.txt
└── README.md

🚀 Quick Start (Local LLM Recommended)

1. Setup Environment

git clone https://github.com/johnsonhk88/AI-Bank-Statement-Document-Automation-By-LLM-And-Personal-Finanical-Analysis-Prediction.git
cd AI-Bank-Statement-Document-Automation-By-LLM-And-Personal-Finanical-Analysis-Prediction

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

2. Start LM Studio (Recommended)

Open LM Studio

Load a 9B+ model (e.g. qwen2.5-14b-instruct or qwen3-27b)
Set Context Length to 16K or higher (32K preferred)
Start the local server on http://localhost:1234

3. Run the Main Notebook

jupyter notebook backend/app/core/ai_agent_skills_dev.ipynb

🔧 Key Configuration

Using LM Studio (Recommended)

llm = LLM(
    model="openai/qwen2.5-14b-instruct",   # Use 9B+ model
    base_url="http://localhost:1234/v1",
    api_key="lm-studio",
    temperature=0.6,
    max_tokens=2048,
)

Async Direct Calls

from litellm import acompletion

response = await acompletion(
    model="openai/your-model",
    base_url="http://localhost:1234/v1",
    api_key="lm-studio",
    messages=[{"role": "user", "content": prompt}]
)

MLflow Tracing (Agent Flow Monitoring)

MLflow is integrated to help you trace and debug the full AI agent workflow, including LLM calls, tool usage, and task execution.

import mlflow
mlflow.crewai.autolog()
mlflow.litellm.autolog()

You can view the agent execution traces in the MLflow UI.

🗺 Roadmap

Production FastAPI backend
Multi-document collections / workspaces
Advanced financial forecasting
Docker + Kubernetes deployment
Improved Streamlit dashboard with charts

📄 License

This project is licensed under the Apache License 2.0.

Made with ❤️ for smarter personal finance automation in Hong Kong. ⭐ If you find this project useful, please consider giving it a star!

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
backend		backend
data/bank-statement-document		data/bank-statement-document
frontend/streamlit_app		frontend/streamlit_app
notebooks		notebooks
scripts		scripts
yolo-base-layout-analysis		yolo-base-layout-analysis
.env_test		.env_test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏦 AI Bank Statement Automation with LLM & Personal Financial Analysis

✨ Key Features

⚠️ Important Notes for Local LLMs (LM Studio / Ollama)

🛠 Tech Stack

📁 Project Structure

🚀 Quick Start (Local LLM Recommended)

1. Setup Environment

2. Start LM Studio (Recommended)

Open LM Studio

3. Run the Main Notebook

🔧 Key Configuration

Using LM Studio (Recommended)

Async Direct Calls

MLflow Tracing (Agent Flow Monitoring)

MLflow is integrated to help you trace and debug the full AI agent workflow, including LLM calls, tool usage, and task execution.

🗺 Roadmap

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🏦 AI Bank Statement Automation with LLM & Personal Financial Analysis

✨ Key Features

⚠️ Important Notes for Local LLMs (LM Studio / Ollama)

🛠 Tech Stack

📁 Project Structure

🚀 Quick Start (Local LLM Recommended)

1. Setup Environment

2. Start LM Studio (Recommended)

Open LM Studio

3. Run the Main Notebook

🔧 Key Configuration

Using LM Studio (Recommended)

Async Direct Calls

MLflow Tracing (Agent Flow Monitoring)

MLflow is integrated to help you trace and debug the full AI agent workflow, including LLM calls, tool usage, and task execution.

🗺 Roadmap

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages