md2LLM

Turn markdown notes into a small local model you can chat with.

md2LLM reads your personal markdown notes, generates training pairs, fine-tunes a compact base model, and runs the result locally through a browser workflow. Your notes stay on your machine except for the optional API call used to create training data.

How It Works

Your notes -> Training data -> Fine-tuned model -> Chat

Point md2LLM at an Obsidian vault or any folder of markdown files
Review the notes selected for training data generation
Generate Q&A, style, reasoning, or chatbot training pairs
Select a base model; md2LLM maps it to a Hugging Face repo for training
Fine-tune locally on MLX or Unsloth, or use the Google Colab fallback
Export a model artifact and chat with it through the local app

Setup

Prerequisites

Python 3.10 or higher
Node.js 18 or higher
Ollama installed and running (ollama.com)
OpenAI API key (for training data generation)
16GB+ RAM or an NVIDIA GPU recommended for local training

Install

# Clone the repo
git clone https://github.com/yourusername/md2LLM
cd md2LLM

# Install Python dependencies
pip install -r requirements.txt

# Install frontend dependencies
cd frontend && npm install && cd ..

Configure

# Copy the environment template
cp .env.example .env

# Open .env and add your OpenAI API key
# OPENAI_API_KEY=your-openai-api-key-here

Run

# Start the backend server
uvicorn server.app:app --host 127.0.0.1 --port 8000

# In a separate terminal, start the frontend (development)
cd frontend && npm run dev

# Open your browser
# http://localhost:5173

Usage

Upload your Obsidian vault or point to a markdown folder
Pick your goal - Knowledge, Style, Reasoning, or Chatbot
Generate training data from your notes
Select a base model, then confirm the Hugging Face repo used for training
Configure training settings
Start training - takes 15-40 minutes depending on hardware
Chat with your personalized model

Hardware Requirements

Hardware	Training Time	Notes
Mac Apple Silicon 16GB+	15-25 min	Uses MLX - recommended
Mac Apple Silicon 8GB	Varies	Colab recommended; local training may freeze
NVIDIA GPU 8GB+	10-20 min	Uses Unsloth
NVIDIA GPU 4-6GB	20-35 min	Use smaller model
CPU only or low RAM	15-25 min	Uses Google Colab fallback
Google Colab	15-25 min	Free T4 GPU

Supported Base Models

Training uses Hugging Face model repositories. Before training starts, md2LLM checks the local Hugging Face cache at ~/.cache/huggingface/hub for the selected repo. If the model is not cached, md2LLM downloads it from Hugging Face. Ollama is only used later to run the exported model after training.

UI model family	Training repo
Qwen 2.5 1.5B	`Qwen/Qwen2.5-1.5B-Instruct`
Llama 3.2 1B	`meta-llama/Llama-3.2-1B-Instruct`
Llama 3.2 3B	`meta-llama/Llama-3.2-3B-Instruct`
Phi-3 Mini	`microsoft/Phi-3-mini-4k-instruct`

After training, md2LLM exports the result so it can be loaded and run locally.

Select Hugging Face repo
        |
        v
Check ~/.cache/huggingface/hub
        |
        +--> cached: train immediately
        |
        +--> not cached: download from Hugging Face, then train
        |
        v
Export trained model
        |
        v
Load/run with Ollama

Privacy

The app runs locally on your machine. Training data generation uses your configured LLM provider, so note excerpts sent for pair generation follow that provider's API policy. Local training, model export, and chat run through the local backend and your local model files.

Architecture

frontend/ React + Vite app
        |
        v
server/ FastAPI routes
        |
        +--> pipeline/ markdown reader and training data generator
        |
        +--> training/ MLX, Unsloth, export, and Colab notebook
        |
        +--> models/ trained model artifacts
        |
        +--> output/ generated JSONL training data

Frontend

frontend/ contains the React workflow:

SelectVault loads markdown files and starts a generation job
ReviewFiles lets the user inspect candidate notes
Generate streams training data generation progress
SelectModel and TrainConfig choose a base model and training settings
TrainRun checks hardware, runs local training, or shows Colab steps
Chat opens the local chat experience after the model is available

The Vite dev server proxies API requests to the FastAPI backend.

Backend

server/app.py creates the FastAPI app and mounts route modules:

server/routes/jobs.py manages job state, generated files, and downloads
server/routes/training.py detects hardware, recommends local vs. Colab training, and starts training jobs
server/routes/models.py discovers Ollama and local model files
server/routes/chat.py handles chat requests
server/routes/frontend.py serves the built frontend in production

Runtime configuration and output paths live in server/config.py.

Pipeline

pipeline/vault_reader.py parses markdown files, frontmatter, wikilinks, tags, and note quality. pipeline/data_generator.py turns accepted notes into JSONL training pairs using the selected goal mode. Generated data is written to output/.

Training

training/ contains the fine-tuning entry points:

train_mlx.py for Apple Silicon MLX training
train_unsloth.py for NVIDIA GPU training with Unsloth
trainer.py for shared dataset and progress helpers
export.py and Modelfile for local model export and Ollama setup
md2LLM_colab.ipynb for the Google Colab fallback path

The backend blocks local training on weak devices and routes the user to Colab when CPU-only or low-memory hardware would be unsafe.

See docs/architecture.md for a deeper technical overview.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
frontend		frontend
models		models
output		output
pipeline		pipeline
server		server
training		training
.env.example		.env.example
.gitignore		.gitignore
BUILD.md		BUILD.md
README.md		README.md
config.yaml		config.yaml
requirements-mlx.txt		requirements-mlx.txt
requirements-unsloth.txt		requirements-unsloth.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

md2LLM

How It Works

Setup

Prerequisites

Install

Configure

Run

Usage

Hardware Requirements

Supported Base Models

Privacy

Architecture

Frontend

Backend

Pipeline

Training

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

md2LLM

How It Works

Setup

Prerequisites

Install

Configure

Run

Usage

Hardware Requirements

Supported Base Models

Privacy

Architecture

Frontend

Backend

Pipeline

Training

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages