GitHub - Emericen/tiny-qwen: A minimal, easy-to-read PyTorch reimplementation of the Qwen3 and Qwen2.5 VL with a fancy CLI

English | 中文

✨ Tiny Qwen

A minimal, easy-to-read PyTorch re-implementation of Qwen3 and Qwen2.5-VL, supporting both text + vision as well as dense and mixture of experts.

If you find Hugging Face code verbose and challenging to interpret, this repo is for you!

Join my Discord channel for more discussion!

🦋 Quick Start

I recommend using uv and creating a virtual environment:

pip install uv && uv venv

# activate the environment
source .venv/bin/activate # Linux/macOS
.venv\Scripts\activate # Windows

# install dependencies
uv pip install -r requirements.txt

Launch the interactive chat:

python run.py

Note: Qwen3 is text-only. Use @path/to/image.jpg to reference images with Qwen2.5-VL.

USER: @data/test-img-1.jpg tell me what you see in this image?
✓ Found image: data/test-img-1.jpg
ASSISTANT: The image shows a vibrant sunflower field with a close-up of a sunflower...

📝 Code Examples

Running Qwen2.5-VL:

from PIL import Image
from model.model import Qwen2VL
from model.processor import Processor

model_name = "Qwen/Qwen2.5-VL-3B-Instruct"
model = Qwen2VL.from_pretrained(repo_id=model_name, device_map="auto")
processor = Processor(repo_id=model_name, vision_config=model.config.vision_config)

context = [
    "<|im_start|>user\n<|vision_start|>",
    Image.open("data/test-img-1.jpg"),
    "<|vision_end|>What's on this image?<|im_end|>\n<|im_start|>assistant\n",
]

inputs = processor(context, device="cuda")

generator = model.generate(
    input_ids=inputs["input_ids"],
    pixels=inputs["pixels"],
    d_image=inputs["d_image"],
    max_new_tokens=64,
    stream=True,
)

for token_id in generator:
    token_text = processor.tokenizer.decode([token_id])
    print(token_text, end="", flush=True)
print()

Running Qwen3:

from model.model import Qwen3MoE
from model.processor import Processor

model_name = "Qwen/Qwen3-4B-Instruct-2507"
model = Qwen3MoE.from_pretrained(repo_id=model_name)
processor = Processor(repo_id=model_name)

context = [
    "<|im_start|>user\n<|vision_start|>",
    "<|vision_end|>Explain reverse linked list<|im_end|>\n<|im_start|>assistant\n",
]
inputs = processor(context, device="cuda")
generator = model.generate(
    input_ids=inputs["input_ids"],
    max_new_tokens=64,
    stream=True
)

for token_id in generator:
    token_text = processor.tokenizer.decode([token_id])
    print(token_text, end="", flush=True)
print()

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
data		data
model		model
train		train
.gitignore		.gitignore
README.md		README.md
README_CN.md		README_CN.md
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

✨ Tiny Qwen

🦋 Quick Start

📝 Code Examples

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Emericen/tiny-qwen

Folders and files

Latest commit

History

Repository files navigation

✨ Tiny Qwen

🦋 Quick Start

📝 Code Examples

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages