kvcache
Here are 9 public repositories matching this topic...
R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
-
Updated
Aug 8, 2025 - Python
kvcached: Elastic KV cache for dynamic GPU sharing and efficient multi-LLM inference.
-
Updated
Aug 18, 2025 - Python
(ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation
-
Updated
May 28, 2025 - Jupyter Notebook
PiKV: KV Cache Management System for Mixture of Experts [Efficient ML System]
-
Updated
Aug 17, 2025 - Python
Span Queries: What if we had a way to plan and optimize GenAI like we do for SQL?
-
Updated
Aug 19, 2025 - Rust
This project implements an Emotion-Aware Music Generator (EAMG) that turns natural-language prompts into emotion-aligned music in real time. It uses a LoRA-tuned DistilBERT to classify emotions, maps them to musical parameters using music theory, and generates MIDI via a transformer model with KV caching for low-latency output.
-
Updated
Jul 6, 2025 - Jupyter Notebook
[SIGMOD 2025] PQCache: Product Quantization-based KVCache for Long Context LLM Inference
-
Updated
Feb 3, 2025 - Python
Improve this page
Add a description, image, and links to the kvcache topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the kvcache topic, visit your repo's landing page and select "manage topics."