← Library · Frontier

Zyphra Releases Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models

Zyphra has launched Zamba2-VL, a family of open vision-language models (VLMs) available in 1.2B, 2.7B, and 7B parameter sizes. These models utilize a hybrid Mamba2 state-space model and Transformer backbone, which significantly cuts time-to-first-token latency by about an order of magnitude compared to traditional Transformer-based VLMs. Zamba2-VL demonstrates strong performance in visual counting and document understanding, while showing areas for improvement in knowledge-heavy reasoning tasks.

Why it matters

Zamba2-VL offers a new architecture for vision-language models that prioritizes lower latency and competitive accuracy, making it a promising option for applications requiring fast processing of both visual and textual information, especially in edge and mid-range deployments.

Learn one new AI thing every day.

Daily Deck sends you seven plain-English cards like this every morning. Free.

Start free

Zyphra Releases Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models

Learn one new AI thing every day.

Related frontiers