← Library · Frontier

Zyphra Releases Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models

Zyphra has launched Zamba2-VL, a family of open vision-language models (VLMs) in 1.2B, 2.7B, and 7B parameters. These models utilize a hybrid State Space Model (SSM)–Transformer backbone, replacing dense Transformers to achieve competitive accuracy with significantly lower latency. Zamba2-VL shows particular strengths in visual counting and document understanding, offering up to an order of magnitude faster Time-to-First-Token (TTFT) compared to Transformer baselines.

Why it matters

This innovation provides faster and more efficient vision-language processing, especially for latency-sensitive applications, by integrating Mamba2 for performance gains.

Learn one new AI thing every day.

Daily Deck sends you seven plain-English cards like this every morning. Free.

Start free