Tencent Open-Sources Hy-MT2 Series: Three Models Redefine Translation
"Fast-thinking" multilingual translation models are here.
TL;DR
On May 21, 2026, Tencent Hunyuan officially open-sourced the Hy-MT2 family of multilingual translation models with three sizes:
- Hy-MT2-1.8B — Lightweight, fits in a phone at 440MB quantized
- Hy-MT2-7B — Mid-range, runs on a single GPU
- Hy-MT2-30B-A3B — MoE architecture, 30B total params, 3B active
All three support 33 languages including Mandarin, Cantonese, English, French, Japanese, Korean, Arabic, Russian, Tibetan, and Uyghur.
GitHub: 213 stars. HuggingFace and ModelScope available now.
What is "Fast-Thinking" Translation?
Traditional LLM translation uses "slow thinking" — understand the full semantics first, then generate. Tencent introduced a "fast-thinking" paradigm here: react like a professional human translator, cutting unnecessary reasoning overhead.
Results: 7B and 30B-A3B in fast-thinking mode outperform DeepSeek-V4-Pro and Kimi K2.6 on translation tasks. And the lightweight 1.8B model beats mainstream commercial APIs like Microsoft and Doubao overall.
That's remarkable — a 1.8B on-device model outperforming commercial cloud services.
Choosing the Right Model
Hy-MT2-1.8B: The Mobile Powerhouse
- Parameters: 1.8B
- Quantized size: 440MB (1.25-bit extreme quantization)
- Inference speed: 1.5x faster
Target: on-device deployment. Phones, tablets, embedded devices. Tencent's AngelSlim quantization compresses 1.8B to 440MB while actually speeding up inference.
Available on HuggingFace in FP8, GGUF, and even 2-bit / 1.25-bit extreme quantization variants.
Hy-MT2-7B: The Sweet Spot
- Parameters: 7B
- Recommended: Single A100 or RTX 4090
- Quantization: FP8 / GGUF
7B is the most popular open-source model size. Tencent provides four inference solutions: transformers, vLLM, SGLang, and llama.cpp — covering research to production.
Ideal for server-side deployment where you need high quality without operating a massive model.
Hy-MT2-30B-A3B: MoE Brutality
- Architecture: Mixture of Experts (MoE)
- Total params: 30B
- Active params: 3B per forward pass
MoE logic: 30B knowledge, 3B compute cost. Only 3B parameters activate per inference, but theoretically taps 30B's knowledge capacity.
Best for highest translation quality demands: legal, medical, or technical documentation.
Supported Languages (33)
Mandarin, Cantonese, English, French, Spanish, Japanese, Korean, Russian, Arabic, Thai, Vietnamese, Hindi, Traditional Chinese, Tibetan, Uyghur, and more.
Instruction-Following Capabilities
Hy-MT2 isn't just a "translator." It follows complex translation instructions:
- Terminology consistency: Provide reference translations, model keeps terminology unified
- Style control: Specify formal/casual/literary tone
- Delimiter preservation: Special characters in code/templates stay intact
- Structured data translation: JSON keys don't translate, only user-visible text
- Personalized preferences: e.g., "translate with a Northern Chinese dialect feel"
- Context integration: Provide background, model translates with context
These capabilities are evaluated via IFMTBench, which Tencent also open-sourced.
Deployment Options
- Quick prototype / research: transformers (HuggingFace)
- Production / high throughput: vLLM / SGLang
- Local lightweight: llama.cpp (GGUF)
- Mobile / on-device: AngelSlim 1.25-bit quantization
llama.cpp inference relies on Tencent's open-source STQ kernel (llama.cpp PR #22836) — requires building from source.
Open Source Ecosystem
- HuggingFace: https://huggingface.co/collections/tencent/hy-mt2
- ModelScope: https://modelscope.cn/collections/Tencent-Hunyuan/Hy-MT2
- GitHub: https://github.com/Tencent-Hunyuan/Hy-MT2
- AngelSlim: https://github.com/tencent/AngelSlim
Tencent also partnered with WMT26 to sponsor a video subtitle translation task.
Summary
Hy-MT2's core strengths:
- Three sizes covering phones to servers
- 33 languages + 5 dialects, true multilingual
- "Fast-thinking" paradigm for efficient inference
- Strong instruction following beyond plain MT
- Fully open source: quantization tools, inference scripts, training pipeline
If you're building multilingual products, translation tools, or need high-quality localization, Hy-MT2 is worth trying. A 1.8B model that fits in a phone is interesting enough on its own.