Tencent Researchers Launch Tencent HY-MT1.5: A New Translation Fashions That includes 1.8B and 7B Fashions Designed for Seamless on-System and Cloud Deployment

**Tencent Researchers Roll Out HY-MT1.5: A New AI Translation Model That Dares to Simplify On-Device and Cloud Translation**

Hey there, language enthusiasts! Today, we’re excited to dive into Tencent’s latest AI translation model, HY-MT1.5. This impressive model is designed to operate on both on-device and cloud deployment, and the best part? It comes in two flavors: HY-MT1.5-1.8B and HY-MT1.5-7B.

**Breaking Down the Models**

So, what’s the difference between these two models? Well, the HY-MT1.5-7B model is essentially an upgraded version of the WMT25 championship system, Hunyuan-MT-7B. It’s optimized for explanatory translation and combined language scenarios, which means it’s got native support for terminology intervention, contextual translation, and formatted translation. This model is perfect for server and high-end edge deployment, where latency isn’t a major concern.

On the other hand, the HY-MT1.5-1.8B model is a more compact variant, with fewer parameters than its bigger sibling. But don’t let that fool you – it still delivers comparable translation efficiency in reported benchmarks. After some quantization magic, this model can run on edge gadgets, supporting real-time translation. Yes, you read that right – real-time translation on the edge!

**The Holistic Training Framework**

So, how did Tencent’s researchers manage to create such an impressive model? They used a multi-stage pipeline that includes:

1. Basic pre-training: The bottom model is trained on large-scale multilingual text with a language modeling goal, building shared representations across languages.
2. MT-oriented pre-training: The model is then exposed to parallel corpora and translation-oriented goals, aligning the technology distribution with actual translation duties.
3. Supervised fine-tuning: The model is fine-tuned with supervised loss, sharpening literal correctness, semantic preservation, and path-specific habits, like translation from ZH to EN versus EN to ZH.
4. On-coverage distillation from 7B to 1.8B: HY-MT1.5-7B is used as a teacher for HY-MT1.5-1.8B, with about 1 million monolingual prompts across 33 languages, running them by the teacher, and using reverse Kullback-Leibler divergence on the student rollouts to match the teacher distribution.
5. Reinforcement learning with rubrics-based analysis: In the final stage, both models are optimized with a bunch relative coverage optimization model algorithm and a rubrics-based reward model. Human reviewers rate translations on a number of axes, such as accuracy, fluency, idiomaticity, and cultural appropriateness.

**Benchmarking the Models**

HY-MT1.5 was evaluated on Flores 200, WMT25, and a Mandarin to minority language benchmark using XCOMET-XXL and CometKiwi. The results? Well, let’s just say these models are quite impressive:

* On Flores 200, HY-MT1.5-7B reaches XCOMET-XXL scores of 0.8690 for ZH to XX, 0.9093 for EN to XX, and 0.8098 for XX to XX, outperforming translation-specific fashions like iFLYTEK Translator and Doubao Translator.
* On WMT25, HY-MT1.5-7B reaches XCOMET-XXL 0.6159, slightly above Gemini 3.0 Professional and significantly above translation-oriented fashions like Seed-X-PPO-7B and Tower-Plus-72B.
* On Mandarin to minority language pairs, HY-MT1.5-7B achieves 0.6174 in XCOMET-XXL, higher than all baselines, including Gemini 3.0 Professional.

**In-Depth Features for Practical Use**

The models expose three immediate-pushed capabilities that matter in production techniques:

1. **Terminology intervention**: A direct template allows you to inject term mappings like “混元珠 → Chaos Pearl”. This is crucial for legal, medical, or model-constrained content.
2. **Context-aware translation**: A second template accepts a context block plus the sentence to translate. The report reveals the phrase “pilot” misinterpreted as an individual when context is absent. When a paragraph about TV series is added, the model appropriately interprets “pilot” as an episode.
3. **Format-preserving translation**: A third template wraps the source in <source> tags and marks spans with <span> tags. The instruction forces the model to maintain tags and output within <target> tags, allowing HTML or XML-like text to survive translation with structure preserved.

**Quantization and Edge Deployment**

HY-MT1.5-1.8B is evaluated with FP8 and Int4 post-coaching quantization using GPTQ. This means it can run efficiently on edge gadgets, supporting real-time translation.

**The Bottom Line**

So, what’s the takeaway from all this? HY-MT1.5 is a 2-model translation household, HY-MT1.5-1.8B and HY-MT1.5-7B, supporting mutual translation across 33 languages plus 5 dialect or variant types. Both models are launched with open weights on GitHub and Hugging Face, so you can try them out for yourself!

Want to dive deeper into the paper or try out the model weights on HF or the GitHub repo?

Tencent Researchers Launch Tencent HY-MT1.5: A New Translation Fashions That includes 1.8B and 7B Fashions Designed for Seamless on-System and Cloud Deployment

Your Push Notifications Aren’t Secure From the FBI

How the Web Broke Everybody’s Bullshit Detectors

How Data Distillation Compresses Ensemble Intelligence right into a Single Deployable AI Mannequin

Tencent Researchers Launch Tencent HY-MT1.5: A New Translation Fashions That includes 1.8B and 7B Fashions Designed for Seamless on-System and Cloud Deployment

Related Posts

Your Push Notifications Aren’t Secure From the FBI

How the Web Broke Everybody’s Bullshit Detectors

How Data Distillation Compresses Ensemble Intelligence right into a Single Deployable AI Mannequin