Chenchen Zhang

Senior Researcher, Tencent · Hunyuan Department

I am a Senior Researcher at Tencent Hunyuan, where I work on RL post-training and agentic code-reasoning systems for foundation models. I have been a core contributor to Hy3.0 preview, HY2.0, Hunyuan-A13B, Hunyuan-TurboS, Hunyuan-T1, and Hunyuan-Large.

Research focus: RL post-training, agentic RL, code/reasoning models, multimodal evaluation, and data-centric model iteration

Google Scholar GitHub X Hugging Face Email

Current Focus

What I am working on now

RL for Reasoning

Post-training recipes for math, STEM, code, and long-horizon reasoning models.

Agentic RL

Training loops for web coding agents, tool use, critic feedback, and credit assignment.

Code Intelligence

Repository-level generation, automatic benchmark generation, visual-to-code, and code critique.

Multimodal Evaluation

Evaluation for audio-visual understanding, artifacts, tool-use, and model iteration loops.

Hunyuan

Model Family Timeline

A compact view of Hunyuan model releases and technical reports where I have been a core contributor.

11/2024

Hunyuan-Large

Open-source MoE model with long-context support.

Details HF Report

03/2025

Hunyuan-T1

Reasoning-focused model with code-reasoning optimization and code evaluation.

Details Tech page

05-06/2025

Hunyuan-TurboS & A13B

Efficient reasoning releases: adaptive CoT in TurboS and open-source MoE A13B with hybrid reasoning.

TurboS A13B TurboS arXiv HF

12/2025

HY2.0

Think and Instruct variants for reasoning and instruction-following.

Details

04/2026

Hy3.0 preview

Open-source preview model with reasoning, coding, and agentic workflows.

Details HF OpenRouter

Research

Research Directions

I am currently working on foundation models, especially code/reasoning models, reinforcement learning, agentic RL, math and STEM reasoning, code intelligence, multimodal systems, and evaluation.

Foundation Hunyuan Models Hy3.0 · HY2.0 · A13B · T1 · TurboS · Large Optimization Math / STEM RL CriticLean · ConceptMath · RLPT Agents Agentic RL ReLook · Rhombus · Credit Assignment Code Code Intelligence SWE-Compass · NL2Repo · OpenCoder Evaluation Multimodal & Tool Use OmniVideoBench · MTU-Bench · ArtifactsBench

Pretraining & Foundation Models

Hy3.0 preview, HY2.0, Hunyuan-A13B, Hunyuan-TurboS, Hunyuan-T1, Hunyuan-Large, MAP-Neo, OpenCoder, D-CPT Law, E2-LLM, DDK.

Hy3.0 preview HY2.0 A13B TurboS T1 Hunyuan-Large MAP-Neo OpenCoder D-CPT Law DDK

Reasoning & Agentic RL

Math/STEM reasoning optimization, RL for code and reasoning models, reinforcement learning on pre-training data, critic-guided RL, parallel thinking, and credit assignment.

Hunyuan-T1 RLPT CriticLean ConceptMath MTU-Bench Credit Assignment ReLook Rhombus

Code & Agentic LLMs

Agentic coding evaluation, long-horizon repository generation, automatic benchmark generation, visual-to-code, code critique, artifact evaluation, and code LLM pretraining.

SWE-Compass WebCompass NL2Repo-Bench Diagrams to Code AutoCodeBench ReLook OpenCoder ArtifactsBench CodeCriticBench

Evaluation, Multimodal & Surveys

Math, code, tool-use, audio-visual, multimodal browsing, group identity evaluation, and survey resources.

OmniVideoBench MTU-Bench ConceptMath GIEBench ArtifactsBench Long-context Survey

Updates

News

Selected paper releases, conference updates, and open-source project milestones.

05/2026
ICML 2026: SWE-Compass, NL2Repo-Bench, and From Diagrams to Code.
04/2026
ACL 2026: CriticLean, RLPT, ReLook, and Rhombus.
04/2026
Hy3.0 preview open-sourced; #1 on OpenRouter daily global API usage.
01/2026
ICLR 2026: AutoCodeBench and OmniVideoBench.
12/2025
Core contributor to the HY2.0 model family.
06/2025
Core contributor to open-source Hunyuan-A13B, an efficient MoE model with hybrid reasoning and agent capabilities.
05/2025
Hunyuan-TurboS technical report released.
05/2025
ACL 2025: OpenCoder.
03/2025
Core contributor to reasoning-focused Hunyuan-T1.
01/2025
ICLR 2025: MTU-Bench.
11/2024
Hunyuan-Large weights and technical report released.
09/2024
NeurIPS 2024: DDK and D-CPT Law.
05/2024
ACL 2024: E2-LLM and ConceptMath.

Selected Work

Selected Publications & Projects

Full list on Google Scholar Full list

Representative models, papers, benchmarks, and survey resources.

Models & Technical Reports

Hy3.0 preview

Open Model 2026

Tencent Hunyuan Team

Open-source Hunyuan 3.0 preview model for reasoning, coding, and agentic workflows.

My role: Core contributor to post-training, RL recipes, and reasoning optimization across STEM and code.

Open-source OpenRouter #1 HF

HF OpenRouter Release

HY2.0

Model Family 2025

Tencent Hunyuan Team

Next-generation Hunyuan model family for reasoning and instruction-following scenarios.

My role: Core contributor to post-training and evaluation for reasoning and instruction-following models.

Reasoning Instruction

Release

Hunyuan-A13B: An Efficient Open-Source MoE Model with 13B Active Parameters

Open Model 2025

Tencent Hunyuan Team

Open-source fine-grained MoE model with 80B total parameters, 13B active parameters, hybrid reasoning, long-context support, and strong agent capabilities.

My role: Core author and contributor, with contribution level comparable to Hunyuan-T1.

Open-source MoE 13B active Hybrid reasoning Agent capabilities

HF Code

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Tech Report 2025

Tencent Hunyuan Team

My role: Core contributor to reasoning model optimization and adaptive CoT evaluation.

Tech report CoT

Report arXiv

Hunyuan-T1

Tech Page 2025

Tencent Hunyuan Team

My role: Core contributor to reasoning post-training, code-reasoning optimization, LiveCodeBench SOTA, and code evaluation.

Reasoning Code reasoning LiveCodeBench SOTA Tech page

Tech page

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

arXiv 2024

Tencent Hunyuan Team

Open-source MoE LLM with long-context support.

My role: Core contributor to model development and evaluation for the Hunyuan open-source LLM family.

Open-source MoE Long-context

HF Report Code

Code & Agentic LLMs

SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models

ICML 2026

Jingxuan Xu*, Ken Deng, Weihao Li, Songwei Yu, Yanan Wu, Chenchen Zhang*, et al.

* Equal contribution. My role: Built unified evaluation for agentic coding abilities and code-agent capability analysis.

ICML 2026 Co-first author Agentic coding

Paper

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

ICML 2026

Jingzhe Ding, Shengda Long, Changxin Pu, Huan Zhou, Daoguang Zan, Chenchen Zhang, et al.

Paper Code

From Diagrams to Code: Multilingual Programming with Visual Design

ICML 2026

Linzheng Chai, Jian Yang, Shukai Liu, Wei Zhang, Liran Wang, Chenchen Zhang, et al.

Paper Code

ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding

ACL 2026

Yuhang Li*, Chenchen Zhang*, Ruilin Lv, Ao Liu, Ken Deng, et al.

* Equal contribution. My role: Designed agentic RL evaluation and multimodal critic loops for web coding.

ACL 2026 Co-first author Agentic RL Multimodal critic

Paper

WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models

arXiv 2026

Xinping Lei*, Xinyu Che, Junqi Xiong, Chenchen Zhang*, Yukai Huang, Chenyu Zhou, et al.

* Equal contribution. My role: Built multimodal web-coding evaluation and analysis for stronger code and agentic capabilities.

Co-first author Web coding Multimodal eval

Paper

AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators

ICLR 2026

Jason Chou, Ao Liu, Yuchi Deng, Zhiying Zeng, Tao Zhang, Yue Mao, Chenchen Zhang, et al.

Paper Project

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

ACL 2025

Siming Huang, Tianhao Cheng, Jason Klein Liu, Jiaran Hao, Liuyihan Song, Yang Xu, J. Yang, J. H. Liu, Chenchen Zhang, et al.

ACL 2025 Open-source Code LLM

Paper Code

ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation

arXiv 2025

Chenchen Zhang, Yuhang Li, Can Xu, Jiaheng Liu, Ao Liu, et al.

Automated multimodal evaluation for interactive visual artifacts.

My role: First author; built visual-interactive code evaluation for artifact generation.

First author Code evaluation Multimodal

Paper Code

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models

arXiv 2025

Chenchen Zhang, Marcus Dong, Jiaheng Liu, Wei Zhang, Yejie Wang, et al.

My role: First author; built holistic critique evaluation for code LLMs and critic-model analysis.

First author Code critique Benchmark

Paper Code

RL & Reasoning

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

ACL 2026

Zhongyuan Peng, Yifan Yao, Kaijing Ma, Shuyue Guo, Yizhe Li, Chenchen Zhang, et al.

ACL 2026 Math formalization

Paper

Reinforcement Learning on Pre-Training Data

ACL 2026

Siheng Li, Kejiao Li, Zenan Xu, Guanhua Huang, Zihao Zheng, Chenchen Zhang, et al.

ACL 2026 RLPT

Paper OpenReview

Rhombus: Incentivizing Coordination in Parallel Thinking through Reinforcement Learning

ACL 2026

Including Chenchen Zhang et al.

ACL 2026 Parallel thinking

OpenReview

From Reasoning to Agentic: Credit Assignment in Reinforcement Learning for Large Language Models

arXiv 2026

Chenchen Zhang et al.

Survey and resource hub for reinforcement learning credit assignment in LLMs and agents.

Survey Agentic RL GitHub

Paper Code

Foundation, Pretraining & Efficiency

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

arXiv 2024

Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, et al.

Paper Code

DDK: Distilling Domain Knowledge for Efficient Large Language Models

NeurIPS 2024

Jiaheng Liu*, Chenchen Zhang*, Jinyang Guo, Yuanxing Zhang, et al.

* Equal contribution. My role: Co-first author.

NeurIPS 2024 Co-first author Efficient LLMs

Paper

D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models

NeurIPS 2024

Haoran Que, Jiaheng Liu, Ge Zhang, Chenchen Zhang, et al.

My role: Core author; wrote most of the underlying codebase for domain-specific continual pre-training and scaling-law experiments.

NeurIPS 2024 Core author Scaling law

Paper

E2-LLM: Efficient and Extreme Length Extension of Large Language Models

ACL 2024

Jiaheng Liu, Zhiqi Bai, Yuanxing Zhang, Chenchen Zhang, et al.

Paper

Multimodal, Evaluation & Surveys

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

ICLR 2026

Caorui Li, Yu Chen, Yiyan Ji, Jin Xu, Zhenyu Cui, Shihao Li, Yuanxing Zhang, et al., including Chenchen Zhang

Paper

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models

ICLR 2025

Pei Wang, Yanan Wu, Zekun Wang, Jiaheng Liu, Xiaoshuai Song, Zhongyuan Peng, Ken Deng, Chenchen Zhang, et al.

ICLR 2025 Tool use

Paper Project

ConceptMath: A Bilingual Concept-wise Benchmark for Mathematical Reasoning of LLMs

ACL 2024

Yanan Wu, Jie Liu, Xingyuan Bu, Jiaheng Liu, Chenchen Zhang, et al.

ACL 2024 Math reasoning

Paper

GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models

EMNLP 2024

Leyan Wang, Yonggang Jin, Tianhao Shen, Tianyu Zheng, Xinrun Du, Chenchen Zhang, et al.

Paper Code

A Comprehensive Survey on Long Context Language Modeling

Survey

LCLM-Horizon contributors

Survey Long-context GitHub

Code

Background

Research, AI infrastructure, and engineering roles across large-scale language model systems.

Experience

Tencent

Hunyuan Department · Senior Researcher

2024.06 - Present

Alibaba

Algorithm Engineer · AI Infrastructure & Research, with early work focused on AI infra

2022.06 - 2024.06

Alibaba

Algorithm Engineer Intern · AI Infrastructure & research systems

2021.04 - 2022.06

Baidu

Algorithm Engineer Intern

2020.09 - 2021.06