Jie Yang 杨杰

Senior Researcher

WeChat Vision, Tencent
📍 Beijing

Email: cvjieyang@tencent.com

Biography

I am currently a Senior Researcher at WeChat Vision, Tencent, working on multimodal foundation models. Before that, I got my Ph.D. degree from The Chinese University of Hong Kong, Shenzhen, under the co-supervision of Prof. Ruimao Zhang and Prof. Zhen Li. During my Ph.D. studies (2021-2025), I had a wonderful time interning at IDEA, BIGAI, SenseTime Research, and Tencent.

We are actively seeking self-motivated interns to work on related research topics, including image/video/omni pretraining/post-training and reinforcement learning. If you're interested, feel free to reach out!

News

One paper is accepted to NeurIPS2025.
We present WeThink for general-purpose vision-language reasoning.
One paper is accepted to T-PAMI.
One paper is accepted to CVPR2025.
One paper is accepted to ICRA2025.
One paper is accepted to NeurIPS2024.
Grounding DINO is selected as The Most Influential Paper in ECCV 2024.
Three papers are accepted to ECCV2024.
One paper is accepted to CVPR2024.
We present X-Pose to detect any keypoints of any objects.
Grounded SAM is accepted to ICCV 2023 Demo Track.
One paper is accepted to ICCV2023.
Two papers are accepted to MIDL2023 and one is rated as an oral presentation.
One paper is accepted to CVPR2023.
One paper is accepted to ICLR2023.
One paper is accepted to NeurIPS2022.

Selected Publications [Google Scholar]

WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning
Jie Yang, Feipeng Ma, Zitian Wang, Dacheng Yin, Kang Rong, Fengyun Rao, Ruimao Zhang.
arXiv preprint, 2025.

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
Li Kang, Xiufeng Song, Heng Zhou, Yiran Qin, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai, Zhenfei Yin.
Conference on Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS D&B), 2025.

ED-Pose++: Enhanced Explicit Box Detection for Conventional and Interactive Multi-Object Keypoint Detection
Jie Yang, Ailing Zeng, Tianhe Ren, Shilong Liu, Feng Li, Ruimao Zhang, Lei Zhang.
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2025.

InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing
Jinlu Zhang, Yixin Chen, Zan Wang, Jie Yang, Yizhou Wang, Siyuan Huang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

Unlock the Power of Unlabeled Data in Language Driving Model
Chaoqun Wang, Jie Yang, Xiaobin Hong, Ruimao Zhang.
IEEE International Conference on Robotics and Automation (ICRA), 2025

KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Jie Yang, Wang Zeng, Sheng Jin, Lumin Xu, Wentao Liu, Chen Qian, Ruimao Zhang.
Conference on Neural Information Processing Systems (NeurIPS), 2024

X-Pose: Detecting Any Keypoints
Jie Yang, Ailing Zeng, Ruimao Zhang, Lei Zhang.
European Conference on Computer Vision (ECCV), 2024

F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions
Jie Yang, Xuesong Niu, Nan Jiang, Ruimao Zhang, Siyuan Huang.
European Conference on Computer Vision (ECCV), 2024

Open-World Human-Object Interaction Detection via Multi-modal Prompts
Jie Yang, Bingliang Li, Ailing Zeng, Lei Zhang, Ruimao Zhang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

Neural Interactive Keypoint Detection
Jie Yang, Ailing Zeng, Feng Li, Shilong Liu, Ruimao Zhang, Lei Zhang.
IEEE International Conference on Computer Vision (ICCV), 2023.

Semantic Human Parsing via Scalable Semantic Transfer over Multiple Label Domains
Jie Yang, Chaoqun Wang, Zhen Li, Junle Wang, Ruimao Zhang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation
Jie Yang, Ailing Zeng, Shilong Liu, Feng Li, Ruimao Zhang, Lei Zhang.
International Conference on Learning Representations (ICLR), 2023.

Toward Unpaired Multi-modal Medical Image Segmentation via Learning Structured Semantic Consistency
Jie Yang, Ye Zhu, Chaoqun Wang, Zhen Li, Ruimao Zhang.
International Conference on Medical Imaging with Deep Learning (MIDL), 2023.

Honors & Awards

The First Prize Scholarship, 2018，2019，2020.
National Scholarship, 2018