Welcome to My Homepage!

I’m a general partner of BeingBeyond, a startup dedicated to advancing foundation models for general-purposed humanoid robots, where I collaborate closely Prof. Zongqing Lu. Prior to this, I was a researcher at the Beijing Academy of Artificial Intelligence(BAAI). I obtained my PhD and bachelor’s degree from Renmin University of China (RUC), under the guidance of Prof. Qin Jin. My research primarily focuses on human behavior understanding, vision-and-language learning, and the development of open-world embodied agents. Currently I’m working towards an intelligent humanoid robot. For more details, please refer to my CV.

Join Us!

We are actively recruiting full-time researchers and interns to join our team. If you’re passionate about embodied AI, feel free to reach out.

Research Interest

  • Large language models and large multimodal models
  • Open-world embodied agent learning
  • Human behavior and motion understanding
  • Robot learning

🔥 News

  • 2025.07: 🎉 Our next LMM version Being-VL-0.5 is released (usr, )
  • 2025.07: 🎉 We release Being-H0, the first VLA pretrained from large-scale human videos with hand motion.
  • 2025.06: 🎉 Three paper is accepted to ICCV’25.
  • 2025.06: 🎉 We won 1st place in GemBench Challenge at CVPR 2025 Workshop GRAIL.
  • 2025.05: 🎉 We present our first million-level motion model Being-M0, which is accepted by ICML 2025
  • 2024.10: 🎉 We present our Being-VL-0, which is accepted by ICLR 2025

📝 Publications

* denotes equal contribution

🤖 BeingBeyond Series

arxiv
sym

Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos
Hao Luo*, Yicheng Feng*, Wanpeng Zhang*, Sipeng Zheng*, Ye Wang, Haoqi Yuan, Jiazheng Liu, Chaoyi Xu, Qin Jin, Zongqing Lu

Project

  • Being-H0 is the first VLA pretrained from large-scale human videos with hand motion.
arxiv
sym

RLPF: Physical Feedback: Aligning Large Motion Models with Humanoid Control
Junpeng Yue, Zepeng Wang, Yuxuan Wang, Weishuai Zeng, Jiangxing Wang, Xinrun Xu, Yu Zhang, Sipeng Zheng, Ziluo Ding, Zongqing Lu

Project

  • RLPF translates text-driven human motions into executable actions for humanoid robots.
ICML 2025
sym

Being-M0: Scaling Large Motion Models with Million-Level Human Motions
Ye Wang*, Sipeng Zheng*, Bin Cao, Qianshan Wei, Weishuai Zeng, Qin Jin, Zongqing Lu

ICML25

Project

  • Being-M0 is the first large motion generation model scaling to 1 million motion sequences.
ICCV 2025
sym

Being-VL-0.5: Unified Multimodal Understanding via Byte-Pair Visual Encoding
Wanpeng Zhang, Yicheng Feng, Hao Luo, Yijiang Li, Zihao Yue, Sipeng Zheng, Zongqing Lu.

ICCV25 (Highlight)

Project | Code

  • Being-VL is the first large multimodal model based on compressed discrete visual representation using 2D-BPE.

Being-VL-0: From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities (ICLR 2025) | page

🎙 Before BeingBeyond

ICLR 2024
sym

Steve-Eye: Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds, Sipeng Zheng, Jiazheng Liu, Yicheng Feng, Zongqing Lu.

ICLR24 (Spotlight 5.02%)

Code | Project

ECCV 2022
sym
CVPR 2022
sym

VRDFormer: End-to-end video visual relation detection with transformer, Sipeng Zheng, Shizhe Chen, Qin Jin.

CVPR22 (Oral 4.14%)

Code

📚 Paper List

🎖 Honors and Awards

  • 2025 Ranked 1st in GemBench Challenge at CVPR 2025 Workshop GRAIL.
  • 2022 Ranked 3th in CVPR 2022 Ego4D Natural Language Query Challenge.
  • 2021 Ranked 3th in NIST TRECVID 2021 Ad-hoc Video Search (AVS) Challenge.
  • 2021 Ranked 2nd in CVPR 2021 HOMAGE Scene-graph Generation Challenge.
  • 2020 Ranked 2nd in ACM MM 2020 Video Relationship Understanding Grand Challenge.
  • 2019 Ranked 2nd in ACM MM 2019 Video Relationship Understanding Grand Challenge.
  • 2022 National Scholarship for Ph.D Students.
  • 2019 Best Method Prize in ACM MM 2019 Grand Challenge.
  • 2019 First Class Scholarship for Ph.D Students from 2018 to 2021.
  • 2015 First Prize in National University Mathematical Modeling Competition of Beijing Area.

📖 Educations

  • 2018.09 - 2023.06, PhD, Computer Science and Engineering, Renmin University of China, China.
  • 2014.09 - 2018.06, Undergraduate; Computer Science and Engineering, Renmin University of China, China.

💻 Work Experience

  • 2025.05 - now, Research Scientist; BeingBeyond, Beijing, China.
  • 2023.07 - 2025.05, Researcher; Beijing Academy of Artificial Intelligence, Beijing, China.
  • 2022.04 - 2022.10, Research Intern; Microsoft Research Asia, Beijing, China.
  • 2021.11 - 2022.04, Research Intern; Beijing Academy of Artificial Intelligence, Beijing, China.

🔧 Services

  • Conference Reviewer for CVPR, ICCV, ECCV, ACCV, NeurIPS, AAAI, ACM MM.
  • Journal Reviewer for IJCV, TCSVT, TMM, JATS.