Welcome to My Homepage!

I’m a general partner of BeingBeyond, a startup dedicated to advancing foundation models for general-purposed humanoid robots, where I collaborate closely Prof. Zongqing Lu. Prior to this, I was a researcher at the Beijing Academy of Artificial Intelligence(BAAI). I obtained my PhD and bachelor’s degree from Renmin University of China (RUC), under the guidance of Prof. Qin Jin. My research primarily focuses on human behavior understanding, vision-and-language learning, and the development of open-world embodied agents. Currently I’m working towards an intelligent humanoid robot. For more details, please refer to my CV.

Join Us!

We are actively recruiting full-time researchers and interns to join our team. If you’re passionate about embodied AI, feel free to reach out.

Research Interest

  • Large language models and large multimodal models
  • Open-world embodied agent learning
  • Human behavior and motion understanding
  • Robot learning

🔥 News

  • 2025.06: 🎉 We won 1st place in GemBench Challenge at CVPR 2025 Workshop GRAIL.
  • 2025.05: 🎉 We present our first million-level motion model Being-M0, which is accepted by ICML 2025
  • 2024.10: 🎉 We present our Being-VL-0, which is accepted by ICLR 2025

📝 Publications

* denotes equal contribution

🤖 BeingBeyond

ICML 2025
sym

Being-M0: Scaling Large Motion Models with Million-Level Human Motions
Ye Wang*, Sipeng Zheng*, Bin Cao, Qianshan Wei, Weishuai Zeng, Qin Jin, Zongqing Lu

Project

  • Being-M0 is the first large motion generation model scaling to 1 million motion sequences.
ICLR 2025
sym

Being-VL-0: From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
Wanpeng Zhang, Zilong Xie, Yicheng Feng, Yijiang Li, Xingrun Xing, Sipeng Zheng, Zongqing Lu

Project

  • Being-VL-0 is the first large multimodal model based on compressed discrete visual representation using 2D-BPE.

🎙 Large Multimodal Model

ICLR 2024
sym

Steve-Eye: Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds, Sipeng Zheng, Jiazheng Liu, Yicheng Feng, Zongqing Lu. (Oral) Code | Project

CVPR 2022
sym

📚 Paper List

🎖 Honors and Awards

  • 2025 Ranked 1st in GemBench Challenge at CVPR 2025 Workshop GRAIL.
  • 2022 Ranked 3th in CVPR 2022 Ego4D Natural Language Query Challenge.
  • 2021 Ranked 3th in NIST TRECVID 2021 Ad-hoc Video Search (AVS) Challenge.
  • 2021 Ranked 2nd in CVPR 2021 HOMAGE Scene-graph Generation Challenge.
  • 2020 Ranked 2nd in ACM MM 2020 Video Relationship Understanding Grand Challenge.
  • 2019 Ranked 2nd in ACM MM 2019 Video Relationship Understanding Grand Challenge.
  • 2022 National Scholarship for Ph.D Students.
  • 2019 Best Method Prize in ACM MM 2019 Grand Challenge.
  • 2019 First Class Scholarship for Ph.D Students from 2018 to 2021.
  • 2015 First Prize in National University Mathematical Modeling Competition of Beijing Area.

📖 Educations

  • 2018.09 - 2023.06, PhD, Computer Science and Engineering, Renmin University of China, China.
  • 2014.09 - 2018.06, Undergraduate; Computer Science and Engineering, Renmin University of China, China.

💻 Work Experience

  • 2025.05 - now, Research Scientist; BeingBeyond, Beijing, China.
  • 2023.07 - 2025.05, Researcher; Beijing Academy of Artificial Intelligence, Beijing, China.
  • 2022.04 - 2022.10, Research Intern; Microsoft Research Asia, Beijing, China.
  • 2021.11 - 2022.04, Research Intern; Beijing Academy of Artificial Intelligence, Beijing, China.

🔧 Services

  • Conference Reviewer for CVPR, ICCV, ECCV, ACCV, NeurIPS, AAAI, ACM MM.
  • Journal Reviewer for IJCV, TCSVT, TMM, JATS.