About Me
I am SHEN Haiyang, a Ph.D. candidate at the School of Computer Science, Peking University. I am advised by Assistant Professor Yun Ma.
My research trajectory started from software systems, then moved toward the intersection of software and AI, and has gradually evolved into my current focus on LLM-based agents. My long-term vision is to better integrate AI with existing software systems, enabling AI to use tools, interact with applications, and improve real-world software workflows more reliably.
My research centers on LLM-based agents. Specifically, I am interested in:
- Search & Web Agents: building intelligent agents for deep information seeking and web interaction.
- Coding Agents: developing benchmarks and systems for automated software engineering.
- Tool-Augmented LLMs: integrating large language models with APIs and external tools.
- Financial Agents: exploring how agents can trade stocks and generate returns in financial markets.
- LLM Inference on Edge Devices: efficient LLM deployment on web and mobile platforms.
My research group is affiliated with the Data Space Technology and Systems Research Center, led by Academician Hong Mei and Professor Gang Huang, with faculty members including Xuanzhe Liu, Xin Jin, and Yun Ma. The center is a leading research group in China for machine learning systems, software engineering, and systems.
Experience
- 2025.12 – now Intern, UniPat AI. Research on code data synthesis.
- 2025.06 – 2025.11 Intern, Tongyi DeepResearch Group, Alibaba. Research on data synthesis for DeepResearch, including WebShaper and WebLeaper. DeepResearch GitHub.
- 2022 – now Outstanding Research Award, Ubiquant Scholarship, Peking University.
- 2022.09 – now Ph.D. candidate in Computer Science and Technology, School of Computer Science in Peking University.
- 2024.10 – 2025.03 Intern of Miracleplus & Shanghai AI Lab. Explore RAG Data Synthesis.
- 2022.08 – 2022.10 Intern of Alibaba Innovative Research in Technical & Quality of Fliggy, Alibaba. Explore Anomaly detection using page access logs. Resigned early to work remotely due to the pandemic.
- 2019 – 2021 National Scholarship, First Class Scholarship, and Wu Yajun Scholarship of Northwestern Polytechnical University.
- 2018.09 – 2022.07 B.Sc. in Computer Science and Technology, School of Computer Science in Northwestern Polytechnical University.
Publications
* Co-first author or project leader. ✉ Corresponding author.
-
Haiyang Shen. EvoCodeBench: Evaluating Coding Agents in Multi-Turn Iterative Interactions. 2026. NeurIPS Submission.
-
Haiyang Shen. MindLoom: Composing Thought Modes for Frontier-Level Reasoning Data Synthesis. 2026. NeurIPS Submission.
-
Haiyang Shen. QuestBench: A Course-Curated Benchmark for Expert-Level Cross-Domain Deep Search in Language Models. 2026. NeurIPS Submission.
-
Haiyang Shen. Genesis: Coding Task Synthesis via Iterative Multi-Agent Coordination. 2026. NeurIPS Submission.
-
Siqi Zhong, Mugeng Liu, Haiyang Shen, Chongyang Pan, Yun Ma✉. LaTune: Lightweight and Adaptive Configuration Tuning for LLM Inference on Edge Devices. Proceedings of the ACM Web Conference 2026, 5404–5414. 2026. Top Conference on Web.
-
Wenchun Jing, Haiyang Shen, Haoran Wang, Qi Liu, Ningyuan Li, Chaoran Luo, Ning Zhang, Yun Ma. MCP-Focus: Leveraging Function-Oriented Document Enhancement for MCP Server Retrieval. The ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 2026.
- Haiyang Shen*, Xinbo Xu*, Xuanzhong Chen, Wendong Xu✉, Elvis Zhang, Kaiyuan Chen, Xiaobo Hu, Rui Wang, Yang Liu, Yixin Ren, Yuan Gong, Liang Chen, Kuan Li✉. Monthly-SWEBench: A Living, Rigorously Verified Benchmark for Real-World Software Engineering. 2026. Benchmark.
- Haiyang Shen*, Hang Yan*, Zhongshi Xing, Mugeng Liu, Yue Li, Zhiyang Chen, Yuxiang Wang, Jiuzheng Wang, Yun Ma✉. DRAGON: Domain-specific Robust Automatic Data Generation for RAG Optimization. Findings of the Association for Computational Linguistics: EACL 2026. 2026. Top Conference on NLP.
- Taian Guo*, Haiyang Shen*, Junyu Luo, Zhongshi Xing, Hanchun Lian, Jinsheng Huang, Binqi Chen, Luchen Liu, Yun Ma✉, Ming Zhang✉. MEME: Modeling the Evolutionary Modes of Financial Markets. arXiv preprint arXiv:2602.11918. 2026.
- Taian Guo*, Haiyang Shen*, Junyu Luo, Binqi Chen, Hao Ding, Jinsheng Huang, Luchen Liu, Yun Ma✉, Ming Zhang✉. AlphaPROBE: Alpha Mining via Principled Retrieval and On-graph Biased Evolution. arXiv preprint arXiv:2602.11917. 2026.
- Sixiong Xie, Zhuofan Shi, Haiyang Shen, Gang Huang, Yun Ma, Xiang Jing✉. M3-BENCH: Process-Aware Evaluation of LLM Agents’ Social Behaviors in Mixed-Motive Games. arXiv preprint arXiv:2601.08462. 2026.
- Liang Chen*✉, Weichu Xie*, Yiyan Liang*, Hongfeng He*, Hans Zhao*, …, Haiyang Shen, Yixin Ren, Yang Liu, Yuan Gong, Kuan Li✉. BabyVision: Visual Reasoning Beyond Language. arXiv preprint arXiv:2601.06521. 2026.
- Zhuofan Shi, Yufei Shao, Minxuan Dai, Yuchao Yu, Pengfei Xiang, Dongliang Huang, Hongxu An, Chunxiao Xin, Haiyang Shen, et al. MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics. arXiv preprint arXiv:2601.02075. 2026.
- Guoqing Wang, Zeyu Sun, Yizhou Chen, Yifan Zhao, Haiyang Shen, Qingyuan Liang, Dan Hao✉. Beyond the Sum of Parts: Leveraging Entanglement for Bug Inducing Commit Localization. IEEE Transactions on Software Engineering. 2025. Top Journal in Software Engineering.
- Zhengwei Tao*, Haiyang Shen*, Baixuan Li*, Wenbiao Yin, Jialong Wu, Kuan Li, Zhongwang Zhang, Huifeng Yin, Rui Ye, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou, Wentao Zhang✉, Yun Ma✉, Zhiqiang Gao✉. Empowering Efficiency and Efficacy in WebAgent via Enabling Info-Rich Seeking. The Fourteenth International Conference on Learning Representations (ICLR). 2026. Top Conference on Machine Learning.
- Tongyi DeepResearch Team, Baixuan Li, Bo Zhang, Dingchu Zhang, …, Haiyang Shen, Xinyu Geng, Yuning Wu, Zijian Li, Yong Jiang✉. Tongyi DeepResearch Technical Report. arXiv preprint arXiv:2510.24701. 2025.
- Baixuan Li*✉, Dingchu Zhang*, Jialong Wu*, Wenbiao Yin✉, Zhengwei Tao, Yida Zhao, Liwen Zhang, Haiyang Shen, Runnan Fang, Pengjun Xie, Jingren Zhou, Yong Jiang✉. ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking. arXiv preprint arXiv:2510.24698. 2025.
- Zhiyang Chen, Dongqi Xu, Haiyang Shen, Chao Lou, Mengwei Xu, Sheng Wang, Xin Jin, Yun Ma. Accelerating Mobile Language Model via Speculative Decoding and NPU-Coordinated Execution. arXiv preprint arXiv:2510.15312. 2025.
- Zijian Shao*, Haiyang Shen*, Mugeng Liu, Guangyu Fu, Yaoqi Guo, Yuxiang Wang, Yun Ma. Rethinking Explainable Disease Prediction: Synergizing Accuracy and Reliability via Reflective Cognitive Architecture. arXiv preprint arXiv:2509.21266. 2025.
- Zhengwei Tao*, Jialong Wu*, Wenbiao Yin✉, Junkai Zhang, Baixuan Li, Haiyang Shen, Kuan Li, Liwen Zhang, Xinyu Wang, Yong Jiang✉, Pengjun Xie, Fei Huang, Jingren Zhou. WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization. arXiv preprint arXiv:2507.15061. 2025.
- Mugeng Liu, Haiyang Shen, Yixuan Zhang, Hong Mei, Yun Ma✉. WebAssembly for Container Runtime: Are We There Yet? ACM Transactions on Software Engineering and Methodology 34 (6), 1–22. 2025. Top Journal in Software Engineering.
- Taian Guo*, Haiyang Shen*, Jinsheng Huang, Zhengyang Mao, Junyu Luo, Binqi Chen, Zhuoru Chen, Luchen Liu, Bingyu Xia, Yun Ma✉, Ming Zhang✉. MASS: Multi-Agent Simulation Scaling for Portfolio Construction. arXiv preprint arXiv:2505.10278. 2025.
-
Haiyang Shen, Yue Li, Zhiyang Chen, Yun Ma. EasIPA: Enhancing LLM’s Ability to Select APIs for IPA. International Conference on Service Science, 34–48. 2025.
- Zhiyang Chen, Yun Ma✉, Haiyang Shen, Mugeng Liu. WeInfer: Unleashing the Power of WebGPU on LLM Inference in Web Browsers. Proceedings of the ACM on Web Conference 2025, 4264–4273. 2025. Top Conference on Web.
- Qi Yang, Weichen Bi, Haiyang Shen, Yaoqi Guo, Yun Ma✉. PixelWeb: The First Web GUI Dataset with Pixel-Wise Labels. arXiv preprint arXiv:2504.16419. 2025.
- Haiyang Shen, Yun Ma✉. Characterizing the Developer Groups for Metaverse Services in Roblox. 2024 IEEE International Conference on Software Services Engineering (SSE). 2024.
- Haiyang Shen, Yue Li, Desong Meng, Dongqi Cai, Sheng Qi, Li Zhang, Mengwei Xu, Yun Ma✉. ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents. arXiv preprint arXiv:2407.00132. 2024.
- Haiyang Shen, Yun Ma✉, Yue Li, Xiaoling Wang, Deyu Tian, Tong Jia, Tengfei He, Shenghua Luo. ADPal: Automatic Detection of Troubled Users in Online Service Systems via Page Access Logs. 2023 IEEE International Conference on Web Services (ICWS), 638–646. 2023. Top Conference on Service Computing.
- Deyu Tian, Haiyang Shen, Yun Ma✉. Parallelizing DNN Inference in Mobile Web Browsers on Heterogeneous Hardware. Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services (MobiSys). 2022. Top Conference on Mobile Computing.
Correspondence
- Email: hyshen@stu.pku.edu.cn
- GitHub: https://github.com/eachsheep
- Google Scholar: https://scholar.google.com/citations?user=BI-Mb_EAAAAJ
- Homepage: https://eachsheep.space