Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

honors-and-awards

publications

Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs

Published in Findings of the Association for Computational Linguistics (ACL), 2024, 2024

We developed a novel evaluation framework to assess the fine-grained control of attributes in text generated by LLMs. Using GPT-4 as a judge and an Elo rating system, our work quantifies control calibration and consistency across five attributes for various prompting and representation editing (RepE) methods.

Recommended citation: Shang Zhou*, Feng Yao*, Chengyu Dong, Zihan Wang, Jingbo Shang. (2024). "Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs." In Findings of the Association for Computational Linguistics: ACL 2024.
Download Paper

Scaling LLM Inference with Optimized Sample Compute Allocation

Published in Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2025, 2024

This paper introduces OSCA, a pioneering algorithm that formulates sample distribution for LLM inference as a learning problem. By using a mixed-allocation strategy, OSCA achieves superior accuracy with significantly less compute—128x less on code generation and 25x less on reasoning tasks.

Recommended citation: Kexun Zhang*, Shang Zhou*, Danqing Wang, William Yang Wang, Lei Li. (2025). "Scaling LLM Inference with Optimized Sample Compute Allocation." In Proceedings of the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics.
Download Paper

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

Published in Under review at Conference on Neural Information Processing Systems (NeurIPS), 2025, 2025

We introduce LiveCodeBench Pro, a new benchmark for evaluating Large Language Models on competitive programming problems, judged by human Olympiad medalists. Our work, co-led by a team of 20 researchers, quantifies a major performance gap: LLMs score nearly 0% on hard problems and consistently fail on tasks requiring deep observation and reasoning.

Recommended citation: Zihan Zheng*, Zerui Cheng*, Zeyu Shen*, Shang Zhou*, Kaiyuan Liu*, Hansen He*, et al. (2025). "LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?." arXiv preprint arXiv:2506.11928.
Download Paper