Jiachen Hu

My name is Jiachen Hu (胡家琛), now working as an algorithm developer at ByteDance. I graduated from Peking University and received the PhD degree in 2025, where I am fortunate to be advised by Professor Liwei Wang, and spend wonderful times working with Chi Jin and Lihong Li remotely for past years. Before becoming a PhD candidate, I obtained my B.S. from Turing Class, Peking University.

I have broad interests in sample efficient reinforcement learning and online learning, especially the application-driven problems. In the past few years, my researches focused on statistically efficient bandits (e.g., multi-armed bandits, linear bandits), online exploration in structured MDPs/POMDPs, and AI methods for mathematics. Please feel free to contact me if you are interested in my researches or having a chat with me!

Contact: nickh at pku.edu.cn

news

Jun 2026	I will join the School of Computing and Artificial Intelligence (SCAI) at Shanghai University of Finance and Economics (SUFE) as an Assistant Professor!
Jul 2025	I will join ByteDance as an algorithm developer!
May 2025	One paper is accepted at ICML 2025!
May 2024	One paper is accepted at ICML 2024!
May 2024	One paper is accepted at TQC 2024!
Jun 2023	I will visit Princeton University for the next 6 months!
Jan 2023	One paper is accepted at ICLR 2023!

selected publications

Preprint
New Sphere Packings from the Antipode Construction

Ruitao Chen, Jiachen Hu, Binghui Li, and 2 more authors

In arXiv preprint, 2025

arXiv Bib

Highlights: We construct non-lattice sphere packings in dimensions 19, 20, 21, 23, 44, 45, and 47, demonstrating record densities that surpass all previously documented results in these dimensions. The construction applies the antipode method to suboptimal cross-sections of $\Lambda_{24}$ and $P_{48p}$.
@inproceedings{chen2025spherepacking, title = {New Sphere Packings from the Antipode Construction}, author = {Chen, Ruitao and Hu, Jiachen and Li, Binghui and Wang, Liwei and Wu, Tianyi}, booktitle = {arXiv preprint}, year = {2025}, }
ICLR
Provable Sim-to-real Transfer in Continuous Domain with Partial Observations

Jiachen Hu^*, Han Zhong^*, Chi Jin, and 1 more author

In International Conference on Learning Representations, 2023

arXiv Bib

Highlights: We study sim-to-real transfer in continuous domains with partial observations, modeled by linear quadratic Gaussian (LQG) systems. We show that a popular robust adversarial training algorithm can learn a policy from simulation that is competitive to the optimal real-world policy, providing the first provable guarantee in this setting.
@inproceedings{hu2023provable, title = {Provable Sim-to-real Transfer in Continuous Domain with Partial Observations}, author = {Hu, Jiachen and Zhong, Han and Jin, Chi and Wang, Liwei}, booktitle = {International Conference on Learning Representations}, year = {2023}, }
ICLR
Understanding Domain Randomization for Sim-to-real Transfer

Xiaoyu Chen^*, Jiachen Hu^*, Chi Jin, and 2 more authors

In International Conference on Learning Representations(Spotlight, top 6%) , 2022

arXiv Bib

Highlights: We provide a theoretical framework for domain randomization, modeling the simulator as a set of MDPs with tunable parameters. We prove sharp bounds on the sim-to-real gap and show that successful transfer is achievable without any real-world training samples, highlighting the importance of history-dependent policies.
@inproceedings{chen2022understanding, title = {Understanding Domain Randomization for Sim-to-real Transfer}, author = {Chen, Xiaoyu and Hu, Jiachen and Jin, Chi and Li, Lihong and Wang, Liwei}, booktitle = {International Conference on Learning Representations}, year = {2022}, }
ICML
Near-Optimal Representation Learning for Linear Bandits and Linear RL

Jiachen Hu^*, Xiaoyu Chen^*, Chi Jin, and 2 more authors

In Proceedings of the 38th International Conference on Machine Learning, 2021

arXiv Bib

Highlights: We study multi-task representation learning for linear bandits and episodic RL with linear approximation. Our algorithm achieves $\tilde{O}(M\sqrt{dkT} + d\sqrt{kMT})$ regret, significantly improving over the $\tilde{O}(Md\sqrt{T})$ baseline, yielding the first theoretical characterization of multi-task representation learning benefits in RL exploration.
@inproceedings{pmlr-v139-hu21a, title = {Near-Optimal Representation Learning for Linear Bandits and Linear {RL}}, author = {Hu, Jiachen and Chen, Xiaoyu and Jin, Chi and Li, Lihong and Wang, Liwei}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {4349--4358}, year = {2021}, volume = {139}, series = {Proceedings of Machine Learning Research}, publisher = {PMLR}, }
ICLR
Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication

Yuanhao Wang^*, Jiachen Hu^*, Xiaoyu Chen, and 1 more author

In International Conference on Learning Representations, 2020

arXiv Bib

Highlights: We design communication protocols for distributed bandit learning with $M$ agents under central coordination. For multi-armed bandits, we achieve near-optimal regret with only $O(M\log(MK))$ communication cost — independent of the time horizon $T$ and matching the lower bound up to a log factor.
@inproceedings{wang2020distributed, title = {Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication}, author = {Wang, Yuanhao and Hu, Jiachen and Chen, Xiaoyu and Wang, Liwei}, booktitle = {International Conference on Learning Representations}, year = {2020}, }