publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2025

  1. ICML
    The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
    Jiachen Hu, Rui Ai, Han Zhong, and 4 more authors
    In Proceedings of the 42nd International Conference on Machine Learning, 2025
  2. Preprint
    New Sphere Packings from the Antipode Construction
    Ruitao Chen, Jiachen Hu, Binghui Li, and 2 more authors
    In arXiv preprint, 2025

    We construct non-lattice sphere packings in dimensions 19, 20, 21, 23, 44, 45, and 47, demonstrating record densities that surpass all previously documented results in these dimensions. The construction applies the antipode method to suboptimal cross-sections of \(\Lambda_{24}\) and \(P_{48p}\).

2024

  1. Preprint
    On Limitation of Transformer for Learning HMMs
    Jiachen Hu, Qinghua Liu, and Chi Jin
    In arXiv preprint, 2024
  2. ICML
    Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
    Han Zhong*, Jiachen Hu*, Yecheng Xue, and 2 more authors
    In Proceedings of the 41st International Conference on Machine Learning, 2024
  3. TQC
    Quantum Non-Identical Mean Estimation: Efficient Algorithms and Fundamental Limits
    Jiachen Hu, Tongyang Li, Xinzhao Wang, and 3 more authors
    In 19th Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC 2024), 2024
  4. FC
    ZeroSwap: Data-Driven Optimal Market Making in Decentralized Finance
    Viraj Nadkarni, Jiachen Hu, Ranvir Rana, and 3 more authors
    In Financial Cryptography and Data Security, 2024

2023

  1. ICLR
    Provable Sim-to-real Transfer in Continuous Domain with Partial Observations
    Jiachen Hu*, Han Zhong*, Chi Jin, and 1 more author
    In International Conference on Learning Representations, 2023

    We study sim-to-real transfer in continuous domains with partial observations, modeled by linear quadratic Gaussian (LQG) systems. We show that a popular robust adversarial training algorithm can learn a policy from simulation that is competitive to the optimal real-world policy, providing the first provable guarantee in this setting.

2022

  1. ICLR
    Understanding Domain Randomization for Sim-to-real Transfer
    Xiaoyu Chen*, Jiachen Hu*, Chi Jin, and 2 more authors
    In International Conference on Learning Representations(Spotlight, top 6%) , 2022

    We provide a theoretical framework for domain randomization, modeling the simulator as a set of MDPs with tunable parameters. We prove sharp bounds on the sim-to-real gap and show that successful transfer is achievable without any real-world training samples, highlighting the importance of history-dependent policies.

  2. ICLR
    Near-Optimal Reward-Free Exploration for Linear Mixture MDPs with Plug-in Solver
    Xiaoyu Chen, Jiachen Hu, Lin F. Yang, and 1 more author
    In International Conference on Learning Representations(Spotlight, top 6%) , 2022

2021

  1. ICML
    Near-Optimal Representation Learning for Linear Bandits and Linear RL
    Jiachen Hu*, Xiaoyu Chen*, Chi Jin, and 2 more authors
    In Proceedings of the 38th International Conference on Machine Learning, 2021

    We study multi-task representation learning for linear bandits and episodic RL with linear value function approximation. Our algorithm MTLR-OFUL achieves \(\tilde{O}(M\sqrt{dkT} + d\sqrt{kMT})\) regret, significantly improving over the \(\tilde{O}(Md\sqrt{T})\) baseline, yielding the first theoretical characterization of multi-task representation learning benefits in RL exploration.

  2. ICLR
    Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL
    Xiaoyu Chen, Jiachen Hu, Lihong Li, and 1 more author
    In International Conference on Learning Representations, 2021

2020

  1. ICLR
    Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication
    Yuanhao Wang*, Jiachen Hu*, Xiaoyu Chen, and 1 more author
    In International Conference on Learning Representations, 2020

    We design communication protocols for distributed bandit learning with M agents under central coordination. For multi-armed bandits, we achieve near-optimal regret with only \(O(M\log(MK))\) communication cost — independent of the time horizon T and matching the lower bound up to a log factor.