Google at ICLR 2026

Machine Learning


Algorithmic Guarantees for Distilling Supervised and Offline RL Datasets
Aaryan Gupta
, Rishi Saket, Aravindan Raghuveer

An Evolutionary Perspective on Modes of Learning in Transformers
Alexander Y. Ku
, Thomas L. Griffiths, Stephanie C.Y. Chan

An Improved Model-free Decision-estimation Coefficient with Applications in Adversarial MDPs
Haolin Liu, Chen-Yu Wei, Julian Zimmert

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality
Shayne Longpre, Sneha Kudugunta, Niklas Muennighoff, I-Hung Hsu, Isaac Caswell, Alex Pentland, Sercan Ö. Arık, Chen-Yu Lee, Sayna Ebrahimi

ATLAS: Constraints-Aware Multi-Agent Collaboration for Real-World Travel Planning
Jihye Choi*, Jinsung Yoon, Jiefeng Chen, Somesh Jha, Tomas Pfister

Back to Square Roots: An Optimal Bound on the Matrix Factorization Error for Multi-Epoch Differentially Private SGD
Nikita P. Kalinin, Ryan McKenna, Jalaj Upadhyay, Christoph H. Lampert

Benchmarking Open-ended Segmentation
Cristina González, Santiago Rodríguez, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez

Beyond Accuracy: Are Time Series Foundation Models Well-Calibrated?
Coen Adler, Yuxin Chang, Felix Draxler, Samar Abdi, Padhraic Smyth

Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning
Shenao Zhang*, Yaqing Wang, Yinxiao Liu, Tianqi Liu, Peter Grabowski, Eugene Ie, Zhaoran Wang, Yunxuan Li

Black-Box Privacy Attacks on Shared Representations in Multitask Learning
John Abascal, Alina Oprea, Jonathan Ullman, Nicolás Berrios, Adam Smith, Matthew Jagielski

Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers
Peter Shaw
, James Cohan, Jacob Eisenstein, Kristina Toutanova

Cautious Weight Decay
Lizhang Chen
, Jonathan Li, Kaizhao Liang, Baiyu Su, Cong Xie, Nuo Wang Piers, Chen Liang, Ni Lao, Qiang Liu

CMT-Benchmark: A Benchmark for Condensed Matter Theory Built by Expert Researchers
Haining Pan, James V. Roggeveen, Erez Berg, Juan Carrasquilla, Debanjan Chowdhury, Surya Ganguli, Federico Ghimenti, Juraj Hasik, Henry Hunt, Hong-Chen Jiang, Mason Kamb, Ying-Jer Kao, Ehsan Khatami, Michael J. Lawler, Di Luo, Titus Neupert, Xiaoliang Qi, Michael P. Brenner, Eun-Ah Kim

CoDA: Agentic Systems for Collaborative Data Visualization
Zichen Chen*, Jiefeng Chen, Sercan Ö. Arık, Misha Sra, Tomas Pfister, Jinsung Yoon

Code World Models for General Game Playing
Wolfgang Lehrach
, Daniel Hennes, Miguel Lázaro-Gredilla, Xinghua Lou, Carter Wendelken, Zun Li, Antoine Dedieu, Marc Lanctot, Atil Iscen, John Schultz, Marcus Chiam, Ian Gemp, Piotr Zielinski, Satinder Singh, Kevin P. Murphy

Cognitive Models Can Reveal Interpretable Value Trade-offs in Language Models
Sonia K. Murthy, Rosie Zhao, Jennifer Hu, Sham Kakade, Markus Wulfmeier, Peng Qian, Tomer Ullman

Continuous Chain of Thought Enables Parallel Exploration and Reasoning
Halil Alperen Gozeten, M. Emrullah Ildiz, Xuechen Zhang, Hrayr Harutyunyan, Ankit Singh Rawat, Samet Oymak

Converge Faster, Talk Less: Hessian-Informed Federated Zeroth-Order Optimization
Zhe Li, Bicheng Ying, Zidong Liu, Chaosheng Dong, Haibo Yang

DevOps-Gym: Benchmarking AI Agents in Software DevOps Cycle
Yuheng Tang, Kaijie Zhu, Bonan Ruan, Chuqi Zhang, Michael Yang, Hongwei Li, Suyue Guo, Tianneng Shi, Zekun Li, Christopher Kruegel, Giovanni Vigna, Dawn Song, William Yang Wang, Lun Wang, Yangruibo Ding, Zhenkai Liang, Wenbo Guo

Diagnosing Generalization Failures from Representational Geometry Markers
Chi-Ning Chou, Artem Kirsanov, Yao-Yuan Yang, SueYeon Chung

Difference-Aware Retrieval Policies for Imitation Learning
Quinn Pfeifer, Ethan Pronovost, Paarth Shah, Khimya Khetarpal, Siddhartha Srinivasa, Abhishek Gupta

Distributed Algorithms for Euclidean Clustering
Vincent Cohen-Addad
, Liudeng Wang, David P. Woodruff, Samson Zhou

Do 3D Large Language Models Really Understand 3D Spatial Relationships?
Xianzheng Ma, Tao Sun, Shuai Chen, Yash Bhalgat, Jindong Gu, Angel X Chang, Iro Armeni, Iro Laina, Songyou Peng, Victor Adrian Prisacariu

Dynamic Classifier-Free Diffusion Guidance via Online Feedback
Pinelopi Papalampidi, Olivia Wiles,
Ira Ktena*, Aleksandar Shtedritski*, Emanuele Bugliarello, Ivana Kajić, Isabela Albuquerque, Aida Nematzadeh

Dynamic Reflections: Probing Video Representations with Text Alignment
Tyler Zhu*, Tengda Han, Leonidas Guibas, Viorica Pătrăucean, Maks Ovsjanikov

Dynamic Speculative Agent Planning
Yilin Guan, Qingfeng Lan, Fei Sun, Dujian Ding, Devang Acharya, Chi Wang, William Yang Wang, Wenyue Hua

Early Signs of Steganographic Capabilities in Frontier LLMs
Artur Zolkowski, Kei Nishimura-Gasparian, Robert McCarthy, Roland S. Zimmermann, David Lindner

Efficient Approximate Posterior Sampling with Annealed Langevin Monte Carlo
Advait Parulekar, Litu Rout, Karthikeyan Shanmugam, Sanjay Shakkottai

Fine-Tuning Diffusion Models via Intermediate Distribution Shaping
Gautham Govind Anil
, Shaan Ul Haque*, Nithish Kannen, Dheeraj Nagaraj, Sanjay Shakkottai, Karthikeyan Shanmugam

Frequency-Domain Better than Time-Domain for Causal Structure Recovery in Dynamical Systems on Networks
Mohammed Tuhin Rana, Mishfad Shaikh Veedu, James Melbourne, Murti V. Salapaka

FSPO: Few-Shot Optimization of Synthetic Preferences Effectively Personalizes to Real Users
Anikait Singh, Sheryl Hsu, Kyle Hsu, Eric Mitchell, Stefano Ermon, Tatsunori Hashimoto, Archit Sharma, Chelsea Finn

FutureFill: Fast Generation from Convolutional Sequence Models
Naman Agarwal
, Xinyi Chen, Evan Dogariu, Devan Shah, Hubert Strauss, Vlad Feinberg, Daniel Suo, Peter Bartlett, Elad Hazan

Generalization in LLM Problem Solving: The Case of the Shortest Path
Yao Tong, Jiayuan Ye, Anastasia Borovykh, Reza Shokri

Graph Random Features for Scalable Gaussian Processes
Matthew Zhang, Jihao Andreas Lin, Krzysztof Choromanski, Adrian Weller, Richard E. Turner, Isaac Reid

GuidedSampling: Steering LLMs Towards Diverse Candidate Solutions at Inference-Time
Divij Handa, Mihir Parmar, Aswin RRV, Md Nayem Uddin, Hamid Palangi, Chitta Baral

Hidden Breakthroughs in Language Model Training
Sara Kangaslahti, Elan Rosenfeld, Naomi Saphra

Hot PATE: Private Aggregation of Distributions for Diverse Tasks
Edith Cohen
, Benjamin Cohen-Wang, Xin Lyu, Jelani Nelson, Tamás Sarlós, Uri Stemmer

How to train data-efficient LLMs
Noveen Sachdeva
, Benjamin Coleman, Wang-Cheng Kang, Jianmo Ni, Lichan Hong, Ed H. Chi, Derek Z. Cheng, James Caverlee, Julian McAuley

HYPER: A Foundation Model for Inductive Link Prediction with Knowledge Hypergraphs
Xingyue Huang, Mikhail Galkin, Michael M. Bronstein, İsmail İlkan Ceylan

Improving Online-to-Nonconvex Conversion for Smooth Optimization via Double Optimism
Francisco Patitucci, Ruichen Jiang, Aryan Mokhtari

Incentive-Aligned Multi-Source LLM Summaries
Yanchen Jiang*, Zhe Feng, Aranyak Mehta

Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
Ran Xu*, Jingjing Chen, Jiayu Ye, Yu Wu, Jun Yan, Carl Yang, Hongkun Yu

Information-Theoretic Membership Inference for Granular Quantification of Memorization
Jiashu Tao, Reza Shokri

Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility
Michael A. Lepori, Jennifer Hu, Ishita Dasgupta, Roma Patel, Thomas Serre, Ellie Pavlick

It’s All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Ali Behrouz
, Meisam Razaviyayn, Peilin Zhong, Vahab Mirrokni

Language-Instructed Vision Embeddings for Controllable and Generalizable Perception
Chengzhi Mao
, Xudong Lin, Wen-Sheng Chu

Latent Concept Disentanglement in Transformer-based Language Models
Guanzhe Hong, Bhavya Vasudeva, Vatsal Sharan, Cyrus Rashtchian, Prabhakar Raghavan, Rina Panigrahy

Latent Stochastic Interpolants
Saurabh Singh*, Dmitry Lagun

Learn to Guide Your Diffusion Model
Alexandre Galashov
, Ashwini Pokle, Arnaud Doucet, Arthur Gretton, Mauricio Delbracio, Valentin De Bortoli

LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
Thomas Schmied*, Jörg Bornschein, Jordi Grau-Moya, Markus Wulfmeier, Razvan Pascanu

ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs
Adi Simhi, Jonathan Herzig, Martin Tutek, Itay Itzhak, Idan Szpektor, Yonatan Belinkov

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training
Johannes von Oswald
, Nino Scherrer, Seijin Kobayashi, Luca Versari, Songlin Yang, Maximilian Schlegel, Kaitlin Maile, Yanick Schimpf, Oliver Sieberling, Alexander Meulemans, Rif A. Saurous, Guillaume Lajoie, Charlotte Frenkel, Razvan Pascanu, Blaise Agüera y Arcas, João Sacramento

Mitigating Non-IID Drift in Zeroth-Order Federated LLM Fine-Tuning with Transferable Sparsity
Yide Ran, Wentao Guo, Jingwei Sun, Yanzhou Pan, Xiaodong Yu, Hao Wang, Jianwen Xie, Yiran Chen, Denghui Zhang, Zhaozhuo Xu

MoGen: Detailed Neuronal Morphology Generation via Point Cloud Flow Matching (see blog post)
Franz Rieger*, Jan-Matthis Lueckmann, Viren Jain, Michal Januszewski

Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies
Han Zhou*, Xingchen Wan, Ruoxi Sun, Hamid Palangi, Shariq Iqbal, Ivan Vulić, Anna Korhonen, Sercan Ö. Arık

Multi-turn Evaluation of Anthropomorphic Behaviors in Large Language Models
Lujain Ibrahim, Canfer Akbulut, Rasmi Elasmar, Charvi Rastogi, Minsuk Kahng, Meredith Ringel Morris, Kevin R. McKee, Verena Rieser, Murray Shanahan, Laura Weidinger

Multiple-Prediction-Powered Inference
Charlie Cowen-Breen*, Alekh Agarwal, Stephen Bates, William W. Cohen, Jacob Eisenstein, Amir Globerson, Adam Fisch

Nearly Space-Optimal Graph and Hypergraph Sparsification in Insertion-Only Data Streams
Vincent Cohen-Addad
, David P. Woodruff, Shenghao Xie, Samson Zhou

Neologism Learning for Controllability and Self-Verbalization
John Hewitt
, Oyvind Tafjord, Robert Geirhos, Been Kim

On the Geometry and Topology of Representations: the Manifolds of Modular Addition
Gabriela Moisescu-Pareja, Gavin McCracken, Harley Wiltzer, Vincent Létourneau, Colin Daniels, Doina Precup, Jonathan Love

On the Interpolation Effect of Score Smoothing in Diffusion Models
Zhengdao Chen

On the Theoretical Limitations of Embedding-Based Retrieval
Orion Weller
, Michael Boratko, Iftekhar Naim, Jinhyuk Lee

OpenPros: A Large-Scale Dataset for Limited View Prostate Ultrasound Computed Tomography
Hanchen Wang, Yixuan Wu, Yinan Feng, Peng Jin, Luoyuan Zhang, Shihang Feng, James Wiskin, Baris Turkbey, Peter A. Pinto, Bradford J. Wood, Songting Luo, Yinpeng Chen, Emad Boctor, Youzuo Lin

Planned Diffusion
Daniel Israel, Tian Jin, Ellie Cheng, Guy Van den Broeck, Aditya Grover, Suvinay Subramanian, Michael Carbin

Poisson Midpoint Method for Log Concave Sampling: Beyond the Strong Error Lower Bounds
Rishikesh Srinivasan
, Dheeraj Nagaraj

PropensityBench: Evaluating Latent Safety Risks in Large Language Models via an Agentic Approach
Udari Madhushani Sehwag, Shayan Shabihi, Alex McAvoy, Vikash Sehwag, Yuancheng Xu, Dalton Towers, Furong Huang

Randomization Boosts KV Caching, Learning Balances Query Load: A Joint Perspective
Fangzhou Wu, Sandeep Silwal, Qiuyi (Richard) Zhang

RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation
Pengcheng Jiang, Lang Cao, Ruike Zhu, Minhao Jiang, Yunyi Zhang, Jiaming Shen, Jimeng Sun, Jiawei Han

ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory (see blog post)
Siru Ouyang*, Jun Yan, I-Hung Hsu, Yanfei Chen, Ke Jiang, Zifeng Wang, Rujun Han, Long T. Le, Samira Daruki, Xiangru Tang, Vishy Tirumalashetty, George Lee, Mahsan Rofouei, Hangfei Lin, Jiawei Han, Chen-Yu Lee, Tomas Pfister

Redirection For Erasing Memory (REM): Towards A Universal Unlearning Method for Corrupted Data
Stefan Schoepf*, Michael C. Mozer, Nicole Mitchell, Alexandra Brintrup, George Kaissis, Peter Kairouz, Eleni Triantafillou

Rethinking Reasoning in Document Ranking: Why Chain-of-Thought Falls Short
Xuan Lu, Haohang Huang, Rui Meng, Yaohui Jin, Wenjun Zeng, Xiaoyu Shen

Robust Reward Modeling via Causal Rubrics
Pragya Srivastava
, Harman Singh*, Rahul Madhavan, Gandharv Patil*, Sravanti Addepalli, Arun Suggala, Rengarajan Aravamudhan, Soumya Sharma, Anirban Laha, Aravindan Raghuveer, Karthikeyan Shanmugam, Doina Precup

Robust Training of Neural Networks at Arbitrary Precision and Sparsity
Chengxi Ye
, Grace Chu, Yanfeng Liu, Yichi Zhang, Lukasz Lew, Li Zhang, Mark Sandler, Andrew Howard

Self-harmony: Learning to Harmonize Self-supervision and Self-play in Test-time Reinforcement Learning
Ru Wang, Wei Huang, Qi Cao, Yusuke Iwasawa, Yutaka Matsuo, Jiaxian Guo

Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning
Daniel Lawson, Adriana Hugessen, Charlotte Cloutier, Glen Berseth, Khimya Khetarpal

Self-Speculative Masked Diffusions
Andrew Campbell*, Valentin De Bortoli, Jiaxin Shi*, Arnaud Doucet

SocialJax: An Evaluation Suite for Multi-agent Reinforcement Learning in Sequential Social Dilemmas
Zihao Guo, Shuqing Shi, Richard Willis, Tristan Tomilin, Joel Z. Leibo, Yali Du

Spectral Bellman Method: Unifying Representation and Exploration in RL
Ofir Nabati, Bo Dai,
Shie Mannor, Guy Tennenholtz

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Yihe Deng*, I-Hung Hsu, Jun Yan, Zifeng Wang, Rujun Han, Gufeng Zhang, Yanfei Chen, Wei Wang, Tomas Pfister, Chen-Yu Lee

SynthWorlds: Controlled Parallel Worlds for Disentangling Reasoning and Knowledge in Language Models
Ken Gu, Advait Bhat, Mike A. Merrill, Robert West, Xin Liu, Daniel McDuff, Tim Althoff

Taming Imperfect Process Verifiers: A Sampling Perspective on Backtracking
Dhruv Rohatgi, Abhishek Shetty, Donya Saless, Yuchen Li, Ankur Moitra, Andrej Risteski, Dylan J. Foster

Text2Arch: A Dataset for Generating Scientific Architecture Diagrams from Natural Language Descriptions
Shivank Garg, Sankalp Mittal, Manish Gupta

TNT: Improving Chunkwise Training for Test-Time Memorization
Zeman Li*, Ali Behrouz, Yuan Deng, Peilin Zhong, Praneeth Kacham, Mahdi Karami, Meisam Razaviyayn, Vahab Mirrokni

Tools are Under-documented: Simple Document Expansion Boosts Tool Retrieval
Xuan Lu, Haohang Huang, Rui Meng, Yaohui Jin, Wenjun Zeng, Xiaoyu Shen

Trust The Typical
Debargha Ganguly, Sreehari Sankar, Biyao Zhang, Vikash Singh, Kanan Gupta, Harshini Kavuru, Alan Luo, Weicong Chen, Warren Morningstar, Raghu Machiraju, Vipin Chaudhary

TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture
Yongchao Chen*, Jiefeng Chen, Rui Meng, Ji Yin, Na Li, Chuchu Fan, Chi Wang, Tomas Pfister, Jinsung Yoon

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate (see blog post)
Amir Zandieh
, Majid Daliri, Majid Hadian, Vahab Mirrokni

Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data
Chu-Cheng Lin
, Daiyi Peng, Yifeng Lu, Ming Zhang, Eugene Ie

UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images
Junhwa Hur
, Charles Herrmann, Songyou Peng, Philipp Henzler, Zeyu Ma, Todd Zickler, Deqing Sun

Understanding the Role of Training Data in Test-Time Scaling
Adel Javanmard
, Baharan Mirzasoleiman, Vahab Mirrokni

Universal Model Routing for Efficient LLM Inference
Wittawat Jitkrittum, Harikrishna Narasimhan, Ankit Singh Rawat, Jeevesh Juneja, Congchao Wang, Zifeng Wang, Alec Go, Chen-Yu Lee, Pradeep Shenoy, Rina Panigrahy, Aditya Krishna Menon, Sanjiv Kumar

VideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Video
Hanoona Rasheed, Abdelrahman Shaker, Anqi Tang, Muhammad Maaz, Ming-Hsuan Yang, Salman Khan, Fahad Shahbaz Khan

VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation
Hritik Bansal, Clark Peng, Yonatan Bitton, Roman Goldenberg, Aditya Grover, Kai-Wei Chang

What’s the Plan? Metrics for Implicit Planning in LLMs and Their Application to Rhyme Generation
Jim Maar, Denis Paperno, Callum McDougall, Neel Nanda

When Does Divide and Conquer Work for Long Context LLM? A Noise Decomposition Framework
Zach Xu, Shang Zhu, Jue Wang, Junlin Wang, Ben Athiwaratkun, Chi Wang, James Zou, Ce Zhang

WorldGym: World Model as An Environment for Policy Evaluation
Julian Quevedo, Ansh Kumar Sharma, Yixiang Sun, Varad Suryavanshi, Percy Liang, Sherry Yang

WRING Out the Bias: A Rotation-Based Alternative to Projection Debiasing
Walter Gerych, Cassandra Parent, Quinn Perian, Rafiya Javed, Justin Solomon, Marzyeh Ghassemi



Source link