Publications
First-order Optimization for Superquantile-based Supervised Learning
Yassine Laguel, Jérome Malick, Zaid Harchaoui
MLSP, 2020
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang, Xiangyang Ji, Simon S. Du
arXiv, 2020
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences
Daniel S. Brown, Russell Coleman, Ravi Srinivasan, Scott Niekum
ICML, 2020
Learning Deep ReLU Networks is Fixed Parameter Tractable
Sitan Chen, Adam R. Klivans, Raghu Meka
arXiv, 2020
Learning to Improve Multi-Robot Hallway Navigation
Jin-Soo Park, Brian Tsang, Harel Yedidsion, Garrett Warnell, Daehyun Kyoung, and Peter Stone
CoRL, 2020
Entanglement is Necessary for Optimal Quantum Property Testing
Sebastien Bubeck, Sitan Chen, Jerry Li
arXiv, 2020
Adaptive Sampling to Reduce Disparate Performance
Jacob Abernethy, Pranjal Awasthi, Matthäus Kleindessner, Jamie Morgenstern, Jie (Claire) Zhang
arXiv, 2020
Value Alignment Verification
Daniel S. Brown, Jordan Schneioder, Scott Niekum
NeurIPS, 2020
Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent.
Surbhi Goel, Aravind Gollakota, Zhihan Jin, Sushrut Karmalkar, Adam Klivans
ICML, 2020
Kernel and Rich Regimes in Overparametrized Models
Blake Woodworth, Suriya Gunasekar, Jason D. Lee, Edward Moroshko, Pedro Savarese, Itay Golan, Daniel Soudry, Nathan Srebro
PMLR, 2020
RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration
Brahma Pavse, Faraz Torabi, Josiah Hanna, Garrett Warnell, and Peter Stone
IROS, 2020
Reducing Sampling Error in Batch Temporal Difference Learning
Brahma Pavse, Ishan Durugkar, Josiah Hanna, and Peter Stone
ICML, 2020
The EMPATHIC Framework for Task Learning from Implicit Human Feedback
Yuchen Cui, Qiping Zhang, Alessandro Allievi, Peter Stone, Scott Niekum, W. Bradley Knox
CRL, 2020
On Sampling Error in Batch Action-Value Prediction Algorithms
Brahma S. Pavse, Josiah P. Hanna, Ishan Durugkar, and Peter Stone
NeurIPS, 2020
Learning and Reasoning for Robot Dialog and Navigation Tasks
Keting Lu, Shiqi Zhang, Peter Stone, and Xiaoping Chen
July, 2020
When random initializations help: a study of variational inference for community detection
Purnamrita Sarkar, Rachel Ward, Soumendu Sundar Mukherjee
arXiv, 2019
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
Rachel Ward, Xiaoxia Wu, Leon Bottou
PMLR, 2019
Splitting Steepest Descent for Growing Neural Architectures
Q. Liu, D. Wang, L. Wu
NeurIPS, 2019