Statistical and Algorithmic ML Research

Statistical and Algorithmic ML Research Contact information, map and directions, contact form, opening hours, services, ratings, photos, videos and announcements from Statistical and Algorithmic ML Research, College & University, New Jersey Institute of Technology, Newark, NJ.

The Statistical and Algorithmic ML Research Group, directed by AP Thanh Nguyen-Tang and currently hosted at The Department of Data Science, NJIT, is dedicated to study the statistical and algorithmic foundations of learning in modern AI settings.

[Gold Reviewer Award ICML 2026]
05/14/2026

[Gold Reviewer Award ICML 2026]

[ICML 2026 Spotlight] Modern interactive AI systems (e.g., recommendation systems, social platforms) learn from your int...
04/30/2026

[ICML 2026 Spotlight]

Modern interactive AI systems (e.g., recommendation systems, social platforms) learn from your interactions, such as clicks and queries, but this data footprint can reveal your identity. If a user later requests their data be removed—a "right to be forgotten" protected by laws like the GDPR and CCPA—how do we build AI systems that optimally facilitate such requests?

Our ICML Spotlight paper, "Exact Unlearning in Reinforcement Learning," provides the first rigorous theoretical foundation for this challenge. We formulate the problem and provide (nearly) minimax-optimal solutions — making it possible to efficiently remove a user’s data (and its effect) in an interactive AI system, as if it was never seen.

📢 Call for Papers: Decision-Making from Offline Datasets to Online Adaptation: Black-Box Optimization to Reinforcement L...
04/03/2026

📢 Call for Papers: Decision-Making from Offline Datasets to Online Adaptation: Black-Box Optimization to Reinforcement Learning @ [ICML] Int'l Conference on Machine Learning-2026

Website: https://decision-making-offline2online-icml2026.github.io/

We invite the submission of research papers and position papers for our ICML 2026 workshop on Decision-Making from Offline Datasets to Online Adaptation. This workshop aims to explore methods for learning policies, acquisition strategies, and decision rules entirely from previously collected data (offline) or with a small amount of new real-world data (online), spanning settings such as black-box optimization, contextual bandits, reinforcement learning (RL), and their synergies. The workshop will highlight both foundational advances and real-world applications in domains where online experimentation is costly, unsafe, or infeasible, including scientific discovery, engineering design, healthcare, education, recommender systems, and beyond.

Topics of interest include, but are not limited to:
• Offline RL: Algorithms, theory, and applications of RL trained from offline datasets, including long-horizon and safety-constrained settings.
• Offline RL for Foundation Models: RLHF, reasoning model training, and alignment using offline data.
• Black-Box Optimization from Offline Data: Model-based optimization and high-throughput experimental design in few- or single-round settings.
• Contextual Bandits from Logged Data: Learning and evaluation using large-scale interaction logs.
• Off-Policy Evaluation and Policy Comparison: Reliable evaluation, confidence estimation, and counterfactual reasoning.
• Hybrid Offline-to-Online Learning: Methods combining offline datasets with limited online interaction.
• Uncertainty Quantification for Offline Decision-Making: Conformal prediction and risk-aware learning.
• Causal Inference from Observational Data: Leveraging causal structure for improved decision-making.
• Generative Models for Decision-Making: Deep generative approaches for policy learning and design optimization.
• Multi-Task and Multi-Objective Learning: Scaling offline methods across tasks and objectives.
• Benchmarks and Evaluation Protocols: Realistic datasets and metrics reflecting real-world deployment challenges.
• Applications in Science and Engineering: Materials discovery, drug design, chip design, robotics, healthcare, education, and industrial systems.

Submission Deadline: May 5th, 2026, AoE
Author Notification: May 15, 2026, AoE
Camera Ready Deadline: June 15, 2026, AoE

OpenReview submission site: https://lnkd.in/e6agxXFB

We have an amazing lineup of Invited Speakers covering academia and industry:
Jacob Gardner
Wen Sun
Clara Wong-Fannjiang
Eytan Bakshy
Aarti Singh
Sergey Levine

A great team of Organizers: A***n Deshwal, Haruka Kiyohara, Willie Neiswanger, Nghia Hoang, Syrine Belakaria, Thanh Nguyen-Tang, and Janardhan Rao (Jana) Doppa

This is what one of my students wrote about me. It's always great to see students grow 😄 (and don't run away from a 30-p...
12/19/2025

This is what one of my students wrote about me. It's always great to see students grow 😄 (and don't run away from a 30-page proof 😂)

11/08/2025
How do transformers learn in-context recall tasks? Consider the incomplete sentence, “After talking to Bob about Anna, C...
11/06/2025

How do transformers learn in-context recall tasks?

Consider the incomplete sentence, “After talking to Bob about Anna, Charles gives her email address to [?].” A traditional machine learning model would often make predictions based on global statistical patterns. For example, since words that most frequently follow “to” in English tend to be common function words or verbs such as “the,” “be,” “go,” or “have,” a traditional ML model might predict one of these. However, a correct prediction here requires in-context understanding—recognizing that “her” refers to “Anna,” making “Bob” a more plausible continuation. This type of reasoning reflects a model’s ability to recall and use context within the same sequence.

In our recent work (https://arxiv.org/abs/2505.15009) with Quan Nguyen (Victoria University-> The University of British Columbia), we present a formal study of how transformers learn such in-context recall tasks. Specifically:

- Formalization: We introduce a formal definition for a class of in-context recall tasks.

- Optimality: We prove that one-layer transformers with linear, ReLU, or softmax attention mechanisms can provably realize the Bayes-optimal next-token predictor for these tasks.

- Training Dynamics: We show that normalized gradient descent converges to the optimal solution at a linear rate. A key technical challenge arises from the softmax attention normalization, which bounds attention scores and limits the ability to place mass on the target token’s logit. Nevertheless, we establish both approximation and convergence results by applying an appropriate scaling to the value vectors.

- Out-of-distribution Generalization: Finally, we demonstrate that the trained transformer generalizes to next-token prediction tasks involving novel target tokens—tokens never seen during training—indicating that transformers learn in-context positional associations rather than merely memorizing specific tokens.



——
We are still looking for PhD students to work on the theory of Reinforcement Learning and Transformers for our Statistical and Algorithmic ML group. Details at https://www.facebook.com/share/p/1DKDBopPk8/?mibextid=wwXIfr

We study the approximation capabilities, convergence speeds and on-convergence behaviors of transformers trained on in-context recall tasks -- which requires to recognize the \emph{positional} association between a pair of tokens from in-context examples. Existing theoretical results only focus on t...

How can we learn safe and reward maximizing policies from offline datasets (no additional interaction with the environme...
10/30/2025

How can we learn safe and reward maximizing policies from offline datasets (no additional interaction with the environment)?
This challenge arises in many real-world problems in safety-critical applications where online exploration becomes infeasible.

Our NeurIPS'25 paper (https://arxiv.org/abs/2510.22027) provides an elegant theoretical and algorithmic framework that formulates the safety-constrained learning problem as a minimax game between: (i) a *max player* who selects policy and (ii) a *min player* who selects the Lagrange multiplier.

We provide an *oracle-efficient* learning algorithm that allows the max player to leverage the power of a stochastic oracle for offline RL while the min player pursuits no-regret learning, provably converging to the approximate min-max equilibrium.

A practical version of our algorithmic framework allows us to turn the min player into a multi-armed bandit instance while employing iterative steps of any off-the-shelf offline RL algorithm as an approximation to the stochastic oracle. Our experiments show that the proposed approximate algorithm outperforms state-of-the-art methods, especially for stringent safety/cost constraint.

-------
I will be at NeurIPS 25 in Dec 2-7. Happy to discuss RL (offline, multi-agent) and transformer theory, and catch up with old and new friends. I will also look for PhD students there to join my Statistical and Algorithmic ML group (https://thanhnguyentang.github.io/) to work on those topics. Feel free to email me if you are interested in the positions.

We study the problem of Offline Safe Reinforcement Learning (OSRL), where the goal is to learn a reward-maximizing policy from fixed data under a cumulative cost constraint. We propose a novel OSRL approach that frames the problem as a minimax objective and solves it by combining offline RL with onl...

We are looking for PhD students on Machine Learning Theory for Spring/Fall 2026! Nguyễn Tăng Thành
10/16/2025

We are looking for PhD students on Machine Learning Theory for Spring/Fall 2026! Nguyễn Tăng Thành

Address

New Jersey Institute Of Technology
Newark, NJ
07102

Alerts

Be the first to know and let us send you an email when Statistical and Algorithmic ML Research posts news and promotions. Your email address will not be used for any other purpose, and you can unsubscribe at any time.

Share