SIMPAS Group Meeting 2025 –
Learning under Requirements: Supervised and Reinforcement Learning with Constraints

This tutorial is geared towards researchers and practitioners interested in imposing requirements to ML systems, such as fairness, robustness, and safety. Typically, these statistical, data-driven constraints are induced by combining the learning objective and requirement violation metrics into a single training loss. To guarantee that the solution satisfies the requirements, however, this approach requires careful tuning of hyperparameters (penalty coefficients) using cross-validation, which can be computationally intensive and time consuming. Constrained learning incorporates requirements as statistical constraints rather than by modifying the training objective.

In this tutorial, we provide an overview of theoretical and algorithmic advances from the past 5 years that show when and how it is possible to learn under constraints and effectively impose constraints on ML systems, both during training and at test time. Specifically, we explore the role and impact of different types of requirements in supervised learning, robust learning, and RL:

Constrained supervised learning, where we show generalization guarantees for constrained supervised learning based on new non-convex duality results and then use them to derive practical algorithms;
Constrained reinforcement learning, where we develop a parallel theory for constrained RL, showing that it is strictly more expressive than unconstrained RL and providing guaranteed algorithms.

Throughout the tutorial, we illustrate the effectiveness and flexibility of constrained learning in a diverse set of applications, such as fairness, federated learning, robust image classification, safe RL, and wireless communications. Ultimately, this tutorial provides a general tool that can be used to tackle a variety of problems in ML and sequential decision-making.

Prerequisite knowledge: only basic understanding of optimization, ML, and RL are expected. Specifically, familiarity with the basics of convex optimization and its algorithms (i.e., what are convex functions, mathematics of gradients, and gradient descent); fundamentals of empirical risk minimization (ERM) and the associated learning theory (i.e., basic notions of generalization and sample complexity); and familiarity with Markov decision processes (MDPs) and basic RL algorithms (policy gradient, e.g., REINFORCE).

Schedule

Time	Topic
10:00 - 10:15	Introduction
10:15 - 11:00	Constrained supervised learning (slides)
	(un)constrained learning and ERM
	non-convex duality and constrained learning theory
	constrained learning algorithms
	(robust or resilient) learning
11:00 - 11:10	Break
11:10 - 11:45	Constrained reinforcement learning (slides)
	(C)MDPs and (C)RL
	CRL duality
	CRL algorithms
11:45 - 12:00	Q&A and discussion

A selected bibliography for the tutorial is available here.

Content

Constrained Supervised Learning: Duality, Generalization, and Resilience

In this module, we formalize statistical learning and empirical risk minimization (ERM), discuss different types of requirements that arise in ML, and show how these requirements are imposed using constrained optimization. We then introduce constrained learning theory and proceed to show a family of generalization bounds for constrained learning using not constrained ERM, but non-convex duality. Based on these results, we develop primal-dual training algorithms, discuss practical aspects of their implementation (step sizes, stopping criteria), and elucidate their convergence properties. We explore how these duality results can be used to adapt the trade-off between objective and requirements and balance the marginal costs and benefits of relaxing constraints. We dub these techniques "resilient learning," from the ecological concept that describes the ability of a system to adapt to changes in their environment.

Applications: We use fairness as an example of rate requirement (e.g., equality of odds, churn), adversarial examples in image classification for robustness, and federated learning to illustrate how to tackle ML problems with simultaneous goals (e.g., heterogeneous data or heterogeneous performance targets). We then revisit this heterogeneous federated learning application to illustrate the use of resilient learning.

Constrained Reinforcement Learning: Duality and Algorithms

The final module develops a parallel constrained learning theory in the dynamic setting of sequential decision-making problems. We start by introducing the MDP formalism and RL algorithms commonly used to tackle this setting. Using a different technique, we then derive non-convex duality results similar to the supervised case and use them to put forward a primal-dual algorithm for constrained RL. We then show how this algorithm can fail even on simple tasks and describe a systematic state augmentation procedure able to provably overcome this issue.

Applications: We consider the task of learning safe policies and wireless resource allocation to motivate and illustrate the results of this module. We then turn to a continuous monitoring problem to illustrate the limitations of unconstrained RL and show the need for state augmentation.

Bibliography

Main references

M. Calvo-Fullana, S. Paternain, L. F. O. Chamon, and A. Ribeiro. State augmented constrained reinforcement learning: Overcoming the limitations of learning with rewards. IEEE Trans. on Autom. Control., 69[7]:4275–4290, 2024.
[ arXiv ] [ Bibtex ]

@Article{Calvo-Fullana24s,
    author = "{Calvo-Fullana}, M. and Paternain, S. and Chamon, L. F. O. and Ribeiro, A.",
    title = "State augmented constrained reinforcement learning: {O}vercoming the limitations of learning with rewards",
    journal = "IEEE Trans. on Autom. Control.",
    year = "2024",
    volume = "69[7]",
    pages = "4275--4290",
    arxiv = "\url{https://arxiv.org/abs/2102.11941}"
}

L. F. O. Chamon, S. Paternain, M. Calvo-Fullana, and A. Ribeiro. Constrained learning with non-convex losses. IEEE Trans. on Inf. Theory, 69[3]:1739–1760, 2023.
[ arXiv ] [ Bibtex ]

@Article{Chamon23c,
    author = "Chamon, L. F. O. and Paternain, S. and {Calvo-Fullana}, M. and Ribeiro, A.",
    title = "Constrained Learning with Non-Convex Losses",
    journal = "IEEE Trans. on Inf. Theory",
    volume = "69[3]",
    pages = "1739--1760",
    year = "2023",
    arxiv = "\url{https://arxiv.org/abs/2103.05134}"
}

A. Robey*, L. F. O. Chamon*, G. J. Pappas, H. Hassani, and A. Ribeiro. Adversarial robustness with semi-infinite constrained learning. In Conference on Neural Information Processing Systems (NeurIPS). 2021. (* equal contribution).
[ arXiv ] [ Bibtex ]

@InProceedings{Robey21a,
    author = "Robey*, A. and Chamon*, L. F. O. and Pappas, G. J. and Hassani, H. and Ribeiro, A.",
    title = "Adversarial Robustness with Semi-Infinite Constrained Learning",
    booktitle = "Conference on Neural Information Processing Systems\textasciitilde (NeurIPS)",
    year = "2021",
    pages = "",
    note = "{(* equal contribution)}",
    arxiv = "\url{https://arxiv.org/abs/2110.15767}"
}

S. Paternain, L. F. O. Chamon, M. Calvo-Fullana, and A. Ribeiro. Constrained reinforcement learning has zero duality gap. In Conference on Neural Information Processing Systems (NeurIPS), 7555–7565. 2019.
[ arXiv ] [ Bibtex ]

@InProceedings{Paternain19c,
    author = "Paternain, S. and Chamon, L. F. O. and {Calvo-Fullana}, M. and Ribeiro, A.",
    title = "Constrained reinforcement learning has zero duality gap",
    booktitle = "Conference on Neural Information Processing Systems\textasciitilde (NeurIPS)",
    year = "2019",
    pages = "7555--7565",
    arxiv = "\url{https://arxiv.org/abs/1910.13393}"
}

Constrained supervised learning

J. Elenter, L. F. O. Chamon, and A. Ribeiro. Near-optimal solutions of constrained learning problems. In International Conference on Learning Representations (ICLR). 2024.
[ arXiv ] [ Bibtex ]

@InProceedings{Elenter24n,
    author = "Elenter, J. and Chamon, L. F. O. and Ribeiro, A.",
    title = "Near-optimal solutions of constrained learning problems",
    booktitle = "International Conference on Learning Representations\textasciitilde (ICLR)",
    year = "2024",
    arxiv = "\url{https://arxiv.org/abs/2403.11844}"
}

I. Hounie, A. Ribeiro, and L. F. O. Chamon. Resilient constrained learning. In Conference on Neural Information Processing Systems (NeurIPS). 2023.
[ arXiv ] [ Bibtex ]

@InProceedings{Hounie23r,
    author = "Hounie, I. and Ribeiro, A. and Chamon, L. F. O.",
    title = "Resilient Constrained Learning",
    booktitle = "Conference on Neural Information Processing Systems\textasciitilde (NeurIPS)",
    year = "2023",
    arxiv = "\url{https://arxiv.org/abs/2306.02426}"
}

J. Cervino, L. F. O. Chamon, B. D. Haeffele, R. Vidal, and A. Ribeiro. Learning globally smooth functions on manifolds. In International Conference on Machine Learning (ICML). 2023.
[ arXiv ] [ Bibtex ]

@InProceedings{Cervino23l,
    author = "Cervino, J. and Chamon, L. F. O. and Haeffele, B. D. and Vidal, R. and Ribeiro, A.",
    title = "Learning Globally Smooth Functions on Manifolds",
    booktitle = "International Conference on Machine Learning\textasciitilde (ICML)",
    year = "2023",
    arxiv = "\url{https://arxiv.org/abs/2210.00301}"
}

I. Hounie, L. F. O. Chamon, and A. Ribeiro. Automatic data augmentation via invariance-constrained learning. In International Conference on Machine Learning (ICML). 2023.
[ arXiv ] [ Bibtex ]

@InProceedings{Hounie23a,
    author = "Hounie, I. and Chamon, L. F. O. and Ribeiro, A.",
    title = "Automatic Data Augmentation via Invariance-Constrained Learning",
    booktitle = "International Conference on Machine Learning\textasciitilde (ICML)",
    year = "2023",
    arxiv = "\url{https://arxiv.org/abs/2209.15031}"
}

Zebang Shen, Juan Cervino, Hamed Hassani, and Alejandro Ribeiro. An agnostic approach to federated learning with class imbalance. In International Conference on Learning Representations (ICLR). 2022.
[ Link ] [ Bibtex ]

@InProceedings{Shen22a,
    author = "Shen, Zebang and Cervino, Juan and Hassani, Hamed and Ribeiro, Alejandro",
    title = "An Agnostic Approach to Federated Learning with Class Imbalance",
    booktitle = "International Conference on Learning Representations\textasciitilde (ICLR)",
    year = "2022",
    url = "\url{https://openreview.net/forum?id=Xo0lbDt975}"
}

A. Robey, L. F. O. Chamon, G. J. Pappas, and H. Hassani. Probabilistically robust learning: Balancing average- and worst-case performance. In International Conference on Machine Learning (ICML). 2022. (spotlight)
[ arXiv ] [ Bibtex ]

@InProceedings{Robey22p,
    author = "Robey, A. and Chamon, L. F. O. and Pappas, G. J. and Hassani, H.",
    title = "Probabilistically Robust Learning: {B}alancing Average- and Worst-case Performance",
    booktitle = "International Conference on Machine Learning\textasciitilde (ICML)",
    year = "2022",
    pages = "",
    award = "spotlight",
    arxiv = "\url{https://arxiv.org/abs/2202.01136}"
}

L. F. O. Chamon and A. Ribeiro. Probably approximately correct constrained learning. In Conference on Neural Information Processing Systems (NeurIPS). 2020.
[ arXiv ] [ Bibtex ]

@InProceedings{Chamon20p,
    author = "Chamon, L. F. O. and Ribeiro, A.",
    title = "Probably approximately correct constrained learning",
    booktitle = "Conference on Neural Information Processing Systems\textasciitilde (NeurIPS)",
    year = "2020",
    pages = "",
    arxiv = "\url{https://arxiv.org/abs/2006.05487}"
}

L. F. O. Chamon, S. Paternain, M. Calvo-Fullana, and A. Ribeiro. The empirical duality gap of constrained statistical learning. In IEEE International Conference in Acoustic, Speech, and Signal Processing (ICASSP). 2020. (Best student paper award)
[ arXiv ] [ Bibtex ]

@InProceedings{Chamon20ta,
    author = "Chamon, L. F. O. and Paternain, S. and {Calvo-Fullana}, M. and Ribeiro, A.",
    title = "The empirical duality gap of constrained statistical learning",
    booktitle = "IEEE International Conference in Acoustic, Speech, and Signal Processing (ICASSP)",
    year = "2020",
    pages = "",
    award = "Best student paper award",
    arxiv = "\url{https://arxiv.org/abs/2002.05183}"
}

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. Theoretically principled trade-off between robustness and accuracy. In International Conference on Machine Learning (ICML). 2019.
[ Link ] [ Bibtex ]

@InProceedings{Zhang19t,
    author = "Zhang, Hongyang and Yu, Yaodong and Jiao, Jiantao and Xing, Eric and Ghaoui, Laurent El and Jordan, Michael",
    title = "Theoretically Principled Trade-off between Robustness and Accuracy",
    booktitle = "International Conference on Machine Learning\textasciitilde (ICML)",
    year = "2019",
    url = "\url{http://proceedings.mlr.press/v97/zhang19p.html}"
}

Andrew Cotter, Heinrich Jiang, Maya Gupta, Serena Wang, Taman Narayan, Seungil You, and Karthik Sridharan. Optimization with non-differentiable constraints with applications to fairness, recall, churn, and other goals. Journal of Machine Learning Research, 20(172):1–59, 2019.
[ Bibtex ]

@Article{Cotter19o,
    author = "Cotter, Andrew and Jiang, Heinrich and Gupta, Maya and Wang, Serena and Narayan, Taman and You, Seungil and Sridharan, Karthik",
    title = "Optimization with Non-Differentiable Constraints with Applications to Fairness, Recall, Churn, and Other Goals",
    journal = "Journal of Machine Learning Research",
    year = "2019",
    volume = "20",
    number = "172",
    pages = "1--59"
}

M. Eisen, C. Zhang, L. F. O. Chamon, D. D. Lee, and A. Ribeiro. Learning optimal resource allocations in wireless systems. IEEE Trans. on Signal Process., 67[10]:2775–2790, 2019. (Top 50 most accessed articles in IEEE TSP: May, July, Sept, Oct 2019)
[ arXiv ] [ Bibtex ]

@Article{Eisen19l,
    author = "Eisen, M. and Zhang, C. and Chamon, L. F. O. and Lee, D. D. and Ribeiro, A.",
    title = "Learning optimal resource allocations in wireless systems",
    volume = "67[10]",
    journal = "IEEE Trans. on Signal Process.",
    year = "2019",
    pages = "2775--2790",
    award = "Top 50 most accessed articles in IEEE TSP: May, July, Sept, Oct 2019",
    arxiv = "\url{https://arxiv.org/abs/1807.08088}"
}

Constrained reinforcement learning

Weiqin Chen and Santiago Paternain. Generalized constraint for probabilistic safe reinforcement learning. In Learning for Dynamics & Control Conference, 1606–1618. 2024.
[ Bibtex ]

@InProceedings{Chen2024g,
    author = "Chen, Weiqin and Paternain, Santiago",
    title = "Generalized constraint for probabilistic safe reinforcement learning",
    booktitle = "Learning for Dynamics \\& Control Conference",
    pages = "1606--1618",
    year = "2024"
}

Weiqin Chen, Dharmashankar Subramanian, and Santiago Paternain. Probabilistic constraint for safety-critical reinforcement learning. IEEE Trans. on Autom. Control., 2024.
[ Bibtex ]

@Article{Chen2024p,
    author = "Chen, Weiqin and Subramanian, Dharmashankar and Paternain, Santiago",
    title = "Probabilistic constraint for safety-critical reinforcement learning",
    journal = "IEEE Trans. on Autom. Control.",
    year = "2024"
}

Weiqin Chen, James Onyejizu, Long Vu, Lan Hoang, Dharmashankar Subramanian, Koushik Kar, Sandipan Mishra, and Santiago Paternain. Adaptive primal-dual method for safe reinforcement learning. In International Conference on Autonomous Agents and Multiagent Systems, 326–334. 2024.
[ Bibtex ]

@InProceedings{Chen2024a,
    author = "Chen, Weiqin and Onyejizu, James and Vu, Long and Hoang, Lan and Subramanian, Dharmashankar and Kar, Koushik and Mishra, Sandipan and Paternain, Santiago",
    title = "Adaptive Primal-Dual Method for Safe Reinforcement Learning",
    booktitle = "International Conference on Autonomous Agents and Multiagent Systems",
    pages = "326--334",
    year = "2024"
}

Damsara Jayarathne, Santiago Paternain, and Sandipan Mishra. Safe residual reinforcement learning for helicopter aerial refueling. In IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), 263–269. 2023.
[ Bibtex ]

@InProceedings{Jayarathne2023s,
    author = "Jayarathne, Damsara and Paternain, Santiago and Mishra, Sandipan",
    title = "Safe residual reinforcement learning for helicopter aerial refueling",
    booktitle = "IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM)",
    pages = "263--269",
    year = "2023"
}

Arindam Chowdhury, Santiago Paternain, Gunjan Verma, Ananthram Swami, and Santiago Segarra. Learning non-myopic power allocation in constrained scenarios. In IEEE Asilomar Conference on Signals, Systems and Computers, 804–808. 2023.
[ arXiv ] [ Bibtex ]

@InProceedings{Chowdhury23l,
    author = "Chowdhury, Arindam and Paternain, Santiago and Verma, Gunjan and Swami, Ananthram and Segarra, Santiago",
    title = "Learning Non-myopic Power Allocation in Constrained Scenarios",
    booktitle = "IEEE Asilomar Conference on Signals, Systems and Computers",
    year = "2023",
    pages = "804--808",
    arxiv = "\url{https://arxiv.org/abs/2401.10297}"
}

S. Paternain, M. Calvo-Fullana, L. F. O. Chamon, and A. Ribeiro. Safe policies for reinforcement learning via primal-dual methods. IEEE Trans. on Autom. Control., 68[3]:1321–1336, 2023.
[ arXiv ] [ Bibtex ]

@Article{Paternain23s,
    author = "Paternain, S. and {Calvo-Fullana}, M. and Chamon, L. F. O. and Ribeiro, A.",
    title = "Safe policies for reinforcement learning via primal-dual methods",
    journal = "IEEE Trans. on Autom. Control.",
    year = "2023",
    volume = "68[3]",
    pages = "1321--1336",
    arxiv = "\url{https://arxiv.org/abs/1911.09101}"
}

Navid NaderiAlizadeh, Mark Eisen, and Alejandro Ribeiro. State-augmented learnable algorithms for resource management in wireless networks. IEEE Trans. on Signal Process., 70:5898–5912, 2022.
[ arXiv ] [ Bibtex ]

@Article{NaderiAlizadeh22s,
    author = "NaderiAlizadeh, Navid and Eisen, Mark and Ribeiro, Alejandro",
    journal = "IEEE Trans. on Signal Process.",
    title = "State-Augmented Learnable Algorithms for Resource Management in Wireless Networks",
    year = "2022",
    volume = "70",
    pages = "5898--5912",
    arxiv = "\url{https://arxiv.org/abs/2207.02242}"
}

Joshua Achiam, David Held, Aviv Tamar, and Pieter Abbeel. Constrained policy optimization. In International Conference on Machine Learning (ICML). 2017.
[ Link ] [ Bibtex ]

@InProceedings{Achiam17c,
    author = "Achiam, Joshua and Held, David and Tamar, Aviv and Abbeel, Pieter",
    title = "Constrained Policy Optimization",
    booktitle = "International Conference on Machine Learning\textasciitilde (ICML)",
    year = "2017",
    url = "\url{https://proceedings.mlr.press/v70/achiam17a.html}"
}

E. Altman. Constrained Markov Decision Processes. Chapman and Hall, 1999.
[ Link ] [ Bibtex ]

@Book{Altman99c,
    author = "Altman, E.",
    publisher = "Chapman and Hall",
    title = "Constrained Markov Decision Processes",
    year = "1999",
    url = "\url{https://www-sop.inria.fr/members/Eitan.Altman/TEMP/h.pdf}"
}

Others

L. F. O. Chamon. csl: Learning under requirements with PyTorch. 2021.
[ Link ] [ Bibtex ]

@Misc{Chamon21c,
    author = "Chamon, L. F. O.",
    title = "{csl}: {L}earning under requirements with {PyTorch}",
    version = "1.0",
    year = "2021",
    url = "\url{https://github.com/lfochamon/csl}"
}

L. F. O. Chamon, Y. C. Eldar, and A. Ribeiro. Functional nonlinear sparse models. IEEE Trans. on Signal Process., 68[1]:2449–2463, 2020.
[ arXiv ] [ Bibtex ]

@Article{Chamon20f,
    author = "Chamon, L. F. O. and Eldar, Y. C. and Ribeiro, A.",
    title = "Functional nonlinear sparse models",
    volume = "68[1]",
    journal = "IEEE Trans. on Signal Process.",
    year = "2020",
    pages = "2449--2463",
    arxiv = "\url{https://arxiv.org/abs/1811.00577}"
}

Luiz Chamon

SIMPAS Group Meeting 2025 –Learning under Requirements: Supervised and Reinforcement Learning with Constraints