Machine learning has become integral to high-stakes decision-making systems — from medical diagnosis and autonomous vehicles to real-time threat detection and large-scale conversational agents. As these systems move from controlled research settings into open-world deployment, their robustness becomes a first-order concern. A model that performs well on a curated benchmark but fails silently in the wild offers not just diminished utility, but a false sense of security that can be more dangerous than no system at all. In this dissertation, we study the robustness of machine learning systems against two distinct but related threats: a) natural distributional drift, arising from changing trends or anomalous inputs in the wild, and b) deliberately crafted adversarial samples designed to deceive deployed models while appearing legitimate. Both threats, though differing fundamentally in \textit{intentionality}, challenge the foundational i.i.d. assumption that training and test data are drawn from the same distribution. This assumption is a cornerstone of statistical learning: it provides a tractable surrogate for the unknowable true data-generating process. In deployment, however, treating it as a guarantee is untenable. Whether due to shifting trends or a hostile adversary, its violation degrades performance and erodes user trust. In the first part, we study distributional drift in non-malicious environments along two axes: how models can be updated incrementally as new data arrives, and how deployed models can detect inputs that fall outside their training distribution — studying these jointly with the aim of making models adaptive to an open and evolving world. In the second part, we study robustness under adversarial conditions: empirically characterising how robustness scales with model size, extending this to autonomous driving systems that rely on deep reinforcement learning, and then to large language models — where adversarial manipulation takes new forms and the evaluation infrastructure itself emerges as a source of systematic risk. Together, these contributions argue that trustworthy machine learning in the real world is not a single problem to be solved, but a commitment to rigor at every layer — in how systems learn, how they withstand attack, and how we measure whether they have succeeded.
Robustness of Machine Learning Systems: under distribution shift & in adversarial environments / Gupta, Srishti. - (2026 May 18).
Robustness of Machine Learning Systems: under distribution shift & in adversarial environments
GUPTA, SRISHTI
18/05/2026
Abstract
Machine learning has become integral to high-stakes decision-making systems — from medical diagnosis and autonomous vehicles to real-time threat detection and large-scale conversational agents. As these systems move from controlled research settings into open-world deployment, their robustness becomes a first-order concern. A model that performs well on a curated benchmark but fails silently in the wild offers not just diminished utility, but a false sense of security that can be more dangerous than no system at all. In this dissertation, we study the robustness of machine learning systems against two distinct but related threats: a) natural distributional drift, arising from changing trends or anomalous inputs in the wild, and b) deliberately crafted adversarial samples designed to deceive deployed models while appearing legitimate. Both threats, though differing fundamentally in \textit{intentionality}, challenge the foundational i.i.d. assumption that training and test data are drawn from the same distribution. This assumption is a cornerstone of statistical learning: it provides a tractable surrogate for the unknowable true data-generating process. In deployment, however, treating it as a guarantee is untenable. Whether due to shifting trends or a hostile adversary, its violation degrades performance and erodes user trust. In the first part, we study distributional drift in non-malicious environments along two axes: how models can be updated incrementally as new data arrives, and how deployed models can detect inputs that fall outside their training distribution — studying these jointly with the aim of making models adaptive to an open and evolving world. In the second part, we study robustness under adversarial conditions: empirically characterising how robustness scales with model size, extending this to autonomous driving systems that rely on deep reinforcement learning, and then to large language models — where adversarial manipulation takes new forms and the evaluation infrastructure itself emerges as a source of systematic risk. Together, these contributions argue that trustworthy machine learning in the real world is not a single problem to be solved, but a commitment to rigor at every layer — in how systems learn, how they withstand attack, and how we measure whether they have succeeded.| File | Dimensione | Formato | |
|---|---|---|---|
|
Tesi_dottorato_Gupta.pdf
accesso aperto
Note: tesi completa
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
7.06 MB
Formato
Adobe PDF
|
7.06 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


