Adversarial Robustness in Deep Neural Networks

Jyoti Sanjay Singh

doi:10.15662/IJRAI.2018.0103003

Authors

Jyoti Sanjay Singh Ganga Institute of Technology and Management, Maharshi Dayanand University, Rohtak Haryana, India Author

DOI:

https://doi.org/10.15662/IJRAI.2018.0103003

Keywords:

Adversarial robustness, deep neural networks, adversarial examples, FGSM, adversarial training, defensive distillation, model vulnerability

Abstract

Adversarial robustness—the capability of deep neural networks (DNNs) to resist intentionally crafted perturbations—has emerged as a critical concern in fields such as computer vision, autonomous systems, and cybersecurity. Early research uncovered that imperceptible perturbations to inputs, dubbed adversarial examples, can reliably mislead DNNs, even when such inputs appear unaltered to humans. This vulnerability stems primarily from the linear tendencies in high-dimensional models, as described by Goodfellow et al. in 2014. Subsequent studies achieved alarming misclassification rates with minimal perturbations, highlighting the need for effective defenses. Among early mitigation approaches, adversarial training—integrating adversarial examples into training data—offered practical improvements, while defensive distillation (Papernot et al., 2015) drastically reduced vulnerability by altering model gradients. This paper surveys these foundational works and others that dissect adversarial mechanics, assess limitations, and propose defenses. We explore key generation methods (e.g., FGSM), defense mechanisms (adversarial training, distillation), and theoretical analyses of vulnerabilities. Through distilled insights, we elucidate the fundamental tradeoff between accuracy and robustness, and highlight how early defenses shaped future advances. We conclude with reflections on structural challenges in achieving adversarially robust DNNs and propose areas for future investigation, such as robust optimization and robustness certification methods—emerging shortly after 2017.