Loughborough University
Browse
1-s2.0-S0196890423001371-main.pdf (3.25 MB)

Self-adaptive logit balancing for deep neural network robustness: Defence and detection of adversarial attacks

Download (3.25 MB)
journal contribution
posted on 2023-07-20, 07:56 authored by Jiefei Wei, Luyan Yao, Qinggang MengQinggang Meng
With the widespread applications of Deep Neural Networks (DNNs), the safety of DNNs has become a significant issue. The vulnerability of the neural networks against adversarial examples deepens concerns about the safety of DNNs applications. This paper proposed a novel defence method to improve the adversarial robustness of DNN classifiers without using adversarial training. This method introduces two new loss functions. First, a zero-cross-entropy loss is used to punish overconfidence and find the appropriate confidence for different instances. Second, a logit balancing loss is proposed to protect DNNs from non-targeted attacks by regularising incorrect classes’ logits distribution. This method achieved competitive adversarial robustness compared to advanced adversarial training methods. Meanwhile, a novel robustness diagram is proposed to analyse, interpret and visualise the robustness of DNN classifiers against adversarial attacks. Furthermore, a Log-Softmax-pattern-based adversarial attack detection method is proposed. This detection method can distinguish clean inputs and multiple adversarial attacks via one multi-classification MLP. In particular, it is state-of-the-art in identifying white-box gradient-based attacks; it achieved at least 95.5% accuracy for classifying four white-box gradient-based attacks with maximum 0.1% false positive ratio.

History

School

  • Science

Department

  • Computer Science

Published in

Neurocomputing

Volume

531

Pages

180 - 194

Publisher

Elsevier

Version

  • VoR (Version of Record)

Rights holder

© The Authors

Publisher statement

This is an Open Access Article. It is published by Elsevier under the Creative Commons Attribution 4.0 International Licence (CC BY). Full details of this licence are available at: https://creativecommons.org/licenses/by/4.0/

Acceptance date

2023-02-11

Publication date

2023-02-17

Copyright date

2023

ISSN

0925-2312

eISSN

1872-8286

Language

  • en

Depositor

Deposit date: 12 July 2023

Usage metrics

    Loughborough Publications

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC