Loughborough University
Browse

IN3: A framework for in-network computation of neural networks in the programmable data plane

Download (1017.16 kB)
journal contribution
posted on 2024-08-21, 15:41 authored by Xiaoquan Zhang, Lin Cui, Fung Po TsoFung Po Tso, Wenzhi Li, Weijia Jia
Neural networks have been widely used in networking applications due to their high accuracy and generalization. However, the traditional approach of collecting network features from switches and transmitting them to the controller introduces high traffic overhead and extra communication latency. In-network computing (INC) mitigates this issue by running computing tasks directly in the networks on the data paths using programmable data planes (PDP). However, it is challenging to embed more sophisticated computing tasks, such as neural networks, in the networks due to the limitations in the computation and storage resources of PDP. To address this challenge, we propose IN3, a framework that enables complete neural network inference in PDP. IN3 uses model compression techniques to reduce the memory and computational requirements of given neural networks. Additionally, a purposely designed data plane pipeline for per-flow features computation and inference is proposed. We implemented a testbed prototype (based on Intel Tofino ASIC), and experimental results demonstrate that IN3 effectively reduces memory usage, while significantly decreasing the inference time. IN3 demonstrates the feasibility of implementing neural networks in PDP, and we identify potential future research directions for this issue.

History

School

  • Science

Department

  • Computer Science

Published in

IEEE Communications Magazine

Volume

62

Issue

4

Pages

96 - 102

Publisher

IEEE

Version

  • AM (Accepted Manuscript)

Rights holder

© IEEE

Publisher statement

© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Publication date

2024-04-08

Copyright date

2024

ISSN

0163-6804

eISSN

1558-1896

Language

  • en

Depositor

Dr Posco Tso. Deposit date: 6 August 2024

Usage metrics

    Loughborough Publications

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC