<p dir="ltr">Deep neural networks have demonstrated exceptional performance in extracting task-specific representations from datasets, earning widespread recognition and application. However, the internal representations often reside in abstract, high-dimensional spaces that are unsupervised and difficult to interpret. Additionally, their complex and tightly coupled structures hinder researchers' ability to understand the models effectively. To tackle these challenges, we introduce NeuronExplorer, an analytical framework that employs self-supervised techniques for learning high-dimensional information representations. NeuronExplorer analyzes the high-dimensional representations derived from the basic units, namely neurons, within the neural network, predicting the clusters to which these neurons belong. This process facilitates the ‘community’ of neurons, enhancing interpretability.Moreover, we refine this neuron community structure by assessing the causal effects of intervening in neuron outputs, allowing us to measure the impact on model performance. NeuronExplorer ultimately enables a deeper understanding of the internal information representation within deep neural networks. Comprehensive experiments conducted across multiple models demonstrate that NeuronExplorer effectively mines internal representations, thereby improving model transparency.</p>
Funding
National Natural Science Foundation of China (Grant Number: 62271345 and 62301356)
Joint Fund of Ministry of Education for Equipment Pre-Research (Grant Number: 8091B032254)