Since the pioneering work of Hubel and Wiesel, one of the central aims of visual neuroscience has been to characterize the visual features that neurons extract along the visual processing hierarchy. However, given the non-linear chain of transformations performed across this hierarchy, it has been extremely challenging to derive effective models to fulfill this goal. A significant progress in this direction is the XDream approach (Ponce et al., 2019), which employs gradient-free optimization to synthesize a visual stimulus that maximizes the activity of a unit in either a convolutional neural network (CNN) or a visual cortical neuron. Despite the success of this paradigm, studying the behavior of individual cells in isolation offers limited biological relevance and reduces interpretability in architectures involving interactions among thousands or millions of units. In this work, we introduce P-XDream, a framework that extends XDream by generalizing the synthesis of “superstimuli” for populations of neural units. Specifically, P-XDream employs unsupervised learning methods to identify populations of units that tend to co-activate in response to natural visual statistics. It then synthesizes a visual stimulus that optimizes the collective activity of the selected population. Among these methods, Dominant Set clustering has proven to be the most effective, achieving high activation levels while also providing a notable degree of interpretability. The results indicate that population-based maximization achieves ensemble activation producing at least double the activity compared to the maximization of randomly selected unit groups of equivalent size, reinforcing the importance of population-based representations in CNNs. Additionally, we illustrate the effects of P-XDream maximization across hierarchical levels within a CNN, providing insights into feature extraction mechanisms across layers.
Since the pioneering work of Hubel and Wiesel, one of the central aims of visual neuroscience has been to characterize the visual features that neurons extract along the visual processing hierarchy. However, given the non-linear chain of transformations performed across this hierarchy, it has been extremely challenging to derive effective models to fulfill this goal. A significant progress in this direction is the XDream approach (Ponce et al., 2019), which employs gradient-free optimization to synthesize a visual stimulus that maximizes the activity of a unit in either a convolutional neural network (CNN) or a visual cortical neuron. Despite the success of this paradigm, studying the behavior of individual cells in isolation offers limited biological relevance and reduces interpretability in architectures involving interactions among thousands or millions of units. In this work, we introduce P-XDream, a framework that extends XDream by generalizing the synthesis of “superstimuli” for populations of neural units. Specifically, P-XDream employs unsupervised learning methods to identify populations of units that tend to co-activate in response to natural visual statistics. It then synthesizes a visual stimulus that optimizes the collective activity of the selected population. Among these methods, Dominant Set clustering has proven to be the most effective, achieving high activation levels while also providing a notable degree of interpretability. The results indicate that population-based maximization achieves ensemble activation producing at least double the activity compared to the maximization of randomly selected unit groups of equivalent size, reinforcing the importance of population-based representations in CNNs. Additionally, we illustrate the effects of P-XDream maximization across hierarchical levels within a CNN, providing insights into feature extraction mechanisms across layers.
P-XDream: Inferring maximally exiting stimuli for populations of units in Convolutional Neural Networks
QUINTAVALLE, SEBASTIANO
2023/2024
Abstract
Since the pioneering work of Hubel and Wiesel, one of the central aims of visual neuroscience has been to characterize the visual features that neurons extract along the visual processing hierarchy. However, given the non-linear chain of transformations performed across this hierarchy, it has been extremely challenging to derive effective models to fulfill this goal. A significant progress in this direction is the XDream approach (Ponce et al., 2019), which employs gradient-free optimization to synthesize a visual stimulus that maximizes the activity of a unit in either a convolutional neural network (CNN) or a visual cortical neuron. Despite the success of this paradigm, studying the behavior of individual cells in isolation offers limited biological relevance and reduces interpretability in architectures involving interactions among thousands or millions of units. In this work, we introduce P-XDream, a framework that extends XDream by generalizing the synthesis of “superstimuli” for populations of neural units. Specifically, P-XDream employs unsupervised learning methods to identify populations of units that tend to co-activate in response to natural visual statistics. It then synthesizes a visual stimulus that optimizes the collective activity of the selected population. Among these methods, Dominant Set clustering has proven to be the most effective, achieving high activation levels while also providing a notable degree of interpretability. The results indicate that population-based maximization achieves ensemble activation producing at least double the activity compared to the maximization of randomly selected unit groups of equivalent size, reinforcing the importance of population-based representations in CNNs. Additionally, we illustrate the effects of P-XDream maximization across hierarchical levels within a CNN, providing insights into feature extraction mechanisms across layers.File | Dimensione | Formato | |
---|---|---|---|
quintavalle_master_thesis.pdf
accesso aperto
Dimensione
19.74 MB
Formato
Adobe PDF
|
19.74 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14247/24481