Deep Mouse Trap

Colin Conwell; David Mayo; Michael A. Buice; Boris Katz; George Alvarez; Andrei Barbu

This article is an abridged summary of a longer work appearing at NeurIPS 2021, as well as a conceptual introduction to the Deep Mouse Trap Github repo.

Neurons in Vivo & Silico

What goes on in the brain of a mouse? It’s a seemingly simple question that belies a devilishly complicated scientific endeavor to understand how the firing and wiring together of neurons in a nervous system produce intelligent behavior. The mouse is arguably the centerpiece of a modern neuroscientific praxis that has availed itself of everything from genetics to cybernetics, yet in certain respects we know very little about certain key aspects of its neural software. In this project, we’ll be looking at vision, and in particular the ways we’ve increasingly come to model it.

The relative paucity of models we have for characterizing the vision of mice is made all the more conspicuous by the relative excess of models we have for characterizing the vision of another paradigmatic lab animal: monkeys (and in particular the rhesus macaque). Over the last 5 years, our ability to characterize and predict the neural activity of macaque visual cortex has surged in large part thanks to a singular class of model: object-recognizing deep neural networks. So powerful are these models that we can even use them as a sort of neural remote control, synthesizing visual stimuli that drives neural activity beyond the range evoked by handpicked natural images. The success of these models in predicting mouse visual cortex, on the other hand, has been a bit more modest, with some even suggesting that randomly initialized neural networks (neural networks that have never actually learned anything) are as predictive as trained ones – a particularly worrisome suggestion if we’d like to make mechanistic claims about the neural activity we’re predicting as having something to do with visual intelligence.

Here, we re-examine the state of neural network modeling in mouse visual cortex, using a massive optical physiology dataset of almost 6,600 highly reliable visual cortical neurons (courtesy of the Allen Brain Observatory), a large battery of neural networks (both trained and randomly initialized), and multiple methods of comparing the activity of those networks to the brain (including both representational similarity and linear mapping). Our intent is this is not to necessarily to converge on the single best model of mouse brain per se, but to better understand the kinds of pressures that shape the representations inherent to those models with greater or lesser neural predictivity.

The Approach

Neural Data & Models

We first preprocess the neural data such that we have the average responses per neuron to each of the 119 natural scene images that were used by the Brain Observatory as a freeviewing probe. (The 6619 neurons in our final set of neurons are actually the subsampled neurons from a larger set of about 37,398 unique neurons that we’ve filtered for reliability.) Our neural sample includes neurons from 6 different cortical areas that span what has (neuroanatomically) been demarcated as the rodent ventral and dorsal visual pathways.

We then compare these responses systematically to the responses of artificial neurons across the layers of a variety of deep net models, selected deliberately to engender meaningful experimental foils we can use to answer thematic questions about representations in the mouse brain. These models include over 90 distinct architectures (e.g. ConvNets, transformers, MLP-Mixers) all trained on object classification with the ImageNet training set; the randomly-initialized (untrained) versions of these same 90 architectures; the 24 models of the Taskonomy project (which include as a backbone the same architecture of encoder); and 20 models (all ResNet50 architectures) trained on a variety of self-supervised tasks. A list of the models we use is available in the table below.

model_display_name	description
AlexNet	AlexNet trained on image classification with the ImageNet dataset.
VGG11	VGG11 trained on image classification with the ImageNet dataset.
VGG13	VGG13 trained on image classification with the ImageNet dataset.
VGG16	VGG16 trained on image classification with the ImageNet dataset.
VGG19	VGG19 trained on image classification with the ImageNet dataset.
VGG11-BatchNorm	VGG11-BatchNorm trained on image classification with the ImageNet dataset.
VGG13-BatchNorm	VGG13-BatchNorm trained on image classification with the ImageNet dataset.
VGG16-BatchNorm	VGG16-BatchNorm trained on image classification with the ImageNet dataset.
VGG19-BatchNorm	VGG19-BatchNorm trained on image classification with the ImageNet dataset.
ResNet18	ResNet18 trained on image classification with the ImageNet dataset.
ResNet34	ResNet34 trained on image classification with the ImageNet dataset.
ResNet50	ResNet50 trained on image classification with the ImageNet dataset.
ResNet101	ResNet101 trained on image classification with the ImageNet dataset.
ResNet152	ResNet152 trained on image classification with the ImageNet dataset.
SqueezeNet1.0	SqueezeNet1.0 trained on image classification with the ImageNet dataset.
SqueezeNet1.1	SqueezeNet1.1 trained on image classification with the ImageNet dataset.
DenseNet121	DenseNet121 trained on image classification with the ImageNet dataset.
DenseNet161	DenseNet161 trained on image classification with the ImageNet dataset.
DenseNet169	DenseNet169 trained on image classification with the ImageNet dataset.
DenseNet201	DenseNet201 trained on image classification with the ImageNet dataset.
GoogleNet	GoogleNet trained on image classification with the ImageNet dataset.
ShuffleNet-V2-x0.5	ShuffleNet-V2-x0.5 trained on image classification with the ImageNet dataset.
ShuffleNet-V2-x1.0	ShuffleNet-V2-x1.0 trained on image classification with the ImageNet dataset.
MobileNet-V2	MobileNet-V2 trained on image classification with the ImageNet dataset.
ResNext50-32x4D	ResNext50-32x4D trained on image classification with the ImageNet dataset.
ResNext50-32x8D	ResNext50-32x8D trained on image classification with the ImageNet dataset.
Wide-ResNet50	Wide-ResNet50 trained on image classification with the ImageNet dataset.
Wide-ResNet101	Wide-ResNet101 trained on image classification with the ImageNet dataset.
MNASNet0.5	MNASNet0.5 trained on image classification with the ImageNet dataset.
MNASNet1.0	MNASNet1.0 trained on image classification with the ImageNet dataset.
Inception-V3	Inception-V3 trained on image classification with the ImageNet dataset.
Autoencoder	Image compression and decompression
Object Classification	1000-way object classification (via knowledge distillation from ImageNet).
Scene Classification	Scene Classification (via knowledge distillation from MIT Places).
Curvatures	Magnitude of 3D principal curvatures
Denoising	Uncorrupted version of corrupted image.
Euclidean Depth	Depth estimation
Z-Buffer Depth	Depth estimation.
Occlusion Edges	Edges which include parts of the scene.
Texture Edges	Edges computed from RGB only (texture edges).
Egomotion	Odometry (camera poses) given three input images.
Camera Pose (Fixated)	Relative camera pose with matching optical centers.
Inpainting	Filling in masked center of image.
Jigsaw	Putting scrambled image pieces back together.
2D Keypoints	Keypoint estimation from RGB-only (texture features).
3D Keypoints	3D Keypoint estimation from underlying scene 3D.
Camera Pose (Nonfixated)	Relative camera pose with distinct optical centers.
Surface Normals	Pixel-wise surface normals.
Point Matching	Classifying if centers of two images match or not.
Reshading	Reshading with new lighting placed at camera location.
Room Layout	Orientation and aspect ratio of cubic room layout.
Semantic Segmentation	Pixel-wise semantic labeling (via knowledge distillation from MS COCO).
Unsupervised 2.5D Segmentation	Segmentation (graph cut approximation) on RGB-D-Normals-Curvature image.
Unsupervised 2D Segmentation	Segmentation (graph cut approximation) on RGB.
Vanishing Point	Three Manhattan-world vanishing points.
Random Weights	Taskonomy architecture randomly initialized.
CaIT-S24	CaIT-S24 trained on image classification with the ImageNet dataset.
CoaT-Lite-Mini	CoaT-Lite-Mini trained on image classification with the ImageNet dataset.
ConViT-B	ConViT-B trained on image classification with the ImageNet dataset.
ConViT-S	ConViT-S trained on image classification with the ImageNet dataset.
CSP-DarkNet53	CSP-DarkNet53 trained on image classification with the ImageNet dataset.
CSP-ResNet50	CSP-ResNet50 trained on image classification with the ImageNet dataset.
DLA34	DLA34 trained on image classification with the ImageNet dataset.
DLA169	DLA169 trained on image classification with the ImageNet dataset.
ECA-NFNeT-L0	ECA-NFNeT-L0 trained on image classification with the ImageNet dataset.
ECA-NFNeT-L1	ECA-NFNeT-L1 trained on image classification with the ImageNet dataset.
ECA-Resnet50-D	ECA-Resnet50-D trained on image classification with the ImageNet dataset.
ECA-Resnet101-D	ECA-Resnet101-D trained on image classification with the ImageNet dataset.
EfficientNet-V2-S	EfficientNet-V2-S trained on image classification with the ImageNet dataset.
FBNetC100	FBNetC100 trained on image classification with the ImageNet dataset.
GerNet-L	GerNet-L trained on image classification with the ImageNet dataset.
GerNet-S	GerNet-S trained on image classification with the ImageNet dataset.
GhostNet100	GhostNet100 trained on image classification with the ImageNet dataset.
HardCoreNAS-A	HardCoreNAS-A trained on image classification with the ImageNet dataset.
HardCoreNAS-F	HardCoreNAS-F trained on image classification with the ImageNet dataset.
LeViT128	LeViT128 trained on image classification with the ImageNet dataset.
LeViT256	LeViT256 trained on image classification with the ImageNet dataset.
Inception-Resnet-V2	Inception-Resnet-V2 trained on image classification with the ImageNet dataset.
Inception-V3	Inception-V3 trained on image classification with the ImageNet dataset.
Inception-V4	Inception-V4 trained on image classification with the ImageNet dataset.
Inception-V4	Inception-V4 trained on image classification with the ImageNet dataset.
MLP-Mixer-B16	MLP-Mixer-B16 trained on image classification with the ImageNet dataset.
MLP-Mixer-L16	MLP-Mixer-L16 trained on image classification with the ImageNet dataset.
MixNet-L	MixNet-L trained on image classification with the ImageNet dataset.
MixNet-S	MixNet-S trained on image classification with the ImageNet dataset.
MNASNet100	MNASNet100 trained on image classification with the ImageNet dataset.
MNASNet100	MNASNet100 trained on image classification with the ImageNet dataset.
MobileNet-V3	MobileNet-V3 trained on image classification with the ImageNet dataset.
NASNet-A-Large	NASNet-A-Large trained on image classification with the ImageNet dataset.
NF-ResNet50	NF-ResNet50 trained on image classification with the ImageNet dataset.
NF-Net-L0	NF-Net-L0 trained on image classification with the ImageNet dataset.
PNASNet-5-Large	PNASNet-5-Large trained on image classification with the ImageNet dataset.
RegNetX-64	RegNetX-64 trained on image classification with the ImageNet dataset.
RegNetY-64	RegNetY-64 trained on image classification with the ImageNet dataset.
RepVGG-B3	RepVGG-B3 trained on image classification with the ImageNet dataset.
RepVGG-B3G4	RepVGG-B3G4 trained on image classification with the ImageNet dataset.
Res2Net50-26W-4S	Res2Net50-26W-4S trained on image classification with the ImageNet dataset.
ResNest50D	ResNest50D trained on image classification with the ImageNet dataset.
ResNetRS50	ResNetRS50 trained on image classification with the ImageNet dataset.
RexNet100	RexNet100 trained on image classification with the ImageNet dataset.
SemNASNet100	SemNASNet100 trained on image classification with the ImageNet dataset.
SEResNet152D	SEResNet152D trained on image classification with the ImageNet dataset.
SEResNext50-32x4D	SEResNext50-32x4D trained on image classification with the ImageNet dataset.
SKResNet18	SKResNet18 trained on image classification with the ImageNet dataset.
SKResNext50-32x4D	SKResNext50-32x4D trained on image classification with the ImageNet dataset.
SPNasNet100	SPNasNet100 trained on image classification with the ImageNet dataset.
Swin-B-P4-W7-224	Swin-B-P4-W7-224 trained on image classification with the ImageNet dataset.
Swin-L-P4-W7-224	Swin-L-P4-W7-224 trained on image classification with the ImageNet dataset.
Swin-S-P4-W7-224	Swin-S-P4-W7-224 trained on image classification with the ImageNet dataset.
EfficientNet-B1	EfficientNet-B1 trained on image classification with the ImageNet dataset.
EfficientNet-B3	EfficientNet-B3 trained on image classification with the ImageNet dataset.
EfficientNet-B5	EfficientNet-B5 trained on image classification with the ImageNet dataset.
EfficientNet-B7	EfficientNet-B7 trained on image classification with the ImageNet dataset.
Visformer	Visformer trained on image classification with the ImageNet dataset.
ViT-L-P16-224	ViT-L-P16-224 trained on image classification with the ImageNet dataset.
ViT-S-P16-224	ViT-S-P16-224 trained on image classification with the ImageNet dataset.
ViT-B-P16-224	ViT-B-P16-224 trained on image classification with the ImageNet dataset.
XCeption	XCeption trained on image classification with the ImageNet dataset.
XCeption65	XCeption65 trained on image classification with the ImageNet dataset.
ResNet50-JigSaw-P100	ResNet50-JigSaw-P100 trained via self supervision with the ImageNet dataset.
ResNet50-JigSaw-Goyal19	ResNet50-JigSaw-Goyal19 trained via self supervision with the ImageNet dataset.
ResNet50-RotNet	ResNet50-RotNet trained via self supervision with the ImageNet dataset.
ResNet50-ClusterFit-16K-RotNet	ResNet50-ClusterFit-16K-RotNet trained via self supervision with the ImageNet dataset.
ResNet50-NPID-4KNegative	ResNet50-NPID-4KNegative trained via self supervision with the ImageNet dataset.
ResNet50-PIRL	ResNet50-PIRL trained via self supervision with the ImageNet dataset.
ResNet50-SimCLR	ResNet50-SimCLR trained via self supervision with the ImageNet dataset.
ResNet50-DeepClusterV2	ResNet50-DeepClusterV2-2x224 trained via self supervision with the ImageNet dataset.
ResNet50-DeepClusterV2	ResNet50-DeepClusterV2-2x224+6x96 trained via self supervision with the ImageNet dataset.
ResNet50-SwAV-BS4096	ResNet50-SwAV-BS4096-2x224 trained via self supervision with the ImageNet dataset.
ResNet50-SwAV-BS4096	ResNet50-SwAV-BS4096-2x224+6x96 trained via self supervision with the ImageNet dataset.
ResNet50-MoCoV2-BS256	ResNet50-MoCoV2-BS256 trained via self supervision with the ImageNet dataset.
ResNet50-BarlowTwins-BS2048	ResNet50-BarlowTwins-BS2048 trained via self supervision with the ImageNet dataset.
Dino-VIT-S16	Dino-VIT-S16 trained via self supervision with the ImageNet dataset.
Dino-VIT-S8	Dino-VIT-S8 trained via self supervision with the ImageNet dataset.
Dino-VIT-B16	Dino-VIT-B16 trained via self supervision with the ImageNet dataset.
Dino-VIT-B8	Dino-VIT-B8 trained via self supervision with the ImageNet dataset.
Dino-XCIT-S12-P16	Dino-XCIT-S12-P16 trained via self supervision with the ImageNet dataset.
Dino-XCIT-S12-P8	Dino-XCIT-S12-P8 trained via self supervision with the ImageNet dataset.
Dino-XCIT-M24-P16	Dino-XCIT-M24-P16 trained via self supervision with the ImageNet dataset.
Dino-XCIT-M24-P8	Dino-XCIT-M24-P8 trained via self supervision with the ImageNet dataset.
Dino-ResNet50	Dino-ResNet50 trained via self supervision with the ImageNet dataset.

Mapping Methods

Equipped with our neural data and models, we then employ two distinct methods for mapping the responses of our biological neurons to the responses of our artificial neurons. The first, often called classic representational similarity analysis, is designed to assess representational structure (sometimes referred to as representational geometry) at the level of neural populations – in our case, the neural populations of 6 different visual cortical areas. The key component of representational similarity the representational (dis)similarity matrix (RDM), a distance matrix computed by taking the pairwise distance (1 - Pearson correlation coefficient in our case) of each stimulus to every other stimulus across all neural responses in the target population. A given model’s neural predictivity score in this classic representational similarity analysis is simply an average second order-distance (in our case, another 1 - Pearson correlation coefficient) between the RDM of its maximally correspondent layer and each of the cortical RDMs in our sample. Note that this kind of classic representational similarity analysis is a nonparametric mapping, and requires no fits or transformations – just an emergent similarity in how stimuli are organized across the responses of the two neural populations (one artificial, one biological) being compared.

Rather than target an entire neural population simultaneously, we can also more closely scrutinize individual neural responses using a method broadly called neural encoding or neural regression. With this method, we take the artificial neural responses of our model as the set of predictors in a regression where we try to predict (always with some form of cross-validation) the responses of a biological neuron to images not included in the regression. What we’re effectively doing in this method is mixing and matching our set of artificial neurons (often with some sort of dimensiionality reduction along the way) to approximate the representational profile of a single biological neuron. The better suited those artificial neurons are to this mixing and matching (which we measure with a correlation between the neural responses predicted by the regression and the actual responses of a target neuron), the higher the score of the model that hosts them. (A schematic of our neural regression method can be seen in the figure below.)

Results

Model Rankings

Combining together our neural data, models and mapping methods, a sort of intuitive first result we obtain is a large set of model rankings, which in the plots below we’ve organized broadly into 1 of 3 categories.

ImageNet Architectures

Taskonomy Models

Self-Supervised Models

These rankings may seem like an end in and of themselves, but scrutinize them without context for more than a few minutes and it will (in all probability) quickly become clear that these rankings are largely insufficient for meaningful insight.

What we need, in their place or at least as a supplement, are targeted contrasts between certain kinds of models that directly arbitrate on questions (theoretical or practical) we have about mouse brain. In the section below, we show a few examples of the kinds of questions a neural modeling survey of this nature allows us to address when we choose models we can meaningfully separate into subgroups.

Contrasts + Questions

Does training matter?

A first question we might ask, and one we’ve deliberately designed our survey of models to assess, is whether the kinds of learning a deep neural network does when trained on a task like object recognition actually matter for the prediction of mouse brain. The answer may seem a trivial and intuitive yes, but some prior work suggests something far less simple. There are, for example, a number of cases in nature of perceptual machineries that (unlike the highly structured, hierarchical primate visual cortex) are characterized by somewhat chaotic, random connectivity – rodent olfactory cortex a prime example among them. That rodent visual cortex may itself be defined by this kind of randomness is thus not unfathomable. Compound this with the finding of a prominent research team a few years ago that a classic convolutional neural network model called VGG16 is just as predictive of mouse brain before training as after training and it’s entirely possible that the first major divergence in the deep neural network modeling of mouse versus monkey brain is the necessity of training – in other words, task optimization.

How then does our modeling arbitrate on the possibility of mouse visual cortex being a randomly initialized neural network?

In brief, it strongly suggests it isn’t. In the plot above, each line represents the difference in score between a model trained on ImageNet and its randomly initialized counterpart. In both our neural encoding and representational similarity methods, pretrained models strongly outperform their untrained pair – in representational similarity, this effect is so strong as to be categorical.

The finding here underscores an important point about deep neural network modeling more generally. While individual models can be informative, it’s often worthwhile to assess many models in the aggregate. While we replicate the field’s previous finding that randomly-initialized VGG16 does just as well if not better than ImageNet-trained VGG16, we also demonstrate this effect is not consistent across different architectures.

All in all, this result suggests some sort of task-optimization is key in predicting mouse brain. To better assess what kind of task-optmization matters, though, we turn to other models in our survey.

Does object recognition matter?

The overarching goal in much of the neural benchmarking literature to date has been the deeper understanding of visual object recognition in the ventral stream of macaque visual cortex. This has naturally entailed a focus on deep convolutional network models of object recognition as the natural (and empirically reified) gold standard of neural and behavioral predictivity.

In a species like the mouse, however, the centrality of object recognition as an ethologically relevant task is immediately suspect. Mice do recognize objects, it seems, and do use their visual systems to predicate and calibrate advanced behaviors like hunting insects… but there are a number of reasons (both first order and observational) to believe object recognition might not be the apex model when it comes to explaining how mouse visual cortex is organized.

So what other options are available? Immediately proximate to object recognition, but mitigating the need for category labels are self-supervised, contrastive learning models. Instead of being taught to discriminate object classes indirectly by dint of invariances, these models are often taught invariances directly – learning for example to treat a single image and various transformation of that image (rotations, dilations, reilluminations, translations) as the same, all without explicit category supervision.

These models, then, allow us to answer the question as to whether the kinds of predictivity we see from trained models in the mouse brain are really about objects per se, or whether it’s more about the fine-grained, separable, invariant representations learnable through both category and self-supervision. The answer it seems, is closer to the latter. Modern contrastive learning algorithms (e.g. BarlowTwins, SimCLR, Swav) often do just as well as category-supervised models when controlling for the effects of architecture. Object recognition, then – as an explicit task – may be sufficient, but is (in the end) unnecessary to capture the responses of mouse visual cortex to naturalistic stimuli. (The comparison of category supervised and self-supervised models is available in the 3rd tab of the ranking plots above, and across the various bar plots of the figure below.)

Less proximate to object recognition, of course, are a whole variety of alternative computer vision tasks that necessitate an arguably kaleidoscopic variety of latent representation. In our survey, this diversity is captured most succinctly (and with the most empirical control) by the taskonomy encoders: a single encoder architecture deployed in 24 different canonical computer vision tasks ranging from autoencoding to edge detection. While trained on a rather limited visual diet (video tours of model houses), these models nonetheless allow us to triangulate the contributions of different kinds of training procedures on neural predictivity.

Clusters of these tasks (organized in terms of how well representations from one task transfer to the others) are depicted in the figure above. (Descriptions of individual tasks, alongside their cluster, may be found in the table below.) The facets in each of these plots are the six cortical areas assayed in our neural data. The individual lines atop each bar are the best performing model from that particular subgroup of models. The ‘intermouse’ predictivity score (calculable only in the neural encoding metric) is a measure of how well we can do when swapping out the artificial neural network models in our pipeline for other mice. (In other words, how well can we do when modeling one mouse if our model was an aggregate of conspecifics?)

The major takeaway from this figure (and from our results more generally) is that object recognition (a member of the ‘Semantic’ task cluster) is by no means the only contender in the game of neural predictivity. In fact, a ‘2D’ task – unsupervised segmentation – is one of the overall best models in our survey, and is perhaps the more intuitive task for a visual system like the mouse’s, configured as it most likely is to navigation and foraging in the dark.

model_display_name	task_cluster	description
AlexNet	Semantic	AlexNet trained on image classification with the ImageNet dataset.
VGG11	Semantic	VGG11 trained on image classification with the ImageNet dataset.
VGG13	Semantic	VGG13 trained on image classification with the ImageNet dataset.
VGG16	Semantic	VGG16 trained on image classification with the ImageNet dataset.
VGG19	Semantic	VGG19 trained on image classification with the ImageNet dataset.
VGG11-BatchNorm	Semantic	VGG11-BatchNorm trained on image classification with the ImageNet dataset.
VGG13-BatchNorm	Semantic	VGG13-BatchNorm trained on image classification with the ImageNet dataset.
VGG16-BatchNorm	Semantic	VGG16-BatchNorm trained on image classification with the ImageNet dataset.
VGG19-BatchNorm	Semantic	VGG19-BatchNorm trained on image classification with the ImageNet dataset.
ResNet18	Semantic	ResNet18 trained on image classification with the ImageNet dataset.
ResNet34	Semantic	ResNet34 trained on image classification with the ImageNet dataset.
ResNet50	Semantic	ResNet50 trained on image classification with the ImageNet dataset.
ResNet101	Semantic	ResNet101 trained on image classification with the ImageNet dataset.
ResNet152	Semantic	ResNet152 trained on image classification with the ImageNet dataset.
SqueezeNet1.0	Semantic	SqueezeNet1.0 trained on image classification with the ImageNet dataset.
SqueezeNet1.1	Semantic	SqueezeNet1.1 trained on image classification with the ImageNet dataset.
DenseNet121	Semantic	DenseNet121 trained on image classification with the ImageNet dataset.
DenseNet161	Semantic	DenseNet161 trained on image classification with the ImageNet dataset.
DenseNet169	Semantic	DenseNet169 trained on image classification with the ImageNet dataset.
DenseNet201	Semantic	DenseNet201 trained on image classification with the ImageNet dataset.
GoogleNet	Semantic	GoogleNet trained on image classification with the ImageNet dataset.
ShuffleNet-V2-x0.5	Semantic	ShuffleNet-V2-x0.5 trained on image classification with the ImageNet dataset.
ShuffleNet-V2-x1.0	Semantic	ShuffleNet-V2-x1.0 trained on image classification with the ImageNet dataset.
MobileNet-V2	Semantic	MobileNet-V2 trained on image classification with the ImageNet dataset.
ResNext50-32x4D	Semantic	ResNext50-32x4D trained on image classification with the ImageNet dataset.
ResNext50-32x8D	Semantic	ResNext50-32x8D trained on image classification with the ImageNet dataset.
Wide-ResNet50	Semantic	Wide-ResNet50 trained on image classification with the ImageNet dataset.
Wide-ResNet101	Semantic	Wide-ResNet101 trained on image classification with the ImageNet dataset.
MNASNet0.5	Semantic	MNASNet0.5 trained on image classification with the ImageNet dataset.
MNASNet1.0	Semantic	MNASNet1.0 trained on image classification with the ImageNet dataset.
Inception-V3	Semantic	Inception-V3 trained on image classification with the ImageNet dataset.
AlexNet	Semantic	AlexNet randomly initialized, with no training.
VGG11	Semantic	VGG11 randomly initialized, with no training.
VGG13	Semantic	VGG13 randomly initialized, with no training.
VGG16	Semantic	VGG16 randomly initialized, with no training.
VGG19	Semantic	VGG19 randomly initialized, with no training.
VGG11-BatchNorm	Semantic	VGG11-BatchNorm randomly initialized, with no training.
VGG13-BatchNorm	Semantic	VGG13-BatchNorm randomly initialized, with no training.
VGG16-BatchNorm	Semantic	VGG16-BatchNorm randomly initialized, with no training.
VGG19-BatchNorm	Semantic	VGG19-BatchNorm randomly initialized, with no training.
ResNet18	Semantic	ResNet18 randomly initialized, with no training.
ResNet34	Semantic	ResNet34 randomly initialized, with no training.
ResNet50	Semantic	ResNet50 randomly initialized, with no training.
ResNet101	Semantic	ResNet101 randomly initialized, with no training.
ResNet152	Semantic	ResNet152 randomly initialized, with no training.
SqueezeNet1.0	Semantic	SqueezeNet1.0 randomly initialized, with no training.
SqueezeNet1.1	Semantic	SqueezeNet1.1 randomly initialized, with no training.
DenseNet121	Semantic	DenseNet121 randomly initialized, with no training.
DenseNet161	Semantic	DenseNet161 randomly initialized, with no training.
DenseNet169	Semantic	DenseNet169 randomly initialized, with no training.
DenseNet201	Semantic	DenseNet201 randomly initialized, with no training.
GoogleNet	Semantic	GoogleNet randomly initialized, with no training.
ShuffleNet-V2-x0.5	Semantic	ShuffleNet-V2-x0.5 randomly initialized, with no training.
ShuffleNet-V2-x1.0	Semantic	ShuffleNet-V2-x1.0 randomly initialized, with no training.
MobileNet-V2	Semantic	MobileNet-V2 randomly initialized, with no training.
ResNext50-32x4D	Semantic	ResNext50-32x4D randomly initialized, with no training.
ResNext50-32x8D	Semantic	ResNext50-32x8D randomly initialized, with no training.
Wide-ResNet50	Semantic	Wide-ResNet50 randomly initialized, with no training.
Wide-ResNet101	Semantic	Wide-ResNet101 randomly initialized, with no training.
MNASNet0.5	Semantic	MNASNet0.5 randomly initialized, with no training.
MNASNet1.0	Semantic	MNASNet1.0 randomly initialized, with no training.
Inception-V3	Semantic	Inception-V3 randomly initialized, with no training.
CaIT-S24	Semantic	CaIT-S24 trained on image classification with the ImageNet dataset.
CoaT-Lite-Mini	Semantic	CoaT-Lite-Mini trained on image classification with the ImageNet dataset.
ConViT-B	Semantic	ConViT-B trained on image classification with the ImageNet dataset.
ConViT-S	Semantic	ConViT-S trained on image classification with the ImageNet dataset.
CSP-DarkNet53	Semantic	CSP-DarkNet53 trained on image classification with the ImageNet dataset.
CSP-ResNet50	Semantic	CSP-ResNet50 trained on image classification with the ImageNet dataset.
DLA34	Semantic	DLA34 trained on image classification with the ImageNet dataset.
DLA169	Semantic	DLA169 trained on image classification with the ImageNet dataset.
ECA-NFNeT-L0	Semantic	ECA-NFNeT-L0 trained on image classification with the ImageNet dataset.
ECA-NFNeT-L1	Semantic	ECA-NFNeT-L1 trained on image classification with the ImageNet dataset.
ECA-Resnet50-D	Semantic	ECA-Resnet50-D trained on image classification with the ImageNet dataset.
ECA-Resnet101-D	Semantic	ECA-Resnet101-D trained on image classification with the ImageNet dataset.
EfficientNet-V2-S	Semantic	EfficientNet-V2-S trained on image classification with the ImageNet dataset.
FBNetC100	Semantic	FBNetC100 trained on image classification with the ImageNet dataset.
GerNet-L	Semantic	GerNet-L trained on image classification with the ImageNet dataset.
GerNet-S	Semantic	GerNet-S trained on image classification with the ImageNet dataset.
GhostNet100	Semantic	GhostNet100 trained on image classification with the ImageNet dataset.
HardCoreNAS-A	Semantic	HardCoreNAS-A trained on image classification with the ImageNet dataset.
HardCoreNAS-F	Semantic	HardCoreNAS-F trained on image classification with the ImageNet dataset.
LeViT128	Semantic	LeViT128 trained on image classification with the ImageNet dataset.
LeViT256	Semantic	LeViT256 trained on image classification with the ImageNet dataset.
Inception-Resnet-V2	Semantic	Inception-Resnet-V2 trained on image classification with the ImageNet dataset.
Inception-V3	Semantic	Inception-V3 trained on image classification with the ImageNet dataset.
Inception-V4	Semantic	Inception-V4 trained on image classification with the ImageNet dataset.
Inception-V4	Semantic	Inception-V4 trained on image classification with the ImageNet dataset.
MLP-Mixer-B16	Semantic	MLP-Mixer-B16 trained on image classification with the ImageNet dataset.
MLP-Mixer-L16	Semantic	MLP-Mixer-L16 trained on image classification with the ImageNet dataset.
MixNet-L	Semantic	MixNet-L trained on image classification with the ImageNet dataset.
MixNet-S	Semantic	MixNet-S trained on image classification with the ImageNet dataset.
MNASNet100	Semantic	MNASNet100 trained on image classification with the ImageNet dataset.
MNASNet100	Semantic	MNASNet100 trained on image classification with the ImageNet dataset.
MobileNet-V3	Semantic	MobileNet-V3 trained on image classification with the ImageNet dataset.
NASNet-A-Large	Semantic	NASNet-A-Large trained on image classification with the ImageNet dataset.
NF-ResNet50	Semantic	NF-ResNet50 trained on image classification with the ImageNet dataset.
NF-Net-L0	Semantic	NF-Net-L0 trained on image classification with the ImageNet dataset.
PNASNet-5-Large	Semantic	PNASNet-5-Large trained on image classification with the ImageNet dataset.
RegNetX-64	Semantic	RegNetX-64 trained on image classification with the ImageNet dataset.
RegNetY-64	Semantic	RegNetY-64 trained on image classification with the ImageNet dataset.
RepVGG-B3	Semantic	RepVGG-B3 trained on image classification with the ImageNet dataset.
RepVGG-B3G4	Semantic	RepVGG-B3G4 trained on image classification with the ImageNet dataset.
Res2Net50-26W-4S	Semantic	Res2Net50-26W-4S trained on image classification with the ImageNet dataset.
ResNest50D	Semantic	ResNest50D trained on image classification with the ImageNet dataset.
ResNetRS50	Semantic	ResNetRS50 trained on image classification with the ImageNet dataset.
RexNet100	Semantic	RexNet100 trained on image classification with the ImageNet dataset.
SemNASNet100	Semantic	SemNASNet100 trained on image classification with the ImageNet dataset.
SEResNet152D	Semantic	SEResNet152D trained on image classification with the ImageNet dataset.
SEResNext50-32x4D	Semantic	SEResNext50-32x4D trained on image classification with the ImageNet dataset.
SKResNet18	Semantic	SKResNet18 trained on image classification with the ImageNet dataset.
SKResNext50-32x4D	Semantic	SKResNext50-32x4D trained on image classification with the ImageNet dataset.
SPNasNet100	Semantic	SPNasNet100 trained on image classification with the ImageNet dataset.
Swin-B-P4-W7-224	Semantic	Swin-B-P4-W7-224 trained on image classification with the ImageNet dataset.
Swin-L-P4-W7-224	Semantic	Swin-L-P4-W7-224 trained on image classification with the ImageNet dataset.
Swin-S-P4-W7-224	Semantic	Swin-S-P4-W7-224 trained on image classification with the ImageNet dataset.
EfficientNet-B1	Semantic	EfficientNet-B1 trained on image classification with the ImageNet dataset.
EfficientNet-B3	Semantic	EfficientNet-B3 trained on image classification with the ImageNet dataset.
EfficientNet-B5	Semantic	EfficientNet-B5 trained on image classification with the ImageNet dataset.
EfficientNet-B7	Semantic	EfficientNet-B7 trained on image classification with the ImageNet dataset.
Visformer	Semantic	Visformer trained on image classification with the ImageNet dataset.
ViT-L-P16-224	Semantic	ViT-L-P16-224 trained on image classification with the ImageNet dataset.
ViT-S-P16-224	Semantic	ViT-S-P16-224 trained on image classification with the ImageNet dataset.
ViT-B-P16-224	Semantic	ViT-B-P16-224 trained on image classification with the ImageNet dataset.
XCeption	Semantic	XCeption trained on image classification with the ImageNet dataset.
XCeption65	Semantic	XCeption65 trained on image classification with the ImageNet dataset.
CaIT-S24	Semantic	CaIT-S24 randomly initialized, with no training.
CoaT-Lite-Mini	Semantic	CoaT-Lite-Mini randomly initialized, with no training.
ConViT-B	Semantic	ConViT-B randomly initialized, with no training.
ConViT-S	Semantic	ConViT-S randomly initialized, with no training.
CSP-DarkNet53	Semantic	CSP-DarkNet53 randomly initialized, with no training.
CSP-ResNet50	Semantic	CSP-ResNet50 randomly initialized, with no training.
DLA34	Semantic	DLA34 randomly initialized, with no training.
DLA169	Semantic	DLA169 randomly initialized, with no training.
ECA-NFNeT-L0	Semantic	ECA-NFNeT-L0 randomly initialized, with no training.
ECA-NFNeT-L1	Semantic	ECA-NFNeT-L1 randomly initialized, with no training.
ECA-Resnet50-D	Semantic	ECA-Resnet50-D randomly initialized, with no training.
ECA-Resnet101-D	Semantic	ECA-Resnet101-D randomly initialized, with no training.
EfficientNet-V2-S	Semantic	EfficientNet-V2-S randomly initialized, with no training.
FBNetC100	Semantic	FBNetC100 randomly initialized, with no training.
GerNet-L	Semantic	GerNet-L randomly initialized, with no training.
GerNet-S	Semantic	GerNet-S randomly initialized, with no training.
GhostNet100	Semantic	GhostNet100 randomly initialized, with no training.
HardCoreNAS-A	Semantic	HardCoreNAS-A randomly initialized, with no training.
HardCoreNAS-F	Semantic	HardCoreNAS-F randomly initialized, with no training.
LeViT128	Semantic	LeViT128 randomly initialized, with no training.
LeViT256	Semantic	LeViT256 randomly initialized, with no training.
Inception-Resnet-V2	Semantic	Inception-Resnet-V2 randomly initialized, with no training.
Inception-V3	Semantic	Inception-V3 randomly initialized, with no training.
Inception-V4	Semantic	Inception-V4 randomly initialized, with no training.
Inception-V4	Semantic	Inception-V4 randomly initialized, with no training.
MLP-Mixer-B16	Semantic	MLP-Mixer-B16 randomly initialized, with no training.
MLP-Mixer-L16	Semantic	MLP-Mixer-L16 randomly initialized, with no training.
MixNet-L	Semantic	MixNet-L randomly initialized, with no training.
MixNet-S	Semantic	MixNet-S randomly initialized, with no training.
MNASNet100	Semantic	MNASNet100 randomly initialized, with no training.
MNASNet100	Semantic	MNASNet100 randomly initialized, with no training.
MobileNet-V3	Semantic	MobileNet-V3 randomly initialized, with no training.
NASNet-A-Large	Semantic	NASNet-A-Large randomly initialized, with no training.
NF-ResNet50	Semantic	NF-ResNet50 randomly initialized, with no training.
NF-Net-L0	Semantic	NF-Net-L0 randomly initialized, with no training.
PNASNet-5-Large	Semantic	PNASNet-5-Large randomly initialized, with no training.
RegNetX-64	Semantic	RegNetX-64 randomly initialized, with no training.
RegNetY-64	Semantic	RegNetY-64 randomly initialized, with no training.
RepVGG-B3	Semantic	RepVGG-B3 randomly initialized, with no training.
RepVGG-B3G4	Semantic	RepVGG-B3G4 randomly initialized, with no training.
Res2Net50-26W-4S	Semantic	Res2Net50-26W-4S randomly initialized, with no training.
ResNest50D	Semantic	ResNest50D randomly initialized, with no training.
ResNetRS50	Semantic	ResNetRS50 randomly initialized, with no training.
RexNet100	Semantic	RexNet100 randomly initialized, with no training.
SemNASNet100	Semantic	SemNASNet100 randomly initialized, with no training.
SEResNet152D	Semantic	SEResNet152D randomly initialized, with no training.
SEResNext50-32x4D	Semantic	SEResNext50-32x4D randomly initialized, with no training.
SKResNet18	Semantic	SKResNet18 randomly initialized, with no training.
SKResNext50-32x4D	Semantic	SKResNext50-32x4D randomly initialized, with no training.
SPNasNet100	Semantic	SPNasNet100 randomly initialized, with no training.
Swin-B-P4-W7-224	Semantic	Swin-B-P4-W7-224 randomly initialized, with no training.
Swin-L-P4-W7-224	Semantic	Swin-L-P4-W7-224 randomly initialized, with no training.
Swin-S-P4-W7-224	Semantic	Swin-S-P4-W7-224 randomly initialized, with no training.
EfficientNet-B1	Semantic	EfficientNet-B1 randomly initialized, with no training.
EfficientNet-B3	Semantic	EfficientNet-B3 randomly initialized, with no training.
EfficientNet-B5	Semantic	EfficientNet-B5 randomly initialized, with no training.
EfficientNet-B7	Semantic	EfficientNet-B7 randomly initialized, with no training.
Visformer	Semantic	Visformer randomly initialized, with no training.
ViT-L-P16-224	Semantic	ViT-L-P16-224 randomly initialized, with no training.
ViT-S-P16-224	Semantic	ViT-S-P16-224 randomly initialized, with no training.
ViT-B-P16-224	Semantic	ViT-B-P16-224 randomly initialized, with no training.
XCeption	Semantic	XCeption randomly initialized, with no training.
XCeption65	Semantic	XCeption65 randomly initialized, with no training.
ResNet50-JigSaw-P100	SelfSupervised	ResNet50-JigSaw-P100 trained via self supervision with the ImageNet dataset.
ResNet50-JigSaw-Goyal19	SelfSupervised	ResNet50-JigSaw-Goyal19 trained via self supervision with the ImageNet dataset.
ResNet50-RotNet	SelfSupervised	ResNet50-RotNet trained via self supervision with the ImageNet dataset.
ResNet50-ClusterFit-16K-RotNet	SelfSupervised	ResNet50-ClusterFit-16K-RotNet trained via self supervision with the ImageNet dataset.
ResNet50-NPID-4KNegative	SelfSupervised	ResNet50-NPID-4KNegative trained via self supervision with the ImageNet dataset.
ResNet50-PIRL	SelfSupervised	ResNet50-PIRL trained via self supervision with the ImageNet dataset.
ResNet50-SimCLR	SelfSupervised	ResNet50-SimCLR trained via self supervision with the ImageNet dataset.
ResNet50-DeepClusterV2	SelfSupervised	ResNet50-DeepClusterV2-2x224 trained via self supervision with the ImageNet dataset.
ResNet50-DeepClusterV2	SelfSupervised	ResNet50-DeepClusterV2-2x224+6x96 trained via self supervision with the ImageNet dataset.
ResNet50-SwAV-BS4096	SelfSupervised	ResNet50-SwAV-BS4096-2x224 trained via self supervision with the ImageNet dataset.
ResNet50-SwAV-BS4096	SelfSupervised	ResNet50-SwAV-BS4096-2x224+6x96 trained via self supervision with the ImageNet dataset.
ResNet50-MoCoV2-BS256	SelfSupervised	ResNet50-MoCoV2-BS256 trained via self supervision with the ImageNet dataset.
ResNet50-BarlowTwins-BS2048	SelfSupervised	ResNet50-BarlowTwins-BS2048 trained via self supervision with the ImageNet dataset.
Dino-VIT-S16	SelfSupervised	Dino-VIT-S16 trained via self supervision with the ImageNet dataset.
Dino-VIT-S8	SelfSupervised	Dino-VIT-S8 trained via self supervision with the ImageNet dataset.
Dino-VIT-B16	SelfSupervised	Dino-VIT-B16 trained via self supervision with the ImageNet dataset.
Dino-VIT-B8	SelfSupervised	Dino-VIT-B8 trained via self supervision with the ImageNet dataset.
Dino-XCIT-S12-P16	SelfSupervised	Dino-XCIT-S12-P16 trained via self supervision with the ImageNet dataset.
Dino-XCIT-S12-P8	SelfSupervised	Dino-XCIT-S12-P8 trained via self supervision with the ImageNet dataset.
Dino-XCIT-M24-P16	SelfSupervised	Dino-XCIT-M24-P16 trained via self supervision with the ImageNet dataset.
Dino-XCIT-M24-P8	SelfSupervised	Dino-XCIT-M24-P8 trained via self supervision with the ImageNet dataset.
Dino-ResNet50	SelfSupervised	Dino-ResNet50 trained via self supervision with the ImageNet dataset.

Does our modeling tell us anything about perceptual organization?

So far, our questions about mouse brain have been mostly geared towards function – in effect, what is the mouse visual system doing? Another set of questions, perhaps a bit more contentious, are geared towards organization. One particular question is whether (as has been clearly established in primate visual neuroscience) there exists a meaningful information processing hierarchy across the various cortical areas that define mouse visual cortex. Recent work in neuroanatomy (especially in large-scale connectomics) has begun to suggest the answer is yes, but significant debate remains. Does our modeling weigh in on this debate in any way?

As a matter of fact, it just might. A somewhat proximal, but statistically robust method of assessing for the presence of an information-processing hierarchy in a target biological brain is by calculating the mean (or median) depth (in a network) of the model layers that maximally correspond to the target brain area. In deep neural networks, deeper layers (at least in feedforward models) tend almost always to host more complex, sophisticated representations than earlier layers. If a given cortical area is earlier in the information processing hierarchy than another area, the layer of the deep net that corresponds to that earlier area should (on average) be shallower than the layer that corresponds to the later area.

As it turns out, this precise trend occurs in our sample of mouse visual cortex. Beginning from primary visual cortex (VISp), successive layers in the neuroanatomically defined hierarchy are best captured by successively deeper layers of our deep net models. Pairwise statistics in this case between the median depths of maximally correspondent model layers verify this ‘data-driven’; in other words, wherever there exists a significant pairwise difference, that difference favors the hierarchy.

Conclusion

While far from complete, the questions we’ve covered here are at least a minimally representative palette sampler of the kinds of question that can be asked (and ideally answered) by a large-scale, deliberately designed neural benchmarking survey.

Perhaps the most important question we’ve addressed here is the whether or not deep neural networks are useful models of mouse visual cortex. While significant work remains, with a number of further modifications necessary to more fully account for the anatomical and ethological idiosyncrasies of mice, we hope you’ll grant at least preliminarily that the answer is yes.

Our code is freely available for you to test your own models, and maybe even to craft up models as of yet beyond conception. So universal and yet still so misunderstood, the mouse is a marvelous creature for modelers of all persuasion. Happy trapping!