Dataset

Dataset#

We will start exploring the parameters of the dataset to learn what data is available.

First we need to access the dataset. We will use the AllenSDK and the BrainObservatoryCache to do so. First we need to set this up - the key step is to provide a manifest file. The SDK uses this file to know what data is available and organize the files it downloads. If you instantiate the BrainObservatoryCache without providing a manifest file, it will create one in your working directory.

from allensdk.core.brain_observatory_cache import BrainObservatoryCache
manifest_file = '../../../data/allen-brain-observatory/visual-coding-2p/manifest.json'
boc = BrainObservatoryCache(manifest_file=manifest_file)

We can use the BrainObservatoryCache to explore the parameters of the dataset.

Targeted structures#

What brain regions were recorded across the dataset? To determine this we use a function called get_all_targeted_structures() to create a list of the regions.

boc.get_all_targeted_structures()

['VISal', 'VISam', 'VISl', 'VISp', 'VISpm', 'VISrl']

We see that data was collected in six different visual areas. VISp is the primary visual cortex, also known as V1. The others are higher visual areas (HVAs) that surround VISp. You can learn more about these areas and how we map them here.

Cre lines and reporters#

We used Cre lines to drive the expression of GCaMP6 in specific populations of neurons. We can find a list of all the cre lines used in this dataset with a similar function

boc.get_all_cre_lines()

['Cux2-CreERT2',
 'Emx1-IRES-Cre',
 'Fezf2-CreER',
 'Nr5a1-Cre',
 'Ntsr1-Cre_GN220',
 'Pvalb-IRES-Cre',
 'Rbp4-Cre_KL100',
 'Rorb-IRES2-Cre',
 'Scnn1a-Tg3-Cre',
 'Slc17a7-IRES2-Cre',
 'Sst-IRES-Cre',
 'Tlx3-Cre_PL56',
 'Vip-IRES-Cre']

Cre is a driver that drives the expression of a reporter. We used four different reporter lines in this dataset.

boc.get_all_reporter_lines()

['Ai148(TIT2L-GC6f-ICL-tTA2)',
 'Ai162(TIT2L-GC6s-ICL-tTA2)',
 'Ai93(TITL-GCaMP6f)',
 'Ai93(TITL-GCaMP6f)-hyg',
 'Ai94(TITL-GCaMP6s)']

Note

Reporter lines: All the experiments in this dataset use GCaMP6. The large majority use GCaMP6f and only a few use GCaMP6s. However, you see four different reporters listed here. Why is this? Ai93 is the GCaMP6f reporter we used with the excitatory Cre lines. However, this reporter does not work well for inhibitory Cre lines. We used Ai148, another GCaMP6f reporter, with Vip-IRES-Cre and Sst-IRES-Cre. However, this didn’t work with the Pvalb-IRES-Cre. We use Ai162, a GCaMP6s reporter with Pvalb. Additionally, to have a GCaMP6f vs GCaMP6s comparison, we collected a small number of experiments using Ai94 with the Slc17a7-IRES2-Cre. This is a GCaMP6s reporter that complements Ai93. Slc17a7-IRES2-Cre is the only Cre line that was recorded using multiple reporter types.

See Transgenic tools to learn more about these Cre lines and reporters.

Imaging depths#

Each experiment was collected at a single imaging depth.

boc.get_all_imaging_depths()

These values are in µm below the surface of the cortex. This is a long list and some of the values don’t differ by very much. How meaningful is it? We roughly consider depths less than 250 to be layer 2/3, 250-350 to be layer 4, 350-500 to be layer 5, and over 500 to be layer 6. Keep in mind, much of the imaging here was done with layer specific Cre lines, so for most purposes the best way to get layer specificity is to select appropriate Cre lines.

Visual stimuli#

What were the visual stimuli that we showed to the mice?

boc.get_all_stimuli()

['drifting_gratings',
 'locally_sparse_noise',
 'locally_sparse_noise_4deg',
 'locally_sparse_noise_8deg',
 'natural_movie_one',
 'natural_movie_three',
 'natural_movie_two',
 'natural_scenes',
 'spontaneous',
 'static_gratings']

These are described more extensively in Visual stimuli.

Experiment containers & sessions#

The experiment container describes a set of 3 imaging sessions performed for the same field of view (ie. same targeted structure and imaging depth in the same mouse that targets the same set of neurons). Each experiment container has a unique ID number.

We will identify all the experiment containers for a given structure and Cre line:

visual_area = 'VISp'
cre_line ='Cux2-CreERT2'

exps = boc.get_experiment_containers(targeted_structures=[visual_area], cre_lines=[cre_line])

Note

get_experiment_containers returns all experiment containers that meet the conditions we have specified. The parameters that we could pass this function include targeted_structures, imaging_depths, cre_lines, reporter_lines, stimuli, session_types, and cell_specimen_ids. If we don’t pass any parameters, it returns all experiment containers.

We can make a dataframe of the list of experiment containers to see what information we get about them:

pd.DataFrame(exps)

	id	imaging_depth	targeted_structure	cre_line	reporter_line	donor_name	specimen_name	tags	failed
0	511510736	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	222426	Cux2-CreERT2;Camk2a-tTA;Ai93-222426	[]	False
1	511510855	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	229106	Cux2-CreERT2;Camk2a-tTA;Ai93-229106	[]	False
2	511509529	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	222420	Cux2-CreERT2;Camk2a-tTA;Ai93-222420	[]	False
3	511507650	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	222424	Cux2-CreERT2;Camk2a-tTA;Ai93-222424	[]	False
4	702934962	275	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	382421	Cux2-CreERT2;Camk2a-tTA;Ai93-382421	[]	False
5	645413757	275	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	348262	Cux2-CreERT2;Camk2a-tTA;Ai93-348262	[]	False
6	659767480	275	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	360565	Cux2-CreERT2;Camk2a-tTA;Ai93-360565	[]	False
7	511510650	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	222425	Cux2-CreERT2;Camk2a-tTA;Ai93-222425	[]	False
8	712178509	275	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	390323	Cux2-CreERT2;Camk2a-tTA;Ai93-390323	[]	False
9	511510667	275	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	222420	Cux2-CreERT2;Camk2a-tTA;Ai93-222420	[]	False
10	524691282	275	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	243293	Cux2-CreERT2;Camk2a-tTA;Ai93-243293	[]	False
11	701412138	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	382421	Cux2-CreERT2;Camk2a-tTA;Ai93-382421	[]	False
12	511510718	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	231584	Cux2-CreERT2;Camk2a-tTA;Ai93-231584	[]	False
13	511510699	275	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	225037	Cux2-CreERT2;Camk2a-tTA;Ai93-225037	[]	False
14	511510779	275	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	225036	Cux2-CreERT2;Camk2a-tTA;Ai93-225036	[]	False
15	511510670	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	225037	Cux2-CreERT2;Camk2a-tTA;Ai93-225037	[]	False

id: The experiment container id
imaging_depth: The imaging depth that data was acquired at, in um from the surface of cortex.
targeted_structure: The brain structure that was imaged in this session.
cre_line: The Cre line that the mouse has.
reporter_line: The reporter line that the mouse has.
donor_name: The id of the mouse that was imaged.
specimen_name: The full name of the mouse including its genotype and donor name.
tags: A list of tags
failed: Boolean indicating whether the experiment container failed QC. By default, only container that pass QC are returned. Users must specify to include failed experiment containers if looking for these.

You see there are 16 experiments for this Cre line in VISp. They all have different experiment container ids (called “id” here) and they mostly have different donor names which identify the mouse that was imaged. This Cre line was imaged at two different imaging depths, sampling both layer 2/3 and layer 4. But they all have the same Cre line, targeted structure and reporter line.

Exercise: How many experiment containers were collected in each cortical visual area for each Cre line?#

cre_lines = boc.get_all_cre_lines()
areas = boc.get_all_targeted_structures()
df = pd.DataFrame(columns=areas, index=cre_lines)
for cre in cre_lines:
  for area in areas:
    exps = boc.get_experiment_containers(targeted_structures=[area], cre_lines=[cre])
    df[area].loc[cre] = len(exps)
df

	VISal	VISam	VISl	VISp	VISpm	VISrl
Cux2-CreERT2	13	11	11	16	13	12
Emx1-IRES-Cre	7	3	8	10	4	9
Fezf2-CreER	0	0	5	4	0	0
Nr5a1-Cre	6	6	6	8	7	6
Ntsr1-Cre_GN220	0	0	7	6	5	0
Pvalb-IRES-Cre	0	0	5	16	0	0
Rbp4-Cre_KL100	6	8	7	7	6	4
Rorb-IRES2-Cre	6	8	6	8	7	5
Scnn1a-Tg3-Cre	0	0	0	9	0	0
Slc17a7-IRES2-Cre	2	2	19	60	15	2
Sst-IRES-Cre	1	0	17	30	14	2
Tlx3-Cre_PL56	0	0	3	6	0	0
Vip-IRES-Cre	0	0	24	36	16	0

You see that not all Cre lines were imaged in all areas.

Session types#

The responses to this full set of visual stimuli were recorded across three imaging sessions. We returned to the same targeted structure and same imaging depth in the same mouse to recorded the same group of neurons across three different days.

Let’s look at all of the sessions in a single experiment container.

experiment_container_id = 511510736
sessions = boc.get_ophys_experiments(experiment_container_ids=[experiment_container_id])

Note

Much like get_experiment_containers, get_ophys_experiments returns all experiment sessions that meet the conditions we have specified. The parameters that we could pass this function include targeted_structures, imaging_depths, cre_lines, reporter_lines, stimuli, session_types, experiment-container_id, and cell_specimen_ids. If we don’t pass any parameters, it returns all experiment sessions.

Let’s look at a DataFrame of the results

pd.DataFrame(sessions)

	id	imaging_depth	targeted_structure	cre_line	reporter_line	acquisition_age_days	experiment_container_id	session_type	donor_name	specimen_name	fail_eye_tracking
0	501559087	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	103	511510736	three_session_B	222426	Cux2-CreERT2;Camk2a-tTA;Ai93-222426	True
1	501704220	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	104	511510736	three_session_A	222426	Cux2-CreERT2;Camk2a-tTA;Ai93-222426	True
2	501474098	175	VISp	Cux2-CreERT2	Ai93(TITL-GCaMP6f)	102	511510736	three_session_C	222426	Cux2-CreERT2;Camk2a-tTA;Ai93-222426	True

id: The session id for the session. This is the id that is used to access data for that session.
imaging_depth: The imaging depth that data was acquired at, in um from the surface of cortex.
targeted_structure: The brain structure that was imaged in this session.
cre_line: The Cre line that the mouse has.
reporter_line: The reporter line that the mouse has.
acquisition_age_days: The age of the mouse when this session was recorded, in days.
experiment_container_id: The id of the experiment container that this session belongs to.
session_type: The name of the session, this describes the set of stimuli that are presented during the experiment.
donor_name: The id of the mouse that was imaged.
specimen_name: The full name of the mouse including its genotype and donor name.
fail_eye_tracking: Boolean marking which sessions had successful eye tracking. This might be obsolete.

When looking at all of the sessions in a single experiment container, as we have done above, you will notice that the experiment container id, cre line, reporter line, donor name, specimen name, imaging depth, targeted structure are all the same while the id, acquisition age, and session type must be different.

As you see, each experiment container has three different session types. For the data published in June 2016 and October 2016, the last session is three_session_C</b<> while the data published after this were collected using three_session_C2. The key difference between these sessions is a change in the locally sparse noise stimulus. This is described more here.

Cell specimen ids#

During data processing, we matched identified ROIs across each of the sessions within experiment containers. Approximately one third of the neurons in the dataset were matched across all three sessions, one third were matched in two of the three session, and one third were only found in one session. Neurons have unique ids, called cell_specimen_ids, that are shared across the sessions they are found in.

How come we don’t always match ROIs across all three session for all neurons?

There are a few factors that could explain why we don’t always match ROIs across all sessions that include biological, experimental, and analytical reasons. Biologically, a neuron must be active within a session to be identifiable during segmentation. For various reasons, a neuron might not be active during some sessions while it is active during others. Experimentally, there are challenges to returning to the precise same field of view. Being at a slightly different depth, or having just a bit of tilt in the imaging plane, might result in some neurons that were in view during one session not being in view during another. Analytically, the method for identifying ROIs as well as for matching ROIs from multiple sessions can make mistakes.

We explore how to look at neurons across session in Cross session data.

previous

Visual Coding — 2-photon

next

Getting data from a session

Contents

Targeted structures

Cre lines and reporters

Imaging depths

Visual stimuli

Experiment containers & sessions

Exercise: How many experiment containers were collected in each cortical visual area for each Cre line?

Session types

Cell specimen ids

By Allen Institute Summer Workshop on the Dynamic Brain

© Copyright 2024.