Aligning behavioral data to task events with the stimulus and trials tables

Aligning behavioral data to task events with the stimulus and trials tables#

This notebook outlines the stimulus presentations table and the trials table and shows how you can use them to align behavioral data like running, licking and pupil info to task events. Please note that the VBN project used the same detection of change task as the Visual Behavior 2-Photon dataset. Users are encouraged to explore the documentation and example notebooks for that project for additional context.

Contents#

Introduction to the stimulus presentations table
Introduction to the behavior trials table
Calculating response latency
Aligning Running, Licking and Pupil Data to task events

Import the cache#

import os
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from allensdk.brain_observatory.behavior.behavior_project_cache.\
    behavior_neuropixels_project_cache \
    import VisualBehaviorNeuropixelsProjectCache

%matplotlib inline

/opt/envs/allensdk/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Get the sessions table#

# Update this to a valid directory in your filesystem. This is where the data will be stored.
cache_dir = '/root/capsule/data/'

cache = VisualBehaviorNeuropixelsProjectCache.from_local_cache(
            cache_dir=cache_dir, use_static_cache=True)

ecephys_sessions_table = cache.get_ecephys_session_table()

Introduction to the stimulus presentations table#

Every recording session consisted of three major visual stimulus epochs in the following order (diagrammed below):

An active behavior session during which the mouse performed the change detection task
Receptive field mapping and full-field flash stimuli
‘Passive’ replay of stimulus shown during active behavior, but without the lickspout so the mouse can no longer respond.

Let’s grab a random session and look at the stimulus presentations dataframe to see how these epochs are labeled.

session = cache.get_ecephys_session(
           ecephys_session_id=1065437523)

stimulus_presentations = session.stimulus_presentations
stimulus_presentations.columns

Index(['active', 'color', 'contrast', 'duration', 'end_frame',
       'flashes_since_change', 'image_name', 'is_change', 'is_image_novel',
       'omitted', 'orientation', 'position_x', 'position_y', 'rewarded',
       'spatial_frequency', 'start_frame', 'start_time', 'stimulus_block',
       'stimulus_index', 'stimulus_name', 'end_time', 'temporal_frequency',
       'trials_id'],
      dtype='object')

This table is a record of every stimulus we presented to the mouse over the course of this experiment. The different stimuli are indexed by the ‘stimulus_block’ column. Let’s group this dataframe by stimulus block and see what stimulus was shown for each block.

stimulus_presentations.groupby('stimulus_block')[['stimulus_block',
                                                'stimulus_name',
                                                'active',
                                                'duration',
                                                'start_time']].head(1)

	stimulus_block	stimulus_name	active	duration	start_time
stimulus_presentations_id
0	0	Natural_Images_Lum_Matched_set_ophys_G_2019	True	0.250188	28.131464
4797	1	spontaneous	False	10.008420	3648.207579
4798	2	gabor_20_deg_250ms	False	0.250208	3658.215999
8443	3	spontaneous	False	288.992998	4570.232761
8444	4	flash_250ms	False	0.250201	4859.225759
8594	5	Natural_Images_Lum_Matched_set_ophys_G_2019	False	0.250213	5183.198085

This shows us the structure of this experiment (and every experiment in this dataset). There are 5 stimuli as follows:

block 0: Change detection task. Natural images are flashed repeatedly and the mouse is rewarded for licking when the identity of the image changes. You can find more info about this task here. Also see here for info about our general training strategy.

block 1: Brief gray screen

block 2: Receptive field mapping. Gabor stimuli used for receptive field mapping. For more details on this stimulus consult this notebook.

block 3: Longer gray screen

block 4: Full-field flashes, shown at 80% contrast. Flashes can be black (color = -1) or white (color = 1).

block 5: Passive replay. Frame-for-frame replay of the stimulus shown during the change detection task (block 0), but now with the lick spout retracted so the animal can no longer engage in the task.

Here’s a quick explanation for each of the columns in this table:

General#

active: Boolean indicating when the change detection task (with the lick spout available to the mouse) was run. This should only be TRUE for block 0.

stimulus_block: Index of stimulus as described in cells above.

stimulus_name: Indicates the stimulus category for this stimulus presentation.

contrast: Stimulus contrast as defined here

duration: Duration of stimulus in seconds

start_time: Experiment time when stimulus started. This value is corrected for display lag and therefore indicates when the stimulus actually appeared on the screen.

end_time: Experiment time when stimulus ended, also corrected for display lag.

start_frame: Stimulus frame index when this stimulus started. This can be used to sync this table to the behavior trials table, for which behavioral data is collected every frame.

end_frame: Stimulus frame index when this stimulus ended.

Change detection task and Passive replay (blocks 0 and 5)#

flashes_since_change: Indicates how many flashes of the same image have occurred since the last stimulus change.

image_name: Indicates which natural image was flashed for this stimulus presentation. To see how to visualize this image, check out this tutorial.

is_change: Indicates whether the image identity changed for this stimulus presentation. When both this value and ‘active’ are TRUE, the mouse was rewarded for licking within the response window.

omitted: Indicates whether the image presentation was omitted for this flash. Most image flashes had a 5% probability of being omitted (producing a gray screen). Flashes immediately preceding a change or immediately following an omission could not be omitted.

rewarded: Indicates whether a reward was given after this image presentation. During the passive replay block (5), this value indicates that a reward was issued for the corresponding image presentation during the active behavior block (0). No rewards were given during passive replay.

Receptive field mapping gabor stimulus (block 2)#

orientation: Orientation of gabor.

position_x: Position of the gabor along azimuth. The units are in degrees relative to the center of the screen (negative values are nasal).

position_y: Position of the gabor along elevation. Negative values are lower elevation.

spatial_frequency: Spatial frequency of gabor in cycles per degree.

temporal_frequency: Temporal frequency of gabor in Hz.

Full field flashes (block 4)#

color: Color of the full-field flash stimuli. “1” is white and “-1” is black.

Let’s confirm that the active behavior block (0) and the passive replay block (5) match frame for frame:

active_image_presentations = stimulus_presentations[stimulus_presentations['stimulus_block']==0]
passive_image_presentations = stimulus_presentations[stimulus_presentations['stimulus_block']==5]
np.all(active_image_presentations['image_name'].values == passive_image_presentations['image_name'].values )

True

Taking block 0 as an example, let’s look at the timing of the stimuli. As a reminder, each flash was presented for 250 ms with 500 ms interflash intervals. In addition, flashes were occasionally omitted to investigate expectation signals:

Now let’s check the timing of our flashes and see how it compares to our intended timing:

#get the active behavior part of the stim table
active_behavior = stimulus_presentations[stimulus_presentations['active']==True]

#for now, let's leave out the omitted stimuli
active_behavior_no_omissions = active_behavior[active_behavior['omitted']==False]

#plot histogram of the stimulus durations
fig, axes = plt.subplots(1, 2)
fig.set_size_inches([10,4])
_ =  axes[0].hist(active_behavior_no_omissions.duration, 50)
axes[0].set_xlabel('Flash Duration (s)')
axes[0].set_ylabel('Count')

inter_flash = active_behavior_no_omissions['start_time'].diff()
_ = axes[1].hist(inter_flash, np.arange(0.7, 1.6, 0.05))
axes[1].set_xlabel('Inter-flash interval (s)')
axes[1].set_xticks(np.arange(0.75, 1.6, 0.25))

[<matplotlib.axis.XTick at 0x7f7f1e8fad00>,
 <matplotlib.axis.XTick at 0x7f7f1e8facd0>,
 <matplotlib.axis.XTick at 0x7f7f1e70e340>,
 <matplotlib.axis.XTick at 0x7f7f1f310280>]

../_images/830c5d941bb3dac8d89a43378cc7a2d0def9a28ad7ba6a39b35efd23b4985494.png

Looks like the flash duration and interflash intervals are generally what we expect. Note though that a number of inter-flash intervals are twice as long as expected (1.5 s). This is because a small percentage of flashes are omitted. The chance of omitting a flash is nominally 5%, but we’ve also added two extra criteria:

two omissions can’t happen in a row
an omission can’t directly precede the change

This makes the actual chances of an omission a bit less than 5%:

#look at the percentage of flashes that were omissions
np.sum(active_behavior.omitted)/len(active_behavior)

0.03418803418803419

Introduction to the behavior trials table#

Now let’s explore the behavior trials table. This table contains lots of useful information about every trial in the change detection task.

trials = session.trials
trials.head()

	start_time	stop_time	initial_image_name	change_image_name	is_change	change_time_no_display_delay	go	catch	lick_times	response_time	reward_time	reward_volume	hit	false_alarm	miss	correct_reject	aborted	auto_rewarded	change_frame	trial_length
trials_id
0	28.08763	29.05453	im036_r	im036_r	False	NaN	False	False	[28.55387, 28.73684, 29.30404]	NaN	NaN	0.000	False	False	False	False	True	False	NaN	0.96690
1	29.58829	36.86108	im036_r	im078_r	True	32.59106	False	False	[33.04048, 33.20773, 33.30745, 33.3908, 33.507...	33.04048	32.74138	0.005	False	False	False	False	False	True	330.0	7.27279
2	37.09446	40.78107	im078_r	im078_r	False	NaN	False	False	[40.48052]	NaN	NaN	0.000	False	False	False	False	True	False	NaN	3.68661
3	40.84754	50.37230	im078_r	im111_r	True	46.10256	False	False	[46.73531, 46.83539, 46.95218, 47.06898, 47.20...	46.73531	46.25277	0.005	False	False	False	False	False	True	1140.0	9.52476
4	50.60569	51.75679	im111_r	im111_r	False	NaN	False	False	[51.43985]	NaN	NaN	0.000	False	False	False	False	True	False	NaN	1.15110

Unlike the stimulus presentations table in which every row corresponded to a visual stimulus presentation, for the behavior trials table every row corresponds to one trial of the change detection task. Here is a quick summary of the columns:

start_time: Experiment time when this trial began in seconds.

end_time: Experiment time when this trial ended.

initial_image_name: Indicates which image was shown before the change (or sham change) for this trial

change_image_name: Indicates which image was scheduled to be the change image for this trial. Note that if the trial is aborted, a new trial will begin before this change occurs.

stimulus_change: Indicates whether an image change occurred for this trial.

change_time_no_display_delay: Experiment time when the task-control computer commanded an image change. This change time is used to determine the response window during which a lick will trigger a reward. Note that due to display lag, this is not the time when the change image actually appears on the screen. To get this time, you need the stimulus_presentations table (more about this below).

go: Indicates whether this trial was a ‘go’ trial. To qualify as a go trial, an image change must occur and the trial cannot be autorewarded.

catch: Indicates whether this trial was a ‘catch’ trial. To qualify as a catch trial, a ‘sham’ change must occur during which the image identity does not change. These sham changes are drawn to match the timing distribution of real changes and can be used to calculate the false alarm rate.

lick_times: A list indicating when the behavioral control software recognized a lick. Note that this is not identical to the lick times from the licks dataframe, which record when the licks were registered by the lick sensor. The licks dataframe should generally be used for analysis of the licking behavior rather than these times.

response_time: Indicates the time when the first lick was registered by the task control software for trials that were not aborted (go or catch). NaN for aborted trials. For a more accurate measure of response time, the licks dataframe should be used.

reward_time: Indicates when the reward command was triggered for hit trials. NaN for other trial types.

reward_volume: Indicates the volume of water dispensed as reward for this trial.

hit: Indicates whether this trial was a ‘hit’ trial. To qualify as a hit, the trial must be a go trial during which the stimulus changed and the mouse licked within the reward window (150-750 ms after the change time).

false_alarm: Indicates whether this trial was a ‘false alarm’ trial. To qualify as a false alarm, the trial must be a catch trial during which a sham change occurred and the mouse licked during the reward window.

miss: To qualify as a miss trial, the trial must be a go trial during which the stimulus changed but the mouse did not lick within the response window.

correct_reject: To qualify as a correct reject trial, the trial must be a catch trial during which a sham change occurred and the mouse withheld licking.

aborted: A trial is aborted when the mouse licks before the scheduled change or sham change.

auto_rewarded: During autorewarded trials, the reward is automatically triggered after the change regardless of whether the mouse licked within the response window. These always come at the beginning of the session to help engage the mouse in behavior.

change_frame: Indicates the stimulus frame index when the change (on go trials) or sham change (on catch trials) occurred. This column can be used to link the trials table with the stimulus presentations table, as shown below.

trial_length: Duration of the trial in seconds.

Calculating response latency#

Let’s combine info from both of these tables to calculate response latency for this session. Note that the change time in the trials table is not corrected for display lag. This is the time that the task control computer used to determine the response window. However, to calculate response latency, we want to use the display lag corrected change times from the stimulus presentations table. Below, we will grab these corrected times and add them to the trials table under the new column label change_time_with_display_delay.

def get_change_time_from_stim_table(row):
    '''
    Given a particular row in the trials table,
    find the corresponding change time in the
    stimulus presentations table
    '''
    table = stimulus_presentations
    change_frame = row['change_frame']
    if np.isnan(change_frame):
        return np.nan

    change_time = table[table.start_frame==change_frame]\
                    ['start_time'].values[0]

    return change_time

change_times = trials.apply(get_change_time_from_stim_table, axis=1)
trials['change_time_with_display_delay'] = change_times

Now we can use this new column to calculate the response latency on ‘hit’ trials. First, we’ll need to get the lick times for this session:

# get the licks table
licks = session.licks

Then we’ll use these to get the response latency:

# filter for the hit trials
hit_trials = trials[trials['hit']]

# find the time of the first lick after each change
lick_indices = np.searchsorted(licks.timestamps, hit_trials['change_time_with_display_delay'])
first_lick_times = licks.timestamps.values[lick_indices]
response_latencies = first_lick_times - hit_trials['change_time_with_display_delay']

# plot the latencies
fig, ax = plt.subplots()
fig.suptitle('Response Latency Histogram for Hit trials')
ax.hist(response_latencies, bins=np.linspace(-0.1, 0.8, 50))
ax.set_xlabel('Time from change (s)')
ax.set_ylabel('Trial count')

Text(0, 0.5, 'Trial count')

../_images/9e029b37b0842930d6f3a3b1019f5cc41175367cf376fca0487ea63ac2b83743.png

Aligning Running, Licking and Pupil data to task events#

Now let’s grab the licking, running and pupil tracking data for this session and align it to the behavior.

eye_tracking = session.eye_tracking
running_speed = session.running_speed
licks = session.licks

Eye tracking dataframe: One entry containing ellipse fit parameters for the eye, pupil and corneal reflection for every frame of the eye tracking video stream.

eye_tracking.head()

	timestamps	cr_area	eye_area	pupil_area	likely_blink	pupil_area_raw	cr_area_raw	eye_area_raw	cr_center_x	cr_center_y	...	eye_center_x	eye_center_y	eye_width	eye_height	eye_phi	pupil_center_x	pupil_center_y	pupil_width	pupil_height	pupil_phi
frame
0	1.98179	NaN	NaN	NaN	True	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	1.99846	NaN	NaN	NaN	True	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	2.01512	NaN	NaN	NaN	True	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	2.03179	NaN	NaN	NaN	True	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	2.04846	NaN	NaN	NaN	True	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

5 rows × 23 columns

There seem to be several rows for which there are no valid data. We can use the ‘likely_blink’ column to filter these out.

eye_tracking_noblinks = eye_tracking[~eye_tracking['likely_blink']]
eye_tracking_noblinks.head()

	timestamps	cr_area	eye_area	pupil_area	likely_blink	pupil_area_raw	cr_area_raw	eye_area_raw	cr_center_x	cr_center_y	...	eye_center_x	eye_center_y	eye_width	eye_height	eye_phi	pupil_center_x	pupil_center_y	pupil_width	pupil_height	pupil_phi
frame
16	2.69844	143.319910	64100.607811	18344.446003	False	18344.446003	143.319910	64100.607811	307.495920	253.050384	...	293.215171	252.429377	154.298080	132.236624	-0.045388	296.119363	245.965454	76.414779	75.552596	0.186599
17	2.71512	144.637343	64128.814923	18500.396156	False	18500.396156	144.637343	64128.814923	307.703577	252.995883	...	293.465116	252.383047	154.610779	132.027248	-0.047398	296.156537	246.179364	76.738901	75.066741	0.071866
18	2.73178	138.430143	64228.144166	18768.526187	False	18768.526187	138.430143	64228.144166	307.929085	253.125091	...	293.563823	252.333781	154.851436	132.026242	-0.049470	296.457157	245.953752	77.292997	75.030102	0.000100
19	2.74845	140.384821	64356.597629	18790.177151	False	18790.177151	140.384821	64356.597629	307.710460	253.179979	...	293.503738	252.582299	154.705897	132.414741	-0.049801	296.389647	246.022947	77.337566	75.251635	0.004288
20	2.76511	153.408346	64446.974420	18824.413215	False	18824.413215	153.408346	64446.974420	307.556380	253.339529	...	293.251817	252.265215	154.720970	132.587775	-0.040922	296.259293	246.003218	77.407989	75.440085	-0.036076

5 rows × 23 columns

Running dataframe: One entry for each read of the analog input line monitoring the encoder voltage, polled at ~60 Hz.

running_speed.head()

	timestamps	speed
0	27.11889	21.929034
1	27.13549	31.705640
2	27.15263	40.125461
3	27.16893	46.906969
4	27.18556	51.945394

Licking dataframe: One entry for every detected lick onset time,

licks.head()

	timestamps	frame
0	28.55104	88
1	28.72504	99
2	29.30011	133
3	33.02635	357
4	33.20237	367

Now let’s take a look at running, licking and pupil area for one reward trial

time_before = 3.0 #how much time to plot before the reward
time_after = 3.0 #how much time to plot after the reward
reward_time = session.rewards.iloc[15]['timestamps'] #get a random reward time

#Get running data aligned to this reward
trial_running = running_speed.query('timestamps >= {} and timestamps <= {} '.
                                    format(reward_time-time_before, reward_time+time_after))

#Get pupil data aligned to this reward
trial_pupil_area = eye_tracking_noblinks.query('timestamps >= {} and timestamps <= {} '.
                                    format(reward_time-time_before, reward_time+time_after))

#Get stimulus presentations around this reward
behavior_presentations = stimulus_presentations[stimulus_presentations['active']]
behavior_presentations = behavior_presentations[(behavior_presentations['omitted']==False)]
trial_stimuli = behavior_presentations.query('end_time >= {} and start_time <= {}'.
                                             format(reward_time-time_before, reward_time+time_after))

#Get licking aligned to this reward
trial_licking = licks.query('timestamps >= {} and timestamps <= {} '.
                                    format(reward_time-time_before, reward_time+time_after))


#Plot running, pupil area and licks
fig, axr = plt.subplots()
fig.set_size_inches(14,6)
axr.plot(trial_running['timestamps'], trial_running['speed'], 'k')
axp = axr.twinx()
axp.plot(trial_pupil_area['timestamps'], trial_pupil_area['pupil_area'], 'g')
rew_handle, = axr.plot(reward_time, 0, 'db', markersize=10)
lick_handle, = axr.plot(trial_licking['timestamps'], np.zeros(len(trial_licking['timestamps'])), 'mo')
axr.legend([rew_handle, lick_handle], ['reward', 'licks'])

axr.set_ylabel('running speed (cm/s)')
axp.set_ylabel('pupil area\n$(pixels^2)$')
axr.set_xlabel('Experiment time (s)')

axp.yaxis.label.set_color('g')
axp.spines['right'].set_color('g')
axp.tick_params(axis='y', colors='g')

#Plot the image flashes as grey bars.
colors = ['0.3', '0.8']
stimulus_colors = {stim: c for stim,c in zip(trial_stimuli['image_name'].unique(), colors)}
for idx, stimulus in trial_stimuli.iterrows():
    axr.axvspan(stimulus['start_time'], stimulus['end_time'], color=stimulus_colors[stimulus['image_name']], alpha=0.5)

../_images/ad3e1288cc1c5b35ce36958212cf2014f42912bca671d578d49e1690b3724479.png

Here we can see that just after the stimulus change (a little past 449 seconds), the mouse abruptly stops running and begins licking. The reward is delivered shortly after the first lick. We can also begin to see that before the change the pupil and running become entrained to the image flashes.