Skeletons

Skeletons#

Often in thinking about neurons, you want to measure things along a linear dimension of a neuron. However, the segmentation is a full complex 3d shape that makes this non-trivial. There are methods for reducing the shape of a segmented neuron down to a linear tree like structure usually referred to as a skeleton. We have precalculated skeletons for a large number of cells in the dataset, including all the proofread cells, and many of the excitatory cells with a single soma in the dataset, and these are available on the harddrives and in the aws directory. These skeletons are stored in an h5 file format, and also contain annotations on the skeletons that have the synapses, which skeleton nodes are axon and which are dendrite, and which are likely the apical dendrite of excitatory neurons.

../_images/skeleton-cartoon.png — Fig. 23 Cartoon illustration of “level 2” graph and skeletons.#

In order to understand these skeletons, you have to understand how large scale EM data is represented. Portions of 3d space are broken up into chunks, such as the grid in the image above. Neurons, such as the cartoon green cell above, span many chunks. Components of the segmentation that live within a single chunk are called level 2 ids, this is because in fact the chunks get iteratively combined into larger chunks, until the chunks span the whole volume. We call this the PyChunkedGraph or PCG, after the library which we use to store and interact with this represenation. Level 0 is the voxels, level 1 refers to the grouping of voxels within the chunk (also known as supervoxels) and level 2 are the groups of supervoxels within a chunk. A segmentation result can be thought of as a graph at any of these levels, where the voxels, supervoxels, or level 2 ids that are part of the same object are connected. In the above diagram, the PCG level 2 graph is represented as the light red lines.

Such graphs are useful in that they track all the parts of the neuron that are connected to one another, but they aren’t skeletons, because the graph is not directed, and isn’t a simple branching structure.

In order to turn them into a skeleton we have to run an algorithm, that finds a tree like structure that covers the graph and gets close to every node. This is called the TEASAR algorithm and you can read more about how to skeletonize graphs in the MeshParty documentation.

The result of the algorithm, when applied to a level 2 graph, is a linear tree (like the dotted purple one shown above), where a subset of the level 2 chunks are vertices in that tree, but all the unused level 2 nodes “near” a vertex (red unfilled circles) on the graph are mapped to one of the skeleton vertices (black arrows). This means there are two sets of indices, mesh indices, and skeleton indices, which have a mapping between them (see diagram above).

The Meshwork library allows us to easily store these representations and helps us relate them to each other. A Meshwork object is a data structure that is designed to have three main components that are kept in sync with mesh and skeleton indices:

mesh: the graph of PCG level2 ID nodes that are skeletonized, stored as a standard meshparty mesh. Note that this is not the high resolution mesh that you see in neuroglancer.
skeleton: a meshparty skeleton,
anno : is a class that holds dataframes and adds some extra info to keep track of indexing.

Working with Meshwork Files#

Loading a meshwork file imports the level 2 graph (the “mesh”), the skeleton, and a collection of associated annotations.

from meshparty import meshwork
nrn = meshwork.load_meshwork(mesh_filename)

The main three properties of the meshwork object are:

nrn.mesh : The l2graph representation of the reconstruction.
nrn.skeleton : The skeleton representation of the reconstruction.
nrn.anno : A table of annotation dataframes and associated metadata that links them to specific vertices in the mesh and skeleton.

Meshwork nrn.mesh vs nrn.skeleton#

Skeletons are “tree-like”, where every vertex (except the root vertex) has a single parent that is closer to the root than it, and any number of child vertices. Because of this, for a skeleton there are well-defined directions “away from root” and “towards root” and few types of vertices have special names:

Branch point: vertices with two or more children, where a neuronal process splits.
End point: vertices with no childen, where a neuronal process ends.
Root point: The one vertex with no parent node. By convention, we typically set the root vertex at the cell body, so these are equivalent to “away from soma” and “towards soma”.
Segment: A collection of vertices along an unbranched region, between one branch point and the next end point or branch point downstream.

Meshes are arbitrary collections of vertices and edges, but do not have a notion of “parent” or “child” “branch point” or “end point”. Here, this means the “mesh” used here includes a vertex for every level 2 chunk, even where it is thick like at a cell body or very thick dendrite. However, by default this means that there is not always a well-defined notion of parent or child nodes, or towards or away from root.

In contrast “Meshes” (really, graphs of connected vertices) do not have a unique “inward” and “outward” direction. For the sake of rapid skeletonization, the “meshes” we use here are really the graph of level 2 vertices as described above. These aren’t a mesh in the visualization sense of the section on downloading Meshes, but have the same data representation.

To handle this, the meshwork object associates each mesh vertices with a single nearby skeleton vertex, and each skeleton vertex is associated with one or more mesh vertices. By representing data this way, annotations like synapses can be directly associated with a mesh vertex (because synapses can be anywhere on the object) and then mapped to the skeleton in order to enjoy the topological benefits of the skeleton representation.

# By the definition of skeleton vs mesh, we would expect that mesh contains more vertices than the skeleton. 
# We can see this by looking at the size of the skeleton vertex location array vs the size of the mesh vertex location array.

print('Skeleton vertices array length:', len(nrn.skeleton.vertices))
print('Mesh vertices array length:', len(nrn.mesh.vertices))

#Let us try to visualize the skeleton:
# Visualize the whole skeleton 

# here's a simple way to plot vertices of the skeleton
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
%matplotlib notebook 

fig = plt.figure(figsize=(6, 6))
ax = plt.axes(projection='3d')
ax.scatter3D(nrn.skeleton.vertices[:,0], nrn.skeleton.vertices[:,1], nrn.skeleton.vertices[:,2], s=1)

../_images/skeleton_scatterplot.png — Fig. 24 Scatterplot of skeleton vertices as a point cloud.#

Skeleton Properties and Methods#

To plot this skeleton in a more sophisticated way, you have to start thinking of it as a graph, and the meshwork object has a bunch of tools and properties to help you utilize the skeleton graph.

Let’s list some of the most useful ones below You access each of these with nrn.skeleton.* Use the ? to read more details about each one

Properties

`branch_points``: a list of skeleton vertices which are branches
root: the skeleton vertice which is the soma
distance_to_root: an array the length of vertices which tells you how far away from the root each vertex is
root_position: the position of the root node in nanometers
end_points: the tips of the neuron
cover_paths: a list of arrays containing vertex indices that describe individual paths that in total cover the neuron without repeating a vertex. Each path starts at an end point and continues toward root, stopping once it gets to a vertex already listed in a previously defined path. Paths are ordered to start with the end points farthest from root first. Each skeleton vertex appears in exactly one cover path.
csgraph: a scipy.sparse.csr.csr_matrix containing a graph representation of the skeleton. Useful to do more advanced graph operations and algorithms. https://docs.scipy.org/doc/scipy/reference/sparse.csgraph.html
kdtree: a scipy.spatial.ckdtree.cKDTree containing the vertices of skeleton as a kdtree. Useful for quickly finding points that are nearby. https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.cKDTree.html

Methods

path_length(paths=None): the path length of the whole neuron if no arguments, or pass a list of paths to get the path length of that. A path is just a list of vertices which are connected by edges.
path_to_root(vertex_index): returns the path to the root from the passed vertex
path_between(source_index, target_index): the shortest path between the source vertex index and the target vertex index
child_nodes(vertex_indices): a list of arrays listing the children of the vertex indices passed in
parent_nodes(vertex_indices): an array listing the parent of the vertex indices passed in

# A better way to plot a neuron is to use cover_paths
# and plot those as 3d lines
def plot_neuron_skeleton(neuron, ax, c='b', linewidth=1):

    for cover_path in neuron.skeleton.cover_paths:
        path_verts = neuron.skeleton.vertices[cover_path,:]
        ax.plot(path_verts[:,0], path_verts[:,1], path_verts[:,2], c=c, linewidth=linewidth)

fig = plt.figure(figsize=(6, 6))
ax = plt.axes(projection='3d')
plot_neuron_skeleton(nrn, ax)

Indexing And Selecting Sets of Points#

The meshworks object contains a mesh with lots of vertices and a skeleton which holds a subset of these vertices. Therefore, in python these points have different “indices” in the mesh and skeleton. For example, if the mesh contains 10000 vertices, the indexing of those would run from 0 - 9999. The skeleton, which contains a subset of 100 of these would have indexing from 0-99. How would you figure out which of the mesh vertices these correspond to?

Luckily, we have some really nifty functions that help us distinguish those:

Let us first look at some attributes in the meshworks objects:

A few nifty function for subselecting points: downstream points and `path_between``. For a given point, downstream points are defined as points on paths from endpoints to the root which are further than the given point. For example, if the skeleton path is : A-B-C-D-E where A is the root, D and E are downstream points of C. With branching, this can be more complex. To find the downstream points from say the 9th branch point, we can do:

#Downstream points
nrn.downstream_of(nrn.branch_points[9])

path_between returns the vertices from one point to another. For example, we can get the every mesh vertex from the end point 5 to the root point. As a quick visualization, we can look at the distance to root along the, showing that it is descreasing.

fig, ax = plt.subplots()
ax.plot(
    nrn.distance_to_root(
        nrn.path_between(
            nrn.end_points[5],
            nrn.root,
        )
    ) / 1_000,
)

ax.set_ylabel('Distance from root ($\mu m$)')
ax.set_ylabel('Vertex along path')

../_images/distance_to_root.png — Fig. 25 Distance to root along graph path indicated above. Note that this flattens out at zero because “distance” is computed along the skeleton and many graph vertices in the level 2 graph are associated with the soma vertex of the skeleton.#

Masking#

Just like meshes, we can mask the meshwork object. Like all basic meshwork functions, the expected input is in mesh vertices. Importantly, doing so will be synchronized across the mesh, the skeleton, and annotations.

# Let's now use masking to highlight one particular cover path.

fig = plt.figure(figsize=(6, 6))
ax = plt.axes(projection="3d")

plot_neuron_skeleton(nrn, ax, "b")

nrn.reset_mask()  # This just makes sure we are working with the same baseline object.

# This will make a mask with only the vertices in the first cover path.
with neuron.mask_context(nrn.skeleton.cover_paths[0].to_mesh_mask) as nrnf:
  plot_neuron_skeleton(nrnf, ax, "r", linewidth=2)
  ax.scatter(
      nrnf.skeleton.root_position[0],
      nrnf.skeleton.root_position[1],
      nrnf.skeleton.root_position[2],
  )

While mask_context acts to mask the skeleton and then unmasks it at the end of operations, you can also just mask a skeleton and let it stay that way. In that case, use the nrn.apply_mask function.

Important

Do not use the nrn.mesh.apply_mask or nrn.skeleton.apply_mask functions, which will not synchronize the mask across the mesh, skeleton, and annotations.

Annotations#

nrn.anno has set of annotation tables containing some additional information for analysis. Each annotation table has both a Pandas DataFrame object storing data and additional information that allow the rows of the DataFrame to be mapped to mesh and skeleton vertices. For the neurons that have been pre-computed, there is a consisent set of annotation tables:

post_syn: Postsynaptic sites (inputs) for the cell
pre_syn: Presynaptic sites (outputs) for the cell
is_axon: List of vertices that have been labeled as part of the axon
lvl2_ids: Gives the PCG level 2 id for each mesh vertex, largely for book-keeping reasons.
segment_properties: For each vertex, information about the approximate radius, surface area, volume, and length of the segment it is on.
vol_prop: For every vertex, information about the volume and surface area of the level 2 id it is associated with.

To access one of the DataFrames, use the name of the table as an attribute of the anno object and then get the .df property. For example, to get the postsynaptic sites, we can do:

from meshparty import meshwork
nrn = meshwork.load_meshwork('data/864691134940133219_skel.h5')
nrn.anno.post_syn.df.head()

	ctr_pt_position	id	post_pt_level2_id	post_pt_mesh_ind	post_pt_position	post_pt_root_id	post_pt_supervoxel_id	pre_pt_position	pre_pt_root_id	pre_pt_supervoxel_id	size	post_pt_mesh_ind_filt
0	[167494, 178677, 22973]	147824610	159893593257935774	6887	[167518, 178672, 22972]	864691134940133219	87835999220528052	[167440, 178598, 22980]	864691135947747425	87765630476372999	3400	6887
1	[138824, 196821, 23152]	97189471	155955348832780498	2332	[138796, 196792, 23151]	864691134940133219	83897754795009976	[138892, 196914, 23159]	864691136785011310	83897823514489547	2012	2332
2	[133018, 175450, 24556]	80426591	155108106538320623	1471	[133022, 175372, 24562]	864691134940133219	83050512501003437	[132972, 175482, 24555]	864691135361023474	83050512500997552	1808	1471
3	[166258, 197684, 22697]	140170915	159685029646041267	6778	[166246, 197648, 22691]	864691134940133219	87627435608336600	[166200, 197720, 22703]	864691135444655596	87627435608350744	4648	6778
4	[162497, 198039, 22661]	133118811	159192517156274428	6438	[162508, 197976, 22664]	864691134940133219	87134923118540329	[162446, 198008, 22664]	864691136824031086	87134923118542231	8932	6438

However, in addition to these rows, you can also easily get the mesh vertex index or skeleton vertex index that a row corresponds to with nrn.anno.post_syn.mesh_index or nrn.anno.post_syn.skel_index respectively. This seems small, but it allows you to integrate skeleton-like measurements with annotations trivially.

For example, to get the distance from the cell body for each postsynaptic site, we can do:

nrn.distance_to_root(nrn.anno.post_syn.mesh_index)

array([107741.13371277,  50915.23168945, 130549.53106689, ...,
        52965.05181885,  53957.32983398,  67804.22151184])

Important

Note that the nrn.distance_to_root, like all basic meshwork object functions, expects a mesh vertex index rather than a skeleton vertex index.

A common pattern is to copy a synapse dataframe and then add columns to it. For example, to add the distance to root to the postsynaptic sites, we can do:

syn_df = nrn.anno.pre_syn.df.copy()
syn_df['dist_to_root'] = nrn.distance_to_root(nrn.anno.pre_syn.mesh_index)

Filtering Annotations by Skeleton Arbor#

Each annotation table has a ‘filter_query’ method that takes a boolean mesh mask and returns only those rows of the dataframe associated with those particular locations on the mesh. Let’s use what we learned above in two examples: first, getting all input synapses within 50 microns of the root and second, getting all input synapses on one particular branch off of the soma.

dtr = nrn.distance_to_root() / 1_000   # Convert from nanometers to microns
nrn.anno.post_syn.filter_query( dtr < 50).df

	ctr_pt_position	id	post_pt_level2_id	post_pt_mesh_ind	post_pt_position	post_pt_root_id	post_pt_supervoxel_id	pre_pt_position	pre_pt_root_id	pre_pt_supervoxel_id	size	post_pt_mesh_ind_filt
6	[151982, 191342, 23145]	120502078	157713880242454752	4892	[152038, 191390, 23149]	864691134940133219	85656286204696158	[151996, 191298, 23140]	864691135489252804	85656286204685856	6928	4892
7	[157726, 193046, 22788]	127341851	158488142519730768	5752	[157706, 193094, 22795]	864691134940133219	86430548482153326	[157682, 192972, 22780]	864691136824031086	86430548482140967	11020	5752
23	[154448, 194194, 23180]	120218955	158066067560726725	5285	[154378, 194200, 23178]	864691134940133219	86008473522980712	[154520, 194204, 23181]	864691136757444974	86008473522981601	312	5285
24	[150364, 191008, 23443]	112691936	157502705290445181	4655	[150330, 190992, 23435]	864691134940133219	85445111253025608	[150438, 190954, 23442]	864691136827445870	85445111253035274	5252	4655
27	[154090, 190584, 22953]	120378123	157995217713103485	5202	[154072, 190652, 22947]	864691134940133219	85937623675692998	[154078, 190578, 22960]	864691136713068398	85937623675711336	9116	5202
...	...	...	...	...	...	...	...	...	...	...	...	...
3433	[156296, 195908, 23183]	127450278	158347817415344328	5610	[156264, 195978, 23183]	864691134940133219	86290223377588084	[156252, 195838, 23182]	864691135210513856	86290223377588010	7036	5610
3435	[147054, 185786, 23994]	110381182	157079805697721156	3949	[147026, 185772, 23992]	864691134940133219	85022211660301024	[147124, 185818, 24002]	864691135658301314	85022211660316873	9124	3949
3436	[150724, 186076, 23779]	112536188	157572386906964425	4735	[150760, 185998, 23771]	864691134940133219	85514792869314896	[150720, 186090, 23783]	864691136195284556	85514861588802988	508	4735
3437	[149372, 190040, 23289]	112691418	157361830363136488	4400	[149382, 190088, 23280]	864691134940133219	85304236325525055	[149360, 190010, 23283]	864691136195284556	85304236325524912	2124	4400
3438	[154928, 196122, 23050]	116358701	158136711182811181	5356	[154940, 196164, 23050]	864691134940133219	86079117144942192	[154934, 196072, 23055]	864691136032031675	86079117144945472	3968	5356

473 rows × 12 columns

We can also use one set of annotations as an input to filter query from another set of annotations. For example, due to errors in segmentation or mistakes in synapse detection, there can be synaptic outputs on the dendrite. However, if we have an is_axon annotation that simply contains a collection of vertices that correspond to the cell’s axon. We can use this annotation to create a mask and filter out all of the synapses that are not on the axon.

axon_mask = nrn.anno.is_axon.mesh_mask
nrn.anno.pre_syn.filter_query(~axon_mask).df # The "~" is a logical not operation that flips True and False

	ctr_pt_position	id	post_pt_position	post_pt_root_id	post_pt_supervoxel_id	pre_pt_level2_id	pre_pt_mesh_ind	pre_pt_position	pre_pt_root_id	pre_pt_supervoxel_id	size	pre_pt_mesh_ind_filt
65	[162422, 198658, 23978]	134156279	[162390, 198688, 23981]	864691136785500782	87134991972525815	159192586009969622	6439	[162464, 198636, 23985]	864691134940133219	87134991972537317	972	6439
166	[157086, 198862, 23150]	127450957	[157010, 198908, 23151]	864691135952299043	86361004438641166	158418529756905664	5697	[157164, 198808, 23156]	864691134940133219	86360935719148064	668	5697
444	[174372, 191708, 23157]	155609310	[174306, 191714, 23158]	864691136210173628	88752579668012861	160810104986271912	7366	[174362, 191656, 23156]	864691134940133219	88752510948535116	2888	7366
456	[147822, 191848, 24173]	110894360	[147772, 191806, 24171]	864691135393532914	85093405104959624	157150999142727859	4073	[147868, 191768, 24170]	864691134940133219	85093405104962373	5068	4073
486	[134460, 194490, 22866]	84031640	[134458, 194454, 22867]	864691135528294084	83264161152691689	155321755190165812	1665	[134424, 194532, 22865]	864691134940133219	83264161152685771	4112	1665
490	[175468, 203358, 22627]	158325494	[175478, 203300, 22634]	864691134988925306	88894828917712054	160952422955484181	7444	[175454, 203396, 22626]	864691134940133219	88894828917704472	1308	7444
517	[135018, 192856, 23553]	86645797	[135062, 192872, 23550]	864691136210178236	83334323805717938	155391917910130691	1731	[135004, 192878, 23557]	864691134940133219	83334323872216719	292	1731
604	[148492, 207248, 23239]	112357047	[148486, 207244, 23240]	864691135853648185	85165835299321100	157223429336990100	4211	[148462, 207276, 23241]	864691134940133219	85165835299314732	2144	4211
1017	[156432, 189900, 23033]	125728175	[156470, 189928, 23033]	864691133262391923	86289398677197625	158346992714515502	5599	[156402, 189886, 23034]	864691134940133219	86289398677193502	2652	5599
1067	[128170, 199156, 23519]	71166059	[128102, 199184, 23512]	864691136012791778	82420354765068733	154477948802433940	902	[128160, 199106, 23519]	864691134940133219	82420354765076207	1584	902
1293	[156188, 191520, 22873]	127340911	[156258, 191546, 22880]	864691136210178236	86289604835470780	158276830128767512	5520	[156152, 191638, 22877]	864691134940133219	86219236091245799	1632	5520

As a sanity check, we can use nglui to see if these synapses we have labeled as being on the axon are all where we expect.

import os
from caveclient import CAVEclient
from nglui.statebuilder.helpers import make_synapse_neuroglancer_link

client = CAVEclient('minnie65_public')

make_synapse_neuroglancer_link(
    nrn.anno.pre_syn.filter_query(axon_mask).df,
    client,
    return_as="html"
)

Neuroglancer Link

(Click one of the synapse annotations to load the neuron mesh).

Another common example might be to pick one of the child nodes of the soma and get all of the synapses on that branch. We can do this by using the nrn.skeleton.get_child_nodes function to get the skeleton vertex indices of the child nodes and then use that to filter the synapses.

branch_index = 0 # Let's just use the first child vertex of the root node, which is at the soma by default.
branch_inds = nrn.downstream_of(nrn.child_index(nrn.root)[branch_index])
branch_mask = branch_inds.to_mesh_mask

make_synapse_neuroglancer_link(
    nrn.anno.post_syn.filter_query(branch_mask).df,
    client,
    return_as="html"
)