Applications inTemporal Coding analysis

**Introduction**

Neurons use sterotyped, pulsed signals called action potentials to signal over long distances. Since the shape of the action potential is believed to carry minimal information, the time of arrival of the action potential carries all the information. Often the detailed voltage waveform is abstracted into a stream of binary events where most of the stream represents no action potential occurence ('zeros') and an isolated '1' represents an action potential. The binary waveform is refered to as a spike train.

The current talk is focussed on temporal coding in spike trains. It seems to
me that *temporal coding, *as used in neurobiology*, *refers to any
time code, including a pure rate code as a special case. One can think of many
examples:

- A neuron could signal one value by rate, another by the variance in the rate, and another by skew in the rate.
- Relative timing in two neurons might signal intensity or phase or causality.
- Two neurons could signal four different conditions if their spikes were treated as binary words. Synchrony would be important.
- The order of occurance of specific interspike intervals could modulate a synapse in a specific fashion.

Since spike trains are sending messages to synapes (typically), it might be
useful to ask how we can interpert a spike train in a way which makes synaptic
sense. A first approximation might ask: *Over what time scale does the synapse
integrate information? *If the integration time is very short, then only
events which are close together matter, so synchrony in a code is a useful concept.
On the other hand, if integration time is long, then the total number of spikes
matters, so a rate code is a useful concept. Perhaps we should analyse spike
trains on every time scale. In addition to time scale, we could analyse trains
in several ways, each corresponding to a different (measured, or hyopthesized)
aspect of synaptic function. For instance activation and learning might have
different effective coding from the same spike train.

We want to talk about schemes for finding patterns in spike trains, Bruce will
descibe patterns in EOD discharge data and I will describe some theoretical
techniques which seem to hold promise in their flexibility and potential *synaptocentric*
(love those neologisms) orientation. The thread between them is the use of methods
that avoid some problems with traditional binning methods. Specifically, we
will consider schemes that treat the spike time as the most important variable.

**Techniques**

Spike train analysis is the attempt to find patterns in spike trains which reflect some aspect of neural functioning. This could include

- relating neural activity to stimuli
- trying to find repetitive patterns in a motor discharge
- relationship between decending neuronal activity and motor output (Bruce Carlson)
- looking for codes distrubuted across a neuron population
- functional interactions among neurons

There are many techniques for understanding a temporal code, for instance:

- Post stimulus time histogram -- bin the spikes into time slots referenced to the stimulus.
- Spike density function (3) and other digital filters (4)
- Auto- and cross-correlation functions of the entire train using spike density function to smooth
- ISI distrubutions and return maps (3)
- Spike-triggered average (5)
- Spike-centered distance analysis (1), (2)

The techniques we will explore here are based on reference (1), (2)and (3). These avoid problems with binning in time and allow the flexibility of imposing synaptic interpretations on the train. Bruce has talked about how convolving a train with an Gaussian distribution allows a smoother, more relable estimate of burst parameters. What we want to do now is describe how spike trains might be compared to each other for similarity.

To set the stage, let us consider a pure rate code. Each train is then specified
by its rate, and the difference between trains is the difference of their rates.
We could say the the *distance* separating the trains is the difference
of their rates, or that rate differnce is a *distance measure *between
spike trains. What we would like is a distance measure which works on *any*
time scale. Then we could speak of the distance between two trains with a given
time precision. Such a distance measure would allow us to smoothly go from strict
synchronization codes (very small spike time difference allowed) to rate codes
(large time differences allowed within an overall interval). The time resolution
at which you analyse the train might alsocorrespond to the time constant of
the synapse.

Two recent papers describe techniques for computing the distance between two spike trains at any time resolution. Victor's paper (1) define the distance between two spike trains in terms of the minimum cost of transforming one train into another. Only three transforming operations are allowed; move a spike, add a spike, or delete a spike. The cost of addition or deletion is set at one. The cost of moving a spike by a small time is a parameter which sets the time scale of the analysis. If the cost of motion is very small then you can slide a spike anywhere but it still costs to insert or delete a spike, so the distance between two trains is just the difference in number of spikes. This is similar to a rate code distance. If the cost of motion is very high then any spike is a different spike,so the minimum costs become the cost insert or delete all the spikes. The distance between two trains becomes approximately the number of spikes in the train which are not exactly alligned, sort of a measure of synchrony. Intermediate values of cost interpolate smoothly between perfect synchrony and a rate code.

Rossum's paper (2) computes a distance which is closely related to Victor's distance, but much easier to implement and easier to explain. Each spike train is convolved with an exponential function:

eWhere ti is the time of occurence of the ith spike. You get to choose the time constant tc of the exponential, which sets the time scale of the distance measurment. Call the convolved waveforms f(t) and g(t). You then form the distance as:^{(-(t-ti)/tc)}with t>ti

D(tc) = (dt/tc) * sum[ (f(t) - g(t))^{2}]

Where `dt`

is the spike sampling time step. This distance could
be considered as an approximate differnce between two post-synaptic current
sequences triggered by the respective spike trains because such currents tend
to have approximately exponential shape. In a sense, the Rossum distance measures
the difference in the effect of the two trains on their respective synapses
(to a very crude approximation).

The Rossum time scale, tc, and Victor's cost parameter are related by

`cost = 1/tc`

We can compare the two distance measures over a wide range of time scales using this reciprocal relation. Two of the matlab programs given below do this comparison, but an example here might help. The following image shows two spike trains. The blue train is regular and the red train is a gaussian dithered verson of the blue train. At small time scales, the Victor distance is 8 because 4 spikes are not exactly aligned. This means that 4 spike deletions and 4 insertions are necessary to transform one into the other. At time scales comparable to the dither (in this case std.dev.=10 mSec), the Victor distance starts to drop because it is cheaper to move a spike. At long time scales the Victor distance goes to zero because both trains have the same number of spikes. The Rossum distance falls more smoothly because it depends on a smooth exponential weighting function. The distance at short time scales is similar because the exponentials have essentially fallen to zero. At large time scales the Rossum distance also goes to zero, because it too measures the total number of spikes. Which distance you decide to use depends on how you think the spike train is interpreted post-synaptically. Note that the distance is computed at all possible time scales so that different criteria of synchrony are automatically available.

Once we can compute distances between spike trains, we can try out several techniques outlined in reference (1). The following assumes that trains are caused by some controlled stimulus, and can therefore be a-priori catagorized by the stimulus type. These steps have been implemented in the software described below.

__Information estimate.__How well do the a-priori types and actual trains match up?- Compute the pair-wise distances between all recorded trains (at some specified time scale).
- For each train:
- Compute the average distance to each a-priori stimulus group (including the group of the current train)
- Select the group with the minimum average distance as the effective group to which the train belongs.

- Build a
*confusion matrix*. Each matrix element represents to number of trains which are classified by a given a-priori stimulus class and a minimum-distance class. A diagonal matrix represents perfect classification. - Compute the
*information content*of the matrix. The maximum information content is the log2(number of rows), which will occur if the matrix is diagonal. The information content for a 2x2 matrix (1 bit max) is approximately equal to the chi-square of the matrix, divided by the maximum chi-square of the matrix. - Repeat for each time scale of interest to determine the time scale with the maximum bit rate. This is the scale at which the trains are optimally separable and could represent the scale on which the receiving cell is analysing the train.

__Geometry of the distances__. The pair-wise distances (paticularly at the optimum time scale) represent an abstract metric-space. One could ask if they happen to also fit into a low-dimensional vector space. If they did, perhaps the vector space (when viewed graphically) would offer some insight into the distances. A standard multidimensional-scaling (MDS) technique was used to optimize a fit to a 2D, 3D, or nD space (6). If you analysis at a large time scale (rate code), the spaces I have looked at become one-dimensional, since the trains can only differ by one number (the rate).__Meaning of the geometry.__The MDS analysis yields axes which optimally fit the distances, but yield limited intuition about the charactistics of the trains which are distrubuted in the space. Aronov, et.al. (8) suggest a kind of*inverse pricipal components analysis*in which you ask: What linear combination of time-binned spike rates (in each train) could lead to the MDS fit? They refer to this as*temporal profiling*. An axis corresponding to a rate code yields a constant set of coefficients for each bin. A transient detector will have a positive weighting at the beginning of the train, and negative later. Since the rest of the analysis is nonlinear, there is no reason why this should work at all, so if it does, it probably implies something about how the cells are doing the train analysis.

Uses of a spike train distance measure:

- Song length and time scale analysis for cricket song (7). The authors conclude that the time scale of analysis is 10 mSec and about 2 sylables of the song are necessary for full separation of songs.
- Function of the visual cortex (8).
- Burst similiarity in a long train. See below where we take a burst located by Bruce Carlson'sconvolution method and slide it along the spike train looking for minimum distance fits.

**Code**

Matlab distance programs. To run the following programs, you will need to download the routine spkd from Cornell Med.

__Regular and dithered spike trains.__code. An example image is given above.__Currents applied to integrate-and-fire neurons (with adaption)__. code. The first figure below shows the simulated voltage, current and spiketrains from two groups of 5 IF neurons in response to a square current pulse and the same pulse dithered with gaussian noise. The second figure shows the two trains which fire during the stimulus and the resulting Rossum distances, taken pairwise between every possible pair of spike trains (45 total). Red indicates one group, blue the other, and black the cross-distance. Summing the distances over all timescales and plotting them by group shows that the groups clump by distance. For comparison, the simple time bin histogram distance is shown at the bottom.

__Computing the bit rate from distance clusters__. code. The first figure below shows the simulated voltage, current and spiketrains from four groups of 5 IF neurons in response to a square current pulse modulated with 4 different amplitudes of sinewave. The image below shows the current, 20 spike trains (color coded) and the resulting information as a funciton of scale. The peak information is about 1 bit, implying fairly poor discrumination of the 4 groups of spike trains. At large scales the bitrate drops, implying that average spike rate carries little information.

__Multidimensional scaling and temporal profiling.__This code is modified from (6) to include 3D plotting and the temporal profiling procedure.To run, it also needs the programs in this ZIP file, which is from (6). The call to`fminu`

must be modified to call`fminunc`

.The fist image shows the optimal imbedding of the train distances in 3D (data from previous image). The numbers correspond to individual trains and the colors to the 4 groups. The second image is the temporal profile for this 3-space using 5 time-bins for each spike train. It shows that dimension 1 is coding a weighted sum so that high values aling the axis mean a low initial firing rate, then a high rate, followed again by a low rate. The second dimension is encoding a temporal pattern of approximately the first derivitive of dimension 1.

__Find a burst by minimizing the Rossum distance__. code.The figure below shows a EOD train (fist panel)with one burst isolated (by hand in second panel). The summed distance is a simple sum of log-distributed time scale distances(third panel). The bottom panel shows a burst (the one starting at about 13.8 sec) aligned by using the minimum distance (about 170). A separate fit to the burst at abou 9.35 seconds (clasified by Bruce as a different type) shows a minimum distance of about 200. The burst at about 18.7 gives a best fit distance of about 150. A random chunk of train at about 6 seconds gives a minimum distance of 280. Another run scanning from 2 seconds to 19 seconds found all the bursts with distances around 200.Of course, the reference burst gave a distance of zero.

Tools on the Web.

- Orbital Spike 3 - Firing pattern analysis in Windoze including ISI, return map, firing rate estimation, and correlation (implements the techniques in (3))..
- Dataplore - Time series analysis
- SPIFF - Spike Interchange File Format
- Signal Processing Techniques for Spike Train Analysis using MatLab - These M-files implement the analysis procedures discussed in chapter 9 of "Methods in Neuronal Modeling".
- NTSA Workbench - Neuronal Time Series Analysis (NTSA) Workbench is a set of tools, techniques and standards designed to meet the needs of neuroscientists who work with neuronal time series data.
- NeuroExplorer
-- Flexible train analysis

**References**

- Victor, JD and Purpura KP,
*Metric-space analysis of spike trains: theory, algorithms and application*. Network 8, 127-164 (1997) - van Rossum, MCW,
*A novel spike distance*, Neural Computation, 13, 751-763 (2001) - Szucs, A,
*Applications of spike density function in analysis of neuronal firing patterns*. J. Neurosci.Methods, 81, 159-167 (1998) - Paulin, M,
*Digital filters for firing rate estimation*, Biol Cybernet 66 525-531 (1992) - Dayan, P and Abbott, LF,
*Neural Encoding I, firing rate and spike statistics*in Theoretical Neuroscience. - Steyvers, M., Nonmetric Multidimensional Scaling (MDS) software, http://www-psych.stanford.edu/~msteyver/software.htm
- Machens CK, Prinz P, Stemmler MB, Ronacher B, Herz AVM.
*Discrimination of behaviorally relevant signals by auditory receptor neurons.*Neurocomput, 38--40: 263--268, (2001) - Dmitriy Aronov, Daniel S. Reich, Ferenc Mechler, and Jonathan D. Victor,
*Neural coding of spatial phase in V1 of the macaque monkey*, J Neurophysiol (January 29, 2003). 10.1152/jn.00826.2002

Copyright Cornell University, 2003