Automatic analysis pipeline for FRA data collected through my other TDT/single-unit extraction pipeline. Contains the automatic tuning analysis for others ease of use, also contains other analyses post processing that were used to make the figures in an associated paper/dataset.
Make sure machine has anaconda installed To set up, download the github files to your computer either by manually clicking the download button or by cloning this repository. download the needed packages (main dependency lies with pandas being at least version 2.2.1 to use the new map function).
conda activate your_env
conda install pandas,matplotlib,seaborn,scipy,numpy
pip install ipynb
Once you have needed packages proceed with opening the files
cd 'your/repo/path'
jupyter notebook
naviagte to .ipynb file, and open.
This analysis pipeline takes a csv in a structure of dBxkHz for each unit. This is not a typical data structure, but is very useful for some visualization tools also included in the accesory analyses folder. There is another program, that will produce this matrix from either TDT multi-unit data or spike-sorted single-unit data.
Once data is in this structure, the main function will calculate the prestimulus average spike count and std deviation. Note you may need to adjust the timing windows for your analysis given the way the auditory stimulation was performed. In the example case, we used a 20 ms prestimulus window and a 100 ms post-stimulus window for our calculations. The spike counts are then binned into 5 ms bins, where the peak firing bin and latency to peak bin are readily saved. The other information pertains to the first bin to go above
First, the dBxKhz matrix for a given unit is smoothed using a 3x3 meidan filter to remove some of the jaggedness for the later smoothing step1. This matrix is then summed across frequency (i.e. columns). This produces a dBx1 array that is essentially an input/output function losing the frequency domain. So if you graph this as a rate-level function (firing over sound intensity), you get a line. Thresholds are calculated based on a Savitzky–Golay filter with a 5-point qudratic filter similar to Guo 20121. These thresholds were checked against hand-scored data.
This proceeds exactly opposite to the Threshold calculation. Instead of summing across columns, we sum up the rows and keep the column information. This produces a curve with a maximum value. This maximum value corresponds to the Best Frequency (BF) (doesn't differ too much from the characteristic frequency in my experience and it's easier to calculate). This value is importnat for showing the distribution of BFs between groups to ensure they are equally represented as the BF can change the inherent bandwidth. This value will be used later for q-values.
Bandwidth is calculated at each intensity above threshold for a given unit, which was calculated in the threshold step. Here, we simply go through each intensity row above threshold and grab the first and last sound-evoked cells, which was caclulated using the resposne level function, where the response crossing the set threshold line is considered to be the edge. There can be mulitple crossing so we take the first and the last to keep the bandwidth estimate conservative. These become the bounds for the bandwidth calculation where
The Q-value is a way to account for frequency specific bandwidth discrepancies. Higher frequencies have a narrow initial bandwidth than lower, for example. Make sure that frequency is in kHz for this step just in case (shouldn't matter but I like to be consistent). We follow the standard form at a given intensity
Calculating d-prime takes everything we have produced so far and, in a way, summarizes it for each units dBxkHz tuning matrix. In short, it takes the firing properties inside a defiend tone-evoked region and compares to a defined non-evoked region. The farther apart these values are, the 'better' the d-prime. Another words the easier it is to tell these areas apart, or the information encoding is more optimal for higher d-primes.
d-prime was first calculated by taking the previously calculated threshold for the given unit and assuming that every cell below that in intensity is non-evoked. Then for every evoked classified above that, it had to be connected to at least one other evoked cell at a higher intensity. However, our evoked criteria of exceeding
d-prime was also calculated using the same smoothing I/O function described above, but for each frequency. This gave a threshold for each frequency such that anything below the threhsold for each column was deemed non-evoked, while everything above was deemed evoked on a per frequency (column) basis. This yielded more reasonable d-prime results.
d-prime was then calculated on a per intensity basis as this was being accessed for bandwidth calculations anyway. It takes a tuning curve like shown above in the bandwidth function for the given intensity, and takes all columns corresponding to being above the threshold value as evoked and all on the outside as non-evoked. This was consistent with the 2nd method and also produces a d-prime for each intensity, which is interesting.
Another metric that was made is the precision metric. Maybe needs a more descriptive name, but it calculates the area under the best frequency tuning curve (i.e. across all intensities). To account for tuning curves with similar area but differing number peaks, where we defined more peaks as being less precise we use
Go to the jupyter tutorial notebook for how functions are used. For additional hand_scoring please use the histogram tuning python file. Open the code and ONLY edit the save path at the end of the file. You will also need to edit the list of intensities and frequencies for your own dataset. This analysis requires data to be pickled as it comes from the unit extractor program LINK HERE. There will be errors showing up in the code, but it still runs correctly AS LONG AS YOU CHANGE NOTHING ELSE. A tuning curve with PSTHs will appear, pick the threshold and frequency edges in Hz and dB, you will type into the command window. If at any point you enter data incorrectly simply type anything other than a number and press 'enter' this will restart that tuning curve. Once in the notes section you cannot restart the curve.
To run program, open anaconda terminal and type:
python histogram_tuning_analysis_wfreq.py
Footnotes
-
Guo et al. 2012,Robustness of Cortical Topography across Fields, Laminae, Anesthetic States, and Neurophysiological Signal Types paper link. Cite the I/O paper here ↩ ↩2 ↩3