NE/PS 212: Projects 1 & 2
Introduction to MATLAB Programming for Research in Psychological & Brain Sciences
Please pick two of these 10 options as your first and second project. I have tried to make them as interesting and varied as possible. The projects are meant for you to explore and analyze the data that is provided and have fun doing it. The class is heterogeneous and so some of you might enjoy doing one project or the other.
You obviously cannot do the same project twice. Again this can be done in groups (max 2 per project!) .
Ultimately, my hope is that these projects will serve as a scafold for you when you do data analysis in the real world. The projects are due on the Friday the 7th of March and on Friday the 25th of April. Show each other your projects and help each other! This is meant to translate your learning in the class to actual projects that one can do and gain real-world skills.
Generate one or more matlab scripts or functions to answer these questions. Ideally you use cells to answer diferent parts of your code.
You can also learn about MATLAB live scripts which allow you to present a notebook of what you did.
Instructions for submission
1. Everyone should submit a zipfile named lastname project1 - for example I would submit “Chan- drasekaran Project1.zip” .
2. Inside this zip file should be:
• Your main script, which should just be your last name, like ”Chand.m”,
• Separate .m files for all your functions, and the data that your project uses,
• A published .html or .pdf of your matlab script.
3. To Html or pdf publish go to the publish menu → select the down arrow under publish → edit publishing options → change the location to the lastname project1 folder. (if publishing an html make sure all images are included, pdf is simpler) .
4. Once all your files are in the folder, zip the folder and upload it via the link on blackboard, If you don’t know how to zip a folder: http://rasmussen.libanswers.com/faq/32413
5. Please make sure you include your code or comments who you did the project with.
1 Psychometric and Chronometric Curves
Often times in neuroscience, you need to plot the behavior of animals when they perform various tasks. In a .mat file I have provided a matrix with three columns. The first column lists the stimulus condition which is the number of red squares in the checkerboard out of 225. The second column lists whether the animal responded red (denoted by a 2) or green (denoted by a 1), and finally the last column the reaction time of the animal. The data are basically of a monkey performing a red-green reaction time visual discrimination decision-making task with arm movements as the behavioral report. The psychometric curve and chronometric curve (reaction times) can be found in Figure 1 of this paper ( https://www.nature.com/ articles/s41467-017-00715-0.). We are basically replicating some of these figures with help of the data here. The data are stored in Project1 MonkeyBehavior.mat.
1. Calculate a signed coherence parameter which is given as 100*(R-G)/(R+G), print the minimum and maximum of this signed coherence parameter. (5 + 5 points).
2. Calculate the percentage of trials the animal responded red as a function of the signed coherence of the checkerboard. Plot this in a graph where the x-axis is signed coherence, y-axes is percentage responded red. T (20 points)
3. Calculate the mean reaction time for each value of signed coherence. Plot this again in a graph where the x-axis is signed coherence and y-axes is reaction time. (20 points)
4. Can you include error bars for the reaction time curve. To calculate error bars, you first need to measure the SEM, which is defined as follows (10 points)
Where X is the variable of interest (in our case RT) and N is the number of data points. Once you have an SEM. You can use the errorbar function to plot the errorbars for the data.
5. Plot a histogram of all the reaction times observed for the animal. Does the reaction time distribution look Gaussian? (10 + 5 points)
6. Plot the median and the mean of RTs as a function of the signed coherence. Is the mean or the median larger? If so why? (10 + 5 points)
7. Label all axes and include legends and make sure all labels are legible. Provide a title for all the graphs at the top of the figure. (This will need some research). (5 points).
8. All functions need to be commented and code needs appropriate comments. Files should execute with minimal modification. (5 points).
9. Note we are not looking for inefficient and awful looking where people do not use for loops and manually calculate the percent red or RT for each signed coherence. These kinds of solutions will be docked multiple points. That is not the intent of programming in this class. We expect clean and clearly documented code.
Figure 1: What I typically expect the output from project 1
2 Properties of Distributions
A lot of analyses in neuroscience and psychology often involve analysis of properties of distributions. For instance in classical signal detection theory, the assumption is that stimuli and noise are distributed according to the normal distribution. Similarly, firing rates of some neurons can be poisson in nature. When neurons are poisson, the inter spike intervals are exponential. In contrast, some neurons are more regular than poisson in which case the inter spike interval distributions follows a gamma distribution. Reaction times often do not follow a gaussian distribution and often follow a gamma distribution. Finally, beta distributions are often used to model undergraduate scores in exams and also as the prior in Bayesian statistics. These are not a part of the class but will be a scafold for the future.
To help you get an understanding of distributions, I am giving you the opportunity to play with datasets from these distributions. Download the dataset from Blackboard. You should get a matrix with 2000 rows and 6 columns. Each column is data generated from a distribution. There are 6 diferent distributions.
1. Plot the histogram of the six distributions. We want 2 rows and 3 columns. Use 20 bins for the histograms (10 points) .
2. Write a function that calculates the mean and variance of the distributions (20 points) . Remember the sample mean and the variance are given as
Mean:
Variance
3. Calculate the sample mean and the sample median for the distributions and produce a bar plot of the means and medians. We want both mean and median for all distributions in the same plot and for each distribution the mean and the median should be side by side. Do you notice something about the means and the medians of the diferent distributions? Can you comment on why the medians and means have this pattern for these distribution? (10 + 5 + 5 points)
4. Besides the mean, median, and the variance, distributions are also described by the skewness and kurtosis.
Skewness tells you the assymmetry of the distribution about its mean, that is whether the majority of the data is shifted to the right or left of the mean. A normal distribution is centered on the mean.
Skewness:
Kurtosis is also another quantity that describes the shape of the distribution. It provides insight into the tails of the distribution and tells you whether the distribution is heavy (meaning more extreme values) or narrow tailed (meaning fewer extreme values) . A normal distribution has a kurtosis of 3 and is used as a reference distribution.
Kurtosis
5. Using your Kurtosis and skewness measures provide an intelligent and cohesive comment on the 6 distributions. Do some research on positive and negatively skewed distributions and on leptokurtic and platykurtic distributions and use that as a scafold for your answers. (15 points) .
Write a function that returns the mean, the variance, the skewness and the kurtosis for a distribution. We expect 4 return values one for each of the metrics of the distribution. (25 points) .
6. Label all axes and include legends and make sure all labels are legible. Provide a title for all the graphs at the top of the figure. (This will need some research) . (5 points) .
7. All functions need to be commented and code needs appropriate comments. Files should execute with minimal modification. (5 points) .
3 Variance, covariance and correlation between two sets of data
Often times in neuroscience, you wish to know if firing rates of two neurons are correlated with each other. This is the field of noise and signal correlations (cf. Ruf and Cohen, 2016) . To calculate correlations between variables you need to define something called covariance.
Remember in class, we have talked about mean and variance which have the following forms.
Mean:
Variance
In the above equation sx is the standard deviation and square of the standard deviation is the variance. When one needs to understand the relationship between variables, the relevant quantity is covariance, which is
We will try to implement functions to calculate these quantities using MATLAB.
1. Generate a variable X as a column vector with 100 normally or gaussian distributed random numbers with a mean of 3 and a standard deviation of 3 . Then generate 3 variables Y1, Y2 , Y3 with the following commands. What are the sizes of the vectors you generated? (2 + 2 + 2 + 2 + 2 points)
• Y1 = 3*X + 2*randn(length(X),1);
• Y2 = -3*X + 2*randn(length(X),1);
• Y3 = 4 + 2*randn(length(X),1);
2. Write a function that returns the mean and the variance when given an array as an input. (20 points) . Run the mean and variance function on the data. Show that your implementation provides the same result as built in functions in MATLAB.
3. Do you notice any similarities between the equations for variance and covariance? (10 points) .
4. Plot all 3 variables versus X. We are expecting 1 figure with 3 sub plots. We want 1 row and 3 columns of plots. Look up the subplot command. (10 points)
5. Calculating covariance
(a) Write a function that calculates the covariance between two variables X and Y. (5 points) (b) Use this function to calculate the covariance between X and each of Y1, Y2 , Y3 . (5 points)
(c) Does your estimate of covariance matches the results from the cov function in matlab? Can you rewrite the function so that it provides the same result as the MATLAB function? (10 points)
6. Now write a function that calculates the correlation between X and Y. Correlation between two variable X and Y are given by the equation.
.
Again examine if the results of your function match the results of what you get from MATLAB. (20 points)
7. Label all axes and include legends and make sure all labels are legible. Provide a title for all the graphs at the top of the figure. (This will need some research) (5 points) .
8. All functions need to be commented and code needs appropriate comments. (5 points) .
4 Mortgages
Assume you are getting a mortgage with a housing price V, a downpayment % of d, a yearly interest rate r for n years from your friendly neighborhood bank.
1. Install the datafeed toolbox and retrieve the 30 year mortgage rate in the united states and plot this rate. Which week was the rate highest? Mark that point on the plot. Which week was the rate lowest? Mark that point on the plot? Can you identify which weeks that these highest and lowest rates correspond to? (15 points)
2. Given V, d, r, n. Derive a formula for the amount of mortgage you pay each month? You can define a variable called Principal (P) from V and d (20 points) .
3. Implement this formula in MATLAB and ask the user to input the values of V, d, r, and n and return the monthly payment (15 points) .
4. Calculate how much you pay in interest and how much you pay in principal each month. Plot the interest and principal for each month that you pay. At the start of the mortgage, what are you paying for the most. How much of your mortgage have you paid of by the year n/2 . (10 points) .
5. How much interest are you paying through the duration of your mortgage? (10 points)
6. Now assume that you received a small windfall amount “S” in year 3 of your mortgage and your bank allows you to pay that towards your mortgage. Assume from month 37 onwards the mortgage is reduced. What is your new monthly payment? (20 points) .
7. Label all axes and include legends and make sure all labels are legible. Provide a title for all the graphs at the top of the figure. (This will need some research) . (5 points) .
8. All functions need to be commented and code needs appropriate comments. Files should execute with minimal modification. (5 points) .
5 Pattern Separations
In this project, the goal is to perform pattern separations as we performed in the lecture. Assume that there are 3 classes of neurons. We have identified 2 features for 500 neurons in each class. We are reaching out to you to help us with further analysis of this data and classifying new neurons that we have identified the features of. The neuron classes are given in the attached mat file.
The second variable in the file are the new example neurons.
1. Load the neural data and plot the diferent neurons in a 2 dimensional space. Use diferent colors to plot the diferent classes (say red, green, blue) and diferent markers (x,o, and diamonds) (10 points) .
2. Plot the mean feature of the three diferent neuron classes (5 points) .
3. What is the shape of the clusters? Do you think the clusters are equally dense? Can you provide quantitative support for your conclusions? (3 + 3 + 4 points)
4. Create a function that takes two vectors as inputs and computes the cosine similarity between them. Show how well your function performs by comparing with simple vectors. (20 points) .
5. Calculate cosine similarities between the neurons in each class and mean of the features for each class.
(15 points) .
6. Calculate mean cosine similarities between the neurons in class 1 and the mean for classes 2 and 3 . Repeat for the other two possible combinations. Store all your answers in a 3 x 3 matrix. The diagonal values tell you the average self similarity whereas the other points tell you the similarity between the neurons in class i with the neurons in class j (10 points) .
7. Do you think the neuron classes are well separated? Discuss using the similarities calculated. (10 points) .
8. Calculate the similarities for each of the new example neurons and identify which class is the most likely for each of them. You can use the mean point for the classes for calculating the similarities. (5+5 points)
9. Label all axes and include legends and make sure all labels are legible. Provide a title for all the graphs at the top of the figure. (This will need some research) . (5 points) .
10. All functions need to be commented and code needs appropriate comments. Files should execute with minimal modification. (5 points) .
6 Simulating the Lorenz Attractor model using diferential equations (ad-vanced)
This project is designed for those with prior programming experience in MATLAB and/or other languages and any experience in solving diferential equations. This project will provide you with the opportunity to model the chaos and uncertainty that manifests in all aspects of life, including neural networks. Also known as the Lorenz butterly, the Lorenz Attractor is a famous system of diferential equations to determine an object’s path for any given starting position on the attractor. There is nothing random about this system! However, for some values of the initial parameters , r, and b, the attractor makes it so that two initial positions that are arbitrarily close (but not the same) will diverge after a number of steps. This is known as deterministic chaos, and relects the unpredictability that can arise from approximations and forecasting.
If you want to explore the Lorenz Attractor in more detail, please refer to this great interactive model which should help give you a better initial understanding of this network:
http://www.malinc.se/m/Lorenz.php
The system of diferential equations for the attractor is as follows:
After you have familiarized yourself with the attractor, you will need to declare your initial parameter values ( sigma , R , and, beta) and conditions (x, y, and z) . Start your model by creating a vector with the classic attractor values. . .
...And another vector with your initial conditions. You may begin by using x = 0; y = 1; z = 20 After declaring these values, you will also need to create a timestep variable and a timespan array. Aim for 50,000 timesteps.
1. Write a function ‘lorenzattractor’ that utilizes your initial parameters to create the lorenz system of three diferential equations from your parameters and initial conditions.(10 points)
2. Now that you have your diferential equations, use the ‘ode45’ function (this may require a bit of research) to solve your system. Ode45 will integrate the Lorenz equations you have created over the timespan you declared, starting at your initial conditions. Be sure to control for error tolerance using ‘options’.(30 points)
3. Plot your attractor in 3D! Does your graph match your expectations? How might it be diferent from other simulations of the Lorenz Attractor? (30 points)
4. Label all axes and include legends and make sure all labels are legible. Provide a title for the graph at the top of the figure.(5 points)
5. Adjust your initial attractor values and re-compile. Do you observe any changes? How might they relect the idea of deterministic chaos and how it relates to our ability to create accurate predictions in neuroscience?(20 points)
6. All functions need to be commented and code needs appropriate comments. Files should execute with minimal modification.(5 points)
7 Analysis of spectrograms of sounds
This project is suitable for students with a signal processing background or who find the class trivial right now :)! As a psychological researcher, you often spend time designing auditory stimuli for experiments or analyzing the vocalizations of species such as mice, monkeys, or lies. One of the most handy things to do is to learn to generate sounds and play them and also analyze their structure. Here, we will learn how to generate simple sinusoids, amplitude modulated signals, and frequency modulated signals.
For generation of sounds assume a sampling frequency of 12 KHz. Waveform amplitudes should be between -1 and +1 . Otherwise you clip sounds. Assume a length of 2 seconds.
1. Generate an amplitude modulated sine wave with carrier frequency (fc) of 250 Hz and a modulating frequency (fm) of 1 Hz and scale it by 0.8 . (5 points)
y1 (t) = 0 .8 sin(2πfmt) sin(2πfct) (11)
2. Generate another amplitude modulated sine wave with carrier frequency (fc) of 500 Hz and a modulating frequency (fm) of 5 Hz and scale it by 0.3 . (5 points)
y2 (t) = 0 .3 sin(2πfmt) sin(2πfct) (12)
3. Generate a frequency modulated signal with a carrier frequency (fc) of 800 Hz and a modulating fre- quency (fm) of 5 Hz and amplitude for the frequency modulation of 200 . (10 points)
y3 (t) = 0 .3 sin(2πfct + 200 sin(2πfmt)) (13)
4. Generate a rising chirp signal with a carrier frequency (fc) of 400 Hz and the following equation (5 points)
6. Concatenate all these five signals into a large variable y with a brief 0.25s silent period between them and use the audiowrite command to save to disk. (10 points)
7. Calculate the spectrogram of this concatenated signal y which is a time frequency signal that provides you a glimpse of the frequency content of the signal along with how the signal varies as a function of time. For spectrogram parameters in MATLAB use 1024 as your window size, 512 as your overlap and 1024 points for your FFT, and 12000 as your Fs. (20 points) . MATLAB has a function called spectrogram for this very purpose :)!
8. Comment on the spectrogram of the signal and link it back to the equations you used for generating the signal. Can you explain why you see what you see in the spectrogram? (10 points)
9. Download the Project6 audioFile.mp3 file from blackboard and use the spectrogram command on it. Use 2048 samples as your window, 1024 samples as overlap, 8192 points for your FFT, and use the Fs retrieved from the file. (20 points) .
10. Label all axes and include legends and make sure all labels are legible. Provide a title for all the graphs at the top of the figure. (This will need some research) . (5 points) .
11. All functions need to be commented and code needs appropriate comments. Files should execute with minimal modification. (5 points) .
8 Simulating Hodgkin Huxley models of the squid giant axon (advanced)
This project is meant for people with significant amount of programming experience in MATLAB and/or other languages and any experience in solving diferential equations using MATLAB. In this project, you get the exciting opportunity to simulate the famous Hodgkin Huxley models of a single giant squid axon. This was one of the earliest demonstrations of the power of using quantitative methods to analyze single neuron spikes. The governing equations for the Hodgkin Huxley Model are as follows. If you want to read more about Hodgkin Huxley models, an excellent introduction is available at http://www.math.pitt.edu/
bdoiron/assets/ermentrout-and-terman-ch-1.pdf. Iam assuming the resting potential is at 0 mV like the Hodgkin Huxley experiments.
The constants are as follows:
For the rate constants that depend on membrane voltage assume the following equations
1. Plot the properties of the gating variables m,n, and h as a function of input voltage. Does this match your understanding of how an action potential is generated? (30 points) .
2. Simulate an action potential using the hodgkin huxley model. Get it to generate a spike!(30 points) .
3. Simulate action potentials using the hodgkin huxley model you programmed and also use several diferent currents from 2 μA to 10 μA. (30 points) .
4. Label all axes and include legends and make sure all labels are legible. Provide a title for all the graphs at the top of the figure. (This will need some research) . (5 points) .
5. All functions need to be commented and code needs appropriate comments. Files should execute with minimal modification. (5 points) .
Classification and summarizing variance is a key part of neuroscience. This project is for people who enjoy playing with high dimensional data and want to derive intuitions for principal components analysis. I recognize for many of the class that this might not be what has been thought in lecture but there is an opportunity for you to think about these problems if you are interested in data sciences and machine learning in general.
9 Temperature data
In this project, we are going to use boxplots, t-tests, anovas, and regression to understand statistical data. These statistical techniques are used often in neuroscience and psychology. The temperature data is just a convenient way to actually play with these techniques. Download the NOAA climate data from blackboard for 6 cities: Miami, Boston, NY, San Francisco, Austin, and Bozeman. Perform the following analysis.
1. Use a boxplot to plot the mean temperature over the last century (averaged over the 12 months in a year) for the cities. Make sure you label the x-axes with the correct labels and they-axis with temperature (10 points) .
2. Write a function that performs an unpaired t-test with pooled variance. Is the mean temperature (Again averaged over 12 months) for NY over the last century diferent from that of Boston. Compare your results from the outputs from the inbuilt ttest2 function which performs unpaired t-tests (15 points) .
3. Use an ANOVA to compare the mean temperature for the cities. Test for the main efect that there are diferences between the cities (15 points) .
4. Use the multcompare function to perform posthoc tests to show that there are diferences between the cities (10 points) .
5. Plot the mean temperature as a function of year for the six cities (10 points) .
6. Use a regression analysis to estimate the change in temperature as a function of time. Using the slopes can you say which city has demonstrated the most rapid change in temperature most during this last 100 years? (20 points) .
7. Create a plot with the slopes and confidence intervals for the slopes. Create another plot that shows the amount of variance explained by the regression. Identify the two cities with the strongest efect of time on temperature.
8. Label all axes and include legends and make sure all labels are legible. Provide a title for all the graphs at the top of the figure. (This will need some research) . (5 points) .
9. All functions need to be commented and code needs appropriate comments. Files should execute with minimal modification. (5 points) .
10. BONUS question: For question 2, ensure your function has the option to perform a t-test assuming unequal variance for the two groups. This is called Welch’s t-test. (20 points) .