For this design task, you need to record your own voice at a sampling frequency of 16 kHz (the voice signal bandwidth is 0-8 kHz). The purpose is to determine the pitch period or pitch frequency of your voice. Start by saving the recorded sound file, and then use MATLAB to play the sound. Before proceeding to find the pitch period, listen to your own voice. Make sure to display the sound waveform. on the computer screen and show it to your demonstrator. Since the vocal tract changes every 10 to 20 ms, you should divide the voiced speech into non-overlapping frames of 20 ms each (see Figure 7).
To find the pitch period, manually count the number of samples between similar peaks in a frame. and repeat the process to other frames. Finally, discuss your results with your lab demonstrator. Once you have completed the above manual estimate of the pitch period, you can proceed to Levels 2, 3, 4, and 5, one after the other.
All the components needed for this task are already available with the one who selected task 1A. Those who selected simulation task 1B as the first task can collect the hardware for this second design task 2A from the lab demos. However, again, the use of microcontrollers is not permitted.
• Please note that students are required to build their speaker circuit using the components provided in Task 1A, such as speakers (8ohm), breadboard and transistors. If the components are lost or damaged, students will need to purchase replacements themselves. Laptops or desktops cannot be used as speakers.
• Upon completion, there will be an assessment that will involve the students demonstrating the design to the demos and getting their Experiment Design Log evaluated based on their submission (or journal) via Moodle.
• Please ensure that you take photos or record the completion of each level as you move to the next so that you have sufficient implementation details and results to be documented in your design journal or logbook.
• The deadline for completion of this task is 9 am on July 11, 2024. If necessary and if not already used, you can request a penalty-free one-week extension.
• All design tasks in this course must be completed individually. Copying from others is not permitted and may result in a failing grade, reinforcing the importance of academic integrity.
• Note 1: Students are not permitted to use in-built commands for autocorrelation, AMDF or harmonic product spectrum. You must write your own routines.
• Formal feedback will be provided well within two weeks of the relevant submission date through Moodle.
• The Assessment will be based on the selected TLT level by the student, which is described as follows:
Level 2 (Pass): For this, you need to complete the two parts:
* Part A: You need to estimate the pitch period of voiced speech (using your previously saved sound file) on a frame-by-frame. basis using two functions: (1) the Autocorrelation Function (AF) and (2) the Average Magnitude Difference Function (AMDF). Implement these functions in MATLAB using the correct formulas. Then, write a code that can estimate the pitch period on a frame-by-frame. basis automatically based on the results from these functions. Compare what you get with your own manual estimation, and then talk about your findings with your lab demonstrator.
* Part B: The pitch period ranges from 50 Hz to 400 Hz, and you should remove frequencies above 400 Hz before estimating the pitch period. To do this, design a lowpass filter with a cut-off frequency of 400Hz and filter your recorded voice signal using this lowpass filter. Then, use the Autocorrelation Function and the Average Magnitude Difference Function again to estimate the pitch period of this filtered voice signal. Compare this pitch period estimate with your manual one and your previous estimate. Discuss these comparisons, paying special attention to any doubling or halving of the pitch, with your lab demonstrator.
Figure 8 shows an example of a speech waveform. corresponding to a female speaker, a male speaker and a child speaker. The fundamental frequency extracted using the AMDF method for each 20ms frame. is also shown.
Level 3 (Credit): Once you’ve completed Level 2, you can move on to Level 3. Here, you’ll calculate the pitch period of a voiced speech segment(s) using a technique based on the frequency domain. Periodicity in the time domain results in useful impulses in the frequency domain at the fundamental frequency ( f0) and its harmonics (2 f0, 3 f0, 4 f0, 5 f0 etc). If the speech signal is periodic or quasi-periodic, the magnitude spectrum will show peaks in multiples of the fundamental frequency f0. You’ll need to apply the Harmonic Product Spectrum (HPS) method in the frequency domain to estimate the pitch period or pitch frequency. Compare your findings with those from the time domain method and discuss these with your lab demonstrator.
Please note that for the above-mentioned Levels 2 and 3, there are three things to be tried:
* Fundamental: use vowel recording (a,e,i,o, or u) and check if it’s working
* Advance: use sentence recording, which will be more difficult because will involve energy detection
* Bonus: Add additive white gaussian noise (AWGN) and test your design as a noise-robust system
Level 4 (Distinction): After finishing Levels 2 and 3, you’re ready to move on to Level 4. In this stage, you need to record your own voice saying the word ’six’, which contains both voiced and unvoiced segments. Your task is to write a MATLAB program that identifies which segments are voiced and which are unvoiced. Then, you’ll calculate the pitch frequency for the voiced segments. After that, you’ll need to write another MATLAB program to create a sound for each pitch frequency and generate random noise for the unvoiced segments. You will then play these sounds through a speaker circuit connected to your computer’s input/output port.
Level 5 (High Distinction): Once Levels 2, 3, and 4 are finished, you can move on to Level 5. In this stage, you’ll record another sentence that is fully voiced, meaning it has no unvoiced parts. An example might be "We were away," which is all voiced. Typically, telephone speech doesn’t include frequencies up to 300 Hz, meaning the speaker’s fundamental frequency is absent. One theory about how humans perceive pitch frequency suggests the presence of nonlinearity in our auditory system that can recreate the fundamental frequency. In Level 5, we’ll explore this theory through simulation.
Your task is to design a band-pass filter with a lower cut-off frequency of 320 Hz and an upper cut-off frequency of 7 kHz. When you apply this filter to your recorded speech, the resulting output won’t contain any fundamental frequencies. However, if you run this output through a nonlinear circuit (see diagram in Figure 7), you’ll be able to regenerate the fundamental frequency from the existing harmonic frequencies. Illustrate this concept by writing a MATLAB program to estimate the pitch period at points A, B, and C, as indicated in the diagram below. Discuss your results with your lab demonstrator.