0:00

Hello, welcome back to the course on audio signal processing for music applications.

Â This week, we're talking about the sinusoidal model and

Â we are considering sinusoids in the frequency domain, in the spectrum.

Â And more particularly, we are considering that a sinusoid is a peak, spectral peak.

Â And from which we can measure the values of the sinusoid.

Â So this is what we want to go over in this programming lecture

Â from a programming perspective.

Â Therefore, developing some code that would allow us to measure

Â the values of the sinusoid.

Â And these are the equations we presented in the lecture class to measure

Â the frequency, the amplitude, and phase of a sinusoid using parabolic interpolation.

Â So this first equation is the way to

Â compute a parabola on the tip of a peak.

Â And obtaining the refined frequency value as this

Â is as allocation from zero to n which is an FFT size.

Â Then we can obtain the frequency value in hertz by

Â multiplying by the sampling rate and divided by capital N.

Â And then we can plot the location value in the parabola and

Â obtain the tip of it, which will be the interpolated amplitude value.

Â And finally we can also get the phase value by

Â reading the location of the frequency.

Â Maybe using a linear interpolation of that in the phase spectrum, so

Â that we get a refined phase value.

Â 1:55

And in here I wrote some code to actually do this detection of a spectral peak.

Â So here I first imported,

Â some of the packages that I need like the numpy and

Â matplotlib, the window from scipy.

Â Then I go to append the directory from which

Â I have some code that I'm going to use.

Â Is specifically the DFT model so that our DFT implementation and

Â set of utility functions that I have in the SNS tools package.

Â Okay, after I have all these packages, I can start reading a sound file.

Â So I read the sound file that includes a sinewave

Â of 440 hertz so I know exactly what frequency I should get.

Â 3:15

And t is threshold that I will be using for the peak detection.

Â So then I can get the window, I get a humming window of size M.

Â Then I read into the file, in the middle of the sound file, so

Â I can just get M, capital M samples of the middle of the sound.

Â And then I can compute the spectrum using the DFT I return the magnitude and

Â phase, and then I am computing the peak.

Â So this peak detection, this is a function that is declared in the utility functions.

Â So let's go there.

Â 4:06

magnitude spectrum and t, which is the threshold.

Â The magnitude spectrum, which is the array of values,

Â of amplitude values, in decibels.

Â And t is the threshold in decibels that is going to be used to conjure the minimum

Â value of below which we are not going to consider to have peaks.

Â So the computation is quite simple.

Â We check for three conditions in the whole magnitude spectrum.

Â 4:37

Well, the whole,

Â we are discarding the initial value, value zero and the last value.

Â So we then between one and the size minus one,

Â we are looking for the values that are above a threshold, above t.

Â The values that are higher than the previous and

Â the next values, so that they are a local maximum.

Â And then we are considering the peaks as being the values that fulfill

Â the three criteria.

Â They are above a threshold, they are above the next value and

Â they are above the previous value.

Â 5:22

And then we add one in order to undo this one, the condition that we had.

Â And it returns the locations, an array of peaks of the locations of those peaks.

Â You can go back to our code, so we call the peak detection.

Â It returns the locations and then to get the magnitudes of those locations,

Â we just read those values from the magnitude spectrum.

Â 5:54

To plot that, we have these lines for

Â plotting in which we define first frequency axis,

Â so we're going to be able to see the x-axis in hertz.

Â Then we plot the whole magnitude spectrum, so

Â we've the frequency axis and the magnitude spectrum.

Â And we plot the peak locations on top of that.

Â So we plot the locations and the magnitudes.

Â And we're going to plot an x on those

Â locations without any lines, so we just see those locations.

Â So let's run that, we are in the workspace of SMS tools.

Â And I have this test.py so I can just run test.

Â 6:46

Okay, and this is the spectrum of the sinusoid.

Â And of course, it has a main slope of the humming.

Â So let's zoom into that location.

Â So let's just go into just the peak.

Â And this is the peak of the sinusoid.

Â We can see the cross, and

Â we can see some of the higher samples of the magnitude spectrum.

Â 7:18

Okay, so in here we can see that the cross is at

Â location 429 hertz, more or less, or 430 hertz.

Â That's clearly quite far from the 440 hertz.

Â Why is that?

Â Well, this is because our size, the FFT size, was 512 samples.

Â 7:43

And with 512 samples,

Â we have quite a bit of distance between two consecutive samples.

Â In order to compute exactly what is that distance,

Â we can just do the sampling rate 44,100 divided by the FFT size which was 512.

Â And 86, this is the distance in hertz between

Â one sample which is here it says 400 and around 28.

Â And in here is 515 kind of thing, so

Â this is the 86 samples distance between these two.

Â 8:27

So to improve the resolution, we can now increase the FFT size.

Â So in here the N, let's make it four times bigger.

Â For example, let's make it 2,048 samples.

Â So this should give us better frequency resolution, let's try that.

Â So we save that, and in here, well, let's remember this plot so

Â we can now If we run again and

Â we zoom into the tip of the spectrum.

Â 9:04

Okay.

Â Okay, yeah, we can see that there is more samples now.

Â But if we even zoom more,

Â we can see that now the tip,

Â the peak has been computed to be around 430 hertz.

Â And clearly the distance between two consecutive samples has reduced.

Â So in fact it should be four times smaller.

Â It would compute that, we can just compute 44,100 divided by 2,048 so

Â now it's 21 hertz, the distance.

Â But still, it's not the frequency we would like to have, it's a little bit far.

Â In fact this 21 hertz resolution is not ideal.

Â So what we're going to do, is to do this parabolic interpolation.

Â So in the utility functions file there is this peakInterp function

Â that performs parabolic interpolation on these locations that we have found.

Â So from the magnitude spectrum, the phase and the locations that it found,

Â these local maximum, this function computes the interpolated values.

Â So from the three highest peaks of

Â each local maximum, the actual local maximum, left and right values.

Â It performs the parabolic interpolation to refind the frequency and

Â find the center of the parabola.

Â Then it finds the tip of the parabola, the magnitude of it.

Â And then it uses the location to look into the phase spectrum and

Â performs a linear interpolation.

Â Because the phase during this value should be quite flat and it performs

Â a linear interpolation to find the actual face value, and it returns this string.

Â 11:27

So we will obtain these by calling the peak interpolation function,

Â so let's copy that too, copy and put it here,

Â okay, so this is our peak interpolation function.

Â No need the computer demag.

Â So now it returns the interpolated location, interpolated magnitude and

Â interpolated phase.

Â By calling the peakInterp and

Â sending it in the magnified phase and location values.

Â So now we can plot the iploc and ipmag.

Â 12:07

Okay, so let's see if we improve this plot now so that we had the peak

Â at 430 hertz, see what we can do with now this interpolation.

Â So let's run again test, it complains about peakInterp because it's

Â part of the package UF so I have to specify that it comes from there.

Â 12:49

Okay, so here we see the values we had before.

Â So the tip before it was this tip.

Â So it was around 430, and

Â now it has found the cross is in between these two locations.

Â So it has moved the tip because can change the tip of the parabola and

Â now the cross, it's closer to 440.

Â Still it's not at 440, in fact it's around 439 or so.

Â So we're still one hertz difference but

Â definitely is much better than what we started with.

Â So that's basically, I wanted to show, so with this code

Â we can analyze a spectrum of a sound.

Â And identify the spectral peak locations that are within that spectrum.

Â Of course, if the sound is more complex than a sine wave and the threshold

Â is also lower, it will not just find one single peak, it will find many peaks.

Â 13:59

Okay, let's go back to the slides.

Â So we have been using the sinusoidal model and

Â part of it the concept of peak detection.

Â And using some of the packages of Python, numpy, scipy, matplotlib and

Â some of the functions available in the SMS tools package.

Â We have been able to measure the spectral peaks of sound and

Â with that we are now ready to go to the next stage.

Â Which we'll be talking about in the next programming class.

Â Of actually handling more real sounds and

Â both analyzing and synthesizing these sinusoids.

Â So I hope to see you next class.

Â Bye-bye.

Â