0:52

So, let's go directly to the interface that

Â we have in the semestral package, through which we can access all the models.

Â Which is this models GY Interface.

Â And well, here we have the Harmonic model as one of the options but

Â let's start from the DFT and let's start by analyzing a simple sound,

Â a sound of which we know the fundamental frequency and

Â that is very clear so this is a subtle sound.

Â If we can listen to this sound.

Â [SOUND] Okay, so this is an electronically generated sound, and

Â now what we want to do is to just first look at the single DFT.

Â So that then we can understand better the sound and

Â decide what are the appropriate parameters for analyzing the harmonics of the sound.

Â So the first decision we have to make is what window do we use.

Â Being a simple sound,

Â electronic sound, sincerely the type of window is not that critical.

Â So let's start with the simple window.

Â For example, let's start with the humming window.

Â 2:20

By default here, it 511, but how do we decide the best window size?

Â And we went over that in theory class.

Â So the window size which also we call with the variable m,

Â can be computed by multiplying the width of the main lobe of the humming

Â window which is 4 multiply by the sampling rate

Â of the sound which is 44,100 and divide it by the fundamental frequency that we have.

Â And in this case, it's at 440 Hertz,

Â which is the A for a note.

Â So, we divide by 440.

Â And, the result is basically 401 samples.

Â This would be four periods of this particular sound so let's do that.

Â Let's put 401 samples as the window size, and in terms of the FFT.

Â 3:26

Size, well, we wanted to have bigger than the window size.

Â Here, we can just do a big FFT size so

Â that we have a lot zero padding, we have a smooth spectrum.

Â So let's put for example 2,048, and

Â we have to choose where are we going to perform these analysis.

Â This is a one second sound, so here, .2,

Â that sounds like a good point in which to choose these 401 samples.

Â So let's compute.

Â Okay, so this is the analysis results.

Â And the input sound, as we chose,

Â is four periods of the.

Â 4:58

Okay so these peaks corresponds to the harmonics and we have harmonics going from

Â 440 hertz up to half of the sampling rate.

Â So in fact, if you look at the shape of the saw tooth,

Â is not that perfect saw tooth,

Â in the sense that is doesn't have the smooth saw tooth.

Â It has this kind of oscillation years.

Â And this is because we have a finite number of sinuses.

Â It is not the, an infinite is not the continuous waveform is a discrete waveform

Â and it has a limited number of sinusoids.

Â Okay, and then if we do the inverse of that,

Â we obtain the reconstructed waveform but

Â of course is a reconcited waveform with the window that we apply to eat.

Â So, we applied humming window.

Â So, this is a window saw tooth waveform.

Â Anyways, so this works quite well and we can just analyze

Â 6:35

decide what is going to be a peak or let's say, a partial of this sound.

Â So the magnitude threshold,

Â we can put here we see the thresholder's things going pretty much down.

Â So we can just put for example -100.

Â Then we can decide minimum duration of the sinusoid,

Â that this being an electronic, a very stable sound,

Â this really doesn't matter that much number of sinusoids to try.

Â Well, we can just put a big number.

Â We can just put for example a hundred that to define and

Â then we also can have a deviation that we allow from

Â one frame to the next in terms of Hertz with respect

Â to what would be the frequency 0 then is a little bit scale as It goes up.

Â This being a very stable electronic sound.

Â It's really not an issue that the stability is going to be so

Â high that this frequency deviation could give back very small and

Â therefore the slope of these deviations.

Â So the change of these deviations as the frequency goes higher also it can be very

Â small.

Â So that's not an issue.

Â So let's compute with these values.

Â 7:54

Okay, and this is the result.

Â So here, we have the regional sound, the complete sound.

Â Now, we are analyzing all the sound.

Â And here is the harmonics or the,

Â basically the sinusoids that it found and the reconstruction.

Â So, here it's very much,

Â in fact the harmonics of the south except at the very bottom.

Â If I could look here at the very bottom, we see these lines

Â that in fact they are not part of the harmonic series.

Â And why is that is that it's impact its side note.

Â It will go back to the DFT here at the very low frequencies, we see some type

Â loads and this is what is catching the sinusoid model at the very bottom.

Â 9:25

Okay and okay, interestingly enough,

Â we see a very different set of sinusoidal tracks.

Â We see many more.

Â What are we seeing here?

Â Well, we are seeing a lot of the side loads of

Â every single harmonic because the side loads are quite high.

Â And therefore with a threshold of minus 100 decibels.

Â And with a window size which is large enough so

Â therefore, there is space to visualize the side lopes.

Â This appear in the analysis.

Â Even though, if we plate, [SOUND] it sounds pretty good.

Â It sounds as if we are only resin to sizing, the harmonics and

Â this is because, of course, they are part of the spectrum.

Â So in terms of reconstruction, it's pretty good,

Â even though from an analysis point of view, it's not so

Â good because we are seeing, basically, the artifacts of the analysis.

Â Okay, now let's go to the harmonic model and let's in fact,

Â start from kind of this wrong parameters.

Â So parameters that are not the best.

Â So we start from the honey window and we do this 600 window

Â 10:52

size and we have this -100 threshold.

Â And in the harmonic model, an additional set of parameters that we

Â have to specify relate to the actual fundamental frequency and

Â the number of harmonics to be detected.

Â So in terms of number of harmonics, in fact,

Â we can know because that given that the fundamental frequency is 440.

Â We can compute in fact the maximum number of harmonics that will be in the spectrum,

Â which will be half of the sampling rate,

Â 22050 divided by the fundamental frequency, by 440.

Â So, 50 is in fact the maximum

Â number of harmonics that will be present in this sound.

Â So, we can specify 50 harmonics, and then we have to specify a possible

Â range of the fundamental frequency, so to help the two way miss match algorithm

Â that is being used here In the detection of the fundamental frequencies.

Â So, for example, we can be kind of flexible.

Â So we can put between a 100 and let's say 600.

Â We know that this is 440, so we could be more restrictive.

Â But this would be just fine.

Â And let just compute it with these parameters.

Â 12:35

in the sense that we are now restricting

Â the sinusoids to be harmonics of the fundamental that was found.

Â And even though, the window size was large, the number of peaks

Â I then define were many more and we also identified the side notes.

Â These has now constrained the search for the harmonics.

Â And therefore, we only see the harmonics.

Â Of course now, we can go back to the ideal type of analysis barriers.

Â So, we can go back for humming window, and let's go

Â 13:15

back to 400 samples and let's leave the rest the same.

Â And we compute, and now we in fact we will obtain the same thing.

Â It's the same thing but now the window size is small which

Â is sufficient, and if we play the original [SOUND] and

Â if we play the reconstructed [SOUND] is identical.

Â So we basically have captured all the relevant

Â information of these sinusoidal components of the sound.

Â Okay, now let's go to more real sound and more natural sound.

Â So let's close all these play windows.

Â And let's start again from the DFT, but let's look at a violin sound.

Â So there is a violin sound here in the sounds of the SMS tools.

Â Which is a violin with frequency B3, and we can listen to that.

Â [SOUND] Okay, so B3, the pitch that corresponds to the note B3,

Â which is lower than the A4 that we had before,

Â is 200 and around 46 hertz, okay?

Â So, in order to find the best window size,

Â we can compute the, for

Â example if we start from a humming window,

Â we need to compute 4 x 444,100,

Â divided by the frequency so 246.

Â Okay, so this is a lower frequency.

Â Therefore four periods of the sound is each larger 717 samples.

Â So we can put here 717, okay.

Â When we can be the same f50 size, and

Â in fact, well we can this sound a little bit longer.

Â So, let's put 0.5 as place to be analyzed.

Â And here is the result, this is not an electronic sound so

Â the number of periods now that we have chosen still four,

Â but is much more irregular than with a subtle.

Â So in fact here, it's even a little bit harder to see the period.

Â In fact, the period is like two bumps.

Â So this would be one period would go from here to here and another.

Â Then another, so it's again, four periods of the violin sound.

Â The spectrum is a little more complex than the one again of the subtle,

Â but we see clearly the harmonics.

Â So if we zoom a little bit better into the part

Â that we see as being relevant we see the first

Â few peaks, and these are clearly the harmonics of the sound.

Â But we see a lot of kind of energy or

Â spectral information that doesn't have this nice-looking

Â sort of peaks or shapes corresponding to the window.

Â So in fact Instead of a hamming window, it might be better to take

Â a smoother window that kind of can discriminate better these

Â kind of background residual or noise, or these sounds.

Â So let's use the Blackman window.

Â And having this, is being smoother, we need more samples.

Â So in fact we need six periods for this,

Â the main lope of the Blackman is six beam wide.

Â So we use the same equation to complete the window size but

Â multiplied by six, so now we need at least 1075 samples.

Â So let's put here 1075 samples, and let's compute the same way.

Â And now, we're seeing much better

Â the harmonics of the sound.

Â In fact, let's compare it with the previous one and

Â having zoom Into the same area so we zoom

Â 19:06

And let's apply the same values.

Â So, let's apply the Blackman window,

Â and the, if we remember the size that

Â we put was 1,075, so, 1,075.

Â And the FFT is 2048 which,

Â it's okay, it's a good zero-padding.

Â The truth is that we don't need that much threshold down.

Â We need more than in the because as we can see here

Â this higher harmonics are quite a bit down from the first harmonics.

Â So minus 100 would be here.

Â So let's just leave the minus 100.

Â And then in terms of the duration of the tracks.

Â So this is this idea that if a crack doesn't last enough

Â we're going to reject it.

Â In terms of number of harmonics also we can compute what would be

Â the possible maximum number harmonics present in the violin and

Â these we will just take the half of the sampling rate,

Â 22050 and divided by the fundamental frequency that a B3 has, which is 246.

Â So, maximum if there were harmonics all the way through half of

Â the sampling rate, there would be 89 partials 89 harmonics.

Â Well, lets put 89.

Â And then the fundamental frequency will have to have the algorithm again from

Â we know it's 246 so we can just put between 200 and 300.

Â That should be enough.

Â And these other parameter again this is a very simple sound.

Â It's not as simple as the sound of the steel.

Â So I do not think these parameters will matter that much but

Â so let's just compute it like this.

Â 21:39

But let's listen again the input and output.

Â [SOUND] So this is the input that we started from and

Â this is [SOUND] the output.

Â Sounds pretty good.

Â If you listen carefully, there may be some aspect,

Â especially during the attack that is not quite there.

Â So kind of this is a let's say, cleaner version of the original sound.

Â Or let's say, a smoother version and maybe is not as bright as the original sound.

Â But clearly the color qualities of the sound are here.

Â And we have been able to capture them.

Â