0:05

This week, we're talking about quite a few spectrum modeling approaches.

Â We're combining the sinusoidal and harmonic analysis,

Â that we talked in the last few week, with the idea of residual and

Â statistic approximation of this residual.

Â So, in this demonstration class, I want to talk about one particular model,

Â the Harmonic plus residual model.

Â So, the idea of analyzing the harmonics of a sound,

Â subtracting them from the original signal.

Â And obtaining the residual, which then can be combined, of course,

Â with the harmonics that we have identified.

Â So let's use the SMS tools that we have been talking about and

Â developing in the course.

Â And let's start with the DFT and let's start with a sound,

Â with this organ sound that we have in the SMS Tools directory.

Â Let's listen to that and we listen with headphones,

Â so that we can listen more carefully.

Â [MUSIC]

Â Okay, so this is quite a very stable tone, and

Â it's a very traditional standard sound of an organ.

Â So in order to analyze,

Â we want to be able to distinguish the harmonics of this sound.

Â So this is a C3, which is around 264 Hertz, so

Â in order to find what is a good window size let's use the Blackman window.

Â This is a very stable note so we can take advantage of a longer window, and

Â really try to isolate the harmonics as much as possible.

Â So that's six beats for the Blackman window and

Â then something where it is 44100, and

Â then we divide by the frequency of this note, 264.

Â So that gives me 1,002 samples that make sense to analyze,

Â okay, so let's put 1,003 to make it an odd window.

Â And, okay, let's use 2,048 F50 size.

Â So, it's quite a bit of zero pattern, and that's good.

Â And so let's compute it.

Â Okay, so this is the samples that we are analyzing.

Â We analyzing six periods of the sound, it clearly looks very sinusoidal.

Â And from the manual spectrum we see that there's not that many clear harmonics,

Â even though there is quite a bit of energy in the high frequency range.

Â 3:04

But it doesn't have all these clear sinusoids.

Â So this is an indication that it's a sound that has quite a bit of,

Â kind of a noise like, or kind of a stochastic component into that.

Â Well, the phase looks as expected, and of course, the reconstruction is quite good.

Â Okay, so this looks like a decent approach.

Â Of course, we could maybe, given that it's stable,

Â we could take a bigger window size.

Â So let's try that, let's try like twice as much,

Â let's try 2,006 samples.

Â And then, let's even do a bigger FFT, let's use 4,096, okay, so

Â we are doing quite a bit of zero pairing.

Â And well, here the time was in 2 seconds which is good,

Â the sound is quite long, so we're kind of in the middle, so let's do that.

Â Okay, this looks twice as many samples,

Â like we are analyzing, 12 periods.

Â And yes, now we are seeing a little bit more things in the spectrum,

Â if you compared the previous one to the current one,

Â we are seeing the harmonics maybe more defined.

Â And we are seeing quite a bit of background things.

Â So maybe now we are seeing a little bit higher harmonics than we were seeing

Â before.

Â Because we have the higher window we kind of reduce this background,

Â because it is not a very coherent type of signal, so

Â it emphasizes a coherent part, the harmonics and

Â reduces this kind of stochastic component.

Â Okay, so this seems to be a good choice, so let's go to the STFT and

Â apply the same parameter.

Â So let's apply the Blackman, when it first opened the sound, the organ sound.

Â 5:12

Let's use these 2,006

Â samples that we took, okay, and

Â let's use the 4096 at 50 size.

Â Here now we have to choose a hop size, here I don't think it

Â matters that much, but let's just use a maybe 500.

Â So one fourth to be able to overlap with the in fact, it should be even more,

Â but for efficiency purposes, let's just leave it like that.

Â Okay, let's compute this.

Â These will take a little bit longer, because, of course,

Â it's longer sound and it is quite big.

Â Okay, so this is what we are getting, of course, they are synthesized.

Â No need to listen to it because it's going to be quite identical.

Â And now, well, we see the horizontal line, maybe let's zoom into the lower areas.

Â Okay I'm sure we can see a little bit better the harmonics,

Â the harmonics look clearly very well defined.

Â 6:26

But there is quite a few things in between.

Â Okay, that seems to be a good choice.

Â Now let's do the harmonic analyses, again, using the same parameters.

Â So we'll go to the organ, let's see the Blackman.

Â Let's use 2,006 and

Â let's use 4096 as the FFT size, and

Â now we have to choose the parameters to identify the peaks and the harmonics.

Â The magnitude threshold, okay, minus 90, that looks like a reasonable one.

Â The duration of the harmonics here, since we are in a long, stable node,

Â we can even afford to put a longer type of track, so at least let's say,

Â that they have to last for 0.2 seconds, or it could be even more.

Â The truth is that there is not that many, because they kind of disappear,

Â so I would say that 30 or 40 harmonics should be plenty.

Â 7:29

And we know the fundamental frequency is 264 so,

Â the range has to be for that, so 130 and

Â 300, 264 is within that.

Â So maybe we can even make this higher so that we make sure it feeds correctly.

Â Okay, that sounds good.

Â And this is the error for the fundamental frequency detection,

Â it should be very clear fundamental, there should not be a problem with that.

Â And this deviation, this is how we will allow

Â the harmonics to deviate from perfect harmonicities.

Â Let's leave it like this and see what happens.

Â Okay let's compute it, again, this will a little bit longer.

Â 8:20

Okay, so we found quite a few harmonics, and

Â of course, in the attack and the k is very unstable,

Â so maybe we should have rejected those.

Â But what is interesting is that some harmonics are quite stable, but

Â some are very unstable.

Â Let's listen to the re-synthesized sound.

Â [SOUND] Okay,

Â it sounds good.

Â Let's go play with original.

Â [SOUND] Well, if you pay attention,

Â clearly, the original has this more air in the background,

Â that is not in the synthesize because it's mainly the harmonics.

Â But the truth is, that also some of these higher harmonics are very unstable,

Â so maybe it's not right to consider them as harmonics.

Â Because they are basically maybe tracking some noise part.

Â So one way to get rid of that is to reduce this deviation and

Â restrict it even more, for example, it could 0.001.

Â And then let's see what happens.

Â 10:11

Okay, now we got rid of quite a lot of this higher,

Â unstable either harmonics or components.

Â Let's listen to the synthesized sound.

Â [SOUND] Yeah,

Â that sounds quite clean and definitely, of course, not as rich as the original sound.

Â [SOUND] Okay,

Â now we can go with these same parameters to the harmonic plus

Â receiver to track these harmonics form the original signal.

Â So let's use the same sound,

Â let's use these 2,006 window size,

Â 4096, the threshold was -90,

Â the minimum duration of track,

Â I think we put one second.

Â 11:16

So, okay, so 1 second, number of harmonics,

Â now definitely we can put less, we can even put just 30.

Â And here we have to make sure it was within that, so

Â let's say 130 to 350 so that is within that,

Â for sure you will find it, no problem And the error.

Â I don't think this matters too much, but here we put 00,

Â I think that's two zeroes, okay.

Â We leave it like that.

Â Okay, let's now compute these.

Â 11:57

Okay, and this is what we got, and let's see,

Â so here we see the harmonics it found.

Â The black lines on the background spectrogram is the residual.

Â And we can listen to the different components.

Â So we can listen well just assign the harmonics we already heard that.

Â Let's listen to the residual.

Â 12:30

Yeah, that's a very clear and

Â nice sound that is what you was

Â missing from this harmonic sound.

Â So clearly, if we put these two sounds together,

Â they will sound like the original.

Â [SOUND] Compared with the original.

Â [SOUND] Yeah, that's

Â an identity basically.

Â So this was a good sound to explore the potential of the harmonic plus possible.

Â Of course, it would take time varying sounds that may be a little more tricky,

Â and we will have to tune the parameters a little bit more.

Â 13:28

But with that I think you get an idea of what the harmonic plus residual model and

Â the tool within SMS Tools that implement this model, what it can do.

Â So I encourage you to play around, of course, choose other sounds.

Â Maybe more complex sounds, and sounds that change in time.

Â So we have talked about the harmonic plus residual model, and

Â this is one instance of the models that we're talking about this week.

Â On the next demonstration lecture,

Â I want to talk about the harmonic plus the stochastic model so that we can model this

Â residual that we just heard as a stochastic signal and see what it can do.

Â So I will see you next lecture.

Â Thank you.

Â