0:05

This week, we're talking about quite a few spectrum modeling approaches.

We're combining the sinusoidal and harmonic analysis,

that we talked in the last few week, with the idea of residual and

statistic approximation of this residual.

So, in this demonstration class, I want to talk about one particular model,

the Harmonic plus residual model.

So, the idea of analyzing the harmonics of a sound,

subtracting them from the original signal.

And obtaining the residual, which then can be combined, of course,

with the harmonics that we have identified.

So let's use the SMS tools that we have been talking about and

developing in the course.

And let's start with the DFT and let's start with a sound,

with this organ sound that we have in the SMS Tools directory.

Let's listen to that and we listen with headphones,

so that we can listen more carefully.

[MUSIC]

Okay, so this is quite a very stable tone, and

it's a very traditional standard sound of an organ.

So in order to analyze,

we want to be able to distinguish the harmonics of this sound.

So this is a C3, which is around 264 Hertz, so

in order to find what is a good window size let's use the Blackman window.

This is a very stable note so we can take advantage of a longer window, and

really try to isolate the harmonics as much as possible.

So that's six beats for the Blackman window and

then something where it is 44100, and

then we divide by the frequency of this note, 264.

So that gives me 1,002 samples that make sense to analyze,

okay, so let's put 1,003 to make it an odd window.

And, okay, let's use 2,048 F50 size.

So, it's quite a bit of zero pattern, and that's good.

And so let's compute it.

Okay, so this is the samples that we are analyzing.

We analyzing six periods of the sound, it clearly looks very sinusoidal.

And from the manual spectrum we see that there's not that many clear harmonics,

even though there is quite a bit of energy in the high frequency range.

3:04

But it doesn't have all these clear sinusoids.

So this is an indication that it's a sound that has quite a bit of,

kind of a noise like, or kind of a stochastic component into that.

Well, the phase looks as expected, and of course, the reconstruction is quite good.

Okay, so this looks like a decent approach.

Of course, we could maybe, given that it's stable,

we could take a bigger window size.

So let's try that, let's try like twice as much,

let's try 2,006 samples.

And then, let's even do a bigger FFT, let's use 4,096, okay, so

we are doing quite a bit of zero pairing.

And well, here the time was in 2 seconds which is good,

the sound is quite long, so we're kind of in the middle, so let's do that.

Okay, this looks twice as many samples,

like we are analyzing, 12 periods.

And yes, now we are seeing a little bit more things in the spectrum,

if you compared the previous one to the current one,

we are seeing the harmonics maybe more defined.

And we are seeing quite a bit of background things.

So maybe now we are seeing a little bit higher harmonics than we were seeing

before.

Because we have the higher window we kind of reduce this background,

because it is not a very coherent type of signal, so

it emphasizes a coherent part, the harmonics and

reduces this kind of stochastic component.

Okay, so this seems to be a good choice, so let's go to the STFT and

apply the same parameter.

So let's apply the Blackman, when it first opened the sound, the organ sound.

5:12

Let's use these 2,006

samples that we took, okay, and

let's use the 4096 at 50 size.

Here now we have to choose a hop size, here I don't think it

matters that much, but let's just use a maybe 500.

So one fourth to be able to overlap with the in fact, it should be even more,

but for efficiency purposes, let's just leave it like that.

Okay, let's compute this.

These will take a little bit longer, because, of course,

it's longer sound and it is quite big.

Okay, so this is what we are getting, of course, they are synthesized.

No need to listen to it because it's going to be quite identical.

And now, well, we see the horizontal line, maybe let's zoom into the lower areas.

Okay I'm sure we can see a little bit better the harmonics,

the harmonics look clearly very well defined.

6:26

But there is quite a few things in between.

Okay, that seems to be a good choice.

Now let's do the harmonic analyses, again, using the same parameters.

So we'll go to the organ, let's see the Blackman.

Let's use 2,006 and

let's use 4096 as the FFT size, and

now we have to choose the parameters to identify the peaks and the harmonics.

The magnitude threshold, okay, minus 90, that looks like a reasonable one.

The duration of the harmonics here, since we are in a long, stable node,

we can even afford to put a longer type of track, so at least let's say,

that they have to last for 0.2 seconds, or it could be even more.

The truth is that there is not that many, because they kind of disappear,

so I would say that 30 or 40 harmonics should be plenty.

7:29

And we know the fundamental frequency is 264 so,

the range has to be for that, so 130 and

300, 264 is within that.

So maybe we can even make this higher so that we make sure it feeds correctly.

Okay, that sounds good.

And this is the error for the fundamental frequency detection,

it should be very clear fundamental, there should not be a problem with that.

And this deviation, this is how we will allow

the harmonics to deviate from perfect harmonicities.

Let's leave it like this and see what happens.

Okay let's compute it, again, this will a little bit longer.

8:20

Okay, so we found quite a few harmonics, and

of course, in the attack and the k is very unstable,

so maybe we should have rejected those.

But what is interesting is that some harmonics are quite stable, but

some are very unstable.

Let's listen to the re-synthesized sound.

[SOUND] Okay,

it sounds good.

Let's go play with original.

[SOUND] Well, if you pay attention,

clearly, the original has this more air in the background,

that is not in the synthesize because it's mainly the harmonics.

But the truth is, that also some of these higher harmonics are very unstable,

so maybe it's not right to consider them as harmonics.

Because they are basically maybe tracking some noise part.

So one way to get rid of that is to reduce this deviation and

restrict it even more, for example, it could 0.001.

And then let's see what happens.

10:11

Okay, now we got rid of quite a lot of this higher,

unstable either harmonics or components.

Let's listen to the synthesized sound.

[SOUND] Yeah,

that sounds quite clean and definitely, of course, not as rich as the original sound.

[SOUND] Okay,

now we can go with these same parameters to the harmonic plus

receiver to track these harmonics form the original signal.

So let's use the same sound,

let's use these 2,006 window size,

4096, the threshold was -90,

the minimum duration of track,

I think we put one second.

11:16

So, okay, so 1 second, number of harmonics,

now definitely we can put less, we can even put just 30.

And here we have to make sure it was within that, so

let's say 130 to 350 so that is within that,

for sure you will find it, no problem And the error.

I don't think this matters too much, but here we put 00,

I think that's two zeroes, okay.

We leave it like that.

Okay, let's now compute these.

11:57

Okay, and this is what we got, and let's see,

so here we see the harmonics it found.

The black lines on the background spectrogram is the residual.

And we can listen to the different components.

So we can listen well just assign the harmonics we already heard that.

Let's listen to the residual.

12:30

Yeah, that's a very clear and

nice sound that is what you was

missing from this harmonic sound.

So clearly, if we put these two sounds together,

they will sound like the original.

[SOUND] Compared with the original.

[SOUND] Yeah, that's

an identity basically.

So this was a good sound to explore the potential of the harmonic plus possible.

Of course, it would take time varying sounds that may be a little more tricky,

and we will have to tune the parameters a little bit more.

13:28

But with that I think you get an idea of what the harmonic plus residual model and

the tool within SMS Tools that implement this model, what it can do.

So I encourage you to play around, of course, choose other sounds.

Maybe more complex sounds, and sounds that change in time.

So we have talked about the harmonic plus residual model, and

this is one instance of the models that we're talking about this week.

On the next demonstration lecture,

I want to talk about the harmonic plus the stochastic model so that we can model this

residual that we just heard as a stochastic signal and see what it can do.

So I will see you next lecture.

Thank you.