0:00

Welcome back to the course on audio signal processing for music applications.

Â In the first part of this lecture we presented some properties of the discreet

Â Fourier transform.

Â We now continue with some more properties

Â that will be very much useful when using the DFT.

Â In particular, we will talk about energy Conservation and

Â decibels, phase unwrapping, zero padding, the Fast Fourier Transform.

Â The Fast Fourier Transform together with what we call zero-phase windowing.

Â And finally, we will put it together with the concept of analysis and

Â synthesis of a sound.

Â 0:42

The property of energy conservation

Â relates with the idea that energy of a signal.

Â Both in the time domain or in the frequency domain can be measured in

Â the same way and it's basically the same.

Â So we can either compute energy in one domain or the other.

Â So the energy is defined as

Â 1:10

the sum of the square root of the absolute values of a signal.

Â And in the frequency domain,

Â if we take also the absolute value squared and sum it.

Â And if we just add this normalization factor by dividing over n,

Â we get the same value.

Â Here we see an example, we have a time domain signal, we do

Â this energy conclusion and we get this value 11.8.

Â And if we do the same thing in the frequency domain in

Â the square of the absolute value.

Â We sum and then divide by n, we get exactly the same value.

Â 1:53

Okay, a concept related to energy is amplitude, which is what

Â we normally use, either in the time domain or in the frequency domain.

Â When we obtain the polar representation of the spectrum as a signal in DFT,

Â the amplitude is obtained by computing the absolute value.

Â Which is a linear measure.

Â However, for the case of sound, a more intuitive representation of the amplitude

Â can be obtained by converting it to decibels, into a log value.

Â So the decibels are defined as we see here,

Â as 20 times log 10 of the absolute value of the signal.

Â So from the original time domain,

Â here we can see the absolute value of the spectrum in a linear representation.

Â And what we are now saying that is a more intuitive way

Â to visualize the amplitude in the frequency domain, so using decibel scale.

Â 3:01

Okay, so the spectrum of a signal includes the amplitude,

Â now computing decibels and also the phase.

Â And phase unwrapping is a way to represent the phase spectrum

Â of the DFT in a way that is easier to visualize and understand.

Â So here we see an original signal,

Â the magnitude spectrum in decibels and the phase spectrum.

Â Computed as the angle of the complex value of the spectrum and

Â here we see clearly that it's a very messy type of visualization.

Â So the unwrapping, what it does, it basically smooths

Â that out by adding two pi whenever there is a discontinuity.

Â So since this is bounded between zero and two pi,

Â whenever it reaches beyond two pi, it wraps back and it goes to zero.

Â So what we're going to do is to unwrap that and

Â let it grow as it behaves in a natural way.

Â So we get these smoother functions that become much easier to read and interpret.

Â 4:17

Zero-padding means to add zeroes at the end of a signal.

Â In the context of the DFT, if we zero-pad in one domain,

Â it produces an interpolated signal in the other domain.

Â [COUGH] So here, we see an example of that,

Â we start from a signal x of size eight,okay?

Â And this is the signal from which we compute the DFT,

Â so its DFT will also be of size eight.

Â And here we see the absolute value while in this case

Â computed into DB, into decibels of these eight samples.

Â Now, instead of computing the DFT of these eight samples,

Â we can compute the DFT of these eight samples plus eight samples of zero.

Â Therefore having the size of the DFT to be 16, this is the second plot.

Â So by computing the DFT of size 16 but of only this eight samples,

Â what we're seeing is that it's a much smoother visualization.

Â The samples of N = 8 of the of the magnitude spectrum for

Â N = 8 are exactly here.

Â But apart from those, there are interpolated values in between so

Â that to make the spectrum smoother.

Â And we can even do more if we zero-pad even more up to N = 32,

Â we will get more interpolated values in between.

Â Therefore resulting into a smoother spectrum.

Â 5:58

Okay, so now let's talk about the Fast Fourier Transform.

Â The DFT can be a quite demanding operation.

Â The implementation can be quite slow if

Â we don't pay attention about some efficient implementations.

Â So the Fast Fourier Transform is that is an efficient implementation

Â of the DFT equation.

Â And it does that by taking advantage of symmetries.

Â So what it does is that it restricts the input signal

Â to be of size, a power of two and because of that.

Â And thanks to that, then there is a whole bunch of symmetries that appear.

Â And so in this example, for example of having eight samples of a signal.

Â We can combine them so that we can group them and

Â take advantage of these symmetries.

Â And then perform computation at this pair wise type of

Â signals and therefore having a much more efficient computation.

Â 7:39

And I compute of different DFT sizes and I computed the time that it took.

Â So it was as the size of the DFT was increasing,

Â the computation time increased exponentially.

Â I'm here at the last one I tried, the 16,000,

Â it was close to two minutes of compute time.

Â 8:01

If I use the FFT implementation that comes with Python, of course,

Â it's an efficient implementation also because it's implemented in C.

Â But clearly we see the huge amounts of difference between that.

Â So the N 50 size of 16,000 samples is

Â much less than a millisecond that to compute time.

Â And the growth of this compute time is growing not exponentially,

Â but is growing a little bit flatter.

Â In fact, it's growing at a growth of n log n, which

Â is lower than the exponential growth of the DFT implementation.

Â 8:51

Okay, so in order to use the FFT,

Â we need to have the input signal to have a power of two length.

Â But we want to compute the spectrum of any length signal.

Â So this is the way we propose to compute the spectrum of a signal.

Â We would first do zero-padding and

Â then we will be using what we call zero-phase windowing.

Â Okay, so let's go through this example.

Â We will start from a fragment of a sound, x,

Â that has a given length, let's say 401 samples.

Â 9:25

Now we want to use the FFT, so we'll need to use power of two,

Â so the next power of two will be 512.

Â So we'll add zeros, so this next representation has zeros but

Â it doesn't add them at the end.

Â It does it by kind of splitting the signal through the middle

Â in a way that this is what called the zero-phase windowing.

Â That the zero sample which is the center sample is at the left side of the buffer.

Â That's where the zero sample is.

Â Then we have the positive samples up until the middle with the zero-padding included.

Â And then from the right side, we have the negative samples,

Â the samples that are negative time.

Â 10:15

So this is the way we will pack the signal in

Â what we call the FFT buffer before calling the FFT.

Â And if we compute the FFT of that and

Â then compute the spectrum in dB and the phase with unwrap.

Â Unwrapping the phase, we see this visualization in which we see

Â the symmetry of the magnitude spectrum and we see it quite nicely.

Â We see it quite smoothly.

Â And the phase, we see the odd symmetry of the phase and

Â we see a very smooth phase visualization because of two reasons.

Â Because we did the zero-phase windowing and because we did the unwrapping.

Â So because of the zero-phase windowing,

Â basically we are getting rid of the shifting distortion.

Â That would occur if we had not centered all the samples around zero.

Â And of course, the unwrapping allows us to see this very smooth visualization.

Â Okay, so this is the last part of what I wanted to talk about.

Â So we have seen the DFT, we have seen the different properties,

Â so now we can put it together.

Â Doing the analysis and synthesis of the DFT in

Â what we call the analysis/synthesis type of operation.

Â So we can start from a signal,

Â compute the FFT represented correctly in the magnitude and phase.

Â And since there are symmetries, there is need to only show half of it,

Â the positive side.

Â So this is the positive side of the magnitude spectrum and

Â the positive side of the phase spectrum.

Â So the spectrum was twice as long.

Â And then we can do the inverse Fourier transform from these and

Â reconstruct the original signal and it should be exactly the same.

Â So if we do things right, there are input signal,

Â the output signal should have exactly the same values.

Â 12:27

So we have seen this slide before.

Â This is just a slide for giving some references and

Â credits, information on the DFT.

Â It's available in many places about the Fast Fourier transform too.

Â The sounds from free sounds.

Â Again, the reference for Julius DFT

Â information in his website and the standard credits.

Â 12:52

So with this lecture,

Â we complete the presentation of the relevant Fourier Transform properties.

Â That they're of relevance to our audio processing work.

Â In the next lecture, we will take this further and

Â start working with more complex sounds.

Â So I hope to see you next class.

Â Bye-bye.

Â