Welcome again to a lecture of the course on Audio Signal Processing for Music Applications. Until now, we have been analyzing sounds using the sinusoidal representations. They work, but there are many types of sounds that best describe by what we call Stochastic Models. Signals like the sound of the ocean or the bold noise of a violin fit into this category of stochastic signals. We will talk about these today. We will first introduce the concept of stochastic signals. Then how to model them, what is a model of a stochastic signal and then more specifically on how to deal with sounds using these models. So how to approximate with from in a stochastic perspective particular sound. And finally, we will describe the concept of a system that can perform analysis and synthesis of sounds using these models. The stochastic model is complementary to the models that we have covered until now. In fact, in the following lecture, we will combine the stochastic model with the sinusoidal-based models. And we'll be able to take advantage of the best of both types of models. What is a stochastic signal? Well, a stochastic signal cannot be described in a deterministic way. It can only be described probabilistically. And the feel of statistical signal processing deals with this type of signals and it's quite advanced topic. Here we'll get a very broad approach which is sufficient for our needs. So in a statistical signal processing, we talk about the laws of probability as a way to describe these stochastic signals. And we talk about the mean, the variance, and the probability distribution of particular signals. And there's some mathematical functions that are used to analyze these type of signals and captures some of its characteristics. For example, one is the autocorrelation function. We have already seen this function before. The autocorrelation function allows us to measure the periodicity of a signal or the degree of repeating patterns in a particular signal. We use it for detecting the fundamental frequency. So this is a function that can be used to measure how stochastically is a signal. If there are no repetitions, that means that it's going to be close to a stochastic signal. So the lower the autocorrelation function value is, the closer is going to be the signal to stochastic signal, okay? Another mathematical function that we can use is what is called the power spectral density. And also we have seen a similar version of that. It's basically the DFT but with a major difference. It's basically the DFT to the limit. We take the square value of the absolute value of the DFT and we take N, the size of the DFT to infinity and if it converges, if it converges to a function, that's our power spectral density. And that happens in quite a few signals. And there are many models that have been proposed to deal with this type of stochastic signals. We'll use a very general model expressed by this equation which is in fact, the convolution of two signals. So we'll consider as a stochastic model the idea that a signal can be expressed as the convolution of white noise with the filter approximation of our signal. So by taking this convulsion, we are assuming that the signal that we are dealing with is well-expressed or well-represented by its impulse response. If we look at the same equation from a spectral point of view, we can understand a few more things. So a convulsion in the frequency domain is the product of the two spectrums. So the product of the white noise, the spectrum of the white noise with the spectrum of the impulse response of the filter. And if we express it in polar coordinates, then we can express it as the product of the two magnitude spectrum and multiplied by the exponential e to the j and the sum of the two phase spectrum, okay? So, that's the product of these two spectrum. And if we consider that these is stochastic signal, we basically can say that the magnitude spectrum of white noise is a flat line and we will see that later. So it's a constant, so therefore we can take it out of the equation. So we can reduce the concept of the impulse response of the filter of the input signal by the magnitude spectrum of the input signal and approximated version of that, which could be the frequency response of a filter. It could be some other type of function, a function that approximates the magnitude spectrum of the input signal. And as the phase of the model, we use the phase of white noise because the phase of an stochastic signal is not so relevant. Therefore, we just can reduce the phase representation of our signal with random numbers, with the random numbers of the white noise. Okay, so this is the good way to express this stochastic model. So, we take an approximation of the magnitude spectrum of our signal and we take random phases for the modeling the phase spectrum. So, this is an example. So if we start from a fragment of the sound, for example, of an ocean sound and let's listen to that, [SOUND] okay? So we just take just one frame of that. And then we compute the magnitude unphased spectrum of this ocean sound so the red plot is the magnitude spectrum of our input signal. And the phases, the c and function is the phases of the input signal. And then the black line on top of the magnetic spectrum is the approximation of the spectrum. And we'll talk about different ways to approximate that so, it's basically a smooth approximation to the magnitude spectrum. And the black line in the phase spectrum are basically random numbers, okay, and we claim that these random numbers Are an approximation or a model of the random numbers that in fact are in the ocean phases. So we are basically saying that the phase spectrum of the ocean sound are just random numbers and can be approximated with any random number sequence. Okay, and then if we take the inverse Fourier transform of these two black lines of the approximation of the magnitude and the random phases, we get this output signal. And we are claiming that perceptually this signal is going to be similar to the first one. Of course, by looking at it, that might not seem to be the case because it's clearly different shape but given that we're talking about stochastic signals, the details of the shape are not relevant. What is important is the statistical properties and so we will be able to try to prove if this type of approximation works. So the main analysis issue for this stochastic model is the approximation of the time varying magnitude spectrum of the input signals. So we'll have to compute these approximation at every frame. A common approach for obtaining a filter that approximates a spectral characteristics of a sound is to use a linear predictive coding. With LPC, with linear predictive coding, we can obtain a set of filtered coefficients a sub k, and the frequency response of the resulting filter approximate spectrum of the input sound. So, here we see the signal X and the idea is that the approximation of this signal is defined according to this LPC model as the linear combination of past samples. Okay, so it's defined as the sum from k = 1 to K of a sub k multiplied by x of n minus k which are the previous samples. This is basically the expression of IR filter, infinite response filter that is a linear combination of previous samples. And then the goal of LPC is to find these coefficients, to find a sub k, the best approximates X generates a similar signal, X hat. So we define an error function that is the sum of the square root of the original signal with this approximated signal. And we sum originally from minus infinity to infinity, of course, then we will narrow down to finite length signals. But with this error measure, basically we can try to identify the a signal, the a coefficient that minimizes these error signal. It will not talk about how to actually implement that but this is a very common approach for obtaining this coefficients and therefore for doing what we call the LPC approximation. So if we start from a sound, for example, of a voice sound like this soprano sound that you can listen to. >> [SOUND] >> In fact, these is the type of sound that is commonly approximated with an LPC model. And what it does is obtains this black line that we see in the bottom plot. So in the bottom plot, we see the magnitude spectrum of this fragment of this voice. And the black line is the magnitude spectrum of the approximation of this LPC filter that approximates the signal. And as we can see it kind of approximates what is a very common characteristics of the voice which is these formants, so, these are resonances. So, an IR filter is a way to approximate the resonances of a signal quite well and so the LPC works quite well for these types of signals. But the LPC does not work so well for many other types of signals. And here we present a more simple, a simpler approximation, that is just based on low-pass filtering. And we show it by implementing a low-pass filtering using the DFT. So we start from a signal a[K] and then we take the DFT of that and we low-pass filter. Low-pass filter means basically we cut the spectrum and we only accept the lower part on that spectrum. And then we can take the inverse DFT of that and we get another signal which this a-tilde is an approximation, a smooth approximation of the original a sub k signal. Then we might need to extend the signal in order to generate the same number of samples or the same sampling rate that the signal that we started with. So in order to do that, we might have to take the DFT of that, zero-pad to extend it to a longer FFT size and then take the inverse DFT of that. So then b(k) is of the same length than a(k) because the a tilde is just an approximation, has less samples, which is good because that means that we have an approximation with a few number of samples, these coefficients, basically, these a tilde is just a coefficient of the approximation. And this is going to be the approach that we'll use in our implementations. So now let's talk about the synthesis part of the stochastic model. If we approximate a sound using LPC or with any other type of filter design approach, we can synthesize a signal from the obtained filter coefficients by filtering white noise. So this equation that we already have seen before is the implementation of an IR filter in which we're filtering white noise. We are filtering the signal u with a series of coefficients a sub k that are the coefficients of the filters. And the implementation of this equation can be done in different ways. For example, these two block diagrams are two different structures that are used to implement this type of filtering. One, the top is called the direct form structure and the bottom one is the lattice structure. But if you obtain an approximation using the low-pass filtering approach that we mentioned, we can synthesize the sound directly by computing the inverse DFT. So in here, we start from our approximation of the spectrum, of the spectrum of the original signal. Which is basically these smooth version of the signal is kind of like what we said the low pass filter approximation of the signal. And then we can just take a random phases, the phases of white noise and we take the inverse DFT and that's basically going to be a filtering operation of white noise. Okay, so we start from the smooth approximation of the signal, the random phases and then we take the inverse of T. And these will be the method that we will use in our examples. So now let's put it out together into an analysis synthesis system using this Stochastic model. And as of here we see the blog diagram that we will be implementing which we start from the signal x of n, hopefully an Stochastic signal. We compute the FFT. We take the absolute value. And then we do this Stochastic approximation which is again, this idea of low pass filtering. So approximating the magnitude spectrum with a smooth curve. And then we can do the synthesis. The synthesis will be done by doing this inverse FFT of this stochastic approximation that might have to be zero path and so to interpolated to be a longer size spectrum, and then we generate random numbers for the phase spectrum. And we can take the inverse FFT of that and that will return a fragment of a sound. And then we can just do an overlap at the similar in this exact the same way that we did for the sinusoidal modelling. Here also we will have to take care about some smoothing windows so that they overlap at works correctly but, with these we can reconstruct the original signal. So, let's listen to some example okay, so, this is the ocean sound that we played before then, the first is the magnitude spectrum, the absolute value of the spectrum. Of the spectrogram of this whole sound with a particular window and 50 size and a size. And then the Stochastic approximation is basically a visualization of this coefficient that are much fewer. So in fact here we took a kind of compression of point so, I've written samples of our magnitude spectrum, we reduced it to one so, that's the idea of the approximation. And then, we can synthesize by combining these magnitude spectrum with random numbers. So let's listen to the synthesize result [NOISE]. If you do an AB comparison with the original ocean, it sounds different but it clearly sounds like an ocean sound. So, for stochastic symbols, maybe it's not relevant to reproduce the exact characteristics of the sound but basically this kind general characteristics of the sound and this is what this approach does. So, the fill of statistical signal processing is quite an advance topic as I mentioned. And most of the referencing are quite complicated, are quite advanced. If you start by looking at these Wikipedia pages you can get links and descriptions to all these more complex views of Stochastic process and statistical signal processing so feel free to go there and check all these topics. And that's all. So we talked in this lecture about Stochastic Model. The goal was to introduce a strategy with which to model some sounds or parts of sounds that cannot be well represented with sinosoids. In the next lecture, we will see how we can combine these stochastic models with the other models we have been discussing, the sinusoidal base models. So I see you in the next lecture, bye bye.