Application version | 1.0 |
Trad4 version | 3.2 |
Document version | 1.0 |
Author | schevans |
Date | 26-06-2010 |
This is a concurrent model of the discrete Fourier transform. This allows us to decompose a particular sample into it's component frequencies. It's the inverse of additive_synth, and their interaction is shown below.
The principal of the discrete Fourier transform is that, given the base frequency of the sample, we calculate the amplitude of each harmonic using a process called correlation in a way that allows us to reconstruct the sample using additive synthesis.
The way each harmonic is tested for correlation is that a pure sine wave of the frequency of that harmonic is multiplied by the sample we're analysing. This is not particularly intuitive so for further reading please see Steven W. Smith's excellent DSP book, available for free online and in all good bookshops. Correlation is described here, but you should probably also read most of Chapter 8.
We have, form the bottom, the source object into which, then the correlator, lastly the monitor which writes out the frequencies domain report.
From the above diagram we can see how the correlators run concurrently. Each correlator is tasked with finding the correlation to a specific harmonic. The first correlator looks for the base frequency f1 (the frequency of the underlying sample). The next correlator looks for f2 - twice the base frequency, and so on.
We start with analysing samples from additive_synth. The advantage here is we know how they were synthesised, so we can compare the results of the transform with the inputs to the synthesiser.
When a sine wave such as sine_f1_h16.wav from additive_synth is run through the transform, we see that is is composed of a singe wave of the fundamental frequency as expected:
Recall the square wave from additive_synth was given by:
if ( id % 2 != 0 ) // Is odd id level = 1.0 / id; else // Is even id level = 0.0;
Which gives:
id | correlation |
1 | 1.0 |
2 | 0.0 |
3 | 0.333 |
4 | 0.0 |
5 | 0.2 |
6 | 0.0 |
7 | 0.111 |
... | ... |
And this is what we see when the square wave is analysed:
Likewise the triangle wave, which is given by:
if ( id % 2 != 0 ) // Is odd id level = pow( -1.0, ( id - 1 ) / 2 ) * ( 1.0 / (id*id) ); else // Is even id level = 0.0;
Which gives:
id | correlation |
1 | 1.0 |
2 | 0.0 |
3 | -0.1111 |
4 | 0.0 |
5 | 0.04 |
6 | 0.0 |
7 | -0.204 |
... | ... |
Which is again what we see when the sample is analysed:
Up until now we've been analysing output samples from additive_synth where we know their composition. In this next example we're going to analyse a sample from a Yamaha CS1x.
We're first going to analyse the samples, then we're then going to plug the resultant frequency domain report back into additive_synth and see if we can re-synthesise the sample.
When discussing the original samples below two are included, a recording a few seconds long of the original sample and a short "_single" version which just contains a single cycle of the waveform. These can be quite different, for example if the original sample had a low frequency oscillator applied (LFO), this causes a warble in the sound which won't be captured by listening to the single waveform.
These single waveforms are very short audio files, so when you open them in an editor like Audacity you may not see or hear anything. So see anything zoom in on the start of the file. To hear anything play the waveform in a loop, which is done in Audacity by holding down Shift when clicking Play.
The first CS1x sample we'll analyse is Clarinet. The original full length sample is here, and the single waveform here. It looks like this:
The synthetic sample is available here.
From this we can see it's pretty close. So why aren't they closer? The likely answer is that the original sample contains some phase-shifted components which we would pick up if we were looking for cosine correlation, but we are only looking for correlation with the main sinusoidal components (because they make up the bulk of an audible tone). Scope for further work here.
The next CS1x sample we'll analyse is ChrchOrg. The original full length sample is here, and the single waveform here. It looks like this:
The frequency domain report is shown below.
The synthetic sample is available here.
This is also close, and sounds pretty similar too.
To run the application:
1) Download and unpack the distribution
2) cd into trad4_v3_2/fourier_transform:
3) Source fourier_transform.conf:$ cd trad4_v3_2/fourier_transform
4) Start fourier_transform:fourier_transform$ . ./fourier_transform.conf
fourier_transform$ fourier_transform
To increase or decrease the number of threads used (the default is 4), set NUM_THREADS and re-start the application:
$ export NUM_THREADS=64 $ fourier_transform
To load a different waveform use the load_waveform command, with the two arguments being the waveform.wav file and the fundamental frequency of the sample you want analysed, which you must know in advance. E.g:
fourier_transform$ load_waveform.pl input/sine_f1_h16.wav 1.0
Or (where 261.626Hz equals middle C):
fourier_transform$ load_waveform.pl input/hammond_888000000.wav 261.626
This can be done while fourier_transform is running, i.e. when the environment variable BATCH_MODE is unset.