This digital
audio primer provides step-by-step instructions on setting up a development
platform for audio effects processing.
Audacity is audio editing software that facilitates the recording, playback, importing and exporting of audio files. It is free, easy to use, and available for Windows, Mac, and Linux operating systems. When recording audio data, Audacity allows the user to specify four necessary parameters to successfully import this data into Csound – file format, sampling rate, sample size, and number of channels.
1. Getting Audacity
Go to the Audacity home page - http://audacity.sourceforge.net/
Download and install Audacity version 1.2.1 for the appropriate
operating system. Tutorials, detailed instructions, an FAQ, and online help are
all available from the same page.
2. Set the recording preferences
Select preferences from the file menu
Set the number of channels
Set the sampling rate
Set sample size
Set file format
3. Record your track – detailed instructions are available from
the help menu
4. Export your track as a wave file - you now have an unprocessed
wave file ready to be imported into Csound
Select “Export as WAV…”
Like Audacity, Csound is multi-platform
freeware. Csound renders audio data according to the text-based instructions of
two interdependent and complimentary text files – the orchestra file (.orc) and
the score (.sco) file. Conveniently,
the two files can be combined into one Csound file (.csd). The complete list of
commands with detailed syntax is available from the Public Csound Manual - http://www.lakewoodsound.com/csound/hypertext/manual.htm
1. Getting Csound
Go to the Csound FrontPage - http://mitpress.mit.edu/e-books/csound/frontpage.html
Download and install Csound version 4.23 for the appropriate operating
system. Tutorials, detailed instructions, an FAQ, and online help are all
available from the same page.
2. Running Csound
Running the executable file created in the installation process
will bring up the following command window:
- Set the path of the orchestra file (.orc) or Csound file (.csd)
in the first text box
- Set the path of the score file (.sco) in the second text box
- Set the path of the output file in the third text box
- Select the file format in the ‘Format’ panel
- Set the sample size in the ‘Size’ panel
- Make sure the file format and sample size match the recording
preferences set in Audacity
- Click the ‘Render’ button to execute the procedure
3. The Orchestra File
The orchestra file is divided into two separate sections – the
header section and the instrument section.
a. The Header Section
This section defines four parameters – sample rate (sr), control
rate (kr), the sampling rate divided by the control rate (ksmps) and the number
of channels (nchnls). The sampling rate
and number of channels must match the recording preferences set in
Audacity. The following code fragment
is the header from an orchestra file:
sr = 48000
kr = 4800
ksmps = 10
nchnls =
2
b. The Instrument Section
The instrument section is composed, appropriately, of a series of
instruments. However, these are not
instruments in the traditional sense. Each instrument is a function made up of
interconnecting modules (opcodes) that generate or modify signals based on
specified parameters. These opcodes are processed top to bottom left to right.
The following code reads the left and right channels of a stereo track called
‘audio_file.wav’ and plays them aloud:
instr 1
ar1, ar2 soundin
"audio_file.wav"
outc ar1, ar2
endin
This can be parsed line by line to determine exactly how it works.
instr 1
This
line declares that a new instrument referenced by the number 1 is being defined
by the subsequent code.
ar1, ar2 soundin "audio_file.wav"
The ‘soundin’ opcode reads the two channels of the audio file ‘audio_file.wav’
and stores them in two separate signals, ‘ar1’ and ‘ar2’. These two signals can now be processed by
other opcodes.
outc ar1, ar2
The ‘outc’ opcode plays the two signals simultaneously.
endin
The ‘endin’ command marks the end of instrument 1.
4. The Score File
The score file is divided into two separate sections – function
tables and notes.
a. Function Tables
Function tables (f-tables) use mathematical function-drawing
subroutines (GENS) built into Csound to calculate function values based on
specified parameters. Functions are
defined using the following syntax:
f p1 p2 p3 p4…
p1 = Unique function table identification number
p2
= Initialization time expressed in seconds
p3
= Table size
p4
= GEN routine called to create the function table
Parameters
numbered 5 or higher are contingent upon p4, or the type of GEN routine.
Certain opcodes call upon functions based on their identification number.
f 1 0 4096 10
The above code will generate a sine wave (GEN number 10 is a sine wave)
with 4096 points, starting at time 0, referenced by the number 1.
b. Notes
Notes are individual statements that designate a specific
instrument be made active for a certain duration based on parameters. The syntax for a note statement is:
i p1 p2 p3…
p1 = Instrument number
p2 = Start time (sec)
p3 = Duration (sec)
Parameters numbered 4 or higher are set by the user. They can be referenced by the corresponding
instrument opcodes.
i 1 3 5
The note above will play instrument 1, beginning at 3 seconds, for
5 seconds.
The above description of Csound and its components is by no means
comprehensive. It is meant to orient
readers with the basics of how the language works so that they may begin to
decipher the code behind the audio effects.
5. The Csound (.csd) File
The orchestra and score files can be combined into one file using
the following html-style syntax:
<CsoundSynthesizer> marks
the beginning of the csound file
</CsoundSynthesizer> marks
the end of the csound file
<CsInstruments> marks
the beginning of the orchestra file
</CsInstruments> marks
the end of the orchestra file
<CsScore> marks the
beginning of the score file
</CsScore> marks the
end of the score file
The text placed between the designated markers is identical to the
independent versions of the orchestra and score files.
A
wave file is composed of a header plus data divided into different types of
chunks. The ‘chunk’ data type has
several different formats and functionalities.
Only two of these chunks fall within the scope of this project – the
format chunk and the data chunk.
Incidentally, these are the two required chunks in every wave file.
The
header is simply used for file identification purposes. It consists of the characters “RIFF”,
designating the Resource Interchange File Format, followed by the size of the file,
followed by “WAVE”, designating this particular RIFF file of the WAVE format.
The format chunk
contains the following descriptive parameters along with their variable type
and size:
1. Chunk size (4 bytes, unsigned long)
The
remaining size of the chunk
2. Format tag (2 bytes, unsigned long)
For
uncompressed data this field is 1
3. Number of channels (2 bytes, unsigned long)
The
number of channels
4. Samples per second (4 bytes, unsigned long)
The
sampling frequency in Hz
5. Average bytes per second (4 bytes, unsigned long)
Used
by audio players to determine buffering requirements
6. Block alignment (2 bytes, unsigned short)
The
block alignment is the storage (bytes) required for one time step, i.e. the
number of channels times the sample width in bytes
7. Bits per sample (2 bytes, unsigned short)
The bits used to represent each sample
The data chunk contains the actual audio data. The characters "DATA" initially
designate a data chunk. Next is the
chunk size in bytes. Finally, the
samples themselves are written according to the specifications assigned in the
format chunk. The actual number of samples is the chunk size divided by the
block alignment setting in the format chunk.
Although the wave file format does not directly
affect the user, there are parameters relevant to Audacity and Csound defined
here - sampling rate, sample size, and number of channels. Importing wave files with unknown
specifications into Csound can cause compatibility errors. To account for this, Csound has built in
opcodes for identifying wave file parameters.
The Csound reference manual (http://www.lakewoodsound.com/csound/hypertext/manual.htm)
details their use.
An analog-to-digital conversion is the
process of quantizing a continuously varying analog signal (voltage). The sound card of the recording device does
the actual conversion with the end result being a discrete digital signal. A numerical approximation of the analog
signal, known as a sample, is taken at regular time intervals. The sampling rate is the number of samples
taken per second, measured in Hertz (Hz).
This rate is calculated according to the Nyquist limit, which states
that a digitally sampled sound can exactly reproduce any analog sound whose
frequency is less than half the sampling rate.
The highest frequency audible to human beings is approximately 20,000 Hz,
so sampling at a rate above 40,000 Hz ensures that all frequencies within the
range of human hearing can be digitally represented. The numerical value of each sample depends on the number of bits
per sample (sampling size). Given a
sample size s, the value of any sample must be an integer within the
following range:
(s-1) (s-1)
[-2 , 2
– 1]
Compact disc quality is defined as having
a sampling rate of 44,100 Hz and a sample size of 16 bits. Therefore a 2-second CD quality sound clip
is digitally stored as 88,200 separate integers each between –32768 and 32767.
Mathematically, the digital audio signal is defined using two related
functions, the input signal x[n] and the output signal y[n]. The integer variable ‘n’ refers to the
sample number which rangers from 0 and 88,200 in this case. The functions x[n]
and y[n] refer to the sample value, between –32768 and 32767. The output signal function y[n] is generated
by processing the input signal x[n].
For instance, the following output signal is twice the input signal:
y[n] = 2 x[n]
The digital-to-analog conversion reconstructs the
analog signal from its digital representation.
Again the sound card of the recording device handles the process. The smooth analog signal is recreated by
playing ‘connect the dots’ with the discrete values of the digital signal. Mathematically, the signal is reconstructed
based on principles of Fourier Theory, which essentially states that any
periodic signal can be represented as the summation of a series of sine
waves. In the diagram below, the dark
blue dots are discrete sample integer values.
The smooth light blue curve is the reconstructed analog signal.
Audacity Home Page (2004) retrieved 3/04 from
http://audacity.sourceforge.net/
Boulanger, Richard (2000). The Csound book. Cambridge, MA:
MIT Press.
Bourke, Paul (2001). Wave file sound format. Retrieved 3/04
from
http://astronomy.swin.edu.au/~pbourke/dataformats/wave/
Schindler, Allan. (1998). Eastman Csound tutorial. Eastman School of Music. Retrieved 1/04 from
http://www.esm.rochester.edu/onlinedocs/allan.cs/
Vercoe, Barry. (1992). The public Csound reference manual,
version 4.16. MIT Press. Retrieved
2/04 from http://www.lakewoodsound.com/csound/hypertext/manual.htm
Zolzer, Udo.
(2002). Digital audio effects. West Sussex, England: Baffins Lane.