<html>
Audio File, Format Conversion, and I/O Utilities

Roger Dannenberg
18 June 97
Revised 6 July 97
Revised 7 May 00 with multiple interface support and inner architecture
    documentation

This document describes a set of portable C utilities for digital audio
input and output to and from files and audio interfaces.

The goals are to be able to read and write sound files in a variety of
formats and to play and record audio. This code is intended for use
in Nyquist, Aura, Amulet, and other systems, and should be portable
to virtually any computer system that supports C and has a file system.


Overview:

There is basically one interesting data type: snd_type is a pointer to a
descriptor for an audio stream, which is either being read from or written
to a file or audio interface. The snd_type contains a structure that
describes the sample format, sample rate, number of channels, etc. 

Routines exist to initialize sound transfer (snd_open()), perform transfers
(snd_read(), snd_write()) and to finalize a transfer (snd_close()).  Other
routines allow you to transfer data to/from buffers and to convert formats.
Sample rate conversion is not currently supported, but would be a welcome
addition.

typedef struct {
    long channels;	/* number of channels */
    long mode;		/* ADPCM, PCM, ULAW, ALAW, FLOAT, UPCM */
    long bits;		/* bits per sample */
    double srate;	/* sample rate */
} format_node;


typedef struct {
    short device; 	/* file, audio, or memory */
    short write_flag;	/* SND_READ, SND_WRITE, SND_OVERWRITE */
    union {
        struct {
            char filename[258];	/* file name */
            int file;		/* OS file number */
            long header; /* None, AIFF, IRCAM, NEXT, WAVE */
            long byte_offset;	/* file offset of first sample */
            long end_offset; /* byte_offset of last byte + 1 */
        } file;
        struct {
            long buffer_max;	/* size of buffer memory */
            char *buffer;	/* memory buffer */
            long buffer_len;	/* length of data in buffer */
            long buffer_pos;	/* current location in buffer */
        } mem;
        struct {
            char devicename[258]; /* (optional) to specify device */
            void *descriptor;
            long protocol;	/* SND_REALTIME or SND_COMPUTEAHEAD */
            double latency;	/* app + os worst case latency (seconds) */
            double granularity;	/* expected period of app computation (s) */
            /* note: pass 0.0 for default latency and granularity */
        } audio;
    } u;
    format_node format;	/* sample format: channels, mode, bits, srate */
} snd_node, *snd_type;

The meanings of fields are as follows:
        device: one of SND_DEVICE_FILE (data to/from file), SND_DEVICE_AUDIO
(data to/from audio I/O device), SND_DEVICE_MEM (data to/from in-memory
buffer), or SND_DEVICE_NONE (records that snd_open failed).
        write_flag: one of SND_WRITE (create a file and write to it),
SND_READ (read from a file), SND_OVERWRITE (overwrite some samples within a
file, leaving the header and other samples untouched).
        format: contains number of channels, mode, number of bits per
sample, and sample rate.  mode is SND_MODE_ADPCM (adaptive delta
modulation), SND_MODE_PCM (pulse code modulation, i.e. simple linear
encoding), SND_MODE_ULAW (Mu-Law), SND_MODE_ALAW (A-Law), SND_MODE_FLOAT
(float), or SND_MODE_UPCM (unsigned PCM)

These are fields for SND_DEVICE_FILE:
        filename: string name for file.
        file: the file number or handle returned by the operating system.
        header: the type and format of header, one of SND_HEAD_NONE (no
header), SND_HEAD_AIFF, SND_HEAD_IRCAM, SND_HEAD_NEXT (Sun and NeXT format),
or SND_HEAD_WAVE.
        byte_offset: the byte offset in the file. After opening the file,
this is the offset of the first sample.  This value is updated after each
read or write.
        end_offset: offset of the byte just beyond the last byte of the file.
        
These are fields for SND_DEVICE_AUDIO
        devicename: string name for device (to select among multiple 
devices). This may be set to the empty string (devicename[0] = 0) to 
indicate the default audio device, or it may be set to a name obtained
from snd_devicename().
        descriptor: a field to store system-dependent data
        protocol: SND_REALTIME (use this if you are trying to compute ahead
by a constant amount, especially for low-latency output) or SND_COMPUTEAHEAD
(use this if you want to keep output buffers as full as possible, which will
cause greater compute-ahead).
        latency: (minimum) amount to be kept in buffer(s). This should be at
least as great as the longest computational delay of the application PLUS
the worst case latency for scheduling the application to run.
        granularity: expected period of the periodic computation that
generates samples. Also, granularity indicates the largest number of
samples that will be written with snd_write or read with snd_read by the
application.

The following fields are for SND_DEVICE_MEM (in-memory data):
        buffer_max: the size of the buffer memory (in bytes)
        buffer: the memory buffer address
        buffer_len: the length of data in the buffer
        buffer_pos: the current location of the input/output in memory.


Routine descriptions:

int snd_open(snd_type snd, long *flags);

To open a file, fill in fields of a snd_type and call snd_open.  If there is
header information in the file or device characteristics for the audio
interface, fields of snd are filled in.  The flags parameter tells which
fields were specified by the snd_open process.  E.g. if you open a raw file,
there is no header info, so the format will be as specified in snd.  On the
other hand, if you open an AIFF file, the file will specify the sample rate,
channels, bits, etc., so all these values will be written into snd, and bits
will be set in flags to indicate the information was picked up from the
file.

Returns SND_SUCCESS iff successful. If not successful, attempts to open a
file will place the return code from open() into the u.file.file field.

Before calling snd_open, all general fields and fields corresponding to the
device (e.g. u.file for SND_DEVICE_FILE) should be set, with the following
exceptions: u.file.header (for SND_WRITE), byte_offset, end_offset,
descriptor.

NOTE: do not call snd_open for SND_DEVICE_MEM, just fill in the fields.
u.mem.buffer_len is the write pointer (snd_write() data goes here), and
u.mem.buffer_pos is the read pointer (snd_read() data comes from here).

NOTE 2: for SND_DEVICE_MEM, you can set write_flag to SND_WRITE, write data
into the buffer, then set write_flag to SND_READ and read the buffer. Use
snd_reset() before reading the buffer again.

int snd_close(snd_type snd);

Closes a file or audio device.  There is no need to call snd_close for
SND_DEVICE_MEMORY, but this is not an error.

Returns SND_SUCCESS iff successful.

int snd_seek(snd_type snd, double skip);

After opening a file for reading or overwriting, you can seek ahead to a
specific time point by calling snd_seek.  The skip parameter is in seconds.

Returns SND_SUCCESS iff successful.

int snd_reset(snd_type snd);

Resets non-file buffers.  If snd has SND_DEVICE_AUDIO, then the sample
buffers are flushed. This might be a good idea before reading samples after
a long pause that would cause buffers to overflow and contain old data, or
before writing samples if you want the samples to play immediately,
overriding anything already in the buffers.

If snd has SND_DEVICE_MEM and SND_READ, then the buffer read pointer
(buffer_pos) is reset to zero.  If SND_WRITE is set, then the buffer read
pointer (buffer_pos) and write pointer (buffer_len) are reset to zero.

If snd has SND_DEVICE_FILE, nothing happens.

Returns SND_SUCCESS iff successful.

long snd_read(snd_type snd, void *buffer, long length);

Read up to length frames into buffer. 

Returns the number of frames actually read.

int snd_write(snd_type snd, void *buffer, long length);

Writes length frames from buffer to file or device.  

Returns number of frames actually written.

long snd_convert(snd_type snd1, void *buffer1, 
        snd_type snd2, void *buffer2, long length);

To read from a source and write to a sink, you may have to convert formats.
This routine provides simple format conversions according to what is
specified in snd1 and snd2.  The number of frames to convert is given
by length. Data in buffer2 are converted and written to bufffer1. 

long snd_poll(snd_type snd);

The standard way to play files is to put something in the event loop that
refills an output buffer managed by the device driver. This routine allows
you to ask whether there is space to output more samples. If SND_REALTIME is
selected, the number returned by snd_poll will grow fairly smoothly at the
data rate, i.e. if the data rate is 8KB/s, then the result of snd_poll will
increase by 8 bytes per millisecond.  On the other hand, if SND_COMPUTEAHEAD
is selected, then snd_poll will return zero until a sample buffer becomes
available, at which time the value returned will be the entire buffer size.

Note: some low-level functions are implemented for conversion from buffers
of floats to various representations and from these representations back to
floats.  See snd.h for their declarations.

int snd_flush(snd_type snd);

When the device is SND_DEVICE_AUDIO, writes are buffered. After the last
write, call snd_flush() to transfer samples from the buffer to the output
device. snd_flush() returns immediately, but it only returns SND_SUCCESS
after the data has been output to the audio device.  Since calling
snd_close() will terminate output, the proper way to finish audio output
is to call snd_flush() repeatedly until it returns SND_SUCCESS. Then call
snd_close() to close the audio device and free buffers.

If snd_flush is called on any open snd_type other than a SND_DEVICE_AUDIO
opened for output, it returns SUCCESS.  Results are undefined if snd_flush
is called on a non-open snd_type.

long snd_bytes_per_frame(snd_type snd);

Calculates the number of bytes in a frame (a frame has one sample per
channel; sound files are stored as a sequence of frames).

char *snd_mode_to_string(long mode);

Returns a string describing the mode (SND_MODE_PCM, etc.).

char *snd_devicename(int n);

Returns a string describing the n-th audio device. Returns NULL if
n is greater or equal to the number of audio devices. Available devices 
are numbered, starting with the default device at n=0. Before opening an
audio device, an application can use this to enumerate all possible
devices, select one (e.g. by presenting a list to the user), and then
copy the string into the devicename field of the snd_type structure.
If the devicename field is the empty string, device 0 will be opened.

It is easy to construct a higher-level function to play a file, e.g.

aio_node my_player;

aio_play_init(&player, "mysound.wav");
playing = TRUE;

Then, in the polling loop:

if (playing) {
        if (!aio_play_poll(&player)) playing = FALSE;
}

Examples: see convert.c for examples of:
        Printing information about a sound file
        Converting sound file formats
        Playing audio from a file
        Reading audio from audio input

To compile convert.c under NT: add all the .c files to a console application project and add these libraries to the Object/library modules list under the Link tab in the Project Settings dialog box:
        winmm.lib
        ws2_32.lib

Inner architecture description



Audio buffer and time management

</html>
