page.title=Audio Terminology
@jd:body
This glossary of audio-related terminology includes widely-used generic terms
and Android-specific terms.
Generic Terms
Generic audio-related terms have conventional meanings.
Digital Audio
Digital audio terms relate to handling sound using audio signals encoded
in digital form. For details, refer to
Digital Audio.
- acoustics
-
Study of the mechanical properties of sound, such as how the physical
placement of transducers (speakers, microphones, etc.) on a device affects
perceived audio quality.
- attenuation
-
Multiplicative factor less than or equal to 1.0, applied to an audio signal
to decrease the signal level. Compare to gain.
- audiophile
-
Person concerned with a superior music reproduction experience, especially
willing to make substantial tradeoffs (expense, component size, room design,
etc.) for sound quality. For details, refer to
audiophile.
- bits per sample or bit depth
-
Number of bits of information per sample.
- channel
-
Single stream of audio information, usually corresponding to one location of
recording or playback.
- downmixing
-
Decrease the number of channels, such as from stereo to mono or from 5.1 to
stereo. Accomplished by dropping channels, mixing channels, or more advanced
signal processing. Simple mixing without attenuation or limiting has the
potential for overflow and clipping. Compare to upmixing.
- DSD
-
Direct Stream Digital. Proprietary audio encoding based on
pulse-density
modulation. While Pulse Code Modulation (PCM) encodes a waveform as a
sequence of individual audio samples of multiple bits, DSD encodes a waveform as
a sequence of bits at a very high sample rate (without the concept of samples).
Both PCM and DSD represent multiple channels by independent sequences. DSD is
better suited to content distribution than as an internal representation for
processing as it can be difficult to apply traditional digital signal processing
(DSP) algorithms to DSD. DSD is used in Super Audio CD (SACD) and in DSD over PCM (DoP) for USB. For details, refer
to Digital Stream
Digital.
- duck
-
Temporarily reduce the volume of a stream when another stream becomes active.
For example, if music is playing when a notification arrives, the music ducks
while the notification plays. Compare to mute.
- FIFO
-
First In, First Out. Hardware module or software data structure that implements
First In, First Out
queueing of data. In an audio context, the data stored in the queue are
typically audio frames. FIFO can be implemented by a
circular buffer.
- frame
-
Set of samples, one per channel, at a point in time.
- frames per buffer
-
Number of frames handed from one module to the next at one time. The audio HAL
interface uses the concept of frames per buffer.
- gain
-
Multiplicative factor greater than or equal to 1.0, applied to an audio signal
to increase the signal level. Compare to attenuation.
- HD audio
-
High-Definition audio. Synonym for high-resolution audio (but different than
Intel High Definition Audio).
- Hz
-
Units for sample rate or frame rate.
- high-resolution audio
-
Representation with greater bit-depth and sample rate than CDs (stereo 16-bit
PCM at 44.1 kHz) and without lossy data compression. Equivalent to HD audio.
For details, refer to
high-resolution
audio.
- latency
-
Time delay as a signal passes through a system.
- lossless
-
A lossless data
compression algorithm that preserves bit accuracy across encoding and
decoding, where the result of decoding previously encoded data is equivalent
to the original data. Examples of lossless audio content distribution formats
include CDs, PCM within
WAV, and
FLAC.
The authoring process may reduce the bit depth or sample rate from that of the
masters; distribution
formats that preserve the resolution and bit accuracy of masters are the subject
of high-resolution audio.
- lossy
-
A lossy data
compression algorithm that attempts to preserve the most important features
of media across encoding and decoding where the result of decoding previously
encoded data is perceptually similar to the original data but not identical.
Examples of lossy audio compression algorithms include MP3 and AAC. As analog
values are from a continuous domain and digital values are discrete, ADC and DAC
are lossy conversions with respect to amplitude. See also transparency.
- mono
-
One channel.
- multichannel
-
See surround sound. In strict terms, stereo is more than one
channel and could be considered multichannel; however, such usage is confusing
and thus avoided.
- mute
-
Temporarily force volume to be zero, independent from the usual volume controls.
- overrun
-
Audible glitch caused by
failure to accept supplied data in sufficient time. For details, refer to
buffer underrun.
Compare to underrun.
- panning
-
Direct a signal to a desired position within a stereo or multichannel field.
- PCM
-
Pulse Code Modulation. Most common low-level encoding of digital audio. The
audio signal is sampled at a regular interval, called the sample rate, then
quantized to discrete values within a particular range depending on the bit
depth. For example, for 16-bit PCM the sample values are integers between
-32768 and +32767.
- ramp
-
Gradually increase or decrease the level of a particular audio parameter, such
as the volume or the strength of an effect. A volume ramp is commonly applied
when pausing and resuming music to avoid a hard audible transition.
- sample
-
Number representing the audio value for a single channel at a point in time.
- sample rate or frame rate
-
Number of frames per second. While frame rate is more accurate,
sample rate is conventionally used to mean frame rate.
- sonification
-
Use of sound to express feedback or information, such as touch sounds and
keyboard sounds.
- stereo
-
Two channels.
- stereo widening
-
Effect applied to a stereo signal to make another stereo signal that sounds
fuller and richer. The effect can also be applied to a mono signal, where it is
a type of upmixing.
- surround sound
-
Techniques for increasing the ability of a listener to perceive sound position
beyond stereo left and right.
- transparency
-
Ideal result of lossy data compression. Lossy data conversion is transparent if
it is perceptually indistinguishable from the original by a human subject. For
details, refer to
Transparency.
- underrun
-
Audible glitch caused by
failure to supply needed data in sufficient time. For details, refer to
buffer underrun.
Compare to overrun.
- upmixing
-
Increase the number of channels, such as from mono to stereo or from stereo to
surround sound. Accomplished by duplication, panning, or more advanced signal
processing. Compare to downmixing.
- virtualizer
-
Effect that attempts to spatialize audio channels, such as trying to simulate
more speakers or give the illusion that sound sources have position.
- volume
-
Loudness, the subjective strength of an audio signal.
Inter-device interconnect
Inter-device interconnection technologies connect audio and video components
between devices and are readily visible at the external connectors. The HAL
implementer and end user should be aware of these terms.
- Bluetooth
-
Short range wireless technology. For details on the audio-related
Bluetooth profiles
and
Bluetooth protocols,
refer to A2DP for
music, SCO for telephony, and Audio/Video Remote Control Profile (AVRCP).
- DisplayPort
-
Digital display interface by the Video Electronics Standards Association (VESA).
- HDMI
-
High-Definition Multimedia Interface. Interface for transferring audio and
video data. For mobile devices, a micro-HDMI (type D) or MHL connector is used.
- Intel HDA
-
Intel High Definition Audio (do not confuse with generic high-definition
audio or high-resolution audio). Specification for a front-panel
connector. For details, refer to
Intel High
Definition Audio.
- MHL
-
Mobile High-Definition Link. Mobile audio/video interface, often over micro-USB
connector.
- phone connector
-
Mini or sub-mini component that connects a device to wired headphones, headset,
or line-level amplifier.
- SlimPort
-
Adapter from micro-USB to HDMI.
- S/PDIF
-
Sony/Philips Digital Interface Format. Interconnect for uncompressed PCM. For
details, refer to S/PDIF.
- Thunderbolt
-
Multimedia interface that competes with USB and HDMI for connecting to high-end
peripherals. For details, refer to Thunderbolt.
- USB
-
Universal Serial Bus. For details, refer to
USB.
Intra-device interconnect
Intra-device interconnection technologies connect internal audio components
within a given device and are not visible without disassembling the device. The
HAL implementer may need to be aware of these, but not the end user. For details
on intra-device interconnections, refer to the following articles:
Audio Signal Path
Audio signal path terms relate to the signal path that audio data follows from
an application to the transducer or vice-versa.
- ADC
-
Analog-to-digital converter. Module that converts an analog signal (continuous
in time and amplitude) to a digital signal (discrete in time and amplitude).
Conceptually, an ADC consists of a periodic sample-and-hold followed by a
quantizer, although it does not have to be implemented that way. An ADC is
usually preceded by a low-pass filter to remove any high frequency components
that are not representable using the desired sample rate. For details, refer to
Analog-to-digital
converter.
- AP
-
Application processor. Main general-purpose computer on a mobile device.
- codec
-
Coder-decoder. Module that encodes and/or decodes an audio signal from one
representation to another (typically analog to PCM or PCM to analog). In strict
terms, codec is reserved for modules that both encode and decode but
can be used loosely to refer to only one of these. For details, refer to
Audio codec.
- DAC
-
Digital-to-analog converter. Module that converts a digital signal (discrete in
time and amplitude) to an analog signal (continuous in time and amplitude).
Often followed by a low-pass filter to remove high-frequency components
introduced by digital quantization. For details, refer to
Digital-to-analog
converter.
- DSP
-
Digital Signal Processor. Optional component typically located after the
application processor (for output) or before the application processor (for
input). Primary purpose is to off-load the application processor and provide
signal processing features at a lower power cost.
- PDM
-
Pulse-density modulation. Form of modulation used to represent an analog signal
by a digital signal, where the relative density of 1s versus 0s indicates the
signal level. Commonly used by digital to analog converters. For details, refer
to Pulse-density
modulation.
- PWM
-
Pulse-width modulation. Form of modulation used to represent an analog signal by
a digital signal, where the relative width of a digital pulse indicates the
signal level. Commonly used by analog-to-digital converters. For details, refer
to Pulse-width
modulation.
- transducer
-
Converts variations in physical real-world quantities to electrical signals. In
audio, the physical quantity is sound pressure, and the transducers are the
loudspeaker and microphone. For details, refer to
Transducer.
Sample Rate Conversion
Sample rate conversion terms relate to the process of converting from one
sampling rate to another.
- downsample
- Resample, where sink sample rate < source sample rate.
- Nyquist frequency
-
Maximum frequency component that can be represented by a discretized signal at
1/2 of a given sample rate. For example, the human hearing range extends to
approximately 20 kHz, so a digital audio signal must have a sample rate of at
least 40 kHz to represent that range. In practice, sample rates of 44.1 kHz and
48 kHz are commonly used, with Nyquist frequencies of 22.05 kHz and 24 kHz
respectively. For details, refer to
Nyquist frequency
and
Hearing range.
- resampler
- Synonym for sample rate converter.
- resampling
- Process of converting sample rate.
- sample rate converter
- Module that resamples.
- sink
- Output of a resampler.
- source
- Input to a resampler.
- upsample
- Resample, where sink sample rate > source sample rate.
Android-Specific Terms
Android-specific terms include terms used only in the Android audio framework
and generic terms that have special meaning within Android.
- ALSA
-
Advanced Linux Sound Architecture. An audio framework for Linux that has also
influenced other systems. For a generic definition, refer to
ALSA.
In Android, ALSA refers to the kernel audio framework and drivers and not to the
user-mode API. See also tinyalsa.
- audio device
-
Audio I/O endpoint backed by a HAL implementation.
- AudioEffect
-
API and implementation framework for output (post-processing) effects and input
(pre-processing) effects. The API is defined at
android.media.audiofx.AudioEffect.
- AudioFlinger
-
Android sound server implementation. AudioFlinger runs within the mediaserver
process. For a generic definition, refer to
Sound server.
- audio focus
-
Set of APIs for managing audio interactions across multiple independent apps.
For details, see Managing Audio Focus and the focus-related methods and constants of
android.media.AudioManager.
- AudioMixer
-
Module in AudioFlinger responsible for combining multiple tracks and applying
attenuation (volume) and effects. For a generic definition, refer to
Audio mixing (recorded music) (discusses a mixer as a hardware device or software application, rather
than a software module within a system).
- audio policy
-
Service responsible for all actions that require a policy decision to be made
first, such as opening a new I/O stream, re-routing after a change, and stream
volume management.
- AudioRecord
-
Primary low-level client API for receiving data from an audio input device such
as a microphone. The data is usually PCM format. The API is defined at
android.media.AudioRecord.
- AudioResampler
-
Module in AudioFlinger responsible for sample rate conversion.
- audio source
-
An enumeration of constants that indicates the desired use case for capturing
audio input. For details, see audio source. As of API level 21 and above,
audio attributes are preferred.
- AudioTrack
-
Primary low-level client API for sending data to an audio output device such as
a speaker. The data is usually in PCM format. The API is defined at
android.media.AudioTrack.
- audio_utils
-
Audio utility library for features such as PCM format conversion, WAV file I/O,
and
non-blocking FIFO, which is
largely independent of the Android platform.
- client
-
Usually an application or app client. However, an AudioFlinger client can be a
thread running within the mediaserver system process, such as when playing media
decoded by a MediaPlayer object.
- HAL
-
Hardware Abstraction Layer. HAL is a generic term in Android; in audio, it is a
layer between AudioFlinger and the kernel device driver with a C API (which
replaces the C++ libaudio).
- FastCapture
-
Thread within AudioFlinger that sends audio data to lower latency fast tracks
and drives the input device when configured for reduced latency.
- FastMixer
-
Thread within AudioFlinger that receives and mixes audio data from lower latency
fast tracks and drives the primary output device when configured for reduced
latency.
- fast track
-
AudioTrack or AudioRecord client with lower latency but fewer features on some
devices and routes.
- MediaPlayer
-
Higher-level client API than AudioTrack. Plays encoded content or content that
includes multimedia audio and video tracks.
- media.log
-
AudioFlinger debugging feature available in custom builds only. Used for logging
audio events to a circular buffer where they can then be retroactively dumped
when needed.
- mediaserver
-
Android system process that contains media-related services, including
AudioFlinger.
- NBAIO
-
Non-blocking audio input/output. Abstraction for AudioFlinger ports. The term
can be misleading as some implementations of the NBAIO API support blocking. The
key implementations of NBAIO are for different types of pipes.
- normal mixer
-
Thread within AudioFlinger that services most full-featured AudioTrack clients.
Directly drives an output device or feeds its sub-mix into FastMixer via a pipe.
- OpenSL ES
-
Audio API standard by
The Khronos Group. Android versions since
API level 9 support a native audio API that is based on a subset of
OpenSL ES 1.0.1.
- silent mode
-
User-settable feature to mute the phone ringer and notifications without
affecting media playback (music, videos, games) or alarms.
- SoundPool
-
Higher-level client API than AudioTrack. Plays sampled audio clips. Useful for
triggering UI feedback, game sounds, etc. The API is defined at
android.media.SoundPool.
- Stagefright
-
See Media.
- StateQueue
-
Module within AudioFlinger responsible for synchronizing state among threads.
Whereas NBAIO is used to pass data, StateQueue is used to pass control
information.
- strategy
-
Group of stream types with similar behavior. Used by the audio policy service.
- stream type
-
Enumeration that expresses a use case for audio output. The audio policy
implementation uses the stream type, along with other parameters, to determine
volume and routing decisions. For a list of stream types, see
android.media.AudioManager.
- tee sink
-
See Audio Debugging.
- tinyalsa
-
Small user-mode API above ALSA kernel with BSD license. Recommended for HAL
implementations.
- ToneGenerator
-
Higher-level client API than AudioTrack. Plays dual-tone multi-frequency (DTMF)
signals. For details, refer to
Dual-tone
multi-frequency signaling and the API definition at
android.media.ToneGenerator.
- track
-
Audio stream. Controlled by the AudioTrack or AudioRecord API.
- volume attenuation curve
-
Device-specific mapping from a generic volume index to a specific attenuation
factor for a given output.
- volume index
-
Unitless integer that expresses the desired relative volume of a stream. The
volume-related APIs of
android.media.AudioManager
operate in volume indices rather than absolute attenuation factors.