Sound Basics

Basics of Sound and Sound Systems

Click on topic of interest to view informaton!

dBW / dB / dB SPL / dB PWL / FUNADMENTALS / HARMONICS / HEARING
INTENSITY / INPUT TRANSDUCERS / OUTPUT TRANSDUCERS / PHASE
PITCH / SOUND / SOUND SYSTEM / SPEED OF SOUND / RMS

A MODEL OF A SOUND SYSTEM

Sound systems amplify sound by converting the sound waves (physical, or kinetic, energy) into electrical energy, increasing the power of the electrical energy by electronic means, and then converting the more powerful electrical energy back into sound.

Devices that convert energy from one form into another are called transducers. Devices that change one or more aspects of the audio signal are called signal processors.

The input transducer (such as a microphone or a guitar pickup) converts sound into a fluctuating electrical current that is a precise representation of the sound. This fluctuating current is referred to as an audio signal.

The signal processing alters one or more characteristics of the audio signal. In the simplest case, it increases the power of the signal (such a signal processor is called an amplifier). In practical sound systems, this block of the diagram represents a multitude of devices-- preamplifiers, mixers, effects units, equalizers, amplifiers, et cetera.

The output transducer (a speaker or headphones) converts the amplified and processed electrical signal (audio signal) back into sound.

INPUT TRANSDUCERS

Input transducers, as mentioned before, convert sound into audio signals. Here are some types of input transducers commonly found in sound reinforcement systems:

Air pressure or velocity Microphones--

convert sound waves traveling in air into an audio signal traveling in the microphone cable (see Input Devices for exactly how they do this).

Contact Pickups--

convert sound waves in a dense medium (wood, metal, skin) into an audio signal. Sometimes used on acoustic stringed instruments such as guitars, mandolins, violins, etc.

Magnetic Pickups--

convert fluctuating waves of induced magnetism into an audio signal. Found on electric stringed instruments (electric guitars, etc).

Tape Heads--

convert fluctuating magnetic fields (imprinted on magnetic recording tape (i.e. cassette)) into an audio signal.

Phonograph pickups (cartridges)--

convert physical movement of a stylus (needle) into an audio signal.

Laser Pickups--

convert imprinted patterns on a compact disc or Mini-Disc into a digital data stream that is then translated by a digital-to-analog converter into an analog signal.

Optical Pickups--

convert variations in the density or transparent area of a photographic film into an audio signal. Used for most motion picture sound tracks.

OUTPUT TRANSDUCERS

Output transducers, as mentioned before, convert audio signals back into sound. The following is a list of commonly-found output transducers:

Woofer Loudspeakers--

designed specifically to reproduce low frequencies (usually below 500Hz). Woofers sometimes are used to reproduce both low and some mid frequencies. Typically, they are cone-type drivers measuring from eight to eighteen inches in diameter.

Midrange Loudspeakers--

designed specifically to reproduce middle frequencies.

Tweeter Loudspeakers--

designed to reproduce the highest frequencies.

Full-range Loudspeakers--
integrated systems incorporating woofer and tweeter drivers in a single enclosure. As the name implies, they are designed to reproduce the full audio range (more or less).

Subwoofer Loudspeakers--

used to extend the low frequency range of full-range systems to include frequencies down to 20 or 30Hz.

Supertweeter Loudspeakers--

used to extend the range of full-range systems in the highest frequencies.

Monitor Loudspeakers--

full-range loudspeakers that are pointed at the performer on stage, rather than out to the audience. They are used to return a portion of the program to the performer, to help him or her stay in tune and in time, and are usually referred to as "foldback."

Headphones--

full-range transducers designed to fit snugly on the ears. Some designs block out ambient (external) sound, while others do not.

A PRACTICAL MODEL OF A SOUND SYSTEM

The illustration above illustrates a simple, practical sound system that might be used in a lecture hall or media center, etc.

The system can be conceptually analyzed as having three sections: (a) the input transducers, (b) signal processing, and (c) the output transducers:

A] Input Transducers--

three microphones convert the sound they pick up from the speakers into audio signals that travel down the cables to the signal processing equipment.

B] Signal Processing--

the three microphones are connected to individual inputs on a mixing console. The console serves the following functions:

1] Preamplification-- the console's microphone input section amplifies the level of the audio signal from each microphone, bringing it up to line level.

2] Equalization-- the console provides the means to adjust the tonal balance of each microphone individually. This allows the console operator to achieve a more pleasing or more intelligible sound quality.

3] Mixing-- the console adds the equalized signals of the microphones together to produce a single line-level output signal. The output of the console is connected to a power amplifier. The power amplifier boosts the console's line level (0.1 to 100 milliwatts) output signal to a level suitable to drive the loudspeaker (0.5 to 500 watts).

C] Output Transducer--

the loudspeaker converts the power amplifier output signal back into sound. The level of the sound is much higher than that of the three orators speaking unaided.

There is another less obvious, but equally important aspect of the sound system: the environment. When the sound output of the loudspeaker propagates into the hall, it is altered by the acoustical characteristics of the space.

The room may have little effect on the clarity of the sound if, for example, the room is "dead" or nonreverberant. If the room is highly reverberant, and the sound system is not designed and installed to deal with the acoustics of the space, the effect on the sound may be so severe as to render the sound system useless.

The environment is an integral part of the sound system, and its effects must be considered when the system is installed.

Every sound system, no matter how large, is merely an extension of this basic model. The same principles that apply to this simple model also apply to large-scale concert reinforcement systems.

Large concert systems may be comprised of twenty stage microphones, twenty keyboard inputs, many drum microphones, maybe a twenty-four track 3324 digital audio tape backup-- but they all follow the same principle: kinetic energy (in the air) is transformed into electrical energy, which is then extensively manipulated and often split to different areas, which may transform the electrical energy back into kinetic energy, or may record the electrical energy.

It should be said here that we believe that anyone working in a technical field... sound, for instance, should have a good background in what sound is, how sound works, what affects sound, etc. In other words, a good background in physics is a good idea. This section will try not to be too technical. But, if it is, check with your local physics teacher to learn more.

SOUND

All sounds are created by causing a medium to vibrate-- be it wood, strings, or vocal chords. Sound is carried through mediums by causing adjacent particles to vibrate similarly; the air particles adjacent to a guitar's strings are displaced and "bump" into the next adjacent air particle. This continues and eventually air particles in our ears "bump" into the tiny hairs located in our inner ear.

The most popular analogy to sound is that of the effect of a rock being dropped into a pond. The ripples, originating from the point source of the rock, spread out in all directions. As with sound, these ripples lose intensity as the distance away from the point source increases. Additionally, these waves form exactly the same shape as a sound wave-- something of sinusoidal curve.

The distance from a particular point of one wave (be it sound or mechanical) to the same point of the next wave is called the wavelength. Wavelengths of sound range from one inch to forty feet. In a given room, if the distance between two sides of the room is a multiple of the wavelength, this wavelength may be emphasized, which can have either a positive or negative effect. Regardless, we must know how to control it.

However, in sound, one rarely discusses wavelength. Instead, we count the number of complete cycles these waves can propagate during a specific amount of time, usually one second. This is known as the frequency. Frequency is measured in cycles-per-second, termed "Hertz," abbreviated "Hz". Sounds that vibrate many times per second are known as "high-frequency" sounds, and those which vibrate fewer times per second are known as "low-frequency" sounds.

The time it takes to complete one cycle is called the period of the wave and is expressed with the symbol T. Thus, T = 1/f.

Wavelength is usually represented by the Greek letter lambda (which I can't display on WWW html... actually I probably can...) frequency by f, and velocity by v. Velocity is the product of wavelength and frequency, so we get the equation

v = (lambda) * f.

PHASE

Since a cycle can begin at any point on a waveform, it is possible to have two wave generators producing the same wave of the same frequency and amplitude which will have different amplitudes at any one point in time. These waves are said to be out of phase with respect to each other. Phase is measured in degrees and a cycle can be divided in to 360 degrees; usually the sine curve is used as an example-- it begins at 0 degrees with 0 amplitude, increases to a positive maximum (the positive amplitude) at 90 degrees, decreases to zero again at 180 degrees, and decreases to a negative maximum (the negative amplitude) at 270 degrees, and returns back to 0 amplitude at 360 degrees.

Similar waveforms can be added by summing their signed amplitudes at each instant of time. When two waveforms that are completely in phase (0 degrees phase difference) and of the same frequency, shape, and peak amplitude are added, the resulting waveform is of the same frequency, phase, and shape, but has twice the original peak amplitude. If two waves are the same as the ones just described, except that they are completely out of phase (out-of-polarity with respect to each other; phase difference of 180 degrees), they will cancel each other out when added, resulting in a straight line of zero amplitude. If the second wave is only partially out of phase, it would interfere constructively at points where the amplitudes of the two waves have the same sign (both positive or both negative), resulting in a greater amplitude in the combined wave than in the first wave; and it would interfere destructively at points where the signs of the two wave amplitudes are opposing, resulting in a lesser amplitude at those points in time. The waves can be said to be in phase, or correlated, at points where the signs are the same and out-of-phase, or uncorrelated, at points where the signs are opposing.

Phase shift is a term that describes the amount of lead or lag in one wave with respect to another. It results from a time delay in the transmission of one of the waves. The number of degrees of phase shift introduced by a time delay can be computed by the formula:

(phase shift) = change-in-t * f * 360 degrees, where change-in-t is the time delay in seconds.

THE SPEED OF SOUND

Since sound is dependent upon vibration, it can travel through anything except a vacuum. It travels through some materials faster than others; sound travels about four times faster in water than in air, and about ten times slower in rubber. The speed of sound is a very important quantity to know when dealing with large-scale sound reinforcement systems, such as those used in arenas, used outdoors, or over extremely long distances. In air at 0 degreesC and 1 atm (atmosphere- a pressure quantity), sound travels at a speed of 331m/s. Temperature can affect the speed of sound in any medium, but most drastically in gases. In air, the speed increases approximately .60 m/s for each degree Celsius increase:

v = (331 + 0.60T) m/s., where T=degrees Celsius.

The speed of sound is virtually constant at all frequencies, but sound will travel faster in humid air rather than in dry air. Humid air also absorbs more high frequencies than low frequencies, so in humid conditions, the sound engineer will need to boost the high frequency portion of the program.

HOW YOU HEAR

The ear is a nonlinear device and, as result, it produces harmonic distortion when subjected to sound waves above a certain loudness. Harmonic distortion is the production of waveform harmonics that did not exist in the original signal. The ear can cause a loud 1kHz tone to be heard as a combination of tones at 1kHz, 2kHz, 3kHz, and so on. Although the ear may receive the overtone structure (all of the harmonics) of a violin (if the listening level is loud enough), the ear will produce additional harmonics, thus changing the perceived timbre of the instrument. This means that sound monitored at very loud levels may sound quite different when played back at low levels.

In addition to being nonlinear with respect to amplitude, the ear's frequency response changes with the loudness of the perceived signal. The loudness compensation switch found on many hi-fi preamplifiers is an attempt to compensate for the decrease in the ear's sensitivity to low-frequency sounds at low-levels. The curves below (when I scan them in) are the Fletcher-Munson equal-loudness contours: they indicate the average ear sensitivity to different frequencies at different levels. The horizontal curves indicate the sound pressure levels that are required to produce the same perceived loudness at different frequencies. Thus, to equal the loudness of a 1.5kHz tone at a level of 110 dB SPL, a 40Hz tone has to be 2dB greater in sound pressure level, while a 10kHz tone must be 8dB greater than the 1.5kHz tone to be perceived as loud. Thus, if a piece of music is monitored so that the signals produce a sound pressure level of 110dB, and it sounds well-balanced, it will sound both bass and treble deficient when played at a level of 50dB SPL. 85dBSPL can be considered the optimum monitoring level for mixdowns.

The loudness of a tone can also affect the pitch that the ear perceives. For example, if the intensity of a 100Hz tone is increased from 40 to 100dB SPL, the ear will perceive a pitch decrease of about 10%. At 500Hz, the pitch changes about 2% for the same increase in sound pressure level. This is one reason that musicians find it hard to tune their instruments while listening through headphones. The headphones are often producing higher SPLs than might be expected.

As a result of the nonlinearity of the ear, tones can interact with each other rather than being perceived separately. Three types of interaction effects occur: beats, combination tones, and masking.

*Beats: Two tones that differ only slightly in frequency and have approximately the same amplitude will produce beats at the ear equal to the different between the two frequencies. The phenomenon of beats can be used as an aid in tuning instruments because the beats slow down and stop as the two notes are in perfect tune, and the piano tuner will slightly off-tune the instrument by listening to the beat relationships. These beats are the result of the ear's inability to separate closely pitched notes.

*Combination Tones: Combination tones result when two loud tones differ by more than 50Hz. The ear will produce an additional set of tones that are equal to both the sum and the different of the two original tones and that are also equal to the sum and difference of their harmonics. The formulae for computing the tones are: diff tone frequencies = f1 - f2; sum tone frequencies = f1 + f2, where f1 and f2 are positive integers. The difference tones can be easily heard when they are below the frequency of both the original tones. For example, 2000 and 2500Hz produce a difference tone of 500Hz.

*Masking: Masking is the phenomenon by which loud signals prevent the ear from hearing softer sounds. The greatest masking effect occurs when the frequency of the sound and the frequency of the masking noise are close to each other. For example, a 4kHz tone will mask a softer 3.5kHz tone, but will have little effect on the audibility of a quiet 1000Hz tone. The masking phenomenon is one of the main reasons that stereo placement and equalization are so important in a mixdown. An instrument that sounds fine by itself can be completely hidden or changed in character by louder instruments with a similar timbre.

Although one ear is not able to discern the direction from which a sound originates, two ears can. This ability of two ears to localize a sound source within an acoustic space is called binaural localization. This effects results from using three cues that are received by the ears: interaural intensity differences, interaural arrive-time differences, and the effects of the pinnae (outer ears).

Middle- to higher-frequency sounds originating from the right side will reach the right ear at a higher intensity level than the left ear, causing an interaural intensity difference. This occurs because the head casts an acoustic block or shadow, allowing only reflected sound from surrounding surfaces to reach the left ear. Since the reflected sound travels farther and loses energy at each reflection, the intensity of sound perceived by the left ear is reduced, with the resulting signal being perceived as originating from the right.

This effect is relatively insignificant at lower frequencies, where wave-lengths are large compared to the diameter of the head and easily bend around its acoustic shadow. A different method of localization known as interaural arrive-time differences is employed at lower frequencies. In our example, time differences occur because the acoustic path length to the left ear is slightly longer than that to the right ear. The sound pressure will thus be sensed by the left ear at a later time than by the right ear. This method of localization, in combination with interaural intensity differences, gives us lateral location cues over the entire frequency spectrum.

The intensity and delay cues allow us to perceive the angle from which a sound originates, but not whether the sound originates from the front, behind, or below. The pinna, however, makes use of two ridges that reflect the incident sound into the ear. These ridges introduce time delays between the direct sound (which reaches the entrance of the ear canal) and the sound reflected from the ridges (which varies according to source location).

PITCH

The pitch of a sound refers to whether it is high, like the sound of piccolo or violin, or low, like the sound of a bass drum or string bass. The physical quantity that determines pitch is the frequency. The lower the frequency, the lower the pitch. The human ear responds to frequencies in the range from about 20Hz to about 20,000Hz. This is called the audible range. These limits vary somewhat from one individual to another. One general trend is that as people age, they are less able to hear the high frequencies, so that the high-frequency limit may be 10,000Hz or less.

Sound waves whose frequencies are outside the audible range may reach the ear, but we are not generally aware of them. Frequencies above 20,000Hz are called ultrasonic. Many animals can hear ultrasonic frequencies; dogs, for example, can hear sounds as high as 50,000Hz and bats can detect frequencies as high as 100,000Hz.

Sound waves whose frequencies fall below the audible range are called infrasonic, or occasionally subsonic. Sources of infrasonic waves are earthquakes, thunder, volcanoes, and waves produced by vibrating heavy machinery.

The pitch of the sound also factors into the way the ear hears. The ear has difficulty in associating a point origin to a low-frequency sound, but is quite accurate in placing the origin of high-frequencies. This is because high frequencies have wavelengths shorter than the distance between the ears; sounds above 1000Hz cannot reach both ears at the same time and at the same intensity, so one ear is favored and provides the information as to the direction in the horizontal plane. The ear is less successful in responding to directions in the vertical plane.

FUNDAMENTALS AND HARMONICS

The initial vibration of a sound sources is called the fundamental, and thus the initial frequency is known as the fundamental frequency. The subsequent vibrations, which are exact multiples of the fundamental frequency, are called the harmonics. So, a note on a musical instrument with a fundamental frequency of 100Hz will have a second harmonic at 200Hz, a third harmonic at 400Hz, et al.

The term octave denotes the difference between any two frequencies where the ratio between them is 2:1. Thus, an octave separates the fundamental from the second harmonic in the above example: 200Hz:100Hz. At the upper end of the frequency spectrum the same ratio still applies although the frequencies are greater. An octave still separates 2000Hz from 1000Hz. Two notes separated by an octave are said to be "in tune." Thus, an octave on the piano keyboard, separated by eight keys (well, really thirteen), is also an octave-- frequency-wise.

Whether the harmonics diminish in intensity or retain much of their energy depends on how the source is initially vibrated and subsequently damped. It is the strength of the harmonics which distinguishes the quality (or timbre) of musical instruments and makes it possible for humans to identify two different instruments playing the same note. Cool, huh.

INTENSITY

Like pitch, loudness is a sensation in the consciousness of a human being. It, too, is related to a physically measurable quantity, the intensity of the wave. Intensity is defined as the energy transported by a wave per unit time across unit area. Since energy per unit time is power, intensity has units of power per unit area, or watts/meter2 (W/m2). The intensity depends on the amplitude of the wave (it is proportional to the square of the amplitude). [The amplitude of the wave is the distance between the extremes of the vibration.]

The human ear can detect sounds with an intensity as low as 10-12 W/m2 and as high as 1 W/m2 (and even higher, although above this it is painful). This is an incredibly wide range of intensity, spanning a factor of 1012 from lowest to highest. Presumably because of this wide range, what we perceive as loudness is not directly proportional to the intensity. Ture, the greater the intensity, the louder the sound. But to produce a sound that sounds about twice as loud requires a sound wave that has about ten times the intensity. For example, a sound wave of intensity 10-9 W/m2 sounds to an average human being as if it is about twice as loud as one whose intensity is 10-10 W/m2; and an intensity of 10-2 W/m2 sounds about twice as loud as 10-3 W/m2 and four times as loud as 10-4 W/m2.

Because of this relationship between the subjective sensation of loudness and the physically measurable quantity intensity, it is usual to specify sound intensity using a logarithmic scale. The unit on this scale is the decibel, (dB). The intensity level, b, of any sound is defined in terms of its intensity, p, as follows:

b(dB) = 10 log (p1/p0).

p0 is the intensity of some reference level. It is usually taken as the minimum intensity audible to an average person, the "threshold of hearing," which is 1.0x10-12 W/m2. Notice that the intensity level at the threshold of hearing is 0dB; that is, b=10 log (10-12/10-12) = 10 log 1 = 0. Notice, too, that an increase of intensity by a factor of ten corresponds to a level increase of 20dB. Common loudness levels and intensity levels for common sounds follow.

It may be noted that the previous description of decibels is simply a brief overview, something that one would find in a generic physics text. A more in-depth look at decibels is needed for use in sound reinforcement.

dB IN GENERAL

The dB always describes a ratio of two quantities. Remember that. It's not really important for you to grasp the logarithm concept just now (but if you do, that's cool)... it's simply important that you realize that a logarithm describes the ratio of two powers, not the power value themselves. To demonstrate this, let's plug in some real values in the dB equation.

dB SPL

This is the one of the more common forms of the decibel. It measures sound pressure levels (SPL): the sound pressure is the level measured per unit area at a particular location relative to the sound source. When a dB describes a sound pressure level ratio, a "20 log" equation is used:

dBSPL = 20 log (p1/p0),

where p0 and p1 are the sound pressures, measured in dynes per square centimeter or Newtons per square meter.

This equation tells us that if one SPL is twice another, it is 6dB greater; if it is ten times another, it is 20dB greater, and so forth.

How do we perceive SPL? It turns out that a sound which is 3dB higher in level than another is barely perceived to be louder; a sound which is 10dB higher in level is perceived to be about twice as loud. Loudness, by the way, is a subjective quantity, and is also greatly influenced by frequency and absolute sound level.

SPL has an absolute reference value (p0); generally 0db SPL is defined as the threshold of hearing in the ear's most sensitive range, between 1kHz and 4kHz. It represents a pressure level of 0.0002 dynes/cm2, which is the same as 0.000002 Newtons/m2. It is really best to compare SPLs with each other, as in the following chart.

dBW

We have explained that the dBm is a measure of electrical power, a ratio referenced to one milliwatt. dBm is handy when dealing with the miniscule power (in the millionths of a watt) output of microphones, and the modest levels in signal processors (in the milliwatts). One magazine wished to express larger power numbers without larger dB values... for example, the multi-hundred watt output of large power amplifiers. For this reason, that magazine established another dB power reference: dBW:

dBW = 10 log (p1/p0),

0 dBW is one watt. Therefore a 100 watt power amplifier is a 20 dBW amplifier (10 log (100/1) = 10 log (100) = 10*2 = 20dB.) A 1000 watt amplifier is a 30 dBW amplifier, and so forth.

dB PWL

Acoustic power is expressed in acoustic watts, and can be described with a dB term, dB PWL. This term shares the same "10 log" equation as other power ratios:

dBPWL = 10 log (p1/p0),

Acoustic power and dB PWL come into play when calculating the reverb time of an enclosed space, or the efficiency of a loudspeaker system, but they are seldom seen on specification sheets and seldom used by the average sound system operator. It is much more common to use dB SPL because the sound pressure is more directly related to perceived loudness (and is easily measured).

Incidentally, there is no set relationship between dB PWL and dBW; the former expresses acoustic power, the latter electrical power. If a loudspeaker is fed 20 dBW, it might generate as little as 10 dB PWL. In English... feed 100 watts into a loudspeaker, it might generate as little as 10 watts of acoustic power. This would indicate a conversion efficiency of ten percent, which is high for a cone loudspeaker in a vented box!

RMS

"RMS" is an abbreviation for a term known as "Root Mean Square." This is a mathmetical expression used in audio to describe the level of a signal. RMS is particularly useful in describing the enegry of a complex waveform or a sine wave. It is not the peak level, nor the average, but rather it is obtained by squaring all the instantaneous voltages along a waveform, averaging the squared values, and taking a square root of the number.

The rms value of a periodic function, such as the sine curve, is .707 times the peak value of the wave.

Why is the rms value of a signal used? For one thing, the rms value correlates well with the real work being done b y the amplifier. When so-called "program" or "music" power ratings are employed, the actual work being done is subjective-- it depends largely on the nature of the program source. The rms value of any program will pretty much reflect the energy content of that source. There is just one minor problem: the term "rms power" is meaningless.

Why? Power is the product of voltage multiplied by current. Typically, in a power amp, one is measuring the rms value of the output voltage and multiplying it by the rms value of the output current. This does not result in the rms power because the voltage and current are not in phase, and hence the rms values do not multiply to form a mathematically valid value. The intent of an rms power rating is valid, but not the term itself. Manufacturers are still driving amplifiers with sine wave test signals and connecting the amp outputs to dummy loads. They obtain the rms value of the sine wave output based on that voltage and the load resistance or impedance. Those who wish to be technically correct list this rating as "continuous average sine wave power," rather than "rms power."

RMS values are not the exclusive domain of power amplifiers. In most (but not all) cases, when you see a voltage listed for input sensitivity on a preamplifier or line amp, it is the rms voltage. For example, you may recall that 0 dBm is 1 milliwatt, which equals 0.775 volts rms across a 600 ohm circuit, and 0 dBV is 1 volt rms.

dBW / dB / dB SPL / dB PWL / FUNADMENTALS / HARMONICS / HEARING
INTENSITY / INPUT TRANSDUCERS / OUTPUT TRANSDUCERS / PHASE
PITCH / SOUND / SOUND SYSTEM / SPEED OF SOUND / RMS

Return to the Sound Room.

Comments, Questions, and Additions should be addressed via e-mail toCliff Pidlubny.

Not responsible for typographical errors. Information collected from many sources.