Audio mixing (recorded music)

In sound recording and reproduction, audio mixing is the process of optimizing and combining multitrack recordings into a final mono, stereo or surround sound product. In the process of combining the separate tracks, their relative levels are adjusted and balanced and various processes such as equalization and compression are commonly applied to individual tracks, groups of tracks, and the overall mix. In stereo and surround sound mixing, the placement of the tracks within the stereo (or surround) field are adjusted and balanced.[1]: 11, 325, 468 Audio mixing techniques and approaches vary widely and have a significant influence on the final product.[2]
Audio mixing techniques largely depend on music genres and the quality of sound recordings involved.[3] The process is generally carried out by a mixing engineer, though sometimes the record producer or recording artist may assist. After mixing, a mastering engineer prepares the final product for production.
Audio mixing may be performed on a mixing console or in a digital audio workstation.
History
In the late 19th century, Thomas Edison and Emile Berliner developed the first recording machines. The recording and reproduction process itself was completely mechanical with little or no electrical parts. Edison's phonograph cylinder system utilized a small horn terminated in a stretched, flexible diaphragm attached to a stylus which cut a groove of varying depth into the malleable tin foil of the cylinder. Emile Berliner's gramophone system recorded music by inscribing spiraling lateral cuts onto a vinyl disc.[4]
Electronic recording became more widely used during the 1920s. It was based on the principles of electromagnetic transduction. The possibility for a microphone to be connected remotely to a recording machine meant that microphones could be positioned in more suitable places. The process was improved when outputs of the microphones could be mixed before being fed to the disc cutter, allowing greater flexibility in the balance.[5]
Before the introduction of multitrack recording, all sounds and effects that were to be part of a recording were mixed simultaneously during a live performance. If the recorded mix was not satisfactory, or if one musician made a mistake, the selection had to be performed over until the desired balance and performance was obtained. The introduction of multi-track recording changed the recording process into one that generally involves three stages: recording, overdubbing, and mixing.[6]
Modern mixing emerged with the introduction of commercial multi-track tape machines, most notably when 8-track recorders were introduced during the 1960s. The ability to record sounds into separate channels made it possible for recording studios to combine and treat these sounds not only during recording, but afterward during a separate mixing process.[7]
The introduction of the cassette-based Portastudio in 1979 offered multi-track recording and mixing technology that did not require the specialized equipment and expense of commercial recording studios. Bruce Springsteen recorded his 1982 album Nebraska with one, and the Eurythmics topped the charts in 1983 with the song "Sweet Dreams (Are Made of This)", recorded by band member Dave Stewart on a makeshift 8-track recorder. In the mid-to-late 1990s, computers replaced tape-based recording for most home studios, with the Power Macintosh proving popular.[8] In the mid-1980s, many professional recording studios began to use digital audio workstations (DAWs) to accomplish recording and mixing previously done with multitrack tape recorders, mixing consoles, and outboard gear.
Equipment
Mixing consoles

A mixer (mixing console, mixing desk, mixing board, or software mixer) is the operational heart of the mixing process.[9] Mixers offer a multitude of inputs, each fed by a track from a multitrack recorder. Mixers typically have 2 main outputs (in the case of two-channel stereo mixing) or 8 (in the case of surround).
Mixers offer three main functionalities.[9][10]
- Summing signals together, which is normally done by a dedicated summing amplifier or, in the case of a digital mixer, by a simple algorithm.
- Routing of source signals to internal buses or external processing units and effects.
- On-board processors with equalizers and compressors.
Mixing consoles can be large and intimidating due to the exceptional number of controls. However, because many of these controls are duplicated (e.g. per input channel), much of the console can be learned by studying one small part of it. The controls on a mixing console will typically fall into one of two categories: processing and configuration. Processing controls are used to manipulate the sound. These can vary in complexity, from simple level controls, to sophisticated outboard reverberation units. Configuration controls deal with the signal routing from the input to the output of the console through the various processes.[11]
Digital audio workstations (DAW) can perform many mixing features in addition to other processing. An audio control surface gives a DAW the same user interface as a mixing console.[11]
Outboard and plugin-based processing
Outboard audio processing units (analog) and software-based audio plug-ins (digital) are used for each track or group to perform various processing techniques. These processes, such as equalization, compression, sidechaining, stereo imaging, and saturation are used to make each element as audible and sonically appealing as possible. The mix engineer also will use such techniques to balance the space of the final audio wave; removing unnecessary frequencies and volume spikes to minimize the interference or clashing between each element.
Processes that affect signal volume or level
- Faders – The process of attenuating (lowering) the level of a signal. This is by far the most basic audio process, appearing on virtually every effect unit and mixer.[11]: 177 Utilizing controlled fades is the most basic step of audio mixing, allowing more volume for prominent elements and less for secondary elements.
- Boost – The process of amplifying a signal. Boosting is usually done using extremely slight amounts of amplification, enough to raise a signal without pushing it to the point of distortion. However, when using audio tape as opposed to recording on to a computer, sometimes a signal will be deliberately overdriven very hard to achieve an intense yet soft, 'rounded off' style of distortion known as tape saturation. Distortion from clipping (overdriving) a digital signal will simply result in blasts of apparent white noise, and is almost universally regarded as unpalatable. Volume control units typically feature the ability to both boost and attenuate a signal.[11]: 177
- Panning – The process of altering the balance of an audio signal between the left and right channels of a stereo signal. The pan of a signal may be modified via a simple two-way pan control or an auto panner that continuously modulates and changes the pan of a signal.[1]: 49, 344 Panning is often used in the mixing process to arrange the track elements, simulating the placement of live bands.
- Compressors – The process of reducing the dynamic range or difference between loudest and quietest parts of a signal. This is done by reducing the signal volume after a user-adjustable threshold is hit. The ratio of reduction to gain above the threshold is often also controllable, as well as the time it takes for reduction to activate (attack) or release. Most compressors will also have a makeup gain control, used to apply a boost after the gain reduction is replied to compensate for the quieter signal. Compression has many uses in the mixing process, from evening out vocal volume to enhancing drums.[11]: 175
- Limiters – Using a compression ratio of 10:1 or higher is known as limiting- instead of applying gentle reduction to audio above the threshold, limiters forcibly flatten it down, allowing no signal above the threshold. Many limiting units also have built-in compressors that reduce the amount of audio actually passing the threshold. Many limiters also use digital algorithms to soften the harsh sound of limited audio, morphing the wave instead of completely decapitating it (by removing part of the waveform entirely, intense distortion and vastly altered tones can occur.) Softer limiters are used with generous amounts of compression to create a more consistently loud track with less volume fluctuation, and harder limiters can be used as distortion effects or emergency safeties to protect large speaker systems from blowing out. Many analog amplifiers are fitted with their own basic limiters to prevent the high-voltage circuitry from overloading and blowing out.[11]: 176
- Dynamic expansion – Expansion [11]: 176 Dynamic expansion is essentially compression with an inverted threshold- any signal below a certain threshold is dynamically reduced while signals above the threshold remain untouched. Expansion is most commonly used to give volume to certain elements of recordings- e.g. the bass drum and snare drum. Expanders can also be set up so that when a signal drops below a set threshold, it will reduce gain until the output signal is forced below a certain level, and continue to hold the gain at that level until the input rises above the threshold. This application of expansion is called gating.
Processes that affect frequencies
The frequency response of a signal represents the amount (volume) of every frequency in the human hearing range, consisting of (on average) frequencies from 20 Hz to 20,000 Hz (20 kHz.) There are a variety of processes commonly used to edit frequency response in various ways.
- Equalization – Equalization is a broad term for any device that can alter parts of a signal frequency response. Some EQs use a grid of faders or knobs which can be arranged to shape each frequency, whereas others use bands that can target and subsequently boost or cut selectable series of frequencies.
[11]: 178
- Filters – Filters attenuate part of the audio spectrum. There are various types of filters. A high-pass filter (low-cut) is used to remove unneeded bass from a sound source. A low-pass filter (high-cut) is used to remove unneeded treble. These are most often used as a way to declutter a given mix to improve the clarity of the individual elements. A band-pass filter is a combination of high- and low-pass filters, also known as a telephone filter (because a sound lacking in high and low frequencies resembles the quality of sound over a telephone).[12]
Processes that affect time
- Reverbs – Reverbs are used to simulate acoustic reflections in a real room, adding a sense of space and depth to otherwise dry recordings. Another use is to distinguish among auditory objects; all sound having one reverberant character will be categorized together by human hearing in a process called auditory streaming. This is an important technique in creating the illusion of layered sound from in front of the speaker to behind it.[11]: 181 Before the advent of electronic reverb and echo processing, physical means were used to generate the effects. An echo chamber, a large reverberant room, could be equipped with a speaker and microphones. Signals were then sent to the speaker and the reverberation generated in the room was picked up by the two microphones.[12]
Processes that affect space
- Panning – While panning is a process that affects levels, it also can be considered a process that affects space since it is used to give the impression of a source coming from a particular direction. Panning allows the engineer to place the sound within the stereo or surround field, giving the illusion of a sound's origin having a physical position.[1]
- Pseudostereo creates a stereo-like sound image from monophonic sources. This way the apparent source width or the degree of listener envelopment is increased. A number of pseudostereo recording and mixing techniques are known from the viewpoint of audio engineers[13][14] and researchers.[15][16]
Downmixing
The mixdown process converts a program with a multiple-channel configuration into a program with fewer channels. Common examples include downmixing from 5.1 surround sound to stereo,[a] and stereo to mono. Because these are common scenarios, it is common practice to verify the sound of such downmixes during the production process to ensure stereo and mono compatibility.
The alternative channel configuration can be explicitly authored during the production process with multiple channel configurations provided for distribution. For example, on DVD-Audio or Super Audio CD, a separate stereo mix can be included along with the surround mix.[17] Alternatively, the program can be automatically downmixed by the end consumer's audio system. For example, a DVD player or sound card may downmix a surround sound program to stereo for playback through two speakers.[18][19]
Mixing in surround sound
Any console with a sufficient number of mix busses can be used to create a 5.1 surround sound mix, but this may be frustrating if the console is not specifically designed to facilitate signal routing, panning, and processing in a surround sound environment. Whether working in an analog hardware, digital hardware, or DAW mixing environment, the ability to pan mono or stereo sources and place effects in the 5.1 soundscape and monitor multiple output formats without difficulty can make the difference between a successful or compromised mix.[20] Mixing in surround is very similar to mixing in stereo except that there are more speakers, placed to surround the listener. In addition to the horizontal panoramic options available in stereo, mixing in surround lets the mix engineer pan sources within a much wider and more enveloping environment. In a surround mix, sounds can appear to originate from many more or almost any direction depending on the number of speakers used, their placement and how audio is processed.
There are two common ways to approach mixing in surround. Naturally, these approaches can be combined in any way the mix engineer sees fit.
- Expanded Stereo – With this approach, the mix will still sound very much like an ordinary stereo mix. Most of the sources, such as the instruments of a band, backing vocals, and so on, are panned between the left and right speakers.[b] Lead sources such as the main vocal are sent to the center speaker. Additionally, reverb and delay effects will often be sent to the rear speakers to create a more realistic sense of being in an acoustic space. For the case of mixing a live recording that was performed in front of an audience, signals recorded by microphones aimed at, or placed among the audience are sent to the rear speakers to make the listener feel as if they are a part of the audience.
- Complete Surround/All speakers treated equally – Instead of following the traditional ways of mixing in stereo, this much more liberal approach lets the mix engineer do anything they want. Instruments can appear to originate from anywhere, or even spin around the listener. When done appropriately and with taste, interesting sonic experiences can be achieved.
Recently, a third approach to mixing in surround was developed by surround mix engineer Unne Liljeblad.
- Multi Stereo Surround (MSS)[21] – This approach treats the speakers in a surround sound system as a multitude of stereo pairs. For example, a stereo recording of a piano, created using two microphones in an ORTF configuration, might have its left channel sent to the left-rear speaker and its right channel sent to the center speaker. The piano might also be sent to a reverb having its left and right outputs sent to the left-front speaker and right-rear speaker, respectively. Thus, multiple clean stereo recordings surround the listener without the smearing comb-filtering effects that often occur when the same or similar sources are sent to multiple speakers.
Mixing in 3D sound
An extension to surround sound is 3D sound, used by formats such as Dolby Atmos. Known as object-based sound, this enables additional speakers to represent height channels, with as many as 64 unique speaker feeds.[22][23] This has application in concert recordings, movies and videogames, and nightclub events.[24]
See also
Notes
- ^ The left and right surround channels are blended with the left and right front channels. The center channel is blended equally with the left and right channels. The LFE channel is either mixed with the front signals or not used.
- ^ Lower levels of these sources may also be sent to the rear speakers in order to create a wider stereo image.
References
- ^ a b c Huber, David Miles; Runstein, Robert E. (2001). Modern recording techniques (5th ed.). Focal Press. ISBN 0-240-80456-2.
- ^ Strong, Jeff (2009). Home Recording For Musicians For Dummies (Third ed.). Indianapolis, Indiana: Wiley Publishing, Inc. p. 249.
- ^ Hepworth-Sawyrr, Russ (2009). From Demo to Delivery. The production process. Oxford, United Kingdom: Focal Press. p. 109.
- ^ Rumsey, Francis; McCormick, Tim (2009). Sound and Recording (6th ed.). Oxford, United Kingdom: Elsevier Inc. p. 168. ISBN 978-0-240-52163-3.
- ^ Rumsey, Francis; McCormick, Tim (2009). Sound and Recording (6th ed.). Oxford, United Kingdom: Elsevier Inc. p. 169. ISBN 978-0-240-52163-3.
- ^ Huber, David Miles (2001). Modern Recording Techniques. Focal Press. p. 321. ISBN 978-0240804569.
- ^ "The emergence of multitrack recording". Retrieved June 17, 2018.
- ^ "Studio Recording Software: Personal And Project Audio Adventures". studiorecordingsoftware101.com. 2008. Archived from the original on February 8, 2011. Retrieved March 20, 2010.
- ^ a b White, Paul (2003). Creative Recording (2nd ed.). Sanctuary Publishing. p. 335. ISBN 978-1-86074-456-3.
- ^ Izhaki, Roey (2008). Mixing Audio. Focal Press. p. 566. ISBN 978-0-240-52068-1.
- ^ a b c d e f g h i Holman, Tomlinson (2010). Sound for Film and Television (3rd ed.). Oxford, United Kingdom: Elsevier Inc. ISBN 978-0-240-81330-1.
- ^ a b Rumsey, Francis; McCormick, Tim (2009). Sound and Recording (6th ed.). Oxford, United Kingdom: Elsevier Inc. p. 390. ISBN 978-0-240-52163-3.
- ^ Levinit, Daniel J. (2004). "Instrument (and vocal) recording tips and tricks". In Greenbaum, Ken; Barzel, Ronen (eds.). Audio Anecdotes. Natick: A K Peters. pp. 147–158.
- ^ Cabrera, Andrés (2011). "Pseudo-Stereo Techniques. Csound Implementations". CSound Journal. 2011 (14): Paper number 3. Retrieved 1 June 2018.
- ^ Faller, Christof (2005). Pseudostereophony Revisited (PDF). Audio Engineering Society Convention 118. Barcelona. Retrieved 1 June 2018.
- ^ Ziemer, Tim (2017). "Source Width in Music Production. Methods in Stereo, Ambisonics, and Wave Field Synthesis". In Schneider, Albrecht (ed.). Studies in Musical Acoustics and Psychoacoustics. Current Research in Systematic Musicology. Vol. 4. Cham: Springer. pp. 299–340. doi:10.1007/978-3-319-47292-8_10. ISBN 978-3-319-47292-8.
- ^ Bartlett, Bruce; Bartlett, Jenny (2009). Practical Recording Techniques (5th ed.). Oxford, United Kingdom: Focal Press. p. 484. ISBN 978-0-240-81144-4.
- ^ "What Is Downmixing? Part 1: Stereo (LoRo)". TVTechnology. 6 March 2007.
- ^ Thornton, Mike (March 2012). "Podcast Follow Up - Surround Mixdown Formats". Pro Tools Expert.
- ^ Huber, David Miles; Runstein, Robert (2010). Modern Recording Techniques (7th ed.). Oxford, United Kingdom: Focal Press. p. 559. ISBN 978-0-240-81069-0.
- ^ "Surround Sound Mixing". www.mix-engineer.com. Retrieved 2010-01-12.
- ^ "Dolby Atmos for Home". www.dolby.com.
- ^ Hidalgo, Jason (April 26, 2012). "Dolby's Atmos technology gives new meaning to surround sound, death from above". Engadget. Retrieved 2012-06-01.
- ^ Authoring for Dolby Atmos Cinema Sound Manual (PDF) (Third ed.). Dolby Laboratories, Inc. 2014. pp. 69–103. Archived from the original (PDF) on 10 July 2015. Retrieved 7 December 2014.
