Equalisation - between hertz and sounds
Mar 25, 2003 - by Franck ERNOULDTranslated by Athahualpa.
This article deals with the connections between Hertz and sound, as well as the patterns and corrections that apply to the latter.
Any psychoacoustic consideration is essentially subjective and often needs to be adjusted according to the listener. The terms, adjectives and figures quoted in this article are by no means meant to be universal, intangible or inflexible, they simply reflect the author’s viewpoint, but if they trigger discussion, experimenting and learning, their objective would be fulfilled…
|Welcome to flat country|
The response curve of every single electroacoustic device, whether it’s electronic (amplifier, console, converter…) or transducer in type (microphone, loudspeaker…) should be as flat as possible. The human ear however, is far from being a linear receptor. After eons of evolution, this organ has become sensitive to the frequencies closest to the human voice, especially a baby’s. For this reason, our most developed perception range lies around 1 to 2 kHz. Above and below these frequencies, our hearing decreases with volume. The flattest “real objective level/perceived subjective levelî curve is reached around 85dB SPL. This is, by the way, the mixing norm sound level used in the cinema: the mixing engineer calibrates his installation, in order to be able to measure 85 dB SPL from where he’s seated!
At lower listening levels, the low and high frequencies are not so well detected. Those famous “loudnessî correctors found in a variety of hi-fi systems serve the purpose of readjusting the shortcomings of the human ear. At higher listening levels, the curve becomes distorted, thus causing the ear to get tired faster and to analyze dynamics less accurately. That’s why a high-pitched sound, which seems inoffensive at low volume might become aggressive when the level is raised.
|Flavors and colors|
Lets do some more foreplay, before getting into the real thing: every single speaker out there, whatever the make, model or price, has an acoustic fingerprint of its own. There’s no such thing as a device featuring a perfectly flat response curve, that’s subjective blabber! Let’s not forget that the trend nowadays pushes for systems giving priority to enhanced basses and nice trebles (Genelec or FAR style). Hi-fi speakers also tend to be disguised as professional models by oversizing their components.
Listening to sounds through ‘colored’ speakers compares to trying to judge the chromatic attributes of a painting wearing tinted sunglasses. Contrary to the mechanism of the eye, our ear gets used to the coloring and ends up compensating it. This explains how some experienced sound engineers perform excellent mixes by trusting their monitors, that, another less experienced listener would qualify as cheap sounding.
To make the task even more difficult, these monitors are often placed inside a cabin that has its own acoustic properties, although its influence is very limited due to the short distance separating the sound engineer from the speakers. The amplifier also brings its own contribution to the final sound of the system, so basically a speaker will behave differently depending on what type of amp it is plugged to. This problem can be bypassed by using active speakers.
|Six frequency domains|
After this introduction, which was intended to make the reader aware of all the distorting factors existing between the sinus curve and your ear, we can get down to the relation between sound and Hertz. The next step will be a rough characterization of the audible spectrum ranging from 20Hz to 20kHz:
1) Sub-bass: between 20 and 60 Hz
In this case the sound is primarily felt, rather than heard, unless a large amount of electric energy is spent in the process. This type of frequency gives an overall impression of powerful sound. If it is exaggerated, the sound tends to be indistinct and confusing and the membranes’ lives will be shortened dramatically! In Dolby Digital or DTS-style mixing, this range is widely exploited and has become a trademark of action packed films, adding their characteristic spectacular dimension, thus, justifying a separate channel.
2) Bass: between 60 and 200 Hz
The biggest part of the rhythmic energy is to be found in this region. The usual bass tone, for instance, goes from E1 to E4, which translates into 80Hz to 640 Hz.
3) Bass-medium: between 200 Hz and 1.5 kHz
The first harmonic frequencies of most instruments lie here. If too many corrections are applied to this region, the sounds tend to become too nasal and tiring.
4) High-medium: between 1.5 and 4 kHz
All the important harmonics are to be found here, specially, if we’re talking about sung music. In order to improve the comprehension of the lyrics it is here that modifications have to be performed.
5) High: between 4 and 10 kHz
This is the realm of clarity and sound definition. Corrections made to an instrument or voice around 5 kHz will increase its presence. Nevertheless, over-accentuated sequences in this region will cause a whistling effect and eventually, add aggressiveness to overall sound.
6) Ultra-high: between 10 and 20 kHz
Although wise manipulation in this area will enhance brilliance and clarity, too much of it will amplify whistle, wide spectrum noises and other acoustic pollutants.
Dividing high and low end into two subunits might seem a bit artificial. On a professional console, you’d generally have four correctors, the most sophisticated (parametric) affecting bass and high mediums. In a typical home studio console, usually there’s only one semi-parametric corrector to alter the mediums region in an approximate manner.
|Uneven and even|
Now let’s talk music. For the same note (let’s take A on a diapason, also A3 or a frequency of 440 Hz), we are able to easily distinguish if the sound is coming from a piano, a violin or a human voice. Why? “By their timbre!î you’d answer. According to Fourier, any sound sums up to a sinusoid array: a fundamental (440 Hz in this case) and plenty of harmonics, whose frequencies are multiples of the former. The pair harmonics (multiples of 2, 4, 6, 8, or 880, 1760…in Hz) are perceived as nicely sounding by the ear (2, 4 and 8 are octaves and 6 is a fifth to the root tone), the uneven harmonics (multiples of 3, 5, 7, or 1320, 2200…Hz) are deemed uncomfortable sounding (these are usually dissonant intervals: major and minor thirds, second…). Of course, in reality, these harmonics are very rarely found isolated; usually they are orbiting around chord fundamentals consisting of four, five or more notes.
A sound having a strong fundamental and little harmonics will be… . It’ll seem empty if the higher-ranking harmonics are important compared to the fundamental. It is only by means of estimation of the relative amount of each harmonic (up to 20240 Hz or the 46th harmonic) that our ear can tell a trumpet from a flute. Other experiments, however, have shown that deprived of their percussive attack, the sound of a piano or guitar become unrecognizable, even though this process does not alter their timbre in the slightest way. The sound of an electric guitar played using the volume pedal is quite different from the original “flavorî of that instrument being played conventionally. This goes to prove that during mixing, there’s more than just frequencies to define a given sound: attack plays a very important role. The personality of an instrument can also result from a small number of harmonics whose particular combination is used as reference by the ear to recognize it.
Another trick to identify a sound are formants. Let’s imagine the same sheer (mouthpiece) mounted on a clarinet and a saxophone: the base pattern is the same, but it is the nature of the instrument or its formants that will add color to the base pattern by attenuating some harmonics while enhancing others. If we replace the mouthpiece by the vocal chords and the sax by the larynx it is going to be their formants that will allow us to identify the voice of the speaker (somewhere between 300 and 3000 Hz).
This principle of enhancing characteristic frequencies of an instrument while dampening the others is used extensively during mixing, particularly if several different instruments compose the song. So how about percussion instruments and their large spectrum without a well defined fundamental? As a matter of fact, their frequency band is so large that you can find treble on a kick drum and bass on a charley. However, when confronted to enhance the characteristic sound of each of these components, you’ll focus on a narrow region, for example the “clangî defining frequencies on a cymbal.
Let’s look at a very special type of instrument: a voice singing lyrics! In this case we have a mixture of established (the vowels) and transition sounds (the consonants). The latter are characterized by medium sounding ones, like T, B or K, diffuse or mild like F, Z, J and hard like S, CH. So each vowel has its own spectral signature or formant. Consonants are closer to wide range noises: a “chhhhhî is practically white noise!
Different harmonic structures are specific to different languages, which is why the same tune sung in German, English or French will have identical melodies, but everything else will differ: consonant’s and vowel’s position and type, tonic accent, etc. That’s why laying down vocal tracks in different languages require different approaches when it comes to mixing, thus, the well known fact that it is easier to mix down a song in English than in German or French, the latter being the most difficult of the three examples.
The following lyrics illustrate the huge difference between the same song and text meaning in English and French:
“Show me the way to the next whisky bar / Oh, don’t ask why / For if we don’t find the next whisky bar / I tell you we must dieî
“Dis-nous ou trouver le prochain beau p’tit bar / D’mande pas pourquoi / Car si on trouve pas le prochain beau p’tit bar / J’te jure qu’on en crevera "
It is the original version of " Alabama Song ", written in English by Berthold Brecht and then translated to French by Boris Vian. Although the metric and subject are the same, the predominance of a certain consonant type is obvious: Ch, x, k, b, f, t against D, n, p, J. The same applies to vowels: o-ou, ou-ei, ou-I, ai, eu against I, e, o, ain, e, a…No wonder Jim Morrison is easier to mix than Catherine Sauvage.
Do you find decibels to be an abstract concept? Here are two simple definitions:
A 1 dB correction is the smallest audible change that an average human ear can detect.
If we consider a passing audio signal, voltage doubles (or halves) every time you add (or subtract) 6 dB. For instance: a given sound has a rich component around 3 kHz and it modulates at –5 dB. If you exaggerate and add +18 dB, you’ll be multiplying the voltage corresponding to this component by 8 (6 + 6 + 6 dB equals 2 x 2 x 2), going from 125 mV to 1 V for example.
How do you determine in which range to find the most characteristic frequency of a sound you’re mixing? Easy: turn the gain of your parametric equalizer all the way up and then start sweeping the entire authorized frequency range until you stumble upon the “bodyî of your sound, it will be so audible you won’t miss it. Turn the gain to 0 dB again and proceed to the desired correction…
Keep in mind that the harder you hit an instrument, the more sonic energy will be released by it and the more the spectral pattern will be crowded with treble harmonics. This phenomenon applies to strings, percussion, mouthpieces, and even vocal chords. You can even try out yourself: if you record the same melody in a soft voice and then very loudly, and adjust their levels in the same manner in the VU-meter, you’ll end up needing far more overall compressing and manipulating of the high mid range in order to be able to use the soft version on a playback.
|Nothing but a guitar chord|
Lets look at an E major chord played on the guitar: it’s made of the following notes: E1, B1, E2, G#3, B2, E3. The following considerations take into account only the first six harmonics of the chord. Their frequency does not go beyond 1980 Hz and even though ‘perfect notes’ are predominant within the chord (E, G#, B), they tend to pull towards F# and D# a bit. The guitar is a rather bass-mid instrument: if it wasn’t for the pick, metal strings, octave strings on a 12-string, or effects on an electric version and so on, we wouldn’t perceive it as an instrument sounding beyond the sixth harmonic…
|Sampling and corrections|
Sampling can be a tricky business. Let’s take a real piano as an example: the spectral range of each note is slightly different from the preceding or following one, so if you sample a C2 and then you play from C1 to C3, all the sampler does is transpose the harmonic information defining C2. The ear recognizes the melody because of the fundamental, but the sampled pattern won’t trick the ear: it is used to a “real piano soundî, where the harmonics of a C3 are different from those of a C1.
The noise produced by the hammer also has an impact on the overall sound of the instrument. Each time the hammer hits a string, the subsequent shock, which covers a large bandwidth, excites the whole structure of the piano. This means that for each note, a characteristic formant exists, made up of a complicated array of harmonics resulting from the internal resonance. So whether you play a C1 or a C3, this part of the sound merely resulting from the mechanical resonance will remain unchanged.
The sampled C2 previously mentioned, certainly contains this component, but unlike the natural notes, which all share exactly the same resonance pattern, the sampled one will shift it up or down, creating a new combination of harmonics which the ear identifies as ‘unnatural’. To avoid this problem, most modern sampled pianos contain a vast number of samples.
The same principle applies to a guitar with the characteristic sound of the pick pinching the strings, or an electric bass, whose body has its resonance peak around 200 Hz. In both cases the pickups turn this ‘extra formants’ into an intrinsic component of the instrument’s sound. So, in order to equalize a real bass, you’ll need to focus on the band around 200Hz. If you sample a bass playing a G (around 96 Hz), on the other hand, its resonance will enhance harmonic 2, so if you then play this harmonically colored note, the resonance will also be transposed! At this point there’s no way of attenuating the instrument as easy as in the ‘real bass’ example.