“The Science of Vocal Pedagogy”
THE PROCESS OF SOUND ANALYSIS*
An analysis of sound demands an interpretation of the laws that govern the propagation of sound through material media. Sound may be defined as “any vibratory disturbance in a material medium which is capable of producing an auditory sensation in the normal ear.”1 The vibratory motion of any sounding body is transmitted to the ear through any medium, such as air, solids, and liquids. Such a motion creates energy, which is conveyed from particle to particle until it reaches the ear or is dissipated within the medium. In the propagation of sound, all particles within the medium are disturbed in the direction in which the sound is propagated. Such a disturbance is called a longitudinal wave (Fig. 51).
One must remember that air as a gaseous medium possesses elasticity. Each particle has the ability to return to its original position after imparting the energy within the sound wave to its neighboring air particle within the medium. The prong of the tuning fork, as it moves to the right (Fig. 51), compresses the air particles. This compression wave travels through the medium longitudinally. As the prong of the fork swings to the left, pressure is decreased behind the surrounding particles, and they are drawn back in a rarefaction phase. Thus, all sound depends upon the magnitude of the pressure changes emitted by a vibrating body. The rate at which a vibrator oscillates is controlled by its mass, weight, and length and the force exerted upon it. The number of oscillations during each second determines the number of compressions and rarefactions each second. This movement is expressed in terms of frequency, that is A 440 equals 440 compressions and rarefactions per second or 440 cycles per second (cps).
Fig. 51
Simple Sounds
The wave forms which follow show graphically the periodic disturbances occurring in the medium through which the wave motion is propagated. These curves do not constitute a picture of the actual to and fro motions of particles; they provide a convenient means of displaying the properties of sound wave components in terms of time and air pressure.
Fig. 52 illustrates the variation in air pressure (14.2 pounds per square inch) created by two complete oscillations of a tuning fork. Air pressure is indicated at the left of the diagram. Amplitude, is indicated by the range of pressure variation above and below the zero line; time is indicated in centiseconds on the bottom line.
A sinusoid is a graphic representation of the pressure and displacement characteristics in time within a uniform vibration. It is usually used to portray sine waves, simple wave forms without overtones that are emitted by a vibrator exhibiting uniform periodic oscillation, such as a tuning fork, a pendulum, or a wheel (Fig. 53).
The graphic representations of these vibrating bodies are called sinusoids or sine waves because they are trigonometric functions of right angles and are identical to the movement of a point on the circumference of a uniformly revolving circle when that movement is projected on the diameter of the circle by means of perpendiculars.2
Fig. 52. Sound Waves from a Perfect Tuning Fork
Although the pressure changes in this illustration appear to be quite small, they represent an extremely loud sound. Source: Martin Joos, Acoustic Phonetics, Monograph No. 23, Language suppl. Linguistic Society of America, 24, 2 (April-June 1948).
Observe in Fig. 54 that as the point b moves steadily around the circle AA’ BB ’ the point C oscillates back and forth upon the diameter AB in a uniform, exactly repetitive motion. This motion is identical to the movement of the pendulum and tuning fork. It is called simple harmonic motion.
The mathematical derivation of the word sine or sinusoidal is as follows: The sine of an angle of a right triangle is defined as the ratio between the side opposite that angle and the hypotenuse. Since the radius of the circle forms the hypotenuse of each of the triangles that have been formed and is, therefore, the same in all of them, the value of the respective sines of the successive angles about the center may be expressed by the lengths of the side opposite those angles, or of the perpendiculars themselves. Therefore, the sine is the perpendicular.
Fig. 53. Examples of Simple Wave Forms
Fig. 54. Examples of the Movement of the Perpendicular
(Sine) Along the Diameter
Fig. 55A. Example of a Sine Wave
Fig. 55B. Example of a Sine Wave
To graphically display the sine waves formed by the simple harmonic motion of a wheel, the length of each perpendicular bc of Fig. 54 and the velocity of the moving wheel must be considered. Each perpendicular is arranged equidistant upon a horizontal axis to represent time in centiseconds, perpendiculars above the line represent the compression phase, and those below the line represent the rarefaction phase (Fig. 55A).
In Fig. 55B the height of each perpendicular, or sine, of the points of rotation of C spaced at intervals of thirty degrees are extended laterally for display. The vertical lines suggest time in centiseconds.
The employment of the mechanics of simple harmonic motion by transferring the motion of C in Fig. 56 to a source outside of the circle is illustrated in Fig. 56. The motion of the piston-cam shaft relationship within the modern gasoline motor or the piston-drive wheel relationship of the steam locomotive are examples of such mechanical transfer of simple harmonic motion.
Fig. 56. Example of Mechanics of Simple Harmonic Motion
Complex Sounds
Any complex sound wave is the sum of its sinusoids, or to state it differently, any complex sound with a repetitive wave form is usefully described as a series of pure tones. This fact is based upon a mathematical theorem named for its discoverer, J. B. J. Fourier (1768-1830): Every wave form, no matter what its nature may be, can be reproduced by superimposing a sufficient number of simple harmonic waves; that is, every complex wave can be built by piling up pure tone waves.
The components of a complex sound structure are called partials and harmonics. A partial is defined as:
A component of sound sensation which may be distinguished as a simple tone that cannot be further analyzed by the ear and which contributes to the timbre of the complex sound. The frequency of a partial may be either higher or lower than the basic frequency, and it may or may not be an integral multiple or submultiple of the basic frequency. If the frequency is not a multiple or submultiple, the partial is inharmonic. . . . [An harmonic is defined as] a partial whose frequency is an integral multiple of the fundamental, i.e., fundamental A, 110 cps; second harmonic, 220 cps; third harmonic, 330 cps, etc.3
The Spectrum of Complex Sounds. The spectrum of sound is analogous to the spectrum of optics. When white light passes through a prism, the light breaks up into many colors. In acoustics, the term spectrum is used to describe the many simple sounds—the sinusoids—that make up a complex sound.
Fig. 57 shows the fundamentals and upper partials of the tones emitted by three different instruments. Such charts are known as sound spectra, and the length of the vertical lines indicates the relative strength of the several harmonics.
Timbre is that attitude of auditory sensation in terms of which a listener can judge that two sounds similarly presented and having the same loudness and pitch are dissimilar. Timbre depends primarily upon the spectrum of the stimulus, but it also depends upon the wave form, the sound pressure, the frequency location of the spectrum and the temporal characteristics of the stimulus.4
Not only is the ear able to detect the difference in sound characteristics, but the visual representation of the wave forms of each sound source reveal different sound characteristics. In Fig. 58 recorded wave forms of three musical sources have the same frequency and amplitude. The differences among the examples result from changes in the wave form which are determined by the number, frequency, and distribution of the partials within it.
Fig. 57. Spectogram of Tuning Fork (left), Clarinet, and Cornet
Any given frequency has its harmonic series. “A harmonic series of sounds is one in which each basic frequency in the series is an integral multiple of the fundamental frequency.”5
The first sixteen harmonics, based upon C-65.4 as a fundamental are shown in Fig. 59; a corresponding harmonic series may be set up on a fundamental of any frequency. The hundreds of partials in the complete spectra extend into the ultra-audio region. The spectra of complex sounds consists of simple multiples of the fundamental frequencies as shown in the figure below.
Fig. 58. Wave Forms of Tuning Fork (top), Clarinet, and Cornet
Each vibrates at a frequency of 440 and at approximately the same intensity. From Musical Acoustics by Charles A. Culver, Copyright 1956, McGraw-Hill Book Co. Used by permission.
Fig. 59. The Harmonic Series
The inharmonic partials of the tempered scale shown above (5, 7, 9, 10, 11, 13, 15) are not exact multiples of the fundamental, and when sounded with the natural scale, they give the effect of roughness. The beats caused by the difference in frequency gives a vibrato effect to the sound rather than smoothness. Inharmonic partials are usually among the higher partials and may be small or large in amplitude.
PHASE, REENFORCEMENT, AND INTERFERENCE
When two or more sinusoidal waves moving in the same direction cross the zero line at exactly the same point in time, they are exactly repetitious and are in phase with each other.
The phase of a periodic quantity, for a particular value of the independent variable,* is the fractional part of a period through which the independent variable has last advanced through zero from a negative to a positive direction.6
In Fig. 60, A represents the superimposition of two simple harmonic motions of equal period but of different amplitudes. Here the vibratory motions, W1 and W2, are in the same phase, crest over crest, trough over trough. The vibrations now reenforce one another, and the resultant (indicated by the solid line W) has an amplitude that is equal to the sum of the amplitudes of waves b and d. This characteristic of a wave crest in concurrence with another wave crest is known as reinforcement.
Fig. 60. Superimposition of Two Simple Harmonic Motions
From Musical Acoustics by Charles A. Culver, Copyright 1956, McGraw-Hill Book Co. Used by permission.
B represents the superimposition of two simple harmonic motions (dotted lines) of equal period, which vary in amplitude but are in opposite phase, crest over trough and trough over crest. The constituent vibrations now pull in opposite directions and so partially neutralize one another, the amplitude of the resultant (represented by the solid line) is equal to the difference of the amplitudes of W1 and W2. This characteristic of a wave trough in opposition to a wave crest is known as interference. C indicates the superimposition of two wave systems of equal amplitude but of opposite phase. W1 subtracted from W2 is zero, an example of complete interference.
Fig. 61 illustrates the manner in which complex waves are formed by the superimposition of two or more sine waves of unequal period that are in phase at every half centisecond.7 Each point in the resultant wave C may be easily determined by using dividers; ab plus cd equals ef; likewise, gh minus jk equals mn. (jk is subtracted because it lies below the zero line.) This point-by-point relationship can be summarized: C is the sum of A and B.
Fig. 61. A Complex Wave as a Sum of Sinusoids
Here A represents a 100~ sinusoid, B represents a 300~ sinusoid, and C represents their sum, a complex or nonsinusodial wave. Source: Martin Joos, Acoustic Phonetics, Monograph No. 23, Language suppl., Linguistic Society of America, 24, 2 (April-June 1948).
Joos further illustrates that a complex wave does not necessarily contain any fundamental component. In Fig. 62, D contains a fourth, a third, and a second harmonic but no fundamental. The fundamental, however, can be easily determined as the point of exact repetition or the point of greatest reinforcement of A, B, and C. In this illustration all three waves cross the median line while moving in the same direction. At one centisecond, A is 200 cps, B is 300 cps, and C is 400 cps. The fundamental is 100 cps at zero amplitude.
Fig. 62. A Complex Wave May Have No Fundamental Component
Source: Martin Joos, Acoustic Phonetics, Monograph No. 23, Language suppl., Linguistic Society of America, 24, 2 (April-June 1948).
Even a square wave (Fig. 63) may be analyzed as consisting of sinusoids, but an infinite series of them is needed to create a square wave. Only the first four are shown here (A, B, C, and D). They are the fundamental and odd harmonics three, five, and seven. If more than four harmonics had been used, the resemblance of T and S would have been still closer.
Fig. 63. Even a Square Wave Is a Sum of Sinusoids
But it takes an infinite set (series) of sinusoids to add up to a square wave. Four of the components are shown here as A, B, C, D. By themselves these four add up to form the wave 7; if more than four had been used, the resemblance of T to 5 would have been still closer. Source: Martin Joos, Acoustic Phonetics, Monograph No. 23, Language suppl., Linguistic Society of America, 24, 2 (April-June 1948).
Joos holds that the rectangular wave of Fig. 64 crudely resembles the glottal tone, “especially of the soprano voice.”8 This wave is shown here to illustrate that it also can be analyzed into sinusoids. Fig. 64 shows a rectangular wave with the positive part (compression) lasting one-tenth of the period, the negative part (rarefaction) nine-tenths. Its components have the amplitude shown in Fig. 64.
For comparison with the wave displayed in Fig. 64 the glottal wave form in Fig. 67 of this chapter illustrates the latest contemporary theory of laryngeal vibrations advanced by the Swedish acoustician Gunnar Fant.9 (Note that the conformation of this wave form is different from that of Fig. 64. The variation is the result of the length of time the glottis is open.)
Fig. 64. A Rectangular Wave and Its Components
A Rectangular wave can also be analyzed into sinusoids, but the analysis is not shown here as it was in Fig. 63. Instead it is represented by the table printed below. This rectangular wave resembles the glottal tone, especially of a soprano voice. Source: Martin Joos, Acoustic Phonetics, Monograph No. 23, Language suppl., Linguistic Society of America, 24, 2 (April-June 1948).
Free Vibration
“Free vibration is the vibration of a freely elastic system in its own natural period after all driving forces have been removed from the system”10 A weight fastened to the end of a spring when displaced from its position of rest and then released will oscillate vertically, and because of its elasticity the spring will overshoot its position of rest on both the ascent and descent. Such oscillations will continue until the energy, created by the weight’s displacement, has been dissipated by the friction of the spring and the force of gravity. All motion will stop at the weight’s original point of rest.
When the system is undisturbed by outside forces the number of up and down oscillations of the weight per second is called the natural period. The frequency depends upon the mass of the weight and the tension of the spring.
A pendulum that is raised from its point of rest and permitted to swing will pass its point of rest in each oscillation for some time and will swing in its own free period without the aid of additional driving force. But eventually the force of gravity and friction of the air will bring it to its point of rest. The frequency (number of swings per second) depends only upon the length of the pendulum.
A tuning fork will vibrate for several seconds but will soon stop. Every elastic system has its own natural period of vibration. If it is set into motion and no other force is applied, it will vibrate only in its particular frequency.
Maintained Vibration
Maintained vibration may be defined as that type in which repeated impulses are given to the vibrator so that it continues to vibrate in its own natural frequency. The pendulum swinging in free vibration may be made to swing indefinitely when a small force is added to it precisely as it changes direction. This additional force exerted by the escape mechanism balances the factors which normally would cause the pendulum to cease swinging.
Leaping upon a springboard will force it to vibrate in the natural frequency and the movement will build up until it reaches great amplitude. If the application of the force is not properly timed, little movement will result.
If this principle of timing the applied force is translated into the realm of sound and a pulsating force is applied to an elastic system, the period of the pulsating force corresponds to the natural frequency of the system and the sound is amplified considerably. When one blows over the lip of a bottle, the hiss resulting has many overtones within it. One overtone just matches the natural frequency of the bottle cavity and causes the air within the bottle to oscillate in and out with regularity and the bottle note sounds loud and clear practically without hiss. A resonator which improves the efficiency of the generated sound at its own frequency is reinforcing the vibrated sound; the phenomenon is called reinforcement. The tone from the bottle will continue to sound so long as the air within it is activated.
Forced Vibration
If the frequencies of the vibration and that of the resonance system do not coincide, one of two things may happen, depending upon the relationship of the vibrator to the resonator, the materials used, and the size of the resonator and vibrator.
1. The vibrating source may change its natural frequency so that it vibrates more or less in the period of the system. Most wind instruments fall into this group, the pitch of the note depending not on the natural frequency of the reed but upon the natural frequency of the air column to which it is coupled. The vibrator serves to agitate the air column within the resonating tube. The resonator alone determines the pitch of the note. This means that the natural frequency of the resonating tube must be varied for every note. In most wind instruments this change is made by lengthening the tube by depressing valves or closing finger holes which alter the natural frequency of the tube.
2. The vibrating source may compel the system to vibrate at a frequency related to its own, regardless of the natural frequency of the resonator. If the stem of a vibrating tuning fork is held on a flat surface, that surface will vibrate as long as the fork is held there, and it will vibrate in the frequency of the fork. The diaphragm of a loudspeaker is forced to vibrate with the frequency of the electric current. As soon as the current is stopped the vibration will cease. All methods of recording and reproducing sound depend upon forced vibration. The ear drum is forced into vibration by the propagated sound wave. When a stringed instrument is played, the sound comes mainly from the body of the instrument and partly from the contained air. The strings of these vibrating instruments of a particular mass, length, and tension give the pitch of the note and are then forced to vibrate.
The resonance of the human voice is representative of forced vibration; to understand it, one must know something about the selectivity of resonance and how the partials of the tone are affected.
Resonance occurs when a resonator is in tune with its vibrator—when compression from the sound source coincides with compression from the resonator and when the rarefaction from the sound source coincides with rarefaction of the resonator. The characteristic tendency of a resonator is to amplify or reinforce those tones with which it is compatible and to dampen or eliminate those tones with which it is not compatible. Thus, the quality of a vocalized tone depends upon those partials passed (reinforced) by the resonating system.
The resonator itself is usually a cavity or a sounding board; both respond well to certain frequencies. Such a resonator makes possible a more effective transfer of energy from the vibrator, producing the sound to the surrounding air with greater energy than if the resonator were not present. [However, resonators do not add energy, they make a more efficient transfer of vibrations from the vibrating source. The efficiency of transfer gives the illusion of amplification of the sound.]
The selectivity of a sounding board is fixed. Its size, mass and surface determine the amplification of sound. The selectivity of cavities, however, varies as the orifice, cavity area and surface vary.11
Cavity Resonators
A single resonator is able to respond to either sympathetic or forced vibration.
A resonator vibrating in tune with the frequency of its generator is in sympathetic vibration. If the generated sound is withdrawn, the resonator will continue to vibrate in its own natural free period.
A resonator that is not in tune with its generator but is forced to vibrate by the generated sound is in forced vibration. When the generated sound has been removed the resonator continues to vibrate for a very short period, which is determined by the damping factor. (This period represents the time rate of amplitude decay.)
When a resonator is compelled to vibrate, forced vibration affects both the resonator and the generator. The resultant frequency is somewhere between the natural frequencies of both resonator and generator. The influence of the generator is stronger than that of the resonator upon the generator.
Resonators of brass and other metals may be sharply tuned to respond to only a few frequencies. A Helmholtz resonator—spherical and of brass—is such a resonator. Soft-walled resonators, which are fibrous or flesh, can respond to many different frequencies and are able to reproduce many different gradations of tone quality.
THE DISSIPATION OF SOUND ENERGY—DAMPING
When a vibrator imparts its energy to an elastic medium the energy appears in the vibration of that medium. A loss of energy will steadily decrease the amplitude of the vibrations. Some energy is expended in overcoming the resistance of the air to the vibrating material, and some is dissipated into the air as sound waves. As the energy is lost, the vibrations cease to be heard. The time rate at which the energy is dissipated in a vibrating body is known as damping. This time rate varies considerably.
When a tuning fork is struck and held in the air it will vibrate for a long time. The vibrating fork loses its energy slowly and therefore is lightly damped. If it is placed against a solid object, the energy of the vibrating body is dissipated rapidly in setting the solid object in motion, and it is heavily damped. A heavily damped vibrator will amplify the original vibrations to a greater extent than will a lightly damped vibrator.
Damping and Cavity Resonance
A sharply tuned resonator is selective; that is, it will respond only to a few frequencies within the complex tone emitted by the vibrator. A sharply tuned resonator is lightly damped and requires a longer period for its vibrations to build up and die away, but the sharply tuned resonator will amplify the vibrations emitted by the generator to a greater extent than will the heavily damped resonator.
A resonator that is not sharply tuned is not selective and responds to many frequencies emitted by the vibrator with little or no amplification of them. Such a resonator permits the vibrations to build up and dissipate rapidly and is heavily damped. When the resonator is so heavily damped that the sound ceases the moment that it is energized the resonator is said to be critically damped. For this reason an acoustic system can be sharply tuned only by reducing the damping, and light clamping can be obtained only at the expense of selectivity.12 The human resonance cavities are almost critically damped. That is, vibrations of air cease immediately and also start vibrating with equal promptness. The cavities of the vocal tract are responsive to a very wide range of frequencies.
In the human resonating system damping depends upon the size of the cavity. As the oral and pharyngeal cavities are enlarged, walls of soft flesh become more taut, clamping decreases and vowels become brighter. The veiledtone effect used for singing piano passages is accomplished by enlarging the cavity and relaxing the walls to permit them to become soft and flaccid, thus causing the cavity to become less responsive to the high partials within the laryngeal and cavity tones. (See “Closed and Open Tones,” p. 89.)
In considering the variability of cavity resonators, one should remember that the effects obtained with the spherical brass Helmholtz resonator (Fig. 65) are analogous to those obtained with the head, mouth, pharynx, and nasal cavities. These cavities are selective according to the following laws:
1. The larger the cavity of the resonator, the lower the frequency to which it will resonate, provided the dimensions of the aperture and the neck remain constant.
Fig. 65. Helmholtz Resonators
The pair of resonators on the left have the same size aperture and length of neck. The size of the cavity varies. The resonators in the middle have the same size cavity and length of neck, but the caliber of the aperture varies. The resonators on the right have the same size cavity and caliber of aperture. The variation is in the length of neck.
2. The larger the aperture of the cavity, the higher the frequency to which it will resonate, provided the cavity and length of neck remain constant.
3. The longer the neck of the aperture, the lower the frequency to which it will resonate, provided the dimensions of the aperture and cavity remain constant.
4. The softer the texture of the cavity walls, the more the cavity emphasizes low overtones.
Generally, the longer and narrower the neck of a cavity resonator, the lower the pitch to which it will respond, and because damping is reduced to a minimum, the more selective its tuning will be. Such a resonator will respond at its maximum effect to a very small deviation in frequency. Conversely, the shorter and wider the neck, the higher the frequency to which it will respond, and because damping is heavy, the wider the range of frequencies to which the resonator will respond significantly.
The significance of these laws involves lip-rounding in singing; the presence or absence of neck (lips) and aperture (mouth opening) directly affect the accurate reproduction of any of the basic vowels in singing.
The quality or timbre of a vocalized sound is determined by the number, intensity, and distribution of the partials which compose it. Such a relationship depends upon the following:
1. The nature of the laryngeal vibration.
2. The changes made in that sound as it passes through the resonating system.
The combination of these two factors causes some partials to be reinforced, others to be weakened. Awareness of instrumental differences and ability to identify the voice of an individual depends upon the constant fluctuation of the overtone structure within either sound.
Speech forms in song are produced by constantly shifting the cavities through articulation, thereby causing variations in the overtone structure which are recognized as vowels.
Nature of Laryngeal Vibrations
Every vocalized tone is a complex one in which the high frequencies (harmonics) are simple multiples of the lowest or fundamental frequency. If the fundamental is 440 cps, its second harmonic will be 440 multiplied by two or 880 cps; the third harmonic will be 440 multiplied by three or 1,320 cps, and so on. Since a complex sound is composed of many simple sounds, the vocal spectrum can become quite involved, as it is in the examples of vocal acoustic spectra in Figs. 66 and 67.
Fant’s theory of laryngeal vibrations is the one most frequently accepted by leading voice scientists and phoneticians. His statement regarding “the voice source” follows:
The primary source of energy for the production of voiced sounds is the contraction of the respiratory muscles resulting in an over pressure in the lungs and thus in an air flow that is periodically varied in magnitude owing to the opening and closing of the valve folds over each fundamental voice period. The acoustic function of these folds should not be regarded as an analogy to vibrating membranes.
All sound waves are pressure variations, and the varying pressures resulting from bursts of air escaping through the glottis are the glottal sound waves, not the vibration of the mechanical folds themselves. Actually, pressure variations cause a modulation of the respiratory air stream but do not generate sound oscillations of a significant magnitude by a direct conversion of mechanical vibration to sound.13
Fig. 66. Harmonic Analysis of Sung Sounds Showing the Vowel Spectra [i], [ɑ], [u].
The wave forms in Fig. 67 are calculated wave forms and spectrum envelopes of the voice source. Curves I and II have been derived from area measures taken from the Bell Telephone Laboratory film of the vocal cord. Curve III has been derived from the glottis area versus time pictures given by Chiba and Kajiyama. The wave marked o is a wave adapted by Gunnar Fant.
Fig. 67. Glottal Wave Form Suggested by Fant
Source: Gunnar M. C. Fant, Acoustic Theory of Speech Production (The Hague: Netherlands: Mouton & Co., 1960)
The wave forms in Fig. 69 are photographs taken from a dual beam oscilloscope of laryngeal sounds recorded by two microphones.14 The first, a probe microphone, was placed within the vestibule near the level of the vocal folds (top wave). The second, a condenser microphone, was placed at a position six inches in front of the mouth (bottom wave). The probe microphone tube was cut at one-centimeter intervals, and six vowels were recorded and photographed at each interval (Fig. 69).
These wave conformations at the laryngeal level within the vestibule, greatly resemble the glottal wave forms in Fig. 67. Note the complexity of the phonated sound after it has passed through the resonating cavities of the pharynx and oral cavities. Note also that each of these waves seems to be a compression wave with no rarefaction phase. This wave form can be described as an addition of sine waves, each of lesser amplitude, emitted by the air stream of the glottis. Since these glottal wave forms are the sum of their sinusoids, apparently each of these sine waves has contributed to the glottal wave conformation by reinforcement, interference, and phase.
Fig. 68. A Study of the Function of the Primary Resonating Area
Source: Kenneth L. Davis, Jr., “A Study of the Function of the Primary Resonating Areas and Their Relation to the Third Formant in the Singing Tone”(Mus. D. dissertation, School of Music, Indiana University, 1964).
Fig. 69. Wave Forms of Glottal Sounds
A, Vowel [i], Probe Microphone—16 cm; B, Vowel [e], Probe Microphone—10 cm; C, Vowel [ɑ], Probe Microphone—16 cm; D, Vowel [ɑ], Probe Microphone—10 cm; E, Vowel [u], Probe Microphone—10 cm; F, Vowel [u], Probe Microphone—10 cm. Source: Kenneth L. Davis, Jr., “A Study of the Function of the Primary Resonating Areas and Their Relation to the Third Formant in the Singing Tone’’ (Mus. D. dissertation, School of Music, Indiana University, 1964).
Such a glottal tone can be described by its spectrum which may be schematized as in Fig. 70. This laryngeal spectrum does not conform to any particular sound; the ear would interpret it as some kind of buzz. The length of each vertical line indicates the strength of each overtone.
Fig. 70. An Explanation of Formants
A, the wave shape of a pulse train; B, a spectrum of a train of short pulses; C, the frequency response of a simple resonator; D and E, the wave shape and the spectrum, respectively, of a sound wave produced when a series of pulses, like those in A, are applied to a resonator whose frequency response is shown in C. Source: The Speech Chain by P. B. Denes and E.N. Pinson, published by Bell Telephone Laboratories, Inc. (1963).
FORMANTS AND THE HUMAN RESONATING SYSTEM
The Creation of Vowel Formants
The human resonating system, described physiologically on page 73, is a series of air-filled cavities which act as resonators. Each cavity has its own natural period of vibration which will respond as a sinusoidal tone when it is excited by the same frequency emitted by the vibrator.
As the sound passes through the resonating cavities of the throat and mouth, the profile of the spectrum changes, since each cavity resonates to some of the tones in the spectrum more readily than to others and each adds its own characteristics to such tones. This reinforcement gives the partials greater energy at the point of cavity resonance. These points of greater energy are called formants.
In passing through the resonating system of the throat and mouth, the partials in the harmonic sequence do not change from their original location in the tonal spectrum; rather, some are strengthened and reinforced by cavity resonance, while others are weakened or damped out (Fig. 71).
Fig. 71. Vowel Sound [i]
The laryngeal tonal spectrum is altered by cavity resonances creating formants.
The values of the natural frequencies of the resonating cavities within the vocal tract are determined by their shape; as a result, as the shape of the tract is altered the amplitudes of the partials within the spectrum will be greater at different frequencies. Thus, every configuration of the total vocal tract has its own set of characteristic formant frequencies which gives to the laryngeal sound a particular vowel quality.
The resonance frequency of any cavity is not necessarily equal to the frequency of any partial of the spectrum. The frequencies of the formants need not be the same as those of the partials, but they may coincide. The formant frequencies are determined by the configuration of the total vocal tract as a series of resonators while the partials within the spectrum are determined by the vocal folds. The vocal tract and the vocal folds can change independently of each other.
When the cavities of the throat and mouth remain fixed, a laryngeal sound of lower pitch may be passed through the system, and the vowel characteristic will remain the same because the energy within each formant has not varied. Only the fundamental will be lower since it is determined by the frequency of the vibration of the vocal folds (Fig. 72).
Fig. 72. Octave Drop, Bass Voice Singing [ε] on Two Pitches, C-523 to C-251
Fig. 73. The Wave Shape and Corresponding Spectra of the Vowel [ɑ]
A, Pronounced at a Vocal Fold Frequency of 90 cps; B, pronounced at a Vocal Fold Frequency of 150 cps. Source: The Speech Chain by P. B. Denes and E. N. Pinson, published by Bell Telephone Laboratories, Inc. (1963).
Cavity-Coupling Defined
A coupled system is composed of a generator and any number of resonators that could vibrate independently if they were not joined together. If any part of the coupled system is set into vibration, another part of the system will be forced to vibrate. This second resonator modifies the vibration of the first. If a third resonator is added, it will exert a periodic force upon the other two, thus modifying the total system.15
Whether a system is tightly coupled or loosely coupled depends upon the degree of constriction at the orifices which join such a system.
A loosely coupled system is one in which the influence exerted by one part of the system upon another is small. In such a system each resonator tends to vibrate near its own natural frequency. Such a condition is evidenced in the tense vowels [i], [ē], [o], and [u]. When both back and front orifices are small, and the cavities are divided into clearly defined resonating areas (Fig. 74A).
Fig. 74A. A Closed Vowel, [u], Loosely Coupled
Fig. 74B. An Open Vowel, [a], Tightly Coupled
A tightly coupled system is one in which a strong influence is exerted upon one part of the system by another. This system, displaying the characteristics of a single resonating system, is observable in all vowels except the high frontals as they migrate from closed to open position in both crescendo and pitch change. As the back orifice is enlarged, the cavity-coupling tends to become a single or tightly coupled system rather than a loosely coupled system; [a] is the most tightly coupled of all phonemes since it is made with little or no tongue stricture (central orifice) and an almost neutral lip position (front orifice) (Fig. 74B). The physiological analysis of each vowel sound is given in Chapter Ten.
Origin of the Formant in a Coupled System
The experiments of Paget and Russell16 have suggested that the mouth and the pharynx must be considered a double resonator. This concept, first called multiple resonance by Wheatstone in 1834, was investigated further by Helmholtz17 when he concluded that the vowels [ɑ] as in calm, [ͻ] as in more, [u] as in who, and [ɒ] as in not resulted from single resonances but that the vowels [æ] as in hat, [ε] as in men, and [i] as in eat resulted from double resonances—that is, from two separate notes. One is produced in the cavity behind the tongue, and the other is caused by a constriction at the mid-point of the tongue and the hard palate as in a bottle with a narrow neck.
In 1890 and 1930 Paget recorded the fact that R. J. Lloyd suggested that every vowel derives one chief resonance from the anterior part of its articulation and another from the posterior or pharyngeal part. Paget made many models of double resonance cavities to determine the nature of cavity coupling and finally concluded:
By this time the principle of vowel formation was becoming clear, that there must be, in effect, two resonating cavities, each producing a separate resonance; provided these resonances are correct, neither the exact shape, cross section or length of the cavities are material. The two cavities behave like two Helmholtz resonators joined together in series.18
D. C. Miller in 1916 confirmed Helmholtz’s views that some of the vowels are the result of double resonance.19 G. O. Russell made a radiographic study in 1928 of the physiological causes of vowel quality differences.20 This study revealed that, as the tongue moves into position to form a vowel sound, the pharynx and the oral cavity are altered. In this manner, the vocal tract may be regarded as a coupled resonator joined by a resistance point or neck.
Dunn’s statement, “The vocal tract may be thought of as a series of cylindrical sections, with acoustical mass and compliance uniformly distributed along each section,”21 is illustrated in Fig. 75 in which the oral and pharyngeal cavities, with their connecting passage, are compared with joined cylindrical sections. Each change in the size and shape of the vocal tract corresponds to a change in the acoustical system. Spectrograms reveal cavity resonances of three formants for each vowel.
Fig. 75. Vocal Tract Configurations Showing Cavity Relationships and Spectra for the Vowels [i], [ͻ], and [U]
Many scholars have regarded the vocal cavities as double resonators, but Crandall22 is probably the first to explain theoretically the relationships between the two characteristic frequencies of a vowel and the shape and size of the vocal cavity on the basis of the double resonator theory.
According to Crandall, each sung vocal tone has ten to fifteen prominent formants; only the first two are needed to analyze the vowel sound. In just which area of the phonatory system these formants originate has long been a question among physicists. Most of them have assumed that the lowest (first) formant developed in the larger pharyngeal area, and the higher (second) formant, in the smaller oral cavity.
Most voice scientists tend to oversimplify the function of the coupled resonating system in the human voice. Studies by Fant 23 have substantiated Dunn’s earlier conclusion24 that for the frontal vowels the first formant originates in the pharynx and the second formant originates in the oral cavity.
For the vowel [ɑ], the first formant depends equally upon the front and back cavity, while the second formant depends more upon the front than upon the back cavity.
For the vowels [o] and [u], the first formant depends more upon the lip section of the front cavity than on the tongue section. The second formants of these two vowels are more dependent upon the front cavity than the back. The third formant of all the vowels depends upon the area in front of the tongue constriction or medial orifice.
In the case of the very open vowels a division of the cavity system above the larynx into separate parts loses some of its significance. The more open a vowel is, the less well the separate parts of the vocal cavities act as separate resonators.25
In studying complex sounds, one needs to know the intensity and frequency of each component or partial within a given sound at a specific point of time.
The spectrograph or sonograph26 is an analyzing instrument that (a) provides an instantaneous record of the composition of any selected sound at any certain instant and (b) isolates and records only those frequencies that are essential to the recognition and understanding of the sung sound.
Fig. 76. Spectogram Displaying Harmonic Analysis from 0 to 8,000 cps of the Vowel [e]
The six formants are caused by various cavity formations in the phonatory tract. The undulated pattern is caused by the vibrato.
When using narrow pass-band filtering the resulting spectrograph, consists of a series of parallel lines, and each line represents an overtone or partial. Analysis can be made of complex tones up to 8,000 cps (Fig. 76).
When a complex vocalized sound is passed through a sonograph and broad-band filtering is used, the parallel lines burned by the stylus upon the paper are much broader and will reveal a concentration of energy among certain groups of partials in the tonal spectrum (Figs. 77-80).
PHYSIOLOGICAL CHANGE AND FORMANT MOVEMENT
The formants of women have a higher frequency than those of men, and the formants of children are higher than those of both men and women. However, the ratio of frequency change between each phoneme is the same for men, women, and children.
When a phoneme is altered during the singing of a single pitch, the physiological cause for formant movement is indicated on the formant charts (Figs. 77-80).
First Formant Movement
In singing the high frontal vowels [i], [ē], and [I] the first formant is lowered by forming a firm tongue occlusion which creates a secure inner orifice. If this inner orifice disintegrates, the first formant is raised (Fig. 77).
To sing the vowel [i] near the basic vowel position, within the stable vowel pitch range (high voice F1 350, F2 2,500; low voice F1 300, F2 2,100), a singer needs to establish a firm occlusion of the tongue blade to the alveolar ridge. In controlling the position of the first formant for the vowel [i], the firmness of the inner orifice is more important than spreading the lips or increasing the size of the pharyngeal cavity.
In the middle and low frontal vowels the first formant movement depends upon the volume of the pharyngeal cavity and lip-spreading.
To sing the vowels [e], [ε], [æ], [a], [ɑ], within the suggested frequency areas, the first formant is raised by progressively lowering the tongue and jaw positions as indicated in Chapter Nine. This action creates a change in the coupled system by decreasing the volume of the pharyngeal cavity and increasing the volume of the oral cavity (Fig. 78).
In the back vowels the first formant movement depends upon lip-rounding. To sing the vowels [ͻ], [o], [U], and [u] within the suggested frequency areas, the first formant is lowered by progressively increasing the lip-rounding (Fig. 78).
Fig. 77A. Spectrogram Showing Rise in Frequency of Formant One
Fig. 77B. Formant Chart Showing Rise in Frequency of Formant One
When tongue occlusion is lessened in passing from a closed tense to a less tense [i], positions on the formant chart indicate vowel [i] migration caused by this action.
Fig. 78A. Spectrogram Showing Formant Movements of Frontal Vowels
Fig. 78B. Formant Chart Showing Formant Movements of Frontal Vowels
Fig. 78C. Spectrogram Showing Format Movement of Back Vowels
Fig. 78D. Formant Chart Showing Formant Movements of Back Vowels
Fig. 79A. Spectrogram Showing the Effect of Lip-Rounding and Tongue-Backing upon the Frontal Vowels
Fig. 79B. Formant Chart Showing the Effect of Lip-Rounding and Tongue-Backing upon the Frontal Vowels
The Second Formant Movement
Increasing the volume of the oral cavity lowers the second formant, and decreasing it raises the second formant. To increase the volume, the singer lowers either the tongue or the jaw. The basic vowel and quality alternate positions established in this book are formed by moving both tongue and jaw to attain maximum sonority of a particular cavity adjustment for singing (Fig. 78A).
Lip-rounding lowers the second formant, and lip-spreading raises it. Tongue-backing lowers the second formant, and tongue-fronting raises it.
The obvious formant movement when the vowel [i] migrates to the French vowel [y] is caused by a change from lip-spreading for [i] to lip-rounding for [y]. As the vowel [y] migrates to the vowel [u], the formant alterations are achieved by tongue-backing and lip-rounding. The same lip and tongue movements cause the formant alterations of [e] to [ø] to [ô]; [ε] to [œ] to [ͻ] (Figs. 77A, 77B, 79A, and 79B). “A direct relation exists between the second formant lowering and the front cavity lengthening.”27
Pitch Change and Vowel Formant Movement
Vowel formants, characteristic concentrations of energy within the tonal spectrum, are found in limited regions of frequency within which they must remain. If the pitch to be sung is changed so much as to go above the regions of formant recognition, the resultant sound will be heard as some other vowel.28 The alteration of the coupled system which accompanies such a change is described in the kinesiologic analysis of each phoneme in Chapter Ten. (The areas for vowel stability are shown in Fig. 102, p. 230.)
Fig. 80. Spectrograms of the Basic and Quality Alternate Vowels
as Sung by Male Voice and Female Voice
Intensity and Vowel Formant Movement
All frontal vowels may be sung with more extreme lip-spreading and with an extreme frontal tongue position at pianissimo and piano levels when they are sung within their limited pitch areas (Fig. 102, p. 230). As intensity is increased above the mezzo piano level, so must the size of the cavity and orifice increase. The cavity coupling is changed somewhat by moving the point of constriction of the tongue and palate slightly downward and backward, thereby creating a larger frontal cavity and less constriction at the inner orifice. This action causes the first formant to rise and the second formant to lower, and the ear hears this cavity alteration as a migration of each frontal vowel to or toward a phoneme directly below it, depending upon the increase of intensity and the lowering of the mandible (Fig. 22 and Records 1-4, [i] Band 7, [a] Band 13).
This same action occurs with the singing of the back vowels. An increase in the volume of the cavities and the separation of the point of constriction at the inner orifice causes these vowels to migrate toward the neutral vowel [ʌ] and [U] (Fig. 103 and Records 1-4, [ͻ] Band 14, [u] 19).
___________________
* Some teachers and singers who seek answers to vocal problems may consider the physical area a subjcct of erudition. Those who regard it thus do so only because they fail to realize that every physical law herein presented may be interpreted demonstrably as a function within the singing act.
A singer’s only product is sound and his thorough understanding of the laws of sound will permit him to form more meaningful concepts that are directly related to the adjustment of his resonating system as he sings. For the teacher, such understanding will provide a diagnostic implement that will lend his judgment stability. Both singer and teacher must ultimately translate these laws of sound into the psychophysical area of disciplined sensation.
* Since phase is a consideration of comparison only when more than one wave is present, the term independent variable represents the “reference” wave of energy with which all other waves are compared.
We use cookies to analyze our traffic. Please decide if you are willing to accept cookies from our website. You can change this setting anytime in Privacy Settings.