When I was giving a semiotic definition of language, I introduced the notion of the phoneme, which I defined as a diacritic used to distinguish different signs. According to this definition, a phoneme is the first term of the binary relation ‘diacritic of’, whose second term is a sign. The notion of the phoneme was motivated by the analysis of natural languages as a very large sign system. It was shown that no natural language can do without a limited number of diacritics necessary for the production of an unlimited number of signs.
The notion of the phoneme was introduced without an analysis of its relation to the notion of sound. At that stage of our discussion, it was enough to justify the notion of the phoneme by characterizing it merely as a diacritic without regard to the most difficult problem of identity, which is central to modern phonology.
It is now time to tackle the problem of identity in phonology. I will start with an analysis of speech sounds that does not presuppose the notion of the phoneme. I will try to show, step by step, how the solution to the problem of identity of speech sounds calls for the notion of the phoneme. In this way we will get new insights into the nature of objects denoted by the term phoneme. A new definition of the phoneme will be given, and it will reflect a deeper understanding of this intriguing notion. The new definition will supplement the definition of the phoneme as a diacritic.
Let us now turn to the analysis of speech.
Phoneticians recognize that speakers segment speech into series of discrete elements called phonetic segments, speech sounds, or phones. A phonetic study of languages provides an inventory and description of the occurring phonetic segments.
Any phonetic transcription is an abstraction from the physical reality of speech. Although speakers segment speech into discrete elements, it is actually a continuous flow. That is why the same speech sound is pronounced differently on different occasions. Compare the words tea and too. The initial consonants in these words are denoted by the same letter t, but they are very different. Try to pronounce only the initial consonants. In one case, the articulation of sound t takes place in a vocal tract that is already prepared to make the front vowel i; in the other, the vocal tract is already prepared for the back vowel u. Because of the different shapes of the vocal tract, the two articulations produce different acoustic results.
Take acoustic records of the pronunciation of the words tea and too on speech spectrograms. You will not be able to find a point of an abrupt change from the consonant t to the vowel i or from t to u. It will be impossible to draw a line between the two successive sounds and say that everything before is a pure consonant and everything after is a pure vowel.
Because the sound flow is a continuum, there is a great intrinsic variability among speech sounds. Speech sounds are modified by their environments. That is a basic universal assumption concerning the nature of speech sounds, which I call the Principle of Speech Sound Variability.
The question is, How do we know which sounds are the same in a given language? What does it mean to say that two speech sounds are the same? What does it mean, for example, to say that the word tight has the same initial and final consonant? These two consonants are not the same: the initial consonant is aspirated, while the final one is not aspirated. In order to represent this difference, two symbols are needed: th for the aspirated consonant and t- for the unaspirated consonant.
Actually, speakers of a given language classify sounds as the same and different unconsciously. One might assume that this classification is based on a natural physical similarity between sounds: two sounds are the same if they belong to the same sound type, and they are different if they belong to different sound types. But this assumption would conflict with the phenomenon that speakers of different languages may classify similar sets of sounds in quite a different way. Consider the following set of sounds:
The superscripts h and - designate aspirated and nonaspirated character of sounds respectively.
Speakers of English classify these sounds as three sound types: p, t, and k, while speakers of Eastern Armenian classify all six sounds as different.1 Similarly, speakers of Old Greek would distinguish six sounds here. These different ways of classification are reflected in the alphabets of these languages: while English has only three letters for this set of sounds, Eastern Armenian and Old Greek have six letters corresponding to the distinction of the aspirated and nonaspirated stops.
It should be noted that in spite of all their imperfections, the alphabets of various languages more or less faithfully reflect classifications of sounds as the same or different, made unconsciously by speakers of these languages.
The phenomenon that sounds belonging to similar sound sets may be classified differently by speakers of different languages, I call phonic relativity. This phenomenon needs to be explained. To explain phonic relativity is to answer the question, What determines identity of speech sounds in a given language?
The problem of the identity of speech sounds is one of the most profound problems of theoretical linguistics. This problem is especially complicated; it involves a series of difficulties that cannot be solved by conventional methods of abstraction. That is why it cannot so far be considered fully resolved.
In what follows I will present my own solution of the problem of the identity of speech sounds; I then will compare it with the solutions proposed by others and explain why I consider them inadequate.
To explain phonic relativity, we must search for some fundamental property or properties of language from which this phenomenon could be deduced.
In our search we must rely on the Principle of Differentiation of Signs. This principle holds that two different linguistic signs must be differentiated by different sequences of diacritics. For example, din and bin differ by alternating sounds d and b, din and den differ by alternating sounds i and e, din and dim differ by alternating sounds n and m. In order to demonstrate that a given set of sounds can be used to distinguish different signs, we have to fix a set of words that are distinguished from one another by alternative choices of one sound from this set. Consider the following two sets of words:
These sets of words show that sounds ph, th, kh before the vowel u and sounds p- t-, k- after the vowel i are used to distinguish different words.
Let me now introduce the notion of the position of sounds. The position of a sound is its place in a sound sequence determined by its environment. Thus, in pin the position of p is its place before the vowel i, the position of n is its place after the vowel i, the position of i is its place between p and n. So in (2) we have two positions for the stops: before the vowel u in the first set of words and after the vowel i in the second set of words:
In (3) we have two ordered sets of sounds, ph:th:kh and p-:t-:k-. The relations that order the sets of sounds in each position I call concrete distinctive oppositions, and the terms of these relations I call concrete phonemes. Using this terminology, we can say that in (3) we have two ordered sets of concrete phonemes: ph, th kh and p-, t-, k-.
The essential difference between concrete speech sounds viewed as merely physical and speech sounds viewed as concrete phonemes is that concrete phonemes are terms of concrete relations. The label concrete distinctive oppositions characterizes a specific semiotic function of these relations as relevant differences between signs.
Various properties of concrete distinctive oppositions I will call concrete distinctive features, and I will describe them in articulatory terms. Thus, in (3) the distinctive oppositions can be characterized as follows:
In this diagram the symbol L means ‘labial’, A means ‘alveolar’, V means ‘velar’, and the superscripts mean that the concrete distinctive features in the two positions are different because of different environments.
Any set of concrete phonemes ordered by concrete distinctive oppositions I call a paradigmatic class of concrete phonemes.
Having introduced new notions, I must restate the problem of the identity of speech sounds in terms of these notions. Now the problem has to be restated as follows: What determines the identity of concrete phonemes that occur in different positions?
Since concrete phonemes are relational entities, an identity of concrete phonemes must be a counterpart of an identity of concrete distinctive oppositions by which paradigmatic classes of concrete phonemes are ordered. Therefore, the problem of the identity of concrete phonemes amounts to the problem, What determines the identity of the structure (isomorphism) of paradigmatic classes of concrete phonemes?
In coming to grips with this problem, we discover the Law of Phonemic Identity:
Two paradigmatic classes of concrete phonemes Ki and Kj are identical if their relational structure can be put into a one-one correspondence, so that to each concrete phoneme x of Ki, there corresponds a concrete phoneme y of Kj, and to each concrete distinctive opposition r of Ki, there corresponds a concrete distinctive opposition s of Kj and vice versa. There is a one-one correspondence between concrete phonemes x and y, and between concrete distinctive oppositions r and s, if the difference between x and y and between r and s is reducible solely to the effect of positional variation.
Let us now try to explain the phenomenon of phonic relativity by the Law of Phonemic Identity.
Let us turn back to (1). Why is the set of six sounds presented in (1) classified as three sounds in English and as six sounds in Eastern Armenian or Old Greek? Because in English ph and p-, th and t-, kh and k- are concrete phonemes belonging to two paradigms, the differences between the terms of these pairs are reducible to the effect of positional variation: the stops are aspirated in the syllable-initial positions and nonaspirated in others. In other words, the difference between the aspirated and nonaspirated stops in English is irrelevant from the point of view of their distinctive functions. In Eastern Armenian and Old Greek, however, all the above six sounds are concrete phonemes that occur in the same positions, that is, belong to the same paradigm so that the distinction between the aspirated and nonaspirated stops is relevant in these languages. Here is an example demonstrating the relevance of the distinction between the aspirated and nonaspirated stops in Armenian:
|(5)||phajt ‘stick’: pajt ‘horseshoe’|
|thoy ‘let’ : toy ‘line’|
|khujr ‘sister’: kujr ‘blind’|
The foregoing shows that the problem of identity of speech sounds is solved by splitting the concept of the speech sounds into two different concepts: the speech sound proper and the concrete phoneme. The identity of speech sounds is determined by their physical analysis, while the identity of concrete phonemes is based on the Law of Phonemic Identity.
Concrete phonemes are physical entities, but the identity of concrete phonemes is logically independent of their physical properties. In order to demonstrate that, we have to show that the Law of Phonemic Identity predicts situations where different concrete phonemes are identical with respect to their physical properties and, reversely, two identical concrete phonemes are completely different with respect to their physical properties. One of the possible situations predicted by this law is this: Assume concrete phonemes A, B, and C in position Pi and concrete phonemes B, C, and D in position Pj so that the difference between A of Pi and B of Pj between B of Pi and C of Pj and between C of Pi and D of Pj is conditioned solely by the differences between the positional environments. Under the Law of Phonemic Identity, there is a one-one correspondence between the concrete phonemes, as shown by the following diagram:
The sign ↔ in the diagram means ‘one-one correspondence’. This hypothetical situation predicted by the law lays bare the essential semiotic properties of concrete phonemes and concrete distinctive oppositions, so that they can be seen in an ideal pure form. What we see is that two different phonemes B in position Pi and B in position Pj (or C in position Pi, and C in Pj) are identical with respect to their physical properties, and, reversely, two identical concrete phonemes A in position Pi and B in position Pj (or B in Pi and C in Pj or C in Pi and D in Pj) are completely different with respect to their physical properties. By the same token, two different distinctive oppositions B:C in Pi, and B:C in Pj are identical with respect to their physical properties, and, reversely, two identical distinctive oppositions A:B in Pi, and B:C in Pj (or B:C in Pi, and C:D in Pj) are completely different with respect to their physical properties.
This abstract hypothetical situation may have counterparts in concrete languages. Consider the following example from Danish.
The Danish sounds t and d occur in syllable-initial positions, for example, in tag ‘roof’ and dag ‘day’. In syllable-final positions, however, d and ð occur, rather than t and d, for example, in had ‘hat’ and hað ‘hate’. The distribution of the sounds can be presented by the following diagram:
How must we classify these sounds?
From a physical point of view, there are three different sounds here: t, d, and ð. From a functional point of view, however, we must suggest a different classification. Since from a functional point of view sounds are always concrete phonemes, that is, terms of some concrete distinctive opposition, the first question we must ask is this: What are the concrete distinctive oppositions whose terms are the above sounds as concrete phonemes?
In order to answer this question, we must bear in mind that concrete distinctive oppositions are characterized by some set of concrete distinctive features, in terms of some acoustic-articulatory properties.
Let us start with the syllable-initial position. The difference between t and d is that the first sound is tense and the second sound is lax. So, the opposition t:d in syllable-initial position is characterized by the opposition of the phonetic features tense:lax. The sound t possesses the phonetic feature tense, and the sound d possesses the phonetic feature lax, and these features characterize the concrete distinctive opposition of both sounds as concrete phonemes.
Let us now consider the sounds in the syllable-final position. What concrete distinctive features characterize the opposition d:ð in the syllable-final position?
The sound d is lax, and the sound ð is fricative. So, the opposition d:ð is characterized by the opposition of the phonetic features lax:fricative.
Let us compare the opposition tense:lax in the syllable-initial position with the opposition lax:fricative in the syllable-final position. We discover an identity of structure between the two oppositions: namely, the relation of the sound d to the sound ð in the syllable-final position is the same as the relation of the sound t to the sound d in the syllable-initial position. The point is that in the syllable-final position, ð can be considered lax with respect to d. As a matter of fact, although d is lax with respect to tense t, d must be considered tense with respect to ð.
We see that from a functional point of view, the distinctive feature lax in the syllable-final position is the same as the distinctive feature tense in the syllable-initial position, and the distinctive feature fricative in the syllable-final position is the same as the distinctive feature lax in the syllable-initial position.
If this analysis of the functional identity of the distinctive features is correct, we arrive at the following conclusion: from a functional point of view, d in the syllable-final position is identical with t in the syllable-initial position, and ð in the syllable-final position is identical with d in the syllable-initial position. The functional identity of the concrete phonemes is a counterpart of the functional identity of their distinctive features (that is, a counterpart of the functional identity of the distinctive oppositions in the syllable-final and the syllable-initial positions).
The results of our analysis can be represented by the following diagram:
The arrows in the diagram mean the functional identity of the respective concrete phonemes.
We can present a similar diagram for the concrete distinctive features of the concrete phonemes presented in (8):
The symbol T means ‘tense’, the symbol L means ‘lax’, and the symbol F means ‘fricative’. The arrows in the diagram mean the functional identity of the respective concrete distinctive features.
We see that the difference between t and d and between d and ð in the syllable-initial and the syllable-final positions is reducible to the effect of the physical variations of these concrete phonemes in these two positions. By the same token, the difference between the distinctive feature tense and the distinctive feature lax and between the distinctive feature lax and the distinctive feature fricative in the syllable-initial and the syllable-final positions is reducible to positional variations.
Here is one more example, which lays bare the complete logical independence between the functional and physical identity of concrete phonemes.
Consider the shift in the degree of the openness of vowels in Danish, as seen in the following diagram:
There are four contrastive degrees of vowel openness in Danish. The four front unrounded vowels are normally realized (indicated in the diagram as before n) as i,e,ɛ,a. However, before r there occurs a uniform shift by one degree of openness, yielding the paradigmatic class e,ɛ,a,a. While this change modified the physical characteristic of each vowel, the relation between vowels remained constant. The vowel i in the position before n and the vowel e in the position before r are functionally identical, because they are the highest vowels in their respective positions. The vowel e before n and the vowel e before r are functionally not identical, because the first e has the second degree of openness, while the second e has the first degree of openness. We arrive at the conclusion that the differences between concrete phonemes that are in one-one correspondence are reducible solely to the effects of the physical variations of the concrete phonemes in the positions before n and r.
An analysis of all the relations between vowels in this diagram shows a complete logical independence between the functional and physical identity of concrete phonemes.
The discussion of the problem of the identity of sounds has shown that the notion of the identity of sounds must be split into two logically independent notions: the functional identity of sounds as concrete phonemes and the physical identity of sounds as sounds proper. By the same token, the notion of the identity of phonetic features must be split into two logically independent notions: the functional identity of phonetic features as concrete distinctive features and the physical identity of phonetic features as phonetic features proper.
In accordance with the distinction between the functional and physical identity of sounds, we distinguish two types of objects: classes of functionally identical concrete phonemes and classes of physically identical sounds. Both classes of sounds are logically independent.
Just as we must distinguish between the functional and the physical identity of sounds, so we must distinguish between the functional and the physical identity of phonetic features. In accordance with the distinction between the functional and physical identity of phonetic features, we must distinguish between two types of objects: classes of functionally identical phonetic features as concrete distinctive features, and classes of physically identical phonetic features as phonetic features proper. Both classes of phonetic features are logically independent.
Having introduced important distinctions between concrete phonological and phonetic objects, properties and relations, and classes of these objects, properties and relations, I have to introduce necessary terms corresponding to these distinctions.
A class of concrete phonemes I will call an abstract phoneme.
A class of concrete distinctive oppositions will be called an abstract distinctive opposition.
A class of concrete speech sounds will be called an abstract speech sound, or sound type.
A class of concrete phonetic features will be called an abstract phonetic feature.
As was said above, distinctive features are characteristics of distinctive oppositions in terms of articulatory or acoustic labels. Of course, these labels should be understood not in a physical but in a functional sense.
Our system of phonology has two levels: physical and functional. It can be represented by the following diagram:
It is convenient to treat a class of identical concrete phonemes as occurrences of one and the same phoneme in different positions. In order to do that, we must use a special type of abstraction, which can be called an identifying idealization. It is interesting to compare our approach with what the Russian mathematician Markov writes on the concept of the abstract letter:
The possibility of establishing identity between letters allows us, by way of abstraction by identification, to set up the notion of an abstract letter. The application of that abstraction consists in a given case in that we actually speak of two identical letters as of one and the same letter. For example, instead of saying that in the word ‘identical’ two letters enter, which are identical with ‘i’, we say the letter ‘i’ enters in the word ‘identical’ twice. Here we have set up the concept of an abstract letter ‘i’ and we consider concrete letters, identical to ‘i’, as representatives of this one abstract letter. Abstract letters are letters considered with a precision up to identity. (Markov, 1954: 7-8)
Under identifying idealization, I will treat identical concrete phonemes as occurrences of one and the same abstract phoneme—or simply of one and the same phoneme. For example, the word tɪtɪleɪt ‘titillate’ will be treated as consisting of three occurrences of the phoneme t, two occurrences of the phoneme ɪ, one occurrence of the phoneme l, and one occurrence of the phoneme eɪ.
By the same token, I will treat identical concrete distinctive oppositions or distinctive features as occurrences of one and the same abstract distinctive opposition or abstract distinctive feature—or simply of one and the same distinctive opposition or distinctive feature.
A question arises: How must we denote classes of identical concrete phonemes? Consider, for instance, the example from Danish in (8). There we have two classes of identical concrete phonemes: t,d and d,ð. How must we designate these classes?
There are two methods of designating any class of elements: either we invent special symbols to designate classes of elements, or we choose any element in a given class and regard it as representing this class.
In our case, we can either invent new symbols to designate a given class of identical concrete phonemes, say ‘t’ or ‘d’ or regard one of the sounds as the representative of this class. The second method is more convenient. We choose a concrete phoneme, minimally dependent on its environment, as representing a given class of concrete phonemes. Since in our example the concrete phonemes in the syllable-initial position are less dependent on their environment than the concrete phonemes in the syllable-final position, we regard the concrete phonemes t and d in the syllable-initial position as representing classes of concrete phonemes.
An analogous reasoning applies to the concrete distinctive features tense (T), lax (L), and fricative (F) in the Danish example (9). There we have two classes of functionally identical phonetic features: [T,L] and [L,F]. To denote these classes, we can either invent new symbols or regard the concrete distinctive features T and L in position P 1 as representatives of the respective classes of concrete phonemes.
Let us compare the definition of the phoneme as a diacritic and the definition of the phoneme as a class of identical concrete phonemes, that is, concrete speech sounds that function as terms of distinctive oppositions. These definitions complement each other. The first definition abstracts from the speech flow as a physical continuum with variable sounds; it defines the phoneme without regard to its realization in the speech flow. The second definition is an operational definition, which establishes the relation of the phoneme as a diacritic to speech sounds.
Our discussion has shown that speech sounds possess a dual nature—physical and functional. Every speech sound splits into two objects: a physical object and a functional object. Both types of objects are logically independent. Functional and physical identity are complementary and at the same time mutually exclusive concepts. A sound as a member of a class of physically identical objects is a sound proper, but as a member of a class of functionally identical objects, it is a fundamentally new object—a phoneme. As a unity of contradictory objects—functional and physical—a speech sound is a combined object: a sound/diacritic. This combined object I call the phoneme.
What is the logical relation between the concept of the phoneme and the concept of the sound?
These are heterogeneous concepts that belong to different abstraction levels: functional and physical. Between the phoneme and the sound there can exist neither a relation of class membership nor a relation of class inclusion, since the phoneme and the sound belong to basically different abstraction levels. The relation between the phoneme and the sound is similar not to the relationship between the general notion of the table and individual tables, but to the relation between such notions as, let us say, the notion of commodity to the notion of product. Between the commodity and the product there is neither a relation of class membership nor a relation of class inclusion. The commodity and the product relate to each other as notions that characterize the dual nature of an object. There is an analogous relation between the wave and the particle in physics.
The notion of the sound/diacritic reminds one of such notions as the wave/particle in physics, the commodity as a unity of use-value and exchange-value, that is, as the use-value/exchange-value in economics, etc. I use the metaphorical term centaur concepts to denote this type of notion, because the structure of these notions is reminiscent of centaurs, the fabulous creatures of Greek mythology, half men and half horses.
The requirement of the strict distinction between the functional and physical identity as logically independent notions I call the two-level principle . The theory based on this principle I call the two-level theory of phonology.
Besides the strict distinction between functional and physical identity, we should also distinguish between and not confuse the functional and physical segmentation of sounds. Research in the field of spectrographic analysis of speech sounds has demonstrated that there exists a natural segmentation of the speech flow into sounds independent of functional segmentation. Thus, G. Fant argues that linguistic criteria are irrelevant for determining the division of the speech flow into sounds. He writes:
A basic problem in speech analysis is the degree of divisibility of the speech wave. The most common approach has been to start with the linguistic criteria in terms of a phonemic transcription and to impose this as a basis for division. By a systematic comparison of the sound patterns of different contexts it is possible to make general statements as to what sound features are typical for a particular phoneme. Such studies are necessary, but in order to avoid ambiguities in the labeling of the successive observable sound units, such an investigation should be preceded by an initial process of segmentation and description of the speech wave on the basis of its physical structure, and in terms of phonetic rather than phonemic units. There is no need for extensive investigation of this type in order to establish an objective basis for dealing with phonetic problems. . . . Detailed studies of this type lead to a description of the speech wave as a succession of sound units with fairly distinctly defined boundaries. (Fant, 1960: 21-22)
Some specialists in acoustic phonetics disagree with Fant and claim that the speech flow is continuous; they argue that there are no distinct boundaries between speech sounds: sounds blend into one another, creating transitions from one sound to another. From the standpoint of phonology, the important fact is that speakers segment the speech flow into discrete units, no matter what its acoustic properties are. The perception of speakers is what matters for phonology. If we accept the claim that the speech flow is continuous, then we have to explain the mechanism of perception that makes speakers segment speech flow into discrete units. Obviously, the problem needs further investigation. The starting point for phonology is the segmentation of the speech flow performed by the speakers, which in fact nobody questions. I apply the term physical segmentation of the speech flow to the segmentation performed by speakers. The functional segmentation of the speech flow contrasts with the physical segmentation of the speech flow. The aim of phonology is to discover the conditions of the functional segmentation of the speech flow.
A concrete phoneme is a minimal part of a linguistic sign that is a term of a distinctive opposition. A segmentation of a linguistic sign into a sequence of concrete phonemes may coincide with its segmentation into a sequence of sounds; in other words, a functional segmentation may coincide with a physical segmentation. But depending on the structure of a language, a minimal part of a linguistic sign may consist of two or more sounds. A sequence of two sounds may be interpreted as one or two concrete phonemes.
Two basic conditions can be stated that determine the functional interpretation of a sequence of two sounds:
CONDITION 1. Given an opposition of sequences of two sounds X͡Y:XY, so that X͡Y is cohesive and XY noncohesive, X͡Y constitute one concrete phoneme and XY constitute two concrete phonemes.
Here are some examples of the opposition cohesive: noncohesive: A comparison of the Polish words czy [tši] ‘if’ and trzy [t|ši] ‘three’ shows that the only difference between these words is the homogeneous articulatory movement in ts in the former and its absence in the latter. We encounter the same opposition in the following pairs of Polish words:
|(1)||Czech [t͡šex]‘Czech’:trzech [t|šex] ‘of three’|
|czysta [t͡šista] ‘clean’:trzysta [t|šista] ‘three hundred’|
|paczy [pat͡ši] ‘it warps’:patrzy [pat|ši] ‘it looks’|
|oczyma [ot͡šima] ‘through the eyes’:otrzyma [ot|šima] ‘will obtain’|
In these examples physical cohesiveness and noncohesiveness are used as distinctive features. Only the opposition of cohesiveness and noncohesiveness is phonologically relevant. If this opposition did not exist, we could not use the physical cohesiveness of a sound sequence XY as a basis for concluding that these two sounds constitute an instance of a single phoneme; nor could we use the physical noncohesiveness of X|Y as evidence that X and Y are instances of two different phonemes ‘X’ and ‘Y’. By doing so we would substitute phonetic criteria for phonological ones, which is unacceptable.
The opposition between physical cohesiveness and noncohesiveness can also be called the opposition of strong and weak cohesion between sounds or the opposition of homogeneity and nonhomogeneity of sound sequences. This opposition is frequently created by the absence and presence of a morphemic or word boundary. Consider the following word pairs from Polish:
|(2)||podrzeć [podžefiś]:podżegacz [pod-žegatš]|
|‘to tear up’ ‘instigator’|
|ocaleć [otsal’ efiś]:odsadzać; [ot-sadzafiś]|
|‘to remain whole’ ‘to drive back’|
|dzwon [dzvon]:podzwrotnikowy [pod-zvrotńikovi]|
The sound sequences dž, ts, dz in the words podrzeć, ocaleć, dzwon, owing to the strong cohesion between the elements d and ž, t and s, d and z, constitute instances of the phonemes ǯ, c, and ʒ. The sound sequences dž, ts, dz in the words podżegacz, odsadzać, podzwrotnikowy, owing to the weak cohesion between the elements d and ž, t and s, d and z, constitute instances of the phonemic sequences dž, ts, and dz.
In English the sound sequence tš occurs inside of morphemes, but t and š can also be divided by a morpheme boundary or word boundary. Compare: cheap, rich, butcher and court-ship, night-shift, nut-shell. That creates an opposition cohesiveness:noncohesiveness. Hence, the sound sequence ts inside of morphemes must be interpreted as a single phoneme č.
CONDITION 2. If in the sound sequence XY either X or Y is not interchangeable with other sounds or zero, then XY is an instance of a single phoneme ‘Z’. If both X and Y are interchangeable with other sounds or zero, then X is an instance of the phoneme ‘X’ and Y is an instance of the phoneme ‘Y’.
Let us clarify the two conditions using a concrete example, the sound sequence tš. According to these rules, this sequence can have the following interpretation depending on the various languages it occurs in.
If the sequence tš is semiotically indivisible, i.e., if in a given language neither t nor š can be substituted for, then the sequence tš is an instance of the phoneme ‘č’. Such a case is a purely hypothetical one, since, to my knowledge, there exists no natural language in which both elements of the sequence would be noninterchangeable with other sounds or zero. However, there are languages in which t is not interchangeable with other sounds or zero, as, for instance, in the Spanish word chino [tšino] ‘Chinese’. Eliminating š, we obtain tino [tino] ‘tact’. The sound t, however, cannot be eliminated, since in Spanish words of the type šino are not admissible. That means that in Spanish tš is an instance of the single phoneme ‘č’
It should be emphasized that the degree of articulatory or acoustic unity of tš in English, German, or Spanish has absolutely no significance for determining whether tš constitutes an instance of one or two phonemes. What matters is only phonological interchangeability. That applies to any sound sequence. For instance, such sound sequences as st, ps, bl, au, rs, etc. could be interpreted as instances of single phonemes in one language but instances of two phonemes in another language.
There is also a condition for biphonematic interpretation of one sound. It can be formulated as follows:
CONDITION 3. Given a sound sequence XY in a position P1, and a sound Z in a position P2, if XY is interpreted as two phonemes XY and if the difference between XY and Z is reducible solely to the effect of positional variation, then Z must be considered a realization of the phoneme sequence XY.
For example, in many Polish dialects nasalized vowels occur only before fricative consonants, and the sequences vowel + nasal consonant only before stops, before vowels, and at the end of the word. Since the sequences vowel + nasal consonant are interpreted as sequences of two phonemes, and since the difference between the sequences vowel + nasal consonant and the nasal vowels is reducible solely to the effect of positional variation, the nasal vowels must be interpreted as realizations of the sequences vowel + nasal consonant.
In certain positions in the speech flow, some sounds do not occur. For instance, in English words the voiced stops b, d, and g do not occur in the position after s. Consider the English words spill [spɪl], still [stɪl], and scold [skould]. Since b, d, and g do not occur in the positions after s, words of the type sbɪl, sdɪl, and sgould are impossible in English.
The nonoccurrence of some sounds in certain positions gives rise to a problem: can counterparts x1, x2, . . . , xn of sounds y1, y2, . . . , yn, not occurring in given positions, be considered functionally equivalent to sounds x1 x2, . . . , xn in other positions where sounds y1, y2, . . . , yn do occur? In our case: can counterparts p, t, k of sounds b, d, g, not occurring in positions after s, be considered functionally equivalent to p, t, k in other positions where b, d, g do occur?
To answer these questions, we must base our analysis of the given situation upon the concept of phonological opposition. In positions where the sounds y1, y2, . . . , yn occur, the sounds x1, x2, . . . ,xn are members of the phonological oppositions x1:y1, x2:y2, . . . , xn:yn, and in positions where sounds y2 y2, . . . , yn do not occur, sounds x1, x2, . . . , xn cannot be members of these oppositions. The phonetic feature distinguishing sounds x1, x2, . . . , xn from sounds y1, y2, . . . , yn should be redundant if sounds x1, x2, . . . , xn do not participate in phonological oppositions x1:y1, x2:y2, . . . , xn:yn. Therefore, sounds x1, x2, . . . , xn in positions where sounds y1 y2, . . . , yn do not occur cannot be functionally equivalent to sounds x1, x2, . . . , xn in positions where sounds y1, y2, . . . , yn do occur.
In our English example, p,t,k in spill, still, and scold do not participate in the phonological oppositions p:b,t:d,k:g, because b,d,g do not occur in the position after s; therefore, the voiceless phonetic feature distinguishing p,t,k from b,d,g is redundant here.
p,t,k in spill, still, and scold cannot be considered functionally equivalent to p,t,k in pill [pɪl], till [tɪl], and cold [kould] because p,t,k in initial positions before vowels participate in the phonological oppositions p:b,t:d,k:g (compare pill: bill, till:dill, cold:gold), and so their voiceless phonetic feature serves as a distinctive feature contrasting with the voiced phonetic feature.
The voiceless phonetic feature of p,t,k in initial positions before vowels is distinctive, while the same phonetic feature of p,t,k in the position after s is not distinctive. In the position after s we have a merger of phonemes that are in opposition into one phonological unit. This merger is called neutralization, and the resulting new phonological unit is called an archiphoneme. The neutralization of p:b,t:d,k:g results in the archiphonemes P,T,K, or <p/b>,<t/d> <k/g> in another notation.
In our English example, the nonvoiced stops p,t,k will be denoted by p,t,k, the voiced stops will be denoted by b,d,g and the archiphonemes will be denoted by P,T,K. The words pill, till, and cold will be transcribed pɪl, tɪl, and kould, and the words spill, still, and scold will be transcribed sPɪl, sTɪl, and sKould.
The concepts of neutralization and the archiphoneme presuppose each other and can in no way be disassociated from one another.
As another example of neutralization, I will take a neutralization of voiced: voiceless consonants in Russian. Consider the Russian words kust [kust] ‘bush’ and gust [gust] ‘thick’; here we encounter the opposition k:g. The distinctive features that characterize this opposition are voicelessness and voicedness. But at the end of words this opposition is neutralized. Thus, in the words luk ‘bow’ and lug ‘meadow’ the otherwise contrasting phonemes k and g merge into a single archiphoneme K (or <k/g> in another notation), and as a result these words merge into a single homonymous sign luK.
In our Russian example, the archiphoneme K in luK presupposes a phonetically conditioned alternation g:k in such forms as luga ‘of the meadow’ and luk ‘meadow’. Actually, the nondistinctiveness of certain phonetic features of sounds in positions where their counterparts do not occur is completely independent of the existence or nonexistence of phonetically conditioned alternations. Thus, the voiceless phonetic feature of p,t,k in the English words spill, still, and scold is nondistinctive, although these sounds do not alternate with b,d,g in other positions.
It goes without saying that the notion of phonetically conditioned alternations of phonemes cannot be dispensed with in phonology, because these alternations account for homonyms resulting from the merger of the phonemic shapes of words. But in the interest of a revealing account of this notion and that of distinctiveness, we should strictly distinguish between and not confuse these notions.
Let us now consider alternative solutions to our problem.
Linguists who include physical similarity of sounds in the definition of the phoneme consider any sort of restriction on the occurrence of sounds in particular positions to be a case of defective distribution, which does not affect the physical constitution of phonemes that occur in these positions. Therefore, these phonemes are considered identical with physically similar phonemes that occur in other positions. Thus, adherents of this approach would view p,t,k in English spɪl, stɪl, and skould as equivalent to p,t,k in pɪ, tɪl, and kould.
What should we think of this approach? It is true that the nonoccurrence of sounds in any position does not affect the physical constitution of sounds that occur in these positions. But since physical similarity is irrelevant for determining the functional equivalence of sounds, this approach is unacceptable.
It should be noted that we can speak of defective distribution of phonemes only in those cases where the sounds that do not occur in certain positions have no counterparts that do occur in these positions. For instance, in Russian r and r’do not occur in initial position before n: Russian words cannot begin with the phoneme combinations rn or r’n. Since r and r’ have no counterparts in this position, the nonoccurrence of r and r’ in this position should be considered a case of defective distribution of the phonemes r and r’.
As was shown above, every phoneme is characterized by a set of distinctive features. Since phonemes are functional segments ordered into linear sequences, the sets of distinctive features characterizing phonemes are also ordered into linear sequences.
The assumption that phonemes are characterized by linear sequences of sets (or bundles) of distinctive features lies at the basis of modern phonology, no matter how widely particular phonological theories differ from one another. This assumption has recently been challenged by some experimental phoneticians. Here are some of their arguments against the assumption that distinctive features are tied to linearly ordered functional segments of the special flow.
Consider duration. If duration functions as a distinctive feature, phonology includes it among other distinctive features of a functional segment. For example, in English duration serves as a functional cue distinguishing between long and short vowel phonemes, and the opposition of the distinctive features short:long must be considered a segmental property of phonemes. However, studies in experimental phonetics have shown that duration has many other linguistic functions that are not restricted to a single segment. It has been found, for example, that in English under certain conditions the phonological distinctive feature voiced does not correspond to the phonetic feature voiced. Perceptual tests with synthetic stimuli have shown that vowel duration is a sufficient cue for determining the perception of voicing in a final consonant: if you synthesize a sequence such as jus, with a voiceless s, and lengthen the duration of the vowel, listeners will begin to hear juz, even though there is no voicing present in the fricative (for a recent review of the experiments, see Wardrip-Fruin, 1982). Similarly, it has been discovered that the tense:lax (fortis:lenis) distinction of stop sounds in German is not exclusively associated with the consonants themselves that presumably carry the distinctive feature of fortis and lenis, but that the distinction between words containing a fortis or lenis stop sound is characterized by a different distribution of the durations of the consonant and the preceding vowel. Thus, in the analysis of German word pairs such as baten:baden and Laken:lagen, the duration of the vowel and stop sequence remains approximately constant at the expense of its different distribution between the vowel and the consonant: in words such as baten, the vowel is shorter and the consonant is longer; while in words such as baden, the relationship is reversed—a shorter consonant follows a longer vowel (Kohler, 1981). Modern literature in experimental phonetics abounds in examples that seem to contradict the notion of the distinctive feature as a segmental property of the speech flow.
These findings of experimental phonetics have induced some linguists, in particular phoneticians, to question the validity of the phonological notion of the distinctive feature. Ilse Lehiste, in her recent paper on the experimental study of duration, writes:
One of my longstanding complaints and criticisms of most current linguistic theories is the fact that they ignore the temporal aspects of spoken language almost completely. If duration enters into phonological theory at all, it gets segmentalized: [+long] may be included among the distinctive features of a segment. And this is where linguistic theory stops—implying that duration can have only a segmental function, i.e., that all duration can do is differentiate between short and long segments.
Those phonologists who have some acquaintance with experimental phonetics have devoted considerable attention and effort to the study of temporal aspects of spoken language; unfortunately this seems to have had little or no impact on the theoreticians, who continue to manipulate segmental distinctive features to the exclusion of anything larger than a segment. I have said it before, and I will say it again: phonologists ignore phonetics at their own peril. The peril is that they may operate in a fictitious abstract sphere that has no connection with reality. In this abstract sphere, linguistic constructs are timeless. In the real world, spoken language unfolds itself in time. (Lehiste, 1984: 96)
Lehiste, like many other phoneticians, rejects the phonological notion of the distinctive feature, because she fails to see the fundamental difference between the functional and physical levels of the speech flow. Consider the above example concerning the sequence jus. True, if we synthesize the sequence jus, with a voiceless s, and lengthen the duration of the vowel, listeners will begin to hear juz, even though there is no voicing in the fricative. That is an interesting phenomenon. But does it undermine the notion of the distinctive feature as a segmental property? From a phonological point of view, the essential thing is the perception of the opposition voiced:voiceless rather than the acoustic properties that are involved in the perception. The essential thing is that although in the above experiment the sound s does not change, it is perceived as z when the preceding vowel is lengthened. What matters is that on the functional level we have the opposition s: z. This opposition is a phonological phenomenon that is no less real than the phonetic fact that acoustically the phoneme z is represented by the voiceless sound s plus the length of the preceding vowel.
Similarly, the discovery that in German the tense:lax distinction is associated with the length of the vowel that precedes the consonant does not undermine the phonological notion of the distinctive features tense:lax. What matters from a phonological point of view is not the distribution of vowel duration in words such as baten:baden but the perception of the consonants as the members of the distinctive oppositions tense:lax.
In accordance with the notion of the functional level of the speech flow, we can hypothesize that speech sounds as phonemes are perceived in a special processing area of the brain, different from that used for the perception of other types of sounds. This processing area of the brain can be called the area of functional perception. Functional perception transforms continuous suprasegmental acoustic phenomena into discrete segmental functional properties.
Phonological distinctive features are no less real than the phonetic phenomena that serve as cues to distinctive features.
The recent findings of experimental phonetics throw new light on the nature of speech sounds. If correctly interpreted, these findings lead to fresh insights into the duality of the speech flow. We understand better the relation between the functional and physical levels of the speech flow: what serves as a segmental property of speech sounds at the functional level is their suprasegmental property at the physical level.
In section 2.1 it was shown that phoneme and sound are heterogeneous concepts that characterize different levels of the speech flow: functional and physical. There is neither a relation of class membership nor a relation of class inclusion between the sound and the phoneme, because these concepts belong to basically different abstraction levels. The notion of the phoneme characterizes the dual nature of the segments of the speech flow, and therefore the phoneme constitutes a complex notion, the sound/diacritic. This complex notion is reminiscent of such notions as the wave/particle in physics and the use-value/exchange-value in economics. I suggested the metaphorical term centaur concepts to denote this type of notion because the structure of these notions is reminiscent of centaurs, the fabulous creatures of Greek mythology, half men and half horses.
In section 2.1, evidence for the dual nature of the segments of the speech flow was based on the discovery of the contradictory consequences from a pair of assumptions that both are taken as a characterization of essential properties of speech sounds. The two assumptions are:
1. Speech sounds are physical elements.
2. Speech sounds are elements whose function is to differentiate between linguistic signs in accordance with the Principle of Semiotic Relevance.
If the two assumptions are considered essential for the characterization of the basic properties of speech sounds, then we have to accept all consequences from these assumptions. Since the consequences are contradictory, we face what in my book on theoretical phonology I called the antinomies of the paradigmatic and syntagmatic identification of phonemes (Shaumyan, 1968: 37-44).
From assumption 1 it follows that different sounds cannot be identical, while assumption 2 predicts that different sounds can be identical.
By the same token, it follows from assumption 1 that a sequence of two sounds cannot constitute one segment, while assumption 2 predicts that a sequence of two sounds can constitute one segment.
In order to solve these antinomies, we have to regard speech sounds as complex objects having a dual nature: functional and physical.
In addition to the two antinomies, we discover a third, more general antinomy, which I have called the antinomy of transposition (Shaumyan, 1968: 31-37).
The antinomy of transposition is generated by the following two assumptions, which both characterize the essential properties of the phoneme:
Assumption 1: Phonemes are elements whose function is to differentiate between signs.
Assumption 2: Phonemes are acoustic elements.
Let us examine the consequences that can be drawn from these assumptions.
If assumption 1 is valid, then the acoustic substance of phonemes can be transposed into other forms of physical substance—graphic, chromatic, tactile. Any phoneme and any set of distinctive features can be presented not only as acoustic elements but as graphic, chromatic, or tactile symbols, as well. In order to see that, let us perform the following mental experiment. We will transpose phonemes into circles of identical dimension but different color, let us say in English the vowel æ into a blue circle, the vowel e into a brown circle, the consonant k into a green circle, the consonant n into a red circle, the consonant t into a yellow circle. The words cat, ten, neck, net, can, tan can then be represented as chains consisting of combinations of the differently colored circles, as shown in the following table:
Hence, from assumption 1 it follows that phonemes can be transposed from acoustic substance into other forms of physical substance.
Let us turn now to assumption 2. If it is true that phonemes are acoustic elements, then it follows that they cannot be transposed into other forms of physical substance, since in that case they would cease to be themselves, i.e., acoustic elements.
Here we encounter an evident antimony: both assumption 1 and assumption 2 are valid in respect to modern phonology; yet assumption 1 implies that phonemes can be transposed into other forms of physical substance, while assumption 2 implies a direct contradiction, i.e., that phonemes cannot be transposed into other forms of physical substance.
This contradiction constitutes an inherent theoretical difficulty, which can be termed the antinomy of transposition.
The antinomy of transposition may evoke the following objection. According to assumptions 1 and 2, a phoneme is, by definition, at the same time an element whose function is to differentiate between signifiants and an acoustic element. And since this definition is adequate for natural languages, the property of the differentiation between signifiants and the property of being an acoustic element are equally essential for the phoneme, and the bond between these two properties must be considered indispensable within the limits of natural languages. Therefore, we are not justified in deducing from assumption 1 that the phoneme can be transposed from acoustic substance into other forms of physical substance.
This objection can be answered as follows. If we regard definitions as convenient compressed descriptions of directly observed data, then, since in natural languages phonemes are always sound elements, we are not justified in separating the functional properties of the phoneme from its acoustic properties. But the subject matter of science comprises not only empirical data, not only what is but also what in principle can be: hence, if a mental experiment arrives at what can be, we disclose the essence of the studied subject. We regard the definition of the phoneme not as a convenient compressed description of an empirical fact but as a hypothesis about facts that are possible in principle. At the level of abstract possibility, the question arises whether the communicative function of a natural language would be violated if its acoustic substance were transposed into other forms of physical substance. Obviously, no such violation would occur. We are, therefore, justified in transposing phonemes, by means of mental experiment, from acoustic substances into other forms of physical substance. The results of the mental experiment contradict, however, the interpretation of the acoustic properties as the essential properties of the phoneme, since if the acoustic properties are essential properties of the phoneme, the phoneme cannot be transposed from acoustic substance into any other form of physical substance.
In order to resolve the antinomy of transposition, we must posit that although the phoneme and the sound are indissoluble, the phoneme is logically independent of acoustic substance. Saussure aptly compared linguistics with political economy, since both sciences are concerned with the notion of value. In his view, language is basically a system of pure values, and the phoneme is a particular instance of linguistic value. Let us continue Saussure’s analogy of linguistics with political economy. A phoneme is the distinctive value of some acoustic substance. Speech sounds have distinctive values in the same way as commodities have exchange-values. Both the speech sound and the commodity have a dual character: the speech sound is a unity of the sound proper and the diacritic, while the commodity is a unity of an exchange-value and a use-value. The phoneme is logically independent of acoustic substance in the same way as the exchange-value of the commodity is independent of use-value, that is, of its physical properties. Here is how Marx characterizes the relation between the exchange-value of a commodity and its use-value, that is, its physical properties:
The objectivity of commodities as values differs from Dame Quickly in a sense that ‘a man knows not where to have it.’ Not an atom of matter enters into the objectivity of commodities as values; in this it is the direct opposite of the coarsely sensuous objectivity of commodities as physical objects. We may twist and turn a single commodity as we wish; it remains impossible to grasp it as a thing possessing value. (Marx, 1977: 138)
Comparing speech sounds with commodities, we can say that just as not an atom of matter enters into the objectivity of commodities as values, so not an atom of acoustic substance enters into the objectivity of speech sounds as values. (In this quotation Marx uses the term value in the sense of ‘exchange-value’.)
The key to understanding the dual character of speech sounds is the concept of the unity of opposites. This concept, known essentially since Nicolaus Cusanus as coincidentia oppositorum, a way of reasoning that has deeply influenced the musical thinking of Johann Sebastian Bach through his art of counterpoint in the fugue, lies at the very base of Hegel’s dialectics. Hegel, in turn, has exerted a major influence on all modern philosophy, including that of Karl Marx. Hegel’s notion of dialectics can be seen, essentially, as an ongoing conflict between opposites that constitute a unity.
A striking example of “the unity of opposites” in modern physics is the dual nature of light—or more generally of electromagnetic radiation. Radiation produces many paradoxical situations. For instance, when there are two sources of light, the intensity of the light at some place will not be necessarily the sum of the radiation from the two sources; it may be more or less. This situation is explained as the interference of the waves that emanate from the two sources: where two crests of the waves coincide, we have “more light” than the sum of the two; but we have “less light” where there is a coincidence of a crest and a trough. On the other hand, when ultraviolet light is shown on the surface of some metals, it can move the electrons from the surface of the metals. This situation, called the “photoelectric effect,” is explained as a collision of light particles with electrons. The inescapable conclusion is that light must, therefore, consist of both particles and waves. Particles and waves constitute the unity of opposites in the phenomenon of electromagnetic radiation.
The discovery of the dual nature of electromagnetic radiation—reasonably well understood by the time of the 1927 International Conference on Physics held in Copenhagen, Denmark, and featuring such notable scientists as Niels Bohr, Max Planck, and Albert Einstein—created major conceptual difficulties unknown in classical Newtonian physics. In order to cope with these difficulties, Bohr introduced the famous “Complementarity Principle” into modern physics. The Complementarity Principle treats the particle picture and the wave picture as two complementary descriptions of the same phenomenon; each of them has a limited range of application, and both are needed to provide a complete description of the phenomenon under investigation.
The notion of complementarity dominates the thinking of modern physics. Bohr suggested that the Complementarity Principle is a methodological postulate that is valid outside of physics, as well. The notion of complementarity, therefore, is quite similar to the notion of the “unity of opposites” in Hegel’s dialectics. It is quite certain that Bohr was well aware of the similarity.
Now, to take this discussion one step further, we must observe that the Complementarity Principle, as a methodological postulate, is very useful indeed, not only in physics, where it was first recognized, but in other sciences, as well. But in applying this postulate, one must beware of applying it trivially. The chief reason why the Complementarity Principle has become so famous is that the essential alternative pictures of reality seem to conflict with one another at first.
Now in my linguistic research I have discovered a nontrivial application of the Complementarity Principle most helpful in the understanding of linguistic phenomena.
In the final section of my book Problems in Theoretical Phonology (Shaumyan, 1968) I discuss the concepts of my phonological theory from the point of view of the Complementarity Principle.
Speech sounds have a dual character—physical and functional. Functional and physical identity are complementary in the sense of the unity of opposites; that is, they are complementary and at the same time mutually exclusive and contradictory concepts.
Consider, for instance, the Danish consonants t and d in syllable-initial and syllable-final positions. Physically these consonants are not identical; the t is “voiceless” and the d is “voiced.” But their physical nonidentity “conflicts” with their functional properties: in Danish the syllable-initial t is functionally identical with the syllable-final d sound.
Consider the physical segmentation of the flow of speech. The sequence —ts—consists, both in English and in German, of two physical segments, a t-like sound followed by an s-like sound, as in the English word cats and the German word zehn, the number ‘ten’. But from a functional point of view, -ts- constitutes one segment in German (frequently spelled with the letter z) but always two separate ones in English. Thus, just like the physical and functional identity, physical and functional segmentation are complementary and at the same time mutually exclusive and contradictory concepts.
A sound as a member of a class of physically identical objects is a sound proper, but as a member of a class of functionally identical objects, it is a fundamentally new object, a diacritic. A unity of contradictory objects—functional and physical—a “speech sound” is a combined object: the best way to characterize it would be to call it a sound/diacritic. A sound/diacritic is thus a unity of opposites similar to such unities of opposites as the wave/particle in physics and the exchange-value/use-value in economics.
In order to denote this type of notion I have introduced a new term. The new term, mentioned above, is centaur concepts. I regard the phoneme as a centaur concept, as a sound/diacritic, which constitutes a radical departure from various current concepts of this notion.
The three phonological antinomies were first stated in my book on theoretical phonology (Shaumyan, 1968). These antinomies were criticized by Kortlandt (1972: 29-33) and Fischer-Jørgensen (1975: 343-45). Both linguists advance similar arguments against the phonological antinomies. I will focus on Kortlandt’s criticism, because it is more detailed.
Criticizing the antinomy of transposition, Kortlandt writes:
From the model outlined here two statements evolve (Šaumjan, 1968: 35):
(1) Phonemes are elements whose function is to differentiate between signifiants.
(2) Phonemes are acoustic elements.
The first statement leads Saumjan to the following conclusion: If it is true that the function of phonemes is to differentiate between signifiants then it follows that there exists an inherent possibility of transposing the acoustic substance into other forms of physical substance—graphic, chromatic, tactile. Any system of distinctive features and phonemes can be presented not only as acoustic properties but as graphic, chromatic or tactile symbols as well. However, “if it is true that phonemes are acoustic elements it follows that they cannot be transposed into other forms of physical substance since in that case they would cease to be themselves, i.e. acoustic elements” (1968: 36). According to Saumjan, the resulting contradiction, which he calls the ‘antinomy of transposition,’ constitutes an inherent theoretical difficulty in Trubetzkoy’s model of the phoneme.
The reasoning is clearly incorrect. If we substitute ‘green table’ for ‘phonemes,’ ‘thing’ for ‘elements,’ and ‘colour’ for ‘function,’ we obtain something like this:
(1) A green table is a thing whose color is green.
(2) A green table is a table.
If it is true that the colour of a green table is green then it follows that there exists an inherent possibility of transposing its table-ness into other forms of thing-ness. However, if it is true that a green table is a table it follows that it cannot be transposed into other things since in that case it would cease to be a table.
Analogy is a bad argument and I am no supporter of the kind of debating exhibited in the preceding paragraph, but it certainly shows that a bit of superficial logic does not make up for the lack of explicitness with regard to the underlying assumptions. Saumjan’s reasoning would hold true if the first statement were reversible, but that is clearly not the case if the second statement holds. (Kortlandt, 1972: 29)
As we can see from this quotation, Kortlandt’s refutation of the antinomy of transposition is based on his claim that the first statement (about the differentiating function of the phoneme) is irreversible (i.e., the reverse statement “Elements whose function is to differentiate between signifiants are phonemes” is not true).
Kortlandt’s argument seems irresistible, and yet it is wrong. The point is that, contrary to Kortlandt’s claim, the reverse statement is true, because the essence of a phoneme is in its distinctive function. Any phoneme represented by some acoustic element remains the same phoneme no matter into which other physical substance it is transposed. The antinomy of transposition is a conflict between the physical nonidentity of two physical substances and their functional identity. In order to solve the antinomy of transposition, we have to recognize the logical independence of the functional and physical identity of speech sounds and introduce the centaur concept sound/diacritic.
I do not mind the use of analogies in reasoning, but Kortlandt’s analogy between a phoneme and a green table is pure nonsense. A green table is a thing whose color is green. But from this statement it does not follow that a green table can be transposed into some other green thing, say, into a green apple, because green color is not an essential property either of a table or of an apple. The difference between the green color of a table and the distinctive function of a phoneme is that the distinctive function is an essential property of a phoneme, and, therefore, a phoneme can be transposed from its acoustic substratum as long as other physical substances have the same distinctive function, while green color is not an essential property of a table, and therefore a green table has nothing in common with a green apple or a green crocodile.
As regards the antinomies of the paradigmatic and syntagmatic identification of phonemes, Kortlandt agrees with them but makes certain reservations. Thus, with respect to the paradigmatic identification of phonemes, he writes:
There is, however, one possible identification which Šaumjan does not take into consideration, though it violates neither the functional nor the physical properties of the phoneme. He writes:
if, in accordance to statement 1, phonemes possess a function of differentiation between signifiants, then phonemes which occur in different positions can be altered in respect to their phonation is sharply as desired as long as they do not get confused with one another. (1968: 41)
According to this view, one could, strictly speaking, regard any pair of sounds as variants of one and the same phoneme provided only that they are in complementary distribution: thus [q] in position P1 can be identified with [ć] in position P2, and subsequently [k] and [k’] can be identified in accordance with their acoustic properties. It follows that Šaumjan’s antinomy cannot be logically derived from his statements 1 and 2 alone, but that it rests upon an additional assumption concerning the mutual relations between phonemes as well. This does not diminish the value of his argument because such an assumption is explicitly present in Trubetzkoy’s work. (Kortlandt, 1972: 32)
Kortlandt’s claim that the antinomy of the paradigmatic identification of phonemes rests upon an additional assumption concerning mutual relations between phonemes is wrong. That is not an additional assumption but a direct consequence of statement 1 characterizing phonemes as diacritic elements. The alleged additional assumption is not present in Trubetzkoy’s work. As a matter of fact, Trubetzkoy was unaware of phonological antinomies, and therefore he failed to see the logical independence of the functional and physical identity of phonemes. Thus, in determining conditions under which two speech sounds can be considered realizations of two different phonemes or variants of a single phoneme, he lumped together functional and physical criteria. One of his basic rules is formulated as follows:
Rule III. If two sounds of a given language, related acoustically or articulatorily, never occur in the same environment, they are to be considered combinatory variants of the same phoneme.
This rule, as well as some other crucial statements in Trubetzkoy’s work, which will be discussed below, clearly show that he failed to discover phonological antinomies and was unaware of the logical independence of functional and physical identity of speech sounds, as well as of the logical independence of functional and physical segmentation of the speech flow.
In discussing the antinomy of the syntagmatic segmentation, Kortlandt claims that this antinomy rests upon an additional assumption that there exists a natural segmentation of the speech flow into sounds that does not coincide with the syntagmatic segmentation. Kortlandt fails to see that that is not an additional assumption but a characterization of the speech flow resulting from the conflict between the consequences that directly follows from the two initial statements defining the essential properties of phonemes.
Now that I have set forth the two-level theory of the phoneme, it will be useful to consider it in a broad methodological context of the logic of science. I intend to relate the two-level theory of phonology to the Complementarity Principle, which is of considerable significance in the comprehension of certain fundamental epistemological situations that possess an analogous character although they arise in various and at first glance unconnected areas of knowledge.
The discovery of the Complementarity Principle is the achievement of the outstanding Danish physicist Niels Bohr. First, Bohr formulated the Complementarity Principle as a purely physical principle; later, however, he extended its validity also to other areas of knowledge, above all to biology and psychology. At the present time the Complementarity Principle is interpreted as a general methodological principle that characterizes a definite epistemological situation.
The essence of the Complementarity Principle is described by Bohr as follows:
In order to characterize the relation between phenomena observed under different experimental conditions, one has introduced the term complementarity to emphasize that such phenomena together exhaust all definable information about the atomic objects. Far from containing any arbitrary renunciation of customary physical explanation, the notion of complementarity refers directly to our position as observers in a domain of experience where unambiguous application of the concept used in the description of phenomena depends essentially on the conditions of observation. (Bohr, 1958: 99)
In illustrating the Complementarity Principle, we can turn to the problem of the nature of light. Contemporary physics teaches that light has a dual nature and, consequently, that diffusion of light cannot be described by a single theory. To describe it, we must resort, writes the Polish physicist L. Infeld, to two theories, the corpuscular theory and the wave theory (Infeld, 1950: 108).
The corpuscular theory and the wave theory cannot be reduced to one another. They are at once mutually exclusive and mutually complementary. This paradoxical epistemological situation where the examined phenomenon can be exhaustively described only by means of mutually exclusive and at the same time mutually complementary theories necessitates the creation of syncretic dual concepts; in the given case, such a dual concept is the concept of corpuscle/wave. Other examples of syncretic dual concepts are time/space and mass/energy.
If we revert from physics to phonology, a linguist who observes the sounds of language must inevitably admit that as an observer of the sound properties of language, he is forced to utilize two kinds of experimental methods: on the one hand, the sounds of language can be subjected to experimental investigation in respect to their acoustic nature or to the physiological conditions of their formation, and on the other hand, it is possible to introduce different kinds of experiments with linguistic informants with the view of establishing objective phonemic contrasts present in a given language.
These two kinds of experimental methods of investigation can be called the physical and the semiotic experimental methods of investigation in phonology. The specific epistemological character of these experimental methods consists in the fact that their results cannot be united into a single picture. Their results are mutually exclusive and, at the same time, mutually complementary. As proof we can examine the problem of the identity of the sounds of language. For instance, investigation of the consonants t and d using the physical experimental methods discloses an essential difference between these consonants: t is voiceless and tense, and d is voiced and lax. But as was shown above, in Danish the syllable-initial t is functionally identical to the syllable-final d. We see that the results of physical and semiotic methods of investigation of the Danish consonants t and d cannot be united into a single picture. If we attempted to do that, we would encounter an irreconcilable contradiction: we would have to admit that the consonants t and d are both identical and nonidentical. In order to avoid this contradiction, as well as other analogous contradictions present in the observation of other sounds of natural languages, we have to admit that the pictures of the identity of the language sounds obtained by semiotic experimental methods of investigation are mutually exclusive and, at the same time, mutually complementary. Hence, we encounter an epistemological situation that is analogous to epistemological situations encountered in physics and in other sciences. All such situations are embraced by the Complementarity Principle.
The sounds of language possess a dual nature, as does light (on another plane, of course).
As was said above, I suggest the term centaur concepts to cover a class of syncretic concepts in various sciences, such as time/space, mass/energy, use-value/exchange-value, sound/diacritic, etc.
The Complementarity Principle is generally recognized in contemporary physics and logic of science. It is an independent discovery of Bohr’s, but there is a striking similarity between this principle and the principle of the unity of opposites, which constitutes the heart of the dialectical method proposed by Hegel and widely used by Marx in his works. Bohr was aware of the similarity of the two principles, and he used the term dialectics to characterize the general methodological and epistemological significance of the Complementarity Principle. Thus, he wrote:
The complementarity mode of description does indeed not involve any arbitrary renunciation on customary demands of explanation but, on the contrary, aims at an appropriate dialectic expression for the actual conditions of analysis and synthesis in atomic physics. . . . The epistemological lesson we have received from the new development in physical science, where the problems enable a comparatively concise formulation of principles, may also suggest lines of approach in other domains of knowledge where the situation is of essentially less accessible character. An example is offered in biology where mechanistic and vitalistic arguments are used in a typically complementary manner. In sociology, too, such dialectics may often be useful, particularly in problems confronting us in the study and comparison of human cultures, where we have to cope with the element of complacency inherent in every national culture and manifesting itself in prejudices which obviously cannot be appreciated from the standpoint of other nations.
Recognition of the complementary relationship is not least required in psychology, where the conditions for analysis and synthesis of experience exhibit a striking analogy with the situation in atomic physics. (Bohr, 1948: 317-18)
The functional view of sounds is significant to the extent to which it increases our understanding of natural languages. I will describe the revolutionary break in linguistics brought about by the functional view of sounds. I mean the birth of Saussure’s theory of the vowel system in the Indo-European parent language.
Indo-European had a vowel alternation e: o: ø, which appeared in diphthongs as ei: oi: i, eu: ou:u, and er: or: r.
Examples from Greek:
|(1)||Present Tense||Perfect||Aorist (Past tense)|
|‘I fly’||‘I have flown’||‘I flew’|
|’I persuade’||’I am persuaded’||’I persuaded’|
|‘I see’||‘I see’ (Perf. in the sense of Pres.)||‘I saw’|
In addition, one finds in Indo-European a different kind of alternation, namely, ā: ō: a. Examples from Greek (Doric dialect) and Latin:
|Greek||phāmi ‘I speak’||phōnā́ ‘voice’||phatós ‘said (past participle)’|
|Latin||dōnum ‘gift’||datus ‘given’|
|stāre ‘to stand’||status ‘standing’|
Ferdinand de Saussure argued that the alternation ā:ō:ă is not different from the alternation e:o:ø, since from a functional point of view the long vowels ā and ō must be interpreted as combinations of the short vowels e and o with a hypothetical sound A.
Saussure realized that if the long vowels in these alternations were interpreted as a combination of the short vowel with A, the two kinds of alternations, which before had looked entirely different, would become quite the same:
|(3)||ei :oi :i|
|eu :ou :u|
|eA :oA :A|
Under this hypothesis, the examples in (2) can be interpreted:
|(4)||Greek||phāmí = pheAmí||phōnā́ = phoAnā́||phatós = phAtós|
|Latin||dōnum = doAnum||datus = dAtus|
|stāre = steAre||status = stAtus|
Saussure advanced his hypothesis in 1879, which was to mark a turning point in the history of linguistics, although the functional and structural point of view that it represented was too strange to his contemporaries for it to meet with any general understanding. Ferdinand de Saussure was twenty-one when he published his famous Mémoire sur le système primitif des voyelles dans les langues indo-européennes (1879). This work was at least fifty years ahead of the linguistics of its time. Only in 1927, after the Hittite language had been deciphered, were the facts predicted by Saussure’s theory discovered. The Polish linguist Jerzy Kuryłowicz showed that the Hittite laryngeal ḫ corresponds to the phoneme A hypothesized by Saussure (Kuryłowicz, 1977). Here are some facts that confirm Saussure’s theory:
|pāscunt||pāskem||paḫsanzi||’they feed, support’|
“Often it is only after immense intellectual effort, which may have continued over centuries, that humanity at last succeeds in achieving knowledge of a concept in its pure form, in stripping off the irrelevant accretions which veil it from the eyes of the mind” (Frege, 1884: vii). These words of the great German logician apply to phonology no less than to any other branch of knowledge. More than a half-century has passed since the concept of the phoneme was introduced into the science of language, but even at the present time the problem of the definition of the phoneme cannot be considered to have been fully resolved.
Wherein lies the essence of the problem that concerns the definition of the phoneme?
Every definition of a concept is arbitrary, since every scholar can define any concept in a way that is to his advantage. From this point of view, the definitions of concepts are not concerned with the essence of the matter, and arguments concerning such definitions can hold only terminological interest. Although the definitions of concepts are in themselves arbitrary, we can regard any definition of a concept as a statement that has an explanatory function. In this case the definition of the concept answers the question “What is the nature of x?”: for example, “What is the nature of light?”, “What is the nature of meaning?”, “What is the nature of truth?”, etc. Since the definitions of concepts that are based on the formula “What is the nature of x?” require on the part of the scholar a deep penetration into some sphere of reality, and are at the same time formulated not in the form of single, isolated statements but in the form of an entire system of statements, such definitions can be called theories, as well; for example, the “definition of light” can be called the “theory of light,” the “definition of meaning” the “theory of meaning,” the “definition of truth” the “theory of truth.” The aforementioned words of Frege do not apply to all definitions of concepts; they apply only to definitions of concepts based on the formula “What is the nature of x?”, because this type of concept definition in particular presents fundamental problems that are rooted in the process of human knowledge. The history of science shows that the struggle for progress in any field of knowledge often takes on a form of a conflict of pro and con definitions of concepts based on the formula “What is the nature of x?” Conflicts of this type have existed in connection with the concept of the phoneme over the past fifty years. The question is, then, which definition of the phoneme based on the formula “What is the nature of x?”, i.e., which phoneme theory, reflects most closely the linguistic reality?
Let us now turn to a comparison of the two-level theory of the phoneme and the distinctive features presented in this book with alternative theories.
The notion of the phoneme as a class of functionally equivalent concrete phonemes is a novel concept that marks a decisive progress in understanding the essence of the phenomenon different phoneme theories have tried to explain.
The notion of the phoneme as a class of sounds is not new. Rather, it is very old; and the notion of the phoneme as a class of sounds is popular in current linguistic literature, too. What is missing is the understanding of the principle on which the classes of sounds are abstracted. As a matter of fact, all existing phonological theories that operate with the notion of the phoneme as a class of sounds have their abstraction on the principle of physical similarity of sounds, which makes these theories worthless.
N. S. Trubetzkoy severely criticized all phonological theories that abstracted classes of sounds on the basis of their physical similarity. His aversion to these theories was so strong that he tried to define the notion of the phoneme without regard to the class concept. He defined the phoneme as follows:
Phonological units that, from the standpoint of a given language, cannot be analyzed into still smaller successive distinctive units are phonemes. (Trubetzkoy, 1969: 35)
This definition of the phoneme is based on the notion of the phonological unit, which is defined as follows:
By (directly or indirectly) phonological or distinctive opposition we thus understand any phonic opposition capable of differentiating lexical meaning in a given language. Each member of such an opposition is a phonological (or distinctive) unit. (Trubetzkoy, 1969: 33-34)
Trubetzkoy’s definition of the phoneme is significant in that it is based on the notion of the distinctive opposition. At the same time, this definition disregards the notion of class. The reasons for this disregard are psychologically understandable. Trubetzkoy ignores the class concept for fear that it will open the door to the treatment of the phoneme as a phonetic rather than a phonological concept, that is, for fear that the phoneme will be treated as if it belonged to the same level as the sound. Of course, this fear was justified; that is confirmed by the subsequent development of phonology. Thus, the concept of the phoneme as a class of physically related sounds dominated American descriptive linguistics and was very popular elsewhere, too. The following definition is typical:
The phoneme is “a class of sounds which: (1) are phonetically similar and (2) show certain characteristic patterns of distribution in the language or dialect under consideration.” (Gleason, 1955: 261)
As was shown above, linguists who include physical similarity of sounds in the definition of the phoneme as a class of sounds do not accept the notion of neutralization and consider any sort of restriction on the occurrence of sounds in particular positions to be a case of defective distribution. This view is, of course, consistent with the notion of the phoneme as a class of physically similar sounds, but then one may wonder whether the linguists who accept this notion of the phoneme understand why they need the term phoneme. The term phoneme makes sense only as a label for a specific functional concept radically different from the ordinary notion of the sound, or else this term is worthless as an aid to understanding linguistic reality: it functions solely as an honorific for what is called the sound in ordinary phonetics.
Trubetzkoy was unaware that the solution to the problem lies in the contrast of two kinds of classes of sounds: 1) physical classes of sounds and 2) functional classes of sounds as members of concrete distinctive oppositions, that is, as concrete phonemes. This contrast, rooted in the dual nature of speech sounds, was to be discovered.
In spite of his insistence on a consistent functional approach, Trubetzkoy was not consistent in many respects himself. Thus, his rules for the determination of phonemes are not free from confusion of the phonological point of view with the phonetic approach. Here are some examples of this confusion.
In defining conditions under which two speech sounds can be considered realizations of two different phonemes, and under what conditions they can be considered phonetic variants of a single phoneme, Trubetzkoy formulated four rules. Rule III reads:
If two sounds of a given language, related acoustically or articulatorily, never occur in the same environment, they are to be considered combinatory variants of the same phoneme. (Trubetzkoy, 1969: 49)
Clearly, this rule is false: whether two sounds of a given language are related acoustically or articulatorily or not has no bearing on whether they are variants of the same phoneme or not. We know that two identical sounds can belong to different phonemes, and, vice versa, two acoustically unrelated sounds can be variants of the same phoneme.
In formulating rules for distinguishing between a single phoneme and a combination of phonemes, Trubetzkoy proposes the following rule:
Rule II.—A combination of sounds can be interpreted as the realization of a single phoneme only if it is produced by a homogeneous articulatory movement or by the progressive dissolution of an articulatory complex. (Trubetzkoy, 1969: 56)
Under this rule, such combinations of sounds as ks or st can never be interpreted as a single phoneme, because these combinations of sounds are not produced by a homogeneous articulatory movement.
Clearly, this rule is false: whether a combination of sounds is produced by a homogeneous movement or not has no bearing on whether this combination must be interpreted as two phonemes or one phoneme. We know that ks or st can be interpreted as either two phonemes or one phoneme, depending on the system of phonological oppositions in a given language.
Let us now turn to the notion of the distinctive feature. It is generally assumed that there is a one-one correspondence between distinctive features and phonetic features (acoustic or articulatory) that correlate with them. In other words, an isomorphism is assumed between a set of distinctive features and a set of phonetic features that correspond to the distinctive features. In accordance with this assumption, the distinctive features are expressed in the same terms as the phonetic features that are supposed to correspond to the distinctive features.2
As was shown above, the view that there is a one-one correspondence between distinctive features and acoustic features conflicts with the semiotic nature of phonological opposition. The distinctive feature does not correspond to elementary units on the phonetic level. As a matter of fact, the distinctive feature is a class of functionally equivalent phonetic features. This insight, which is a logical consequence of a purely theoretical deductive analysis, finds confirmation in experimental phonetics, provided the experimental data are interpreted correctly.
For example, it has been known for decades that in English the vowel preceding a voiceless consonant is shorter than the same vowel preceding a voiced consonant. Perceptual tests with synthetic stimuli have shown that vowel duration is a sufficient one for determining the perception of voicing in a final consonant: if a sequence with a voiceless consonant is synthesized, such as jus, and the duration of the vowel is lengthened, listeners will begin to hear juz. Thus, the duration of the vowel contributes to the perception of a phonetic feature, namely, voicing in the adjacent consonant. This experiment shows that the distinctive feature ‘voicing’ may have two phonetic variants—‘+voicing’ and ‘—voicing’—depending on the environment. The situation when the phonetic feature ‘—voicing’ is perceived as ‘—voicing’ in one type of environment and as ‘+voicing’ in another type of environment is analogous to the situation when a gray circle is perceived as a white circle inside of a black circle and as a black circle inside of a white circle.
Trubetzkoy also gives an alternative definition of the phoneme as “the sum of the phonologically relevant properties of a sound” (Trubetzkoy, 1969: 36), which is similar to the definition formulated earlier by R. Jakobson. In other words, the phoneme is defined as a bundle of distinctive features (Bloomfield, 1933).
Trubetzkoy’s alternative definition of the phoneme is incorrect, because it confounds an analysis of an object into its components with an analysis of a class of objects into its characteristic properties. A phoneme is a minimal component of a sound-form. For example, a sound-form bed consists of three minimal components, that is, phonemes b, e, and d. But distinctive features are not components of a phoneme. Rather, they are characteristic properties of a class of functionally equivalent sounds. Thus, such distinctive features as ‘labial’, ‘stop’, and ‘voiced’ are not components of the phoneme b; rather, they are characteristic properties of the class of sounds denoted by the symbol b.
The identification of phonemes with bundles of distinctive features had unfortunate consequences: it induced some linguists to treat the phoneme as a fictitious entity. The following claim is typical:
It is not the phoneme, but rather the distinctive feature which is the basic unit of phonology: this is the only unit which has a real existence. (Martinet, 1965: 69)
As a matter of fact, the basic unit of phonology is the phoneme. Distinctive features are not units of phonology. They are only functional characteristics of classes of sounds that are phonemes.
In the foregoing sections I discussed the notions of the phoneme and the distinctive feature. Every phoneme is characterized by a set of distinctive features. In a phonological system there are as many different sets of distinctive features as there are different phonemes.
I turn now to an examination of sequences of phonemes in a language.
All languages have rigid constraints on sequences of phonemes. Sequences that are admissible in one language may be inadmissible in another language. For instance, the sequences of phonemes rt and nr occur at the beginning of a word in Russian but do not occur in this position in English. Both English and Russian allow pr but not rp as initial consonantal clusters. In initial position, the cluster of three consonants str is allowed, but all other combinations of these three phonemes are inadmissible: srt, rst, rts, tsr, and trs are impossible. In every language the occurring sequences of phonemes represent only a very small percentage of the theoretically possible sequences.
Sequential units resulting from the fact that some phonemes contract relations with one another are called syllables. The structure of the syllable depends on the relations that can be contracted by the phonemes. These relations are governed by special rules.
The syllable is a minimal sequential phonological unit. There are larger phonological units. Every phonological feature is characterized by what Trubetzkoy called the culminative function. According to Trubetzkoy, the culminative function is represented by accents that signalize how many full words the speech flow is composed of. An accent sets off one and only one syllable of the word at the expense of and in contrast to the other syllables of the same word. Thus, the culminative function makes one syllable the center of the word, and all its other syllables are made the satellites of the central syllable.
Trubetzkoy attributed the culminative function only to word accents. But this function can also be attributed to vowel features as characterizing the segment of the speech flow constituting syllable nuclei. Just as the accents signalize how many full words the speech flow is composed of, the vowel features of the segments of a word signalize how many syllables a word is composed of. The vowel feature sets off one and only one segment of a syllable at the expense of and in contrast to the other segments constituting the syllable. Thus, the vowel feature makes one segment the center of a syllable, and the other segments of the syllable are made satellites of its central segment.
From the point of view of the culminative function, there is a complete analogy between the structure of the syllable and the phonological structure of the word.
The main consequence of the above characteristic of the culminative function is that the central and noncentral phonemes of a phonological unit can be in opposition only to phonemes of the same type; that is, vowels can be in opposition only to vowels, and consonants only to consonants. In other words, vowels and consonants can never occur in the same position. This conclusion is opposed to a fairly widespread view that vowels and consonants can occur in identical phonological positions.
Our theoretical conclusion can be confirmed by an experiment based on the following operational definition of the notion ‘phonological opposition:
(1) OPERATIONAL DEFINITION. A given phonological unit X, which is a part of a larger phonological unit Y, is in phonological opposition to a phonological unit Z if the substitution of Z for X does not destroy the structure of Y.
Our theoretical conclusion can be confirmed by the following experiment. Consider, for example, the words paw [pɔ] and eat [it]. Let us assume the following opposition:
On this assumption, we must be able to freely substitute p for i and, vice versa, i for p, or ɔ for t and, vice versa, t for ɔ, since by definition any phonemes that oppose each other in identical positions can be substituted for each other. As a result of our assumption, we will get impossible syllabic structures: either iɔ, which at best can be counted as two syllables, not one syllable, or pt, which is no syllable at all. These substitutions violate operational definition (1) because they destroy the structure of the respective syllables.
Since the phonological positions must be defined with respect to the structure of the syllable, the analysis of (2) is clearly incorrect. The correct analysis must be
where symbol Ø signifies an empty position.
It is clear from the foregoing that vowels and consonants can never occur in identical phonological positions. We should not confound the notion of the phonological position with the place in a phoneme string. For example, pɔ and it are phoneme strings with an underlying syllable structure. If we take these words merely as phoneme strings, then the consonant p in pɔ correlates with the vowel i in it as the phonemes that occupy the first place in both strings. But if we approach both strings from the point of view of their syllable structure, then, as was shown above, the vowels and consonants in these strings do not constitute oppositions, because they occur in different phonological positions.
There is parallelism between the relation of central and noncentral phonemes in a syllable and the relation of central (stressed) and noncentral (unstressed) syllables in a phonological word. The central and noncentral syllables of phonological words can be in opposition to syllables of the same type; that is, central syllables can be in opposition only to central syllables, and noncentral syllables can be in opposition only to noncentral syllables. Consider the word rampage, which can be stressed either on the first or on the second syllable: ‘rӕmpeɪdӡ or rӕm’peɪdӡ.
The variants of the word rampage are opposed as follows:
In addition to the culminative function, the stress may have the distinctive function, which must be considered its secondary function. The distinctive function of the stress presupposes the culminative function, but the reverse is not true: the stress may not, and usually does not, have the distinctive function. Here is an example of words that are in opposition with respect to the culminative and the distinctive functions: tórment and tormént. This opposition can be represented by the following diagram:
One might wonder why we could not oppose the first syllable and the second syllable of ‘rӕmpeɪdӡ to the first and second syllable of rӕm’peɪdӡ as follows:
These substitutions would violate operational definition (1), because they would destroy the normal stress patterns of words. As a result of our substitutions, we would get impossible stress patterns: either rӕm-peɪdӡ, a word without a stress at all, or ‘rӕm-‘peɪdӡ, with two equivalent stresses on the first and the second syllables of this word, which amounts to no stress at all—the two substitutions have the same result.
Similar considerations apply to the words tórment and tormént. The correct opposition of these words is presented in (5). If we assume the opposition
then, by substituting the respective syllables, we will violate the operational definition, because we will obtain an incorrect stress pattern of words: either a word pattern without any stress on its two syllables, or a word pattern with two equivalent stresses on its syllables, which amounts to no stress at all.
The syllable and the phonological word differ in that the phonological word is a sign—that is, it has some meaning—while the syllable is not a sign; it does not have any meaning. Of course, there are languages, such as Vietnamese, that have only monosyllabic words. But also in this case syllables taken by themselves have no meaning: Vietnamese syllables have meaning only insofar as they are used as phonological words.
The syllable is a universal phonological unit: there are no languages lacking syllables. But the phonological word is not a universal phonological unit: there exist languages in which the stress does not fall on separate words.
For instance, French is quite interesting in this respect. L. V. Scerba compares the function of stress in Russian:
In Russian the speech flow is separated into words due to the fact that every word possesses verbal stress: my čitáem knígu naúčnogo soderžánija ‘we are reading a scientific book’; this sentence possesses five stresses and five words. It is, of course, true that in Russian there exist some unstressed words, the so-called enclitics, and especially proclitics; yet this does not change the basic fact, since the number of such unstressed words is small; and since such words possess, as a rule, the character of movable prefixes and suffixes; moreover, the word which is affected by enclitics always preserves stress in its usual place. Therefore we talk in Russian about the ‘verbal stress.’ No analogy to this exists in French: there stress relates not to individual words but to a group of words, which represent a complete meaningful unit. The stress falls on the last syllable of the final word of the group unless the penult contains the so-called e muet (the only apparent exception to this is the pronoun le which can stand in the terminal position of such a group but remains under stress: donne-le!). The remaining words of the group remain unstressed, as can be seen from these examples: un grand mouchoir de soie ‘a large silk handkerchief’; je viendrai vous voir ‘I will come to see you’; quelques minutes après ‘a few minutes later’; en lisant le journal ‘while reading the newspaper’; aussi vite qu’il pouvait ‘as fast as he could.’ (Ščerba, 1948: 82)
These examples show that a number of stresses does not necessarily signalize the number of independent words in a given speech flow; in languages such as French, stresses signalize only the number of separate word groups in a given speech flow.
While in some languages stresses can correspond to units larger than separate words, namely, to word groups, in other languages they correspond to units that are smaller than words, as well. Let us take, for instance, the German word Wachsfigurenkabinett ‘cabinet for wax figures’. In this word there occur three stresses: a primary one on the first syllable, and two secondary ones on the syllables -nett and -gur-, This example shows that in German, stress signalizes not only the number of separate words in a given speech flow but also the number of parts in a composite word. Comparing the function of stress in German with the function of stress in Russian, A. Martinet writes:
In languages such as German the situation is clear: every element of the composite word preserves the stress which characterizes it as an individual word; the second syllable of the word Figur ‘figure’ always preserves its stress, independently of whether the word Figur constitutes an autonomous member of a sentence or a component of a composite word. Quite different is the situation in such languages as Russian where all components of the composite word, with the exception of one, lose their proper stress: the word nos nose’ loses its stress and the timbre of its vowel when it becomes a component of the composite word nosorog ‘rhinoceros’; in the German equivalent of this word, Nashorn, on the contrary, every one of the components preserves its proper stress; there occurs only the subordination of the stress of the component -horn to the stress of the component Nas-. Thus the accentual unit in Russian is the word, and in German the lexeme. (Martinet, 1949: 89)3
The largest phonological unit is the phonological sentence characterized by intonation. Intonation also characterizes word groups as parts of a sentence. The sentence and its parts characterized by intonation can be called intonational units. It should be noted that intonation has not only the culminative function, signaling how many sentences the speech flow consists of, but also the distinctive function. For example, the opposition between rising and falling intonation is used to differentiate between subordinate and main classes.4
The analysis of phonological units in 2.10.1 allows us to posit the following four universals:
(8) LAW OF PHONOLOGICAL OPPOSITION:
No language can have the opposition of vowels and consonants: vowels can be in opposition only to vowels, and consonants can be in opposition only to consonants
(9)LAW OF PHONOLOGICAL CONTRAST:
Every language has the contrast of vowels and consonants.
(10) LAW OF PHONOLOGICAL FUNCTIONS:
In every language the distinctive function is obligatory for consonants and optional for vowels, since the culminative function is the only essential phonological function of vowels.
(11) LAW OF MINIMAL VOCALISM:
For every language, a possible minimal vocalism consists either of two vowels having the distinctive function or of one vowel having no distinctive function.
Recall that, by the definition in 1.1, the term opposition designates the paradigmatic relation, and the term contrast designates the syntagmatic relation.
The above phonological laws are not inductive laws. Rather, they are consequences of the analysis of the semiotic properties of phonological systems. It follows from the Law of Minimal Vocalism that there can be languages having a monovocalic phonological system.
The Law of Minimal Vocalism is based on the following assumptions. Since, under the Law of Phonological Opposition, vowels can be in opposition only to vowels, never to consonants, and any opposition presupposes at least two phonemes, then any vocalism having the distinctive function must consist of at least two vowels. As to the culminative function, it is based on contrast, and since, under the Law of Phonological Contrast, every language has the contrast of vowels and consonants, the minimal vocalism, restricted only to the culminative function, must be confined to a single vowel.
Minimal phonological vocalism confined to a single vowel is a theoretical possibility supported by an analysis of the semiotic properties of the phonological system. But it is an empirical question whether at present there exist languages having a vocalism confined to a single vowel. Some linguists have claimed that Kabardian and Abaza, North Caucasian languages, are phonologically unique in that they have no vocalic opposition and are confined to a single vocalic phoneme. This claim has been the subject of controversy (see Genko, 1955; Kuipers, 1960; Allen, 1965; Szemerényi, 1964 and 1977; Lomtatidze, 1967; Halle, 1970; Kumaxov, 1973; Colarusso, 1975).
By the same token, Gamkrelidze and Mačavariani have considered the Proto-Kartvelian phonological system to be monovocalic. Thus, Gamkrelidze, in an article that summarizes the results of the massive phonological research on the vocalism in Kartvelian languages, done by him and Mačavariani (1965), writes as follows:
We may envision an earlier stage of Common Kartvelian with no phonemic contrasts between vowels, assigning the two vowels *e and *a as allophones to one original vowel, which later split into different phonemic units according to the character of its allophones. (Gamkrelidze, 1966: 80)
I intend to examine closely the hypotheses of monovocalism in Kartvelian and North Caucasian languages in a separate work.
There are already a great number of publications on the phonological structure of the syllable. Among the earlier studies, the most important is the work of Kuryłowicz (1948). Important recent contributions include the works of Kahn (1976), Vennemann (1978), Halle and Vergnaud (1979), and Clements and Keyser (1983), among others. A critical survey of these works is, however, outside the scope of this investigation. I shall examine the phonological structure of the syllable only from the standpoint of the semiotic approach. As a point of departure, I will take some notions about the syllable on which there has been a convergence of opinion in the more recent literature. Then I will apply a semiotic analysis to these notions and will trace out its important consequences, which throw new light on the phonological nature of vowels.
The syllable consists of three parts: 1) the onset, 2) the nucleus, and 3) the coda. For example, in a syllable bed, b is the onset, e is the nucleus, and d is the coda. The nucleus and coda combined are called the core. This analysis of the syllable can be represented by the following constituency tree:
The phonemes that function as nuclei are called vowels. The phonemes that do not function as nuclei are called consonants. A syllable can be reduced to its nucleus, but never to its onset or coda. In other words, a syllable may consist solely of a vowel but never solely of a consonant or a consonant cluster.
A syllable without a coda is called an open syllable, and a syllable with a coda is called a closed syllable.
The above constituency tree specifies a hierarchy of the components of the syllable. The onset and the core belong to the first level of the hierarchy, and the nucleus and the coda to the second level.
In positing this hierarchy, we are able to explain some important properties of the syllable.
In many languages the closed syllables either do not occur at all (as in Old Russian) or occur very rarely (as in Georgian or Italian), while there are no languages in which all syllables begin only with vowels, that is, do not have onsets. In languages where closed syllables occur, the occurrence of closed syllables presupposes the occurrence of open syllables. That can be explained by the fact that the opposition of the components onset:core is basic, because these components belong to the first level of the above hierarchy. This basic opposition must occur in every language. On the other hand, the opposition of the components nucleus:coda is not basic, because these components belong to the second level of the hierarchy. Therefore, in some languages this opposition can be reduced to the nucleus.
Other important phenomena explained by the above hierarchy are the quantity and the intonation of the syllable. There is an intimate relationship between the quantity or the intonation of the syllable and its core to the exclusion of its onset. The quantity of a syllable is determined by the core, that is, by the nucleus plus the coda, so that the equivalence holds:
(13)long nucleus = short nucleus + coda
In many languages that have syllables whose core consists only of a short vowel, this vowel cannot be stressed: stress must pass onto a neighboring vowel. Such syllables are called light. A syllable whose core consists of a long vowel, two short vowels, a short vowel plus a consonant, or combinations of these is called heavy. The stress placement may depend on the distinction of light and heavy syllables. For instance, in Latin, stress is placed on the penultimate syllable of a word if it is heavy, and on the antepenultimate syllable if the penultimate syllable is light. Compare: 1) rédigo ‘I drive back’, 2) redḗgi I have driven back’, and 3) redáctum ‘driven back’. Stress is placed on the penultimate syllables in 2) and 3) and on the antepenultimate syllable in 1) in accordance with the rule of stress placement.
The intonation of a syllable extends onto the nucleus and a certain part of the coda. For example, in Lithuanian ver̃kti ‘to cry’, consisting of two syllables ver̃k-ti, the rising intonation extends onto the e+r̃ of the syllable ver̃k.
The basic problem in defining the syllable is the definition of the syllable boundary. The definition of the syllable must be based on the syllable boundary as a primitive notion. Taking the notion of the syllable boundary as primitive, we can define the syllable as follows:
(14) The syllable is a maximal string of phonemes between two syllable boundaries.
If we accept this definition of the syllable, we have to turn to the definition of the syllable boundary. To define the syllable boundary, I will introduce the notion of the interlude. I use the term interlude to denote a string of one or more consonants between two vowels—an intervocalic string of consonants.
An interlude may have either a binary or a unary structure. A binary interlude has two components: the left part, which constitutes the coda of the preceding syllable, and the right component, which constitutes the onset of the succeeding syllable. A unary interlude must be conceived of as a binary interlude reduced to its right component, which constitutes the onset of the succeeding syllable. It follows that the onset component is the constitutive component of the interlude: the coda component of the interlude presupposes the onset component, but the onset component does not presuppose the coda component.
The basic assumption in defining the syllable boundary is that there is an intimate relationship between word structure and syllable structure. Ideally, the same sequential constraints that operate at the beginning of a word should be operative at the beginning of a syllable; and the same sequential constraints that operate at the end of a word should be operative at the end of a syllable.
The syllable boundary is defined by the following principles:
1) The interlude constitutes the onset of the next syllable, unless the preceding syllable cannot be kept open because its vowel does not occur in word-final position. If that is the case, then as many consonants as necessary to provide the syllable with an admissible coda must be detached from the onset of the next syllable and transferred to the preceding syllable.
To illustrate the first principle, let us take words such as eastern and tester. The word eastern can be syllabified as i$stərn, where $ denotes the syllable boundary. But the word tester cannot be syllabified as te$stər, because this syllabification violates a sequential constraint in English by which the short vowels e, u ,o,ӕ are disallowed in word-final position. Since te$ster contains the vowel e, which does not occur word-finally, it must be resyllabified to yield tes$tər.
2) If the interlude does not occur in word-initial position, then as many consonants as necessary to reduce it to the admissible word-initial shape must be detached from it and transferred to the preceding syllable as coda.
These are the two main principles that define the syllable boundary. In addition, there are a few special principles of the syllable boundary which there is no need to discuss in this very general outline of phonology.
In 10.1 it was shown that vowels differ from consonants in that vowels have the culminative function and consonants do not. A vowel constitutes the center, the nucleus, of a syllable, while consonants are in marginal positions; consonants are satellites of vowels. This phonological definition of vowels and consonants contrasts with the phonetic definition of vowels as sounds characterized by voice modified by various shapes of the oral cavity, and of consonants as sounds produced by the closure of air passages.
The proposed phonological definition of vowels and consonants is not generally accepted in the current phonological literature. While some linguists recognize that in many cases the roles played by phonemes in the syllable may be a useful basis for a functional distinction between vowels and consonants, they deny that this approach can be used for a universal phonological definition of vowels and consonants. For example, Martinet admits that it is mostly expedient to distinguish between phonological systems of vowels and consonants. He writes:
What is expected of consonants and vowels is not that they should appear in the same contexts, that is they should be in opposition, but that they should follow one another in the chain of speech; in other words, we expect them to be in contrast. (Martinet, 1960: 72)
But at the same time he makes the following reservation:
This does not mean that certain sounds cannot, according to this context, function as the syllabic peak, which is normal for a vowel, or as the flanking unit of this peak, which is normal for a consonant. [i] in many languages is a syllabic peak before a consonant and the adjunct of such a peak before a vowel: e.g. French vite and viens.  is a syllabic peak, i.e. a vowel, in the English battle or Czech vlk ‘wolf,’ but a consonant in English lake or Czech léto ‘year.’ In these circumstances there is no point in distinguishing two phonemes, one vocalic and the other consonantal. (Martinet, 1960: 72-73; emphasis added)
The fact that sometimes consonants can be used as syllabic nuclei and vowels as satellites of the syllabic nuclei seems to be the evidence that a phonological definition of vowels and consonants based upon their function in the syllable cannot be universally valid. And yet, if correctly interpreted and understood, it does not undermine the universal validity of this definition. It is true that one and the same phoneme may function sometimes as a syllable nucleus and sometimes as a nonsyllabic phoneme in the same language. But we must distinguish between the primary and secondary functions of a phoneme. Thus, the primary function of vowels is to serve as syllable nuclei, while their secondary function is to serve as consonants. Conversely, the primary function of consonants is to serve as satellites of syllable nuclei, while their secondary function is to serve as syllable nuclei.
The distinction between the primary and secondary functions of vowels and consonants is based on their range. By the range of vowels and consonants I mean their distribution within a syllable. If the range of a phoneme is greater when it serves as a syllable nucleus than when it serves as a satellite, then the primary function of the phoneme is to be a syllable nucleus, and its secondary function is to be a satellite. Conversely, if the range of a phoneme is greater when it serves as a satellite than when it serves as a syllable nucleus, then the primary function of the phoneme is to be a satellite, and its secondary function is to be a syllable nucleus.
It is to be noted that the notion of the range of the phoneme has nothing in common with the statistical notion of frequency. The range of a phoneme is defined solely by its distributional possibilities. For example, Czech r and l as satellites occur in syllable-initial and syllable-final positions, while as syllable nuclei they occur only between consonants. Therefore, their primary function is to be satellites, while their secondary function is to be syllable nuclei. The French i as a syllable nucleus occurs between syllable-initial and syllable-final consonants, between zero onset and syllable-final consonants, between syllable-initial consonants and zero coda, while as a satellite it occurs only before vowels. Therefore, the primary function of the French i is to be a syllable nucleus, and its secondary function is to be a satellite.
The distinction between the primary and secondary functions of vowels and consonants in the syllable throws new light on the problem recently raised by Clements and Keyser (1983). They claim that in some cases long clusters of consonants may contain consonants that are not members of any syllable. Such consonants they call extrasyllabic. They characterize extrasyllabic consonants as follows:
An extrasyllabic consonant is one which is not a member of any syllable. Typically, such consonants are separated from neighboring consonants by short neutral or voiceless vowels and are historically susceptible to processes which either eliminate them or incorporate them into well-formed syllables by means of processes such as vowel epenthesis, sonorant vocalization and metathesis. English has several such examples. The usual pronunciation of knish in Cambridge, Massachusetts, for example, inserts a short, voiceless schwa after the k. However, this is not a full schwa as is evidenced by its near minimal contrast with the word canoe. Other examples of extrasyllabic consonants in English include the initial consonants of the names Pnin, Knievel, Zbiegniew, Khmer Rouge, Dvořák, Phnom Penh, Dmitri and Gdansk, in usual pronunciations, not to speak of the b in common renderings of the name of the former Iranian minister Ghotbzadeh. (Clements and Keyser, 1983: 39-40)
Here is an example of extrasyllabic sonorants from Klamath (an American Indian language of Oregon), which Clements and Keyser represent in the following diagrams (Clements and Keyser, 1983: 121):
Clements and Keyser have introduced the notion ‘extrasyllabic consonant’ in order to handle complex consonant clusters that are not parsable by the rules of their theory of the syllable. The difficulty lies not in the complexity as such but in that such consonant clusters are irregular in a given language and can be parsed as a part of a syllable neither by the rules of the theory of Clements and Keyser nor by the rules of any other viable theory of the syllable.
We can understand the motivation for the notion ‘extrasyllabic consonant’, but this notion does not seem to resolve the difficulty. Although this solution does away with irregular consonant clusters, a new difficulty arises: extra-syllabic consonants as such are heterogeneous elements, and by being heterogeneous elements they interfere with the sequential structure of the phonological word and make it irregular. This irregularity seems to be as odd as the irregularity of consonant clusters.
The answer to our problem is in the distinction between the primary and secondary functions of phonemes. Relying on this distinction, we can treat what Clements and Keyser call extrasyllabic consonants as syllable nuclei with or without satellites, depending on their number. Thus, the function of the second sonorant l in the Klamath word ὠillGa and the sonorant n in the Klamath word nἰəpk̓a are counterparts of the function of the Czech sonorant l in the Czech word vlk ‘wolf’. In this case, consonants have taken on the function of vowels, that is, the function of the syllable nucleus, as their secondary function. Similarly, in the above examples from English, such as Gdansk or Zbigniew, the consonants g and z have taken on the function of the syllable nucleus. Since this function is more unusual for g and z than for the sonorants r, l, m, and n, an ultra-short schwa is inserted to support the pronunciation of g and z in the usual position.
If we compare the solution to the problem of irregular consonant clusters using the notion ‘extrasyllabic consonant’ and using the distinction between the primary and secondary functions of the phoneme, we can see that the odds are in favor of the latter solution. The extrasyllabic consonant is an ad hoc notion introduced especially to solve the problem of irregular consonant clusters, while the primary and secondary functions of the phonemes are notions introduced on independent grounds to explain the structure of the syllable. The latter notions are an instance of even broader notions of primary and secondary functions of semiotic units, which are a cornerstone of semiotics and must be a cornerstone of any adequate linguistic theory. In addition, while the introduction of the notion ‘extrasyllabic consonant’ pretends to solve the problem of irregular consonant clusters, it generates a new difficulty, since, as was pointed out above, extrasyllabic elements are heterogeneous elements that make the structure of a phonological word seem irregular. Relying on the distinction between the primary and secondary functions of phonemes, we solve the problem of irregular consonant clusters not by introducing new notions but by treating some consonants as syllables with consonantic, rather than vocalic, syllable nuclei.
We are now ready to consider the notion ‘prosodic feature’. Prosodic features can be defined as elements that characterize units of speech flow whose duration differs from the duration of phonemes. These units usually are larger than phonemes: a syllable or a phonological word, which consists of one or more syllables; they can, however, be smaller than the phoneme, as, for example, in the case when the syllabic nucleus splits into two consecutive units called morae.
In essence, prosodic features are subdivided into two groups: accentual and nonaccentual.
Accent, or stress, should be defined as the setting off of one syllable within a bisyllabic or a polysyllabic word. The basic function of stress is the so-called culminative (crest-forming) function, which consists in the fact that the accents signalize the number of independent words within the given speech flow. Since every self-contained word usually possesses only one stress, it follows that the number of stresses in any given speech flow determines the number of self-contained words, as well.
With respect to the place it occupies in the word, stress can be either bound or free. A stress is called a bound stress if in the given language it always falls in one and the same place (in Czech, for instance, the stress falls always on the first syllable, in Turkish always on the last syllable), or if its place is determined strictly by the phonemic structure of the word (in Latin, for instance, where the stress can fall on either the penult or the antepenult, its place is determined by the phonemic structure of the penult). A stress is called a free stress if it can occupy various places independently of the phonemic structure of the word; consequently, it possesses, aside from the culminative function, also the function of word differentiation (for instance, in Russian, where the stress is free, the words píli ‘they drank’ and pilí ‘saw (imperative)’ differ phonologically only with respect to their place of stress).
According to the manner of the setting off of the stressed syllable, we differentiate the following types of stress:
(a) strong, or dynamic, stress (the stressed syllable is set off by greater tenseness of its articulation);
(b) quantitative stress (the stressed syllable is set off by increased lengthening of the articulation of the vowel); and
(c) tonal, or musical, stress (the stressed syllable is set off by a change in the tone pitch).
Dynamic stress, which is innate in Russian, is rather widespread among the various languages of the world.
Quantitative stress is seldom encountered; it is found, for instance, in modern Greek.
Tonal stress, which is used by various languages of the world, can have several variations. In Lithuanian there exists an opposition of a rising and a falling intonation. In Swedish and Norwegian there exists an opposition of a simple and a complex stress; a simple stress consists of a falling and rising pitch (the direction of the tone movement is immaterial, since it changes with respect to dialect changes), while a complex stress consists of a falling-rising pitch.
Prosodic features that in one and the same word are relevant to more than one syllable are called nonaccentual. For example, in the African language Lonkundo there exists a contrast between the words lòcòlò ‘palm fruit’ and lòcóló ‘invocation’ (the symbol ‘ represents a low-register tone, the symbol ’ a high-register tone). Nonaccentual prosodic features do not have the culminative function; their only purpose is word differentiation.
In order to simplify the description of prosodic features, it is useful to introduce the concept of the mora. The mora is the minimal segment of the speech flow that can be a carrier of a prosodic feature. In respect to the concept of the mora, a long vowel phoneme that, let us say, has a rising pitch can be regarded as a succession of two morae, the first of which is a carrier of a sharp tone of low register and the second, a carrier of a sharp tone of high register. On the other hand, if the long vowel has a falling pitch, it can be regarded as a succession of two morae, the first of which is a carrier of a sharp tone of high register and the second, a carrier of a sharp tone of low register. For instance, if we take the Lithuanian word nósis ‘noise’ (with a falling intonation) and the word takas ‘footprint, track’ (with a rising intonation), we can regard each of these words as possessing three morae. In the word nósis the first mora is set off; in the word takas, the second. A simplification of the description can be attained by reducing the twofold characteristic of intonation (quality of intonation and place of intonation) to a single characteristic (place of intonation), because the application of the concept of the mora makes the quality of intonation phonologically redundant.
The above discussion presents, in short, the classification of prosodic elements in modern phonology. It is unnecessary to go into further detail here.
Prosodic features have a purely relational nature; they are independent of definite phonic substance. The essential thing is that we distinguish between prosodic and nonprosodic elements, not because the substance of prosodic and nonprosodic elements is objectively different but because we get to them through two different analyses of the speech flow. Imagine a language that would allow only syllables of the type mā or ba, that is, fully nasalized or fully nonnasalized syllables. If we assume at the same time that in this language every word could have only one nasalized syllable, it becomes apparent that the function of nasality would be basically identical to the function of stress in such languages as, let us say, Russian, English, or German. This mental experiment is a logical consequence of the following statement, which I formulate as the Principle of the Transposition of Phonological Structure:
Any phonological structure can be transposed from one phonic substance into another as long as the phonological relations characterizing this structure remain intact.
The Principle of the Transposition of Phonological Structure predicts that one and the same phonic substance can be used in one language as a prosodic feature and in another language as a nonprosodic feature. Consider the glottal stop. In Arabic the glottal stop is a phoneme of this language; in a few American Indian languages it is a distinctive feature characterizing globalized phonemes. But in Lettish or Danish it is no longer a phoneme or a distinctive feature but, as it were, a kind of accent, that is, a prosodic feature.
In conclusion, a few words about nonaccentual prosodic features. One may wonder whether nonaccentual prosodic features could be treated simply as distinctive features. Thus, if we treated low-register and high-register tones as distinctive features, then we could treat ò and ó as two different phonemes, like i and u or k and g. This approach is certainly possible. However, it is consistent to regard different tones as prosodic features, because tones affect segments of the speech flow that do not necessarily coincide with phonemes: a given vowel phoneme may be pronounced with a sequence of tones, and a given tone may affect a number of vowels or fractions of vowels.
Let us turn to the phonological theory called generative phonology. The bases of generative phonology were elaborated by Morris Halle and Noam Chomsky between 1955 and 1968, when their book The Sound Pattern of English was published.
Generative phonology constitutes the phonological component of generative-transformational grammar. The phonological component consists of two levels: the level of systematic phonemics and the level of phonetics. The level of systematic phonemics roughly corresponds to the morphophonemic level of non-transformational grammar, and the level of systematic phonetics roughly corresponds to the level of phonetics. The level of systematic phonemics consists of systematic phonemic representations of underlying forms, which are converted by phonological rules into systematic phonetic representations, which constitute the level of systematic phonetics.
The basic claim of generative-transformational grammar is that the phonemic level does not exist, and therefore—and this is a revolution—the old phonology must be rejected and replaced by a radically new discipline: generative phonology. Let us see whether this revolution is justified. In order to do so, we have to examine the arguments against the existence of the phonemic level.
In his book on Russian phonology, Morris Halle advanced an argument disposing of the phonemic level. He writes:
In the Russian example discussed the morphophonemic representation and the rule concerning the distribution of voicing suffice to account for all observed facts. Phonemic representation, therefore, constitutes an additional level of representation made necessary only by the attempt to satisfy Condition (3a). If Condition (3a) can be dispensed with, then there is also no need for the ‘phonemic’ representation. (Halle, 1959: 21)
Here is how Halle presents Condition (3a):
A phonological description must include instructions for inferring (deriving) the proper phonological representation of any speech event, without recourse to information not contained in the physical signal.
He adds in a footnote:
This requirement has played a particularly important role in the development of American linguistics. ‘For a notation to be phonemic we require a bi-unique, one-one relation rather than a many-one relation (between representation and utterance—M.H.).’ C. F. Hockett, Review of A. Martinet’s Phonology as Functional Phonetics, Language, 27, 340 (1951).
Halle makes the following comments about Condition (3a):
Condition (3a) is concerned with procedures that are essentially analytical. Analytical procedures of this kind are well known in all sciences. Qualitative and quantitative chemistry, electrical circuit analysis, botanical and zoological taxonomy, medical diagnosis are examples of disciplines concerned with discovering the appropriate theoretical representations (i.e., chemical formula, configuration of circuit elements, classification within the taxonomical framework, names of disease, respectively) of different complexes of observable data. Theoretical constructs are never introduced because of considerations that have to do with analytic procedures. Thus, for instance, it is inconceivable that chemistry would establish substances that can be identified by visual inspection as a category distinct from substances that require more elaborate techniques for their identification. Yet this is precisely the import of Condition (3a), for it sets up a distinction between phonemes and morphophonemes for the sole reason that the former can be identified on the basis of acoustic information alone, whereas the latter require other information as well.
So important a deviation from a standard scientific practice can only be justified if it were shown that phonology differs from other sciences in such a way as to warrant the departure. This, however, has never been demonstrated. Quite to the contrary, it has been common to stress the essential similarity between the problems of phonology and those of other sciences. The conclusion, therefore, imposes itself that Condition (3a) is an unwarranted complication which has no place in a scientific description of language.
I agree with Halle’s criticism of Condition (3a). One may, however, wonder whether the phonemic representation would satisfy Condition (3a). As a matter of fact, the phonemic representation, if it is to be linguistically significant, should run counter to Condition (3a) rather than satisfy it. The two-level theory of phonology is not based on analytical procedures. It is based on the Principle of Semiotic Relevance.
An important consequence of the Principle of Semiotic Relevance is that phonological representation should stand in a many-many relation to phonetic representation; that is, one phonemic symbol might correspond to several phonetic symbols. Another important consequence of the Principle of Semiotic Relevance is the necessity to strictly distinguish between distinctive and acoustic features: the former belong in phonemic representation, the latter in phonetic representation. Just as phonemes and sounds should be in many-many relations, so distinctive features and acoustic features should be in many-many relations, as well.
Let us now turn to the Russian example considered by Halle. He presents the Russian words mok li ‘was (he) getting wet?’ and mok by ‘were (he) getting wet?’, žeč li ‘should one burn’ and žeč by ‘were one to burn’ in three types of transcription, as shown below:
|mok l’i||mok l’i||mok l’i|
|mok hi||mog hi||mog hi|
|žeč l’i||žeč l’i||žeč l’i|
|žeč bi||žeč bi||žeǯ bi|
Column I represents a morphophonemic representation, column II a phonemic representation, and column III a phonetic representation.
In Russian, voicing is distinctive for all obstruents except c, č, and x, which do not possess distinctive voiced counterparts. These three obstruents are voiceless unless followed by a voiced obstruent, in which case they are voiced. At the end of the word, however, that is true of all Russian obstruents: they are voiceless, unless the following word begins with a voiced obstruent, in which case they are voiced.
The forms in column III can be deduced from the forms in column I by the rule
(2) obstruent → voiced in the context:____voiced obstruent.
But if grammar is to generate the forms in column II, it cannot have one general rule (2), but instead of this rule it will have two rules, (3a) and (3b), the first one linking the morphophonemic representation with the phonemic representation and the second one linking the phonemic representation with the phonetic representation:
Thus, assuming that there is a phonemic level, we cannot state the significant generalization that voicing assimilation applies uniformly to all obstruents. It is for difficulties of this type that Halle rejects a phonemic level.
What should we think about this example? To evaluate it properly, we have to make necessary corrections in the phonemic representation of the words. If we acknowledge the distinctive property of phonemes, we can see that the phonemic transcriptions mok l’i and mok bi are incorrect, since they do not take into account that at the end of the word all Russian obstruents can be only voiceless unless followed by a voiced obstruent, in which case they can be only voiced. The correct transcriptions are moK l’i and moK bi, because in these words the voicelessness of k is redundant and the voicing of g is redundant (symbol K denotes an archiphoneme restricted with respect to voicelessness: voicing, whose instances are sounds k and g).
Let us now compare the distinctive and redundant features of the phonemes presented in columns I and II.
1) In column I voicelessness is a distinctive feature of k, since k contrasts with g, and a redundant feature of č, since č has no voiced counterpart.
2) In column II voicelessness is redundant in K and č, and voicing is redundant in g. Therefore, they belong in the same class of obstruents with respect to the redundant features voiceless:voiced.
Accordingly, the correct phonological representation must be
|(4)||moK l’i||moK bi||žeč l’i||žeč bi|
Rule (2) applied to the correct phonological representation will give the output in column III. In this view rule (2) is phonemic rather than morphophonemic.
We conclude that rule (2) can be considered morphophonemic or phonemic, depending on whether the morphophonemic or the phonemic level is chosen as the starting point of phonological theory.
The question arises, Which level must we choose as the starting point of phonological theory?
The goal of phonology is the study of the system of distinctive oppositions. Therefore, the starting point of this theory is the phonemic level as an ideal system of distinctive oppositions whose properties are determined by the Principle of Semiotic Relevance. The ideal system of distinctive oppositions is not based on analytic procedures but is postulated as a theoretical construct from which possible systems of distinctive oppositions are deduced.
The goal of generative phonology is the study of the system of alternations. Therefore, the starting point of this theory is the morphophonemic level postulated as a theoretical construct.
A system of distinctive oppositions is an essential part of any human language. A system of alternations is a significant but not essential part of a human language. We cannot imagine a human language without a system of distinctive oppositions, but we can imagine one without a system of alternations.
Since both a system of distinctive oppositions and a system of alternations constitute the expression plane of a language, we must seek to integrate studies of these systems into a single phonological theory.
Is it possible to realize this integration in the framework of generative phonology?
Generative phonology deals with processes of mapping the morphophonemic level onto the phonetic level. To describe these processes, we do not need an intermediary level; the phonemic level is redundant with generative phonology. In view of that, it is difficult to see how the study of alternations can be integrated on the basis of generative phonology.
It is not clear how distinctive oppositions can be defined on the basis of alternations, but it is natural to define alternations on the basis of distinctive oppositions. Alternations are nothing other than a subset of distinctive oppositions that hold between allomorphs of morphemes.
In this view, the morphophonemic level, that is, the level of alternations, is a sublevel of the phonemic level, which is the level of distinctive oppositions. Hence, it is indicated to solve our problem within a phonological theory whose starting point is the phonemic level.
Since the morphophonemic level is a sublevel of the phonemic level, it is partially subordinated to the phonemic level. Therefore, the essential step towards the creation of an integrated phonological theory is to determine the nature of this subordination.
Our first observation is that we must distinguish phonetically conditioned and nonphonetically conditioned alternations. Nonphonetically conditioned alternations are a projection of diachronic processes onto the synchronic structure of a language. Consider the alternations wiodę: wiedziesz I lead: you lead’, biorę:bierzesz ‘I take:you take’, niosę: niesiesz ‘I carry:you carry’ in Polish. The vocalic alternation o:e in these words is not phonetically conditioned as it was in Old Polish. In Old Polish this alternation was conditioned by non-palatalized anterior lingual consonants. But in Modern Polish this conditioning has been lost, because in a later stage of development of Polish a new e appeared before nonpalatalized anterior lingual consonants as a result of the change of the reduced vowel b into e. Thus, the phonetically conditioned alternation o: e of Old Polish has been projected into Modern Polish as a morphologically conditioned alternation e:o. In Modern Polish the change of e into o is not conditioned by nonpalatalized anterior lingual consonants, since e occurs before these consonants (for instance, in sen ‘sleep’); it is conditioned by the verbal suffixes -e (first person singular present) and -ą (third person plural present). Here the suffixes -e and -ą can be viewed as operators that, when applied to stems ending in nonpalatalized anterior lingual consonants, imply the change of e into o.
We can posit the relation of implication between phonemic alternations and morphemes that condition phonemic alternations.
Abstracting from morphologically conditioned phonemic changes, we can posit ideal forms of morphemes underlying allomorphs. In our case the ideal forms of morphemes will be: wied-, bier-, nies-. They are ideal because they do not coincide with any of their allomorphs.
Besides morphologically conditioned alternations, there are phonetically conditioned alternations. For instance, we have the alternation d: T in the Russian words sadu:saT ‘to the garden:garden’. This alternation is phonetically conditioned, since voiced obstruents automatically lose their voicing at the absolute end of words.
It should be noted that whereas morphologically conditioned alternations hold between the allomorphs of a morpheme, phonetically conditioned alternations hold between variants of the same allomorph. Thus, in our example the alternating parts of the Russian words, that is, sad- and saT-, should be considered variants of the same allomorph sad- rather than two different allomorphs.
We conclude that alternations are grounded in the synchronic structure of the phonemic level. Therefore, we must posit the following constraint on possible rules in the integrated phonological theory:
No rules should be formulated in phonetic terms if they do not correspond to synchronic processes in the structure of the phonemic level.
I call this constraint the Condition of Synchronic Motivation. In accordance with this condition, we cannot formulate the alternation e:o in the above Polish example as a phonetic rule of the change of e into o before anterior lingual non-palatalized consonants, since in Modern Polish vowel e occurs in this position.
The Condition of Synchronic Motivation is an effective bar to substituting diachrony for synchrony in phonological studies.
In constructing a phonological theory, we can choose between two approaches: either the morphophonemic level or the phonemic level hypothesized as theoretical constructs can be taken as a starting point of the theory. The morphophonemic level is the starting point of generative phonology; the phonemic level is the starting point of the two-level theory of phonology.
Since the morphophonemic level and the phonemic level are interrelated, we must seek to integrate the study of the two levels in a single phonological theory.
Since the phonemic level is redundant with generative phonology, it would be difficult to realize this integration on the basis of generative phonology.
The study of the phonemic and the morphophonemic levels can be naturally integrated within a theory that takes the phonemic level as its starting point.
Integrated phonological theory must have two basic goals: 1) the exploration of systems of distinctive oppositions as autonomous objects; and 2) the exploration of systems of alternations as objects partially subordinated to systems of distinctive oppositions.
Why should systems of distinctive oppositions be considered autonomous objects? Because they are completely independent of systems of alternations. Although systems of alternations play an important part in natural languages, they are not absolutely essential: we can imagine a language without a system of alternations, but no natural language can exist without a system of distinctive oppositions.
The basic methodological constraint on possible rules in integrated phonological theory is the Condition of Synchronic Motivation. This condition involves a consistent differentiation between phonetically conditioned and non-phonetically conditioned rules. Phonetically conditioned rules cover variations of sounds occurring in the speech flow as instances of phonemes and phonetically conditioned alternations of nonrestricted (or free) and restricted phonemes. Nonphonetically conditioned rules cover nonphonetically conditioned alternations (morphophonemic rules) and syllable structure (syllable structure rules).
The strict separation of functional and physical equivalence of sounds and acoustic features, functional and physical segmentation of the speech flow, and phonetically and nonphonetically conditioned alternations should be considered several of the universal constraints on the construction of phonological models. The study of universal constraints on the construction of phonological models is a basic methodological goal of an integrated theory of phonology.
In constructing phonological models, we may face the possibility of alternative choices; for instance, we may have to make a choice between a binary feature system (Jakobson, 1968; Chomsky and Halle, 1968) and a multivalued feature system (Ladefoged, 1971). The evaluation of different possibilities in constructing phonological models should be based partly on the abstract study of the properties of semiotic systems and partly on empirical evidence from linguistic typology, linguistic change, and comparative language acquisition studies by psycholinguists.
I have examined Halle’s arguments against the existence of the phonemic level and have demonstrated that they are false. I have shown that the system of phonemic oppositions is an essential aspect of any phonological system, while morphophonological alternations are optional, in principle. Therefore, the phonemic level is a basic and autonomous level of any phonological system, while the morphophonemic level, which is dispensable in principle, is subordinated to the phonemic level.
Phonology is an indispensable part of linguistics as a discipline that studies the phonemic level of language.
Granted that the claim about the nonexistence of the phonemic level is false, and granted that phonology is an indispensable part of linguistics, one may wonder, however, whether generative phonology could be considered valuable not as a replacement for phonology but as a modern version of morphophonology. Let us now consider whether this more modest claim of generative phonology might be acceptable.
The fundamental error of generative phonology is disregard of the properties of language as a sign system that, as was shown above, are characterized by the Principle of Semiotic Relevance and the Principle of Synchronic Stratification. The complete lack of understanding of the sign nature of language has led to disastrous consequences—to a confusion of synchrony and diachrony and a confusion of the phonemic level with the morphophonemic and phonetic levels.
Every state of a given language contains a series of layers that reflect different stages of its history. With respect to these layers, we can speak about the diachronic stratification of the state of a language. In the same way, the geological layers of the surface of the Earth reflect different stages of its history. But from a functional viewpoint, any language is synchronically stratified, and the synchronic stratification of a language sharply differs from its diachronic stratification. Moreover, these two stratifications, as is well known, conflict with each other. The diachronic stratification of a language is the object of internal reconstruction, which is part of the history of language. But it is one thing to study the diachronic stratification of a language as part of its history, and it is another thing to confound the diachronic stratification of a language with its synchronic stratification. And that is what generative phonology does.
In order to justify a confusion of synchrony with diachrony, Chomsky and Halle advanced a proposition that phonological rules are extrinsically ordered. This proposition is known as the Ordering Hypothesis. Chomsky and Halle say:
The Hypothesis that rules are ordered . . . seems to us to be one of the best-supported assumptions of linguistic theory. (Chomsky and Halle, 1968: 342)
Chomsky and Halle’s claim that the Ordering Hypothesis is one of the best-supported assumptions of linguistic theory is false. Any extrinsic ordering of rules is arbitrary and therefore unacceptable. True, language is a hierarchical system, but there must be an inner dependence between the rules of any natural hierarchy, rather than an extrinsic ordering. The order of rules may reflect the order of phenomena in time, but time order belongs to diachrony rather than synchrony.
Rule ordering is a formal device that creates a vicious circle in synchrony: the generation of phonological objects is justified by the Ordering Hypothesis, which in its turn is justified by the fact that ordered rules may generate phonological objects. That is a case of an arbitrary application of formal machinery to empirical facts, as discussed above in the section on the generativist notion of language (chap. 1, sec. 7). True, the generative model can generate phonological objects; but, as was shown above, from the fact that a mathematical design works, one cannot conclude that language works in the same way. The generative model is based on logical necessity, but logical necessity does not necessarily conform to empirical necessity. It may be in conflict with empirical necessity. And such is the case with respect to the formal machinery of generative phonology based on arbitrary rule ordering.
Generative phonology confounds functional description with internal reconstruction. Functional description belongs to synchrony, and internal reconstruction belongs to diachrony, but generative phonology lumps these two things together under the heading ‘linguistic competence’. The rules of generative phonology pretend to characterize the linguistic competence of the speaker-hearer. It is clear, however, that the abstract underlying structures that are proposed as an input of these rules cannot characterize a language either as a tool of communication or as a tool of cognition. The rules of generative phonology have nothing to do with the competence of the speaker-hearer, for whom the only thing that matters is functional rather than genetic dependencies between the elements of his language.
The confusion of synchrony and diachrony involves other grave conceptual difficulties, of which I will speak later. Let me first discuss some examples of the confusion of synchrony and diachrony in generative phonology:
Consider the following alternations of the front vowels in English:
Chomsky and Halle propose the following underlying phonemic representations of morphemes:
These abstract forms coincide with historical reconstructions that are retained by the orthography.
Chomsky and Halle propose three rules to produce the current phonetic forms: 1) a laxing rule, which applies before the -ity suffix; 2) a vowel shift rule, which changes I to ǣ, ē to Ī, and ǣ to ē; and 3) a diphthongization rule, by which ǣ becomes æy, 1 becomes iy, and ē becomes ey. The derivations for səriyn and sərɛnɪtɪ are given below:
|seren+iti||Laxing before -ity|
It is clear that Chomsky and Halle are doing an internal reconstruction, but they present it as if it were a synchronic phenomenon. The above three rules describe diachronic phonetic changes but are presented as if they describe synchronic processes characterizing the speaker-hearer’s competence.
Chomsky and Halle confound synchrony with diachrony. This confusion is possible because the rules are ordered. But ordering the rules makes sense if it corresponds to a succession of changes in time; otherwise it is arbitrary. In our case, the ordered rules can be reasonably interpreted as describing a succession of phonetic changes in time; from a synchronic point of view, however, this ordering is arbitrary.
Two different notions are confused here: a diachronic notion of the phonetic change and a synchronic notion of the phonemic alternation. Phonemic alternations, rather than phonetic changes, belong to synchrony.
From a synchronic point of view, only one rule can be proposed here—a rule of phonemic alternations of iy with e, ey with ӕ, ay with i before the -ity suffix.
The confusion of phonetic changes with phonemic alternations is a consequence of the Ordering Hypothesis. Eliminate the rule ordering, and this confusion will be impossible.
Let us now turn to another example. According to Chomsky (1964:88-90), the following phonological rules are valid in English:
Thus, according to rule (1) we have
|(10)||opaque → opacity|
|logic → logicism|
|democrat → democracy|
|pirate → piracy|
According to rule (2) we have
|(11)||race → racial|
|express → expression|
|(12)||erase → erasure|
|enclose → enclosure|
|revise → revision|
If those phonological rules are regarded as unordered rules, having the form ‘morpheme X realizes phoneme Y in the context of Z ______W’ then they should be supplemented by the rule
to explain such facts as logician, delicious (cf. delicacy), relate → relation, ignite → ignition, etc. However, rule (3) can be dispensed with if the first two rules are ordered in such a way that rule (2) is applied to the result of the application of rule (1).
A grammar containing rules (1) and (2), applied in the order named, will give the following derivations:
|(14)||lajik+yɨn||prezident+i||prezident+ i+ :ӕl|
|lajis+yɨn||prezidens+i||prezidens+i+:ӕl||(according to (1))|
|lajišɨn||prezidenš+:ӕl||(according to (2))|
A grammar containing rule (2) obviously lacks the generalization found in a grammar containing only rules (1) and (2) with ordered application. Moreover, it can be proved that a grammar containing rule (3) lacks other generalizations, as well. Thus, alongside rules (1) and (2) there is also the following rule:
(15) z → s in the context: +iv
e.g., abuses → abusive.
Now consider such forms as persuade → persuasive → persuasion, corrode → corrosive → corrosion, etc.
In a grammar that does not envisage ordered application of rules, these correspondences must be accounted for by means of the following two new rules, independent of rules (1), (2), (3), and (4):
(16) d → s in the context: +iv
(17) d + [i,y] → ž in the context: Vowel
However, if the rules are applied in a definite order, then rules (5) and (6) are superfluous. Generalizing rule (1) to apply to d,t instead of t, we get the following derivation for persuasive:
|perswēz+iv||(according to (1))|
|perswēsiv||(according to (4))|
and for persuasion:
|perswēz+yɨn||(according to (1))|
|perswēžɨn||(according to (4))|
Such is the reasoning of Chomsky. This example is a vivid illustration of the importance of ordering phonological rules. However, in considering the above example, we come across the following difficulty.
In formulating rule (1), Chomsky obviously simplified it, because it is easy to find exceptions, such as shake → shaky, might → mighty. If it were only a matter of didactic exposition, these exceptions might be disregarded, because, even though rule (1) is simplified, the example given is very illustrative. However, careful consideration of the ordered rules used by Chomsky suggests that there must be a cardinal difference between them, which is not given in generative phonology.
Comparing rules (1) and (2), we find that they refer to cardinally different processes. Such cases as shake → shaky, might → mighty do not come under rule (1), because it is formulated as if it had to do with the phonological conditions of the transition:
Actually, this transition occurs under morphophonological rather than phonological conditions. It does not occur in the context of definite phonemes; it occurs in the context of definite suffixes—suffixes of abstract nouns in this case: -ity/-y, -ism.
This formulation is also a simplification, because to formulate rule (1) exactly, it would be necessary to consider the entire class of affixes, in the context of which the above phonological process takes place. However, the fundamental issue of the matter is as follows. Chomsky formulates this rule as if the phonological process occurred under definite phonological conditions, whereas we maintain that it occurs not under phonological but under definite morphological conditions, i.e., in the context of a definite class of affixes.
It is different with rule (2). The transition
|(22)||[s, z] + [i,y] → [š,ž] in the context:____Vowel|
takes place not under morphological but under phonological conditions, i.e., in the context of the adjacent vowel.
Thus, when formulating phonological rules in generative phonology, two kinds of phonological processes should be distinguished: 1) phonological processes occurring under definite morphological conditions, i.e., in the context of a definite class of affixes; and 2) phonological processes occurring under definite phonological conditions, i.e., in the context of definite distinctive features or phonemes.
Generative phonology does not differentiate between these two kinds of phonological processes. However, such differentiation is of fundamental importance.
Rules pertaining to the first kind of phonological processes will be called morphophonological rules (M-rules), and those pertaining to the second kind of processes will be called phonological rules (P-rules).
Let us now consider how to reformulate the above rules in the light of differentiating between the morphophonological and phonological levels of phonological processes.
To account for the facts given by Chomsky, it is sufficient to have two morphophonological rules and one phonological rule.
The morphophonological rules are
Consistent differentiation between morphophonological and phonological rules is one of the essential aspects of treating phonological processes in synchronic linguistics.5
The above example illustrates a confusion of phonemic and morphophonemic levels in generative phonology, which is a second consequence of the Ordering Hypothesis. Eliminate rule ordering, and this confusion is impossible.
Finally, as a result of the confusion of synchrony and diachrony, generative phonology does not distinguish between distinctive functions of sounds and their physical properties. In other words, generative phonology confounds not only the phonemic level with the morphophonemic level, but also the phonemic level with the phonetic level.
From a formal point of view, the rules of generative phonology are of an algorithmic type; that is, these rules are instructions for converting a set of initial sequences of symbols into new sequences of symbols in a finite number of steps. These rules Chomsky calls rewriting rules. Rewriting rules, like any other mathematical formalism, can have cognitive value if they are combined with correct empirical hypotheses about the nature of reality. Otherwise, they are no more than a game with symbols.
Generative phonology seriously distorts linguistic reality. What is the cause of the errors and fallacies of generative phonology? The cause is the fetishistic approach to the notion of the formal rule. Generative phonology aims at constructing a mathematically consistent system of formal rules. But mathematical consistency does not guarantee a correct description of reality. A consistent system of formal rules must be combined with a correct hypothesis about reality. Otherwise, this system may be inappropriate. Generative phonology presents a good example of the tragic consequences of mathematical fetishism.
A question arises: Why did generative phonology have success in some quarters?
The answer is this: Generative phonology started with a severe critique of phonology, and this critique was justified in many respects. At the same time, a novel approach was emphasized which required a construction of formal systems analogous to formal systems in other sciences. Chomsky is a master of polemic, and his successful polemic against an inferior type of phonology accompanied with a promise of new linguistic vistas played a decisive role in the promotion of generative phonology.
There is also another side of the story.
Generative phonology is a theory that is dressed in mathematical garb. For a certain sort of mind, the glamour of mathematical garb has a powerful appeal regardless of the ideas underneath it. But not every mathematical formalism is easy to master. The formalism of generative phonology has the advantage of simplicity. Rewriting rules is a type of formalism that does not presuppose any knowledge of mathematics and can be readily mastered by anybody who is able to enjoy manipulating symbols.
The price of this simplicity is high: strip the mathematical dress off the body of ideas, and you will see that the naked body lacks substance. Nevertheless, generative phonology is a seductive game with symbols. This game with symbols, like any other game, is an exciting pastime. It can be emotionally very satisfying.
A question arises: Can generative phonology turn from a game with symbols into an activity having significant cognitive value?
Yes, it can if it gives up the Ordering Hypothesis, if it does not confound synchrony with diachrony and the phonemic level with the morphophonemic level and the phonetic levels. But then it will lose all of its attractions and will become ordinary phonology.