In accordance with the distinction of two semiotic strata in a natural language—the system of signs and the system of diacritic elements—linguistics consists of two parts: phonology and grammar. Phonology is the study of the system of diacritic elements, and grammar is the study of the system of signs.
A minimal sign is called a morpheme. For example, the word speaker consists of two morphemes: speak-er, speak- and -er. The word above consists of one morpheme.
Every combination of morphemes is called a syntagm. It follows from this definition that a word consisting of two or more morphemes is a syntagm; a combination of words is also a syntagm.
Morphemes, words, and groups of words can be called syntagmatic units. Besides syntagmatic units, a language has units of a different kind. If we approach language as an instrument of communication, we discover a new type of units, which can be called functional units.
The basic functional unit is the sentence. The communicative function of language is to transmit messages from the speaker to the hearer—in more general terms, from the sender to the receiver—and messages are transmitted by means of sentences. A word or a group of words does not transmit a message unless it functions as a unit of communication—a sentence.
The next basic functional unit is the predicate.
In terms of the notions sentence and representation, the predicate can be defined as the functional unit that represents a sentence. Here are some examples illustrating this definition. Consider the Russian sentence
|(1)||Ivan kupil knigu.|
|‘John bought the book’.|
We can delete the noun knigu the book’ and get the correct sentence
Or we can delete the noun Ivan John’ and get the correct sentence
|‘Bought the hook’.|
Finally, we can delete both nouns, and again we will get a correct sentence:
In all these sentences, the constant element is the verb kupil, and this element is what is left after all the deletions because it represents the complete sentence; the verb kupil is the predicate of these sentences. True, if we take the English equivalent of the Russian sentence—John bought the book—we can delete only the book, but we cannot delete John; neither bought the book nor bought is a sentence in English. But that is because the rules of English grammar restrict the deletions of subjects. Still, in those cases when the deletion of the subject is possible, it is the verb that serves to represent the whole sentence. Compare the imperative sentences
|(5)||You come here, Jack, and you go over there, Mary.|
and its equivalent with the deleted subjects
|(6)||Come here, Jack, and go over there, Mary!|
We can delete all the other words and get finally
|(7)||Come and go!|
which syntactically represents the whole sentence. Since the predicate represents the sentence, it can be identical with the sentence in those cases when the sentence is complete without any other functional units. For instance, the Russian sentence
|‘It is warm’.|
has only a predicate teplo, which is identical with the whole sentence. Similarly, the English sentence
has only a predicate fire, which is identical with the whole sentence.
The next functional units are terms. The terms are the complements of the predicate: they denote the participants of a situation denoted by the predicate. For example, in the sentence
the predicate slept denotes the situation of sleeping, and the term John denotes a participant of this situation—the subject of the sentence. In
|(11)||John gave money to Bill.|
the predicate gave denotes the situation of giving, and the terms John, money, and Bill denote the participants of the situation—the subject John, the direct object money, and the indirect object Bill. As sentence (8) shows, a sentence can have no terms. But although a sentence can be without terms, every sentence must have a predicate.
Besides predicates and terms, a sentence may contain predicate modifiers and term modifiers. In languages that have adjectives and adverbs, adjectives serve as modifiers of terms, and adverbs usually serve as modifiers of predicates. For example, in the English sentence
|(12)||The red car was moving slowly.|
the adjective red is the modifier of the term car, and the adverb slowly is the modifier of the predicate was moving.
There are other functional units, but I will not discuss them here.
Functional units and syntagmatic units clearly belong to different levels of language. There is no one-one correspondence between them. Functional units can be realized not only by single words but by other syntagmatic units, as well:
1) A functional unit can be realized by a group of words:
(13) The weight of the snow on the roof caused the shed to collapse.
In this sentence the first term is realized by the group the weight of the snow on the roof, and the predicate by the group caused to collapse.
2) A functional unit can be realized by a morpheme that constitutes a component of an incorporating complex. So, in Chukchee a lexical morpheme may be incorporated into a verb stem. This morpheme must be viewed as an optional affix that has a function of a term denoting an object. For example, such is the Chukchee lexical morpheme kora ‘deer’, incorporated in the verb kora-janratgat ‘separated deer’ in the sentence
|‘The hosts separated the deer’.|
3) A functional unit is realized by a morpheme that is a component of a verbal form. There is no incorporation here, because morphemes of this type are mandatory components of verbal forms. For example, in Georgian the conjugation of transitive verbs involves the realization of a term denoting a subject and a term denoting an object:
|(15)||mo-kl-a ‘he killed him’|
|mo-kal-i ‘you killed him’|
|mo-v-kal-i ‘I killed him’|
The realization of functional units by morphemes of incorporating complexes and morphemes constituting mandatory components of verbal forms is common in Paleoasiatic, Caucasian, Semitic, and Bantu languages.
Functional units and the morphological units that realize them constitute different levels of linguistic structure, which should strictly be distinguished from each other. In accordance with this distinction, we must distinguish two types of syntactic connections: 1) syntactic connections between functional units and 2) syntactic connections between units that realize functional units, that is, syntactic connections between syntagmatic units. In current linguistic literature, these two different types of connections are very often confounded with each other. Consider, for instance, such very common statements as: 1) the predicate agrees with the subject; 2) the predicate governs the object. In the light of the foregoing, these statements are clearly incorrect. Predicate, subject, and object are functional units, while agreement and government are formal connections between syntagmatic units. Therefore, the correct statements will be, on the one hand: 1) the predicate has a subject relation with the term, 2) the predicate has an object relation with the term; and, on the other hand: 1) the verb agrees with the noun, 2) the verb governs the noun.1
We can think of the syntactic structure of a sentence as something independent of the way it is represented in terms of syntagmatic units. In this way we come up with two levels of grammar, which I call genotype grammar and phenotype grammar. Genotype grammar comprises functional units—predicates, terms, modifiers—and abstract operator-operand relations between these units. Phenotype grammar comprises syntagmatic units—morphemes and words—and connections between them in terms of linear order and their morphological properties. Such notions as agreement and government clearly belong in phenotype grammar.
The rules of genotype grammar are invariant with respect to various possibilities of their realization by phenotype grammar. The terms genotype and phenotype are borrowed from biology, where genotype means a definite set of genes that is invariant with respect to its different manifestations called phenotype.
The distinction between the genotype and phenotype levels is of paramount importance for linguistic theory, because this distinction puts us on the road to the solution of the basic question of linguistic theory formulated above: What factors contribute to the similarities and differences between natural languages?
In order to see the significance of genotype grammar for formulating universal rules that are invariant with respect to their different manifestations, consider passivization rules.
Take an active-passive pair in English:
|(16)||a.||Columbus discovered America.|
|b.||America was discovered by Columbus.|
In generative-transformational grammar, the relation between active sentences and passive sentences is treated by means of passive transformations. There are quite a few proposals as to how passive transformations should be stated. These proposals differ greatly in detail. But despite great differences in detail, advocates of generative-transformational grammar agree that: 1) passive in English applies to strings in which a noun phrase, a verb, and a noun phrase occur in this order:
and 2) passivization involves postposing of the preverbal noun phrase and preposing of the postverbal noun phrase:
Passive transformations clearly are rules stated in terms of linear order of words. One may wonder whether universal rules of passivization can be stated in terms of passive transformations. As a matter of fact, that cannot be done, for the following reasons.
For one thing, since different languages have different word order, a statement of passivization in terms of the transformational approach will require a distinct rule for each language where the order of the relevant words is different. Take, for example, the following active-passive pairs in Malagasy:
|(19)||a.||Nahita ny vehivavy ny mpianatra.|
saw the woman the student
‘The student saw the woman’.
|b.||Nohitan’ny mpianatra ny vehivavy.|
seen (passive) the student the woman
‘The woman was seen hy the student’.
In Malagasy the verb is in the initial position, and the subject is normally in the final position:
In this formula of the active sentence, the noun phrase 1 is the subject and the noun phrase 2 is the object. Passivization involves the reversal of the order of the noun phrases:
In terms of linear order of words, the rule of passivization in Malagasy is quite distinct from the rule of passivization in English.
There are languages where word order is irrelevant for the statement of the rule of passivization. Consider the following active-passive pair in Russian:
|(22)||a.||Kolumb otkryl Ameriku.|
Columbus (subject) discovered America (object)
‘Columbus discovered America’.
|b.||Kolumb-om byla otkryta Amerika.|
Columbus-by (instr.) was discovered America (subject)
‘America was discovered by Columbus’.
In Russian, active and passive sentences can have the same word order, because word order is irrelevant for passivization. Here passivization involves, besides passive marks on the verb, the change of case markings: the noun that is in the accusative case in the active sentence is in the nominative case in the passive sentence; the noun that is in the nominative case in the active sentence is in the instrumental case in the passive sentence.
It is clear that the universal rules cannot be stated in terms of operations on strings of words, that is, in terms of syntagmatic units. Word order, case markings, and verbal morphology are language-specific, and therefore they are irrelevant for the characterization of a language-independent universal rule of passivization.
In order to state a universal rule of passivization, we have to use the notions of the genotype level. On this level we face functional units rather than words and combinations of words. For our immediate purpose, we must treat the basic functional unit—the predicate—as a binary relation whose terms denote an agent and a patient. The rule of passivization must be stated as the operation of conversion on the binary relation. Conversion is an operation with the help of which, from a relation R, we form a new relation called the converse of R and denoted by Ř. The relation Ř holds between X and Y if, and only if, R holds between Y and X. The converse of the relation >, for example, is the relation, ˂,since for any X and Y the formulae
|(23)||X ˂ Y||and||Y ˃ X|
Passive predicate is the converse of active predicate. The conversion involves the exchange of positions between the term denoting an agent and the term denoting a patient. In the active construction, the primary term denotes an agent, and the secondary term denotes a patient. In the passive construction, the secondary term becomes the primary term, and the primary term becomes the secondary term. It should be noted that position, as a relational notion, is independent of the linear order of words. So, in the above example from Russian (22), although the primary term and the secondary term have exchanged their positions in the passive construction, their linear order has remained intact.
The statement of the rule of passivization as the conversion of active predicates is language-independent and therefore is universal. This rule of genotype grammar predicts the phenomena on the phenotype level of different languages. If the secondary term of an active sentence is the primary term of the corresponding passive, then it should stand in the same position in the passive sentence as do the primary terms of active sentences in languages where word order is not free. And if the primary term of an active sentence is the secondary term of the corresponding passive, then it should stand in the same position in the passive sentence as do the secondary terms of active sentences in languages where word order is not free. This prediction is confirmed by the facts of various languages of the world where word order is not free. In the above examples from English and Malagasy (16, 19), the secondary terms of the active sentences stand in the same positions in the passive sentences as do the primary terms of the active sentences, and the primary terms of the active sentences stand in the same positions in the passive sentences as do the secondary terms of the active sentences.
Another prediction concerns the morphological marker of the converse relation. Since conversion is an operation on the predicate, the passive predicate must be characterized by some morphological marker, while the morphological markers on the terms of the predicate are optional. This prediction is confirmed by the languages of the world. No matter how different are the morphological markers of the verbs that serve as manifestations of passive predicates, one thing remains constant: the mandatory use of some morphological marker of the passive predicate.
The foregoing shows the crucial significance of the distinction between the genotype and phenotype levels. Generative-transformational grammar ignores the genotype level and states the rules of grammar in terms of the phenotype level. As a result, generative-transformational grammar fails to provide cross-linguistically viable functional notions for the study of language universals.
In the foregoing section I gave tentative definitions of some functional units of language. Now we are ready to study these notions in a more systematic way.
Any speech act involves a communication and three basic items connected with it—the sender, the receiver, and the external situation. According to whether the communication is oriented towards one of these three items, it can have respectively an expressive, vocative, or representational function.
I leave the expressive and vocative functions for the present and focus on the representational function of the communication. With respect to this function, the following postulate can be advanced:
If we abstract from everything in the language used that is irrelevant to the representational function of the communication, we have to recognize as essential only three classes of linguistic expressions:
a) the names of objects;
b) the names of situations;
c) the means for constructing the names of objects and the names of situations.
I call this postulate the Principle of Representational Relevance.
Names of objects are called terms. For example, the following expressions are used as terms in English: a car, a gray car, a small gray car, a small gray car he bought yesterday. Terms should not be confused with nouns. A noun is a morphological concept, whereas a term is a functional concept. Languages without word classes lack nouns but still have terms.
Sentences serve as the names of situations.
The means for constructing the names of objects and the names of situations are expressions called operators.
An operator is any kind of linguistic device that acts on one or more expressions called its operands to form an expression called its resultant. For example, in the English expression
|(1)||The hunter killed the bear.|
the word killed is an operator that acts on its operands the hunter and the bear; in gray car the expression gray is an operator that acts on its operand car. If an operator has one operand, it is called a one-place operator; if an operator has n operands, it is called an n-place operator.
It is important to notice that in accordance with the definition of the operator as a linguistic device, instances of an operator do not have to be only concrete expressions, such as words or morphemes. For instance, a predicate may be represented by intonation. So, in the following verse from a poem by the Russian poet A. Blok:
|(2)||Noč’. Ulica. Fonar’. Apteka.|
|‘Night. Street. Lantern. Pharmacy’.|
we have four sentences. In each of these sentences the intonation serves as an operator that acts on a term to form a sentence.
Another example of an operator that is not a concrete expression is truncation. For instance, bel ‘is white’ in the Russian sentence
|‘The snow is white’.|
is the resultant of truncation of the suffix -yj in the world bel-yj ‘white’. Here truncation serves as an operator that acts on the adjective bel-yj ‘white’ to form the predicate bel ‘is white’.
Let us focus on the operation of the combination of the operator with its operands. According to the definition of this operation in ordinary logic, an n-place operator combines with its operands in one step. This definition treats all operands as if they had an equally close connection with their operator. But usually an operator is more closely connected with one operand than another. For example, a transitive verb is more closely connected with the secondary term than with the primary term. Thus, in the above example (1), the transitive predicate killed is more closely connected with the bear than with the hunter.
Why are killed and the bear more closely connected than killed and the hunter? Because a combination of a transitive predicate with a direct object is equivalent to an intransitive predicate. That is why in some languages this combination can be replaced by an intransitive predicate. For example, in Russian lovit’ rybu ‘to catch fish’ can be replaced by the intransitive verb rybačit’ with the same meaning. And, vice versa, an intransitive verb can be replaced by a transitive verb with a direct object. For example, to dine may be replaced by to eat dinner. There is also other evidence of a close connection between a transitive verb and its direct object. Thus, nouns derived from intransitive verbs are oriented towards the subjects of the action (genetivus subjectivus), while nouns derived from transitive verbs tend to be oriented towards the objects of the action (genetivus objectivus). Compare the dog barks: the barking of the dog versus they abducted the woman: the abduction of the woman. The ambiguity of expressions such as the shooting of the hunters must be explained by the fact that although the verb to shoot is transitive, it can also be used as an intransitive verb; we can say the hunters shoot without specifying the object. Compare: the boy reads the book: the boy reads. The orientation of nouns derived from transitive verbs towards the object of the action is a universal tendency observed in typologically very different language groups.
To do justice to this phenomenon, we must redefine the combination of an n-place operator with its operands as a series of binary operations: an n-place operator is applicated to its first operand, then the resultant to the second operand, and so on. According to the new definition, an n-place operator combines with its operands in n steps, rather than in one step as in ordinary logic. For example, any transitive predicate, which is a two-place operator, must be applied to the secondary term, then the resultant to the primary term. Thus, in the expression (1), the transitive predicate killed must be applied first to the bear, then to the hunter:
|(4)||((killed the bear) the hunter)|
The new binary operation is called application.
The above informal explanation of the notion of application must now be presented as a formal statement called the Applicative Principle:
An n-place operator can always be represented as a one-place operator that yields an (n-1)-place operator as its resultant.2
Examples of representing an n-place operator as a one-place operator: The two-place operator killed is represented as a one-place operator that is applied to its operand the bear and yields the resultant (killed the bear). The resultant is a (2-1)-place operator, that is, a one-place operator, which is applied to another term, the hunter, and yields the resultant ((killed the bear) the hunter). The new resultant is a (1-1)-place operator, that is, a zero-place operator, which is a sentence. This sentence is an abstract representation of the sentence
The hunter killed the bear.
The three-place operator gave is represented as a one-place operator that is applied to its operand Mary and yields the resultant (gave Mary). The resultant is a (3-1)-place operator, that is, a two-place operator. This two-place operator is represented, in its turn, as a one-place operator that is applied to another term, money, and yields the resultant ((gave Mary) money), which is a (2-1)-place operator, that is, a one-place operator. The latter is applied to the term John and yields a (1-1)-place operator (((gave Mary) money) John), which is an abstract representation of the sentence John gave Mary money.
On the basis of the Applicative Principle, I define the formal concepts one-place predicate, two-place predicate, and three-place predicate and the formal concepts primary term, secondary term, and tertiary term.
DEFINITION 1. If X is an operator that acts on a term Y to form a sentence Z, then X is a one-place predicate and Y is a primary term.
DEFINITION 2. If X is an operator that acts on a term Y to form a one-place predicate Z, then X is a two-place predicate and Y is a secondary term.
DEFINITION 3. If X is an operator that acts on a term Y to form a two-place predicate Z, then X is a three-place predicate and Y is a tertiary term.
The opposition of a primary and a secondary term constitutes the nucleus of a sentence. These terms I call nuclear.
An applicative tree (henceforth AT) is a network of operators and operands combined by application. The sentence He knocked down his enemy can be presented by the following applicative tree:
In an AT, operators are represented by double lines, and operands are represented by single lines. An AT presents the relation operator: operand independently of the linear word order, as can be seen from the following example:
ATs (7) and (6) are equivalent from the relational point of view.
Any AT can be replaced by an equivalent linear formula with brackets. In the linear notation, by a convention, an operator must precede its operand, and both are put inside brackets.
|(8)||(((DOWN KNOCKED) (HIS ENEMY)) HE)|
|(9)||(UNFORTUNATELY ((SOUNDLY SLEPT) JOHN))|
Formula (8) replaces AT (5). Formula (9) replaces ATs (6) and (7), since it is invariant under the changes of word order.
In a linear formula, the brackets can be left out in accordance with the principle of leftward grouping. Applying this convention to the above linear formulae, we get
|(10)||((DOWN KNOCKED) (HIS ENEMY)) HE|
|(11)||UNFORTUNATELY ((SOUNDLY SLEPT) JOHN)|
Applicative structure represented by an applicative tree has two facets: part-whole relations called constituency relations and dependency relations. Operators and operands are interconnected by constituency and dependency relations.
An operator and its operand are in part-whole relation to the resultant of the operator. Therefore, a network of operators and operands presented by an applicative tree is at the same time a network of part-whole relations.
On the other hand, a network of operators and operands is also a network of heads and their dependents, because either the operator is the head and its operand is dependent, or, vice versa, the operator is dependent and its operand is the head.
I will first consider constituency.
Constituency is a part-whole relation that is defined in two steps. We first define immediate constituents and then give a definition of constituents based on the definition of immediate constituents:
DEFINITION OF IMMEDIATE CONSTITUENTS:
If expression A is an operator, expression B is its operand, and expression C is the resultant of the application of A to B, then expressions A and B are immediate constituents of expression C.
Examples: In AT (5) in the preceding section the operator down and its operand knocked are immediate constituents of the resultant (down knocked);the operator his and its operand enemy are immediate constituents of the resultant (his enemy); the resultant (down knocked) is in its turn the operator of the resultant (his enemy), and both of these expressions are immediate constituents of the resultant ((down knocked) (his enemy)), which in its turn is the operator of he; ((down knocked) (his enemy)) and he are immediate constituents of the sentence (((down knocked) (his enemy)) he).
In AT (6) and AT (7) in the preceding section, soundly and slept are immediate constituents of (soundly slept); (soundly slept) and John are immediate constituents of ((soundly slept) John); unfortunately and ((soundly slept) John) are immediate constituents of (unfortunately ((soundly slept) John)). (In representing immediate constituents, we place, by a convention, an operator before its operand.)
DEFINITION OF CONSTITUENTS:
If there exists a sequence of expressions x1, x2, . . . , xn such that xi is an immediate constituent of x i+1 (for i = 1, 2, . . . , n-1), then x i is a constituent of xn.
Examples: Every word W in any sentence S is its constituent, because, for any sentence S, there exists a sequence of expressions x1, x2, . . . , xn such that xi is a word W and xi is an immediate constituent of xi+1 (for i = 1, . . . , n-1). Thus, in the sentence presented in AT (5) in the preceding section, the word knocked is a constituent of the sentence He knocked down his enemy, because there exists a sequence of expressions knocked, (down knocked), ((down knocked) (his enemy)), (((down knocked) (his enemy)) he), and every member of this sequence is an immediate constituent of the member that follows it. And every member of this sequence is also a constituent of the sentence He knocked down his enemy, because it is at the same time a member of the subsequence that satisfies the above definition of the constituent. For instance, (down knocked) is a constituent of the sentence in question because (down knocked) is a member of a subsequence (down knocked), ((down knocked) (his enemy)), (((down knocked) (his enemy)) he), and every member of this subsequence is an immediate constituent of the member that follows it.
Note that I have defined immediate constituents and constituents independently of linear word order. In current linguistic literature, and in particular in generative-transformational grammar, the definition of immediate constituents includes the requirement that immediate constituents must be adjacent elements in a linear string. While in genotype grammar constituency is viewed as independent of linear word order, generative-transformational grammar confounds constituency with linear word order. That leads to serious difficulties, which will be discussed below.3
It is possible to give a precise definition of the dependency relation on the basis of the part-whole relation between an operator, its operand, and the resultant of the operator.
DEFINITION OF DEPENDENCY:
Let expression C be a resultant of operator A applied to operand B. Either A is the head and B is its dependent, or B is the head and A is its dependent: if expression C belongs in the same category as operand B, then B is the head and A is its dependent; if expression C belongs in a different category from that of B, then A is the head and B is its dependent.
If operand B is the head and operator A is its dependent, then A is called the modifier of the head.
If operator A is the head and operand B is its dependent, then B is called the complement of the head.
|(1)||Bill bought new books.|
the operator new is the modifier, and the operand books is its head, because the resultant (new books) belongs in the same category as books (both are terms); the operator bought is the head, and the operand (new books) is its complement, because the resultant (bought (new books)) belongs in a different category from that of (new books): (bought (new books)) is a predicate, but (new books) is a term; finally, the operator (bought (new books)) is the head, and operand Bill is its complement, because the resultant ((bought (new books)) Bill) belongs in a different category from that of Bill: the former is a sentence, and the later is a term.
The concepts head’ and dependent’ given in the above definitions are more general than those given in current linguistic literature. In so-called dependency grammar, dependency relations are defined only for the smallest constituents of a sentence, that is, for words. For example, the sentence
|(2)||A little boy looked intently at the picture.|
will be represented in dependency grammar by the following tree:
Dependency grammar is unable to represent dependency relations between functional units, while genotype grammar does represent these relations. So the above sentence will be represented in genotype grammar by the following dependency tree:
Starting from the notion of applicative structure, we are able to give a rigorous definition of the dependency relation presented above. Dependency is not a primitive concept: it must be defined. But dependency grammar is not able to give a satisfactory definition of this concept.
Richard A. Hudson in his recent paper gives a tentative solution to the problem of defining dependency (Hudson, 1980: 188-91). Using the term modifier as a synonym of the term dependent, he defines heads and modifiers in terms of the concepts ‘frame’ and ‘slot’. According to Hudson, any filler of a slot in a frame is a modifier. But the concepts ‘frame’ and ‘slot’ cannot be taken as primitive, either; they also have to be defined. Rather than give a definition of these concepts, Hudson gives a list of heads and modifiers.
Any list of heads and modifiers cannot replace a definition of these concepts.
Taking the notion of applicative structure as a starting point, we are able to solve the problem of defining dependency. We give a rigorous definition of heads and dependents and draw an important distinction between two kinds of dependents: modifiers and complements.
Contemporary linguistic theory recognizes that there are two models of representation of the syntactic structure of a sentence: constituency representation and dependency representation. Generative-transformational grammar uses constituency representation, while some other grammatical theories favor dependency representation.
In current linguistic literature, controversy about the superiority of one type of syntactic representation over another approaches the intensity of a civil war. But as a matter of fact, there is no likelihood of forming a consistent description of grammatical structure using a choice of only one of the two possible models of representation. It seems as though one must use sometimes constituency representation and sometimes dependency representation, while at times it is possible to use either. The situation is rather like that in physics, where the phenomena of light are explained by two theories that complement each other—the wave theory of light and the quantum theory of light. Separately neither of them fully explains the phenomena of light, but together they do.
We are faced with a fundamental problem: Is it possible to combine constituency representation and dependency representation to form an integrated representation of syntactic structure? And if it is possible, will the integrated representation of syntactic structure lead to new significant insights into linguistic reality?
My answer to this question is affirmative. It is possible to form an integrated representation of syntactic structure that will lead to new significant insights into linguistic reality. In the foregoing sections it was shown how that can be done. Starting from the notion of applicative structure, I defined constituency and dependency as complementary notions that are reduced to the relations between an operator, its operand, and the resultant of an operator.
With the integration of the constituency and dependency models into the applicative model, the controversy over the superiority of one type of model over the other must come to an end. As a matter of fact, we cannot dispense with either model; both of them are necessary as complementary pictures of linguistic reality.
As was shown above, the level of functional units is universal, while the level of morphological units that realize the functional units is language-specific. The level of functional units I call the genotype level, and the level of syntagmatic units that realize functional units I call the phenotype level. Functional units are irreducible to any other type of units, they are ultimate units of the genotype level.
Neither words nor combinations of words belong to syntax proper. As a matter of fact, these are morphological units that realize functional units—the true units of syntax.
In order to set out the crucial significance of functional units as true units of syntax, I introduce the term syntaxeme. I suggest calling the functional unit the syntaxeme.
The construction of a sentence starts with a predicate frame. By a predicate frame I mean a combination of a predicate with an appropriate number of terms functioning as operands of the predicate.
The minimal sentence consists of two syntaxemes, one of which is the predicate, which normally designates a state of affairs or an event, while the other is the primary term, which refers to a participant whose role, whether active or passive, is emphasized by its choice as the primary term. The primary term contrasts with other terms as a pivot, that is, as a term having a privileged position in a sentence. In current linguistic literature, the primary term is called the subject. I will show below that the term subject is inappropriate, because it refers also to other notions that are incompatible with the notion of the pivot.
The primary term may be represented by a pronominal morpheme such as I in I love or by a morpheme that is a component of a verbal form, as in Latin amo ‘I love’. The primary term may be represented by a word or a group of words, as in John left or Poor John, whom I met yesterday, is very sick. The primary term may be represented by a combination of a word and a pronominal morpheme or by a combination of a word and a morpheme that is a component of a verbal form, as in French l’homme il marche ‘The man is marching’ or in Latin Caesar venit ‘Caesar came’. Semantically, the primary term may denote an agent, as in John walks slowly; a patient, as in John was arrested; or a beneficiary, as in John was given the books. It can also have some other meanings, which I will not discuss here.
According to the language concerned, the primary term may either have a case marker, such as nominative in Latin or Russian, or be marked by a position with respect to the predicate. For example, in English and French the primary term precedes the predicate, as in
|(1)||a.||Tom beats Dick.|
|b.||Dick beats Tom.|
In (1a) Tom is the primary term and Dick is the secondary term; but in (1b), vice versa, Dick is the primary term and Tom is the secondary term.
The corresponding Russian sentences will be
|(2)||a.||Tom b’et Dik-a.|
|b.||Tom-a b’et Dik.|
As these sentences show, the position of the primary term with respect to the predicate is irrelevant in Russian: in (2a) the predicate b’et is preceded by the primary term Tom; in (2b), however, the predicate b’et is preceded by the secondary term Tom-a. What matters is not the position but case markers: in both (2a) and (2b) the primary term is in the nominative case, which has a zero case marker, and the secondary term is in the accusative case, which has the case marker -a.
An important property of the primary term with respect to the secondary term and other terms is its indispensability. The secondary term or the tertiary term can be eliminated from a sentence, which will still remain a complete sentence. But that is normally not true of the primary term. For instance,
|(3)||a.||Peter sells fruit (for a living).|
|b.||Peter sells (for a living).|
|c.||* Sells fruit (for a living).|
After we eliminated the secondary term fruit from (3a), we got sentence (3b), which is normal. But the elimination of the primary term Peter has led to an inadmissible sentence (3c).
The predicate frame predicate: primary term can be reduced to a structure consisting only of one unit—the predicate. In this case the predicate becomes syntactically equivalent with the whole sentence that it represents. In many languages, such as Russian or Latin, sentences with a predicate and a zero term are quite normal. Compare Latin pluit ‘it rains’ or ningit ‘it snows’, Russian zarko ‘it is hot’.
According to a fairly widespread view, sentences with terms are the result of the expansion of basic sentences consisting solely of a predicate; so, sentences without terms are regarded as basic, and sentences with terms are regarded as derived from the sentences containing predicates only. But, as a matter of fact, the contrary is true: the structure of the sentences containing predicates is a reduced structure based on the full structure predicate: primary term. This claim is based on the Principle of Maximum Differentiation discussed below: a complete sentence is a structure that permits the maximal differentiation of the members of the predicate frame. The opposition of the predicate and the primary term is eliminated in the sentence consisting of the predicate only. Still, according to its syntactic role, the predicate represents the complete sentence. A predicate can represent a complete sentence, but the reverse is not true: a complete sentence cannot be regarded as the representation of a predicate.
According to the above definition of the two-place and three-place predicates and the secondary and tertiary terms, a combination of the two-place predicate with a term is syntactically equivalent to the one-place predicate, and a combination of the three-place predicate with a term is syntactically equivalent to the two-place predicate. An obvious consequence of these definitions is that predicates are more closely connected with tertiary and secondary terms than with the primary term. The construction of a sentence containing more than one term goes on by incorporation of terms into predicates.
The closer connection between predicates and secondary and tertiary terms is manifested by the contraction of these predicates and their terms into simple predicates. Compare
|(4)||form a circle round → encircle|
|give courage → encourage|
|he watches weight → he is a weightwatcher|
|he held her in a warm embrace → he warmly embraced her|
Predicates and terms can be expanded by various modifiers; we must distinguish modifiers of terms and modifiers of predicates.
An elementary term is normally represented by a noun. A modifier of a term is normally represented by an adjective but can also be represented by words that belong to other lexical classes. As was said above, the primary function of a verb is to be a predicate, but one of its secondary functions is to replace an adjective whose primary function is to be a modifier of a term. The primary function of a noun is to be a term, but one of its secondary functions is to serve as a modifier of a term. The primary function of a sentence is to be an independent unit of communication, but one of its secondary functions is to serve as a modifier of a term. Compare the following expansions of the elementary term represented by the noun:
|the old father|
|the old father of John|
|the old father of John, who returned yesterday|
The modifier of a predicate is normally represented by an adverb, whose primary function is to be a modifier of a predicate. It can, however, be represented by other words, one of whose secondary functions is to serve as a modifier of a predicate. Compare
|(6)||He walked slowly.|
|He walked in a hurry.|
An important syntaxeme is the modifier of a sentence. A modifier of a sentence is represented by an adverb or an adverbial phrase. Examples of modifiers of sentences:
|(7)||Probably, John will come back tomorrow.|
|Unfortunately, it is too late.|
|Last summer I visited Mexico.|
If we transpose a sentence into a term, we can embed one sentence into another sentence. We have already discussed one case of embedded sentences; that is when a transposed sentence serves as a modifier of a term. But a sentence can also serve as a term of another sentence. For example, in
|(8)||I know that John has left.|
we find a sentence John has left that was transposed by means of the conjunction that into the secondary term of sentence (8).
In current linguistic literature, the term clause is sometimes used to denote a sentence that either is a part of another sentence or is combined with another sentence using a coordinating conjunction, such as and, but, etc.:
|(9)||John went to Boston, and Peter went to New York.|
|Boris likes smoking, but I don’t like it.|
Sentences can be coordinated and subordinated. The difference between the coordination and the subordination of sentences is this: coordinated sentences can interchange their positions in the structure of the sentence whose components they are, while this interchange is impossible between the main sentence and the subordinated sentence without a complete change of the meaning (which may make no sense at all) of the sentence whose components they are. So, if we take the above coordinated sentences (9), we can interchange their positions as follows:
|(10)||Peter went to New York, and John went to Boston.|
|I don’t like smoking, but Boris likes it.|
As an example of subordinated sentences, consider
|(11)||I did it, because they asked me to do it.|
If we interchange the positions of the components of this sentence, we will get a sentence with a completely different meaning:
|(12)||They asked me to do it, because I did it.|
As a matter of fact, the clause because they asked me to do it is a modifier of the clause I did it. If we take the above sentence (8) and try to interchange the position of its components, we will get complete nonsense:
|(13)||*John has left that I know.|
According to our definition of dependency, given an operator A and an operand B, A is the head and B is its dependent, if A transposes B from one category into another. If an operator transposes its operand from one category into another, I call the operator a transposer and the operand, a transponend; and the process I call transposition.
Examples of transposition in English: The operator of applied to the term table transposes table from the category of terms into the category of modifiers of terms. The operator is applied to table transposes table from the category of terms into the category of predicates. The operator that applied to the sentence John left Paris yesterday transposes this sentence from the category of sentences into the category of terms (compare I know John and I know that John left Paris yesterday).
Neither constituency grammar nor dependency grammar has the means to handle transposition. But understanding the phenomenon of transposition is of paramount importance to linguistic theory. This phenomenon has far-reaching implications in both synchrony and diachrony.
Now I consider transposition more closely. The concept ‘transposition is different from the concept ‘transformation’ as used in generative-transformational grammar. In generative-transformational grammar, the term transformation refers to operations that effect changes in preestablished structures through deletion, substitution, or permutation of constituents. Transpositions require no deletion, substitution, or permutation; they are purely relational operations transferring expressions from one category into another. It should be pointed out that applicative grammar does not require transformations in the sense of generative-transformational grammar.
The concept ‘transposition’ makes it possible to reveal important relations between syntactic functions and words that have these functions. Word categories have a dual nature: on the one hand, they have some very general lexical meaning, and on the other hand, they have some definite syntactic function. So, nouns denote objects and at the same time function as terms in a sentence; adjectives denote fixed properties and at the same time function as modifiers of terms in a sentence; verbs denote changing properties and at the same time function as predicates of sentences; adverbs denote properties of properties and at the same time function as modifiers of predicates in sentences. These are inherent syntactic functions of the word categories, because these functions correlate with the lexical meanings of the word categories.
Since words are transposed from one category into another, they are at the same time transposed from one syntactic function into another. Thus, we come up with a classification of the syntactic functions in terms of ‘primary’ and ‘secondary’. Primary syntactic functions are inherent syntactic functions of the word categories. Secondary syntactic functions are those that are acquired by words when they are transposed from their basic category into some other category.
Examples: The noun milk may function as a term in a sentence, and that is its primary function; by applying the operator of, we transpose this word into of milk, and now it functions as a modifier of a term, which is its secondary function; by applying the operator is to milk, we get is milk, which functions as a predicate of a sentence, and that is another secondary function of milk.
A sentence can also have primary and secondary syntactic functions. The primary function of a sentence is the function of being an independent unit of communication. Its secondary functions are functions of a term or a modifier. When a sentence is transposed into a clause, it receives a secondary function. Compare, for instance, I know John and I know that John has left.
There are languages that do not distinguish between lexical classes, such as Chinese. We must abandon the terms verb and noun when describing these languages. Still, we can discover predicative and nonpredicative functions in these languages, just as we can in languages that have lexical classes. They can be based on different modes of combinations of the expressions that constitute one and the same lexical class.
The interplay of primary and secondary syntactic functions of words is of fundamental importance for diachrony: when a word A receives a secondary function in a certain syntactic environment, there is a universal tendency to replace the word A with a word (or group of words) B, for which this function is a primary function. Owing to this tendency, nouns that have a secondary function of adverbs are replaced by adverbs (for instance, Romance adverbs in -mente are ancient instrumental); adjectives that have a secondary function of terms are replaced by nouns, etc.
The phenomenon of transposition is crucial for the typological classification of languages of the world. For instance, the so-called inflexional languages (such as Latin or Russian) and analytic languages (such as English or French) have different types of transposers (flexions, on one hand, and prepositions, on the other hand). A classification of the languages of the world from the point of view of transposition must answer the fundamental question of universal grammar and language typology: In what way do languages differ with respect to the transposers they use?
The distinction of primary and secondary syntactic functions implies a view of syntactic processes that is different from the view of syntactic processes advocated by generative-transformational grammar.
Generative-transformational grammar regards syntactic nominal groups as a result of nominalization of sentences. In many instances this view is justified. There are, however, instances when it runs into difficulties. Consider, for example, the nominal structure ‘adjective + noun’, the blue sky is not the result of the transformation of the sky is blue. In languages where the formal category of the adjective occurs, the primary syntactic function of the adjective is attributive, and the secondary is predicative.
As a matter of fact, from the standpoint of derivation, the nominal group is no less fundamental than the sentence. The nominal group is fundamental for the syntactic contrast adjective: noun, and the sentence is fundamental for the syntactic contrast noun:verb. The syntactic contrast between the adjective and the noun, on the one hand, and between the noun and the verb, on the other hand, is based on the semiotic Principle of Maximum Differentiation. The nominal group is characteristic of the maximum differentiation between the noun and the adjective, and the sentence is characteristic of the maximum differentiation between the noun and the verb. The syntactic contrast between the noun and the adjective is neutralized in the predicative position, and the syntactic contrast between the noun and the verb is neutralized in the attributive position. That is why the sky is blue must be regarded as derived from the blue sky rather than vice versa: is blue is a secondary function of blue. By contrast, the moon shines must be regarded as primary with respect to the shining moon: the participle shining is a secondary function of the verb.
In accordance with the Principle of Maximum Differentiation, both nominal groups and sentences are fundamental from the standpoint of the direction of derivation. Therefore, under some conditions nominal groups are derived from sentences, and under other conditions sentences are derived from nominal groups.
A process related to functional transposition is what I call functional superposition. Given a syntactic unit, superposition puts a secondary syntactic function over its primary one so as to combine them into a new bistratal syncretic function. For example, if by using the suffix - tion, we change the verb to instruct into a verbal noun instruction, we have a case of transposition: the noun instruction has no syntactic function whatsoever of a verb. If, on the other hand, by using the suffix -ing, we change the verb to instruct into a different kind of noun—the so-called gerund instructing—we will have a case of functional superposition: the verbal noun retains the syntactic functions of the verb to instruct. Thus, it can take an object in the accusative (on instructing him) and an adverb (He suggested our immediately instructing them). The suffix -ing I call a superposer, and the verb to instruct, with respect to the suffix -ing, I call a superponend of -ing. The suffix -ing superposes noun functions onto the verb functions of to instruct so as to combine them into a new syncretic verbal-nominal function.
Other examples of superposition: In Russian, the characteristic syntactic function of the instrumental case is to be an adverbial, but in addition, it can take on the function of the direct object, as in Ivan upravljaet zavodom ‘John manages a factory’. This sentence can be passivized—Zavod upravljaetsja Ivanom ‘The factory is managed by John’—because the instrumental in the active functions as an accusative case.
The characteristic function of the accusative case is to be the direct object, but in addition, it can take on the function of an adverbial, as in the Russian sentence On rabotal celyj den’ ‘He worked all day long’. In this sentence the accusative celyj den’ can be replaced by an adverbial of time, such as utrom ‘in the morning’, večerom in the evening’, etc. Compare: On rabotal večerom ‘He worked in the evening’. This syntactic behavior of the accusative shows that it functions as an adverbial.
The characteristic function of the Russian dative is the role of an indirect object, but there is a large class of predicates that superpose the function of the subject onto it. For example, in:
|(14)||Vidja, čto proisxodit, emu stydno za svoego brata.|
|seeing what happens him-Dat. ashamed for his-Refl. brother|
|‘Seeing what is happening, he feels ashamed for his brother’.|
In (14) the dative emu has its characteristic function of the indirect object, but in addition, the predicate stydno superposes onto it the function of the subject, so that emu has three properties of a subject: 1) it controls Equi into a participle construction vidja, čto proisxodit; 2) it serves as an antecedent of the reflexive svoego; and 3) it precedes the predicate stydno (although Russian has a free word order, subjects normally precede predicates, while direct and indirect objects follow them).
As will be shown below, the notion of superposition is of paramount importance for understanding the structure of ergative languages.4
Now we can introduce the generalized concept of valence called the valence of an operator.
The valence of an operator is defined as the number of operands that the operator can be combined with. Accordingly, operators can be univalent, bivalent, trivalent, etc.
I call the valence of an operator the generalized concept of valence, since the ordinary concept of valence usually relates to predicates alone, and predicates are only a special class of operators.
By applying valence-changing rules, we can increase, decrease, or reorient the valence of a predicate. Here are some examples of the application of these rules:
To fall and to rise are one-place predicates. By changing the root vowels of these predicates, we derive two-place predicates from them: to fell and to raise. Compare
|(1)||a.||The tree has fallen.|
|b.||Someone has felled the tree.|
|(2)||a.||An arm rose.|
|b.||Someone has raised an arm.|
The Russian predicate rugat’ is a two-place predicate like its English counterpart to scold. By applying the suffix -sja to rugat’, we get the one-place predicate rugat’-sja. Compare the following Russian sentences and their English counterparts:
|(3)||a.||On rugaet rebenka.|
|b.||He scolds a child.|
Notice that although in (4b) scolds is used only with one term, that does not mean a decrease of its valence. A true decrease of the valence of a predicate takes place when some formal device is used, such as the suffix -sja in Russian. English does not have formal devices for decreasing the valence of predicates. When to scold is used as a one-place predicate, as in the above example, this use must be characterized as an incompletely realized two-place valence.
Compare now the following sentences:
|(5)||a.||John killed Mary.|
|b.||Mary was killed by John.|
These two sentences have an equivalent meaning and an equal number of terms. Both killed and was killed are two-place predicates. But what is the difference between these predicates? The predicate was killed is the converse of killed, and it involves a permutation of terms: in the sentence (5a) the primary term denotes an agent (John), and the secondary term denotes a patient (Mary); and, as a result of the permutation of these terms, in (5b) the primary term denotes a patient (Mary), and the secondary term denotes an agent (John).
Conversion neither increases nor decreases the valence of a predicate; it is a relational change of valence.
The notion of conversion was defined above for two-place operators. But conversion can be applied to many-place operators. Consider the three-place predicate to load in the following sentence:
|(6)||They loaded goods on trucks.|
where they is a primary term, goods is a secondary term, and trucks is a tertiary term.
We can apply conversion to loaded and permutate the secondary and tertiary terms. We get
|(7)||They loaded trucks with goods.|
As a result of conversion, goods became a tertiary term and trucks became a secondary term.
Now we can generalize the notion of conversion as an operator applied to many-place relations.
DEFINITION OF CONVERSION:
The many-place relation Řn called the converse of the many-place relation Rn holds between x1, . . . , xl. . . , xi . . . , xn if, and only if, the relation Rn holds between x1, . . . , xi, . . . , xl, . . . , xn.
In accordance with this definition, the conversion of any n-place relation R n iinvolves a permutation of any two terms:
|(8)||Řnx1 . . . xl . . . xi . . . xn ≡ Rn x1 . . . xi . . . xl . . . xn|
The valence of an operator was defined above as the number of operators with which the operator can be combined. Now we can generalize the notion of valence by including in it the relational structure of the operator defined by the rules of conversion of operators. I will distinguish two types of valence: 1) quantitative valence—the number of operators with which an operator can be combined; and 2) relational valence—an orientation of an operator with respect to its operands determined by the rules of conversion.
Classes of operators generated by the valence-changing rule I call valence classes of operators. In the above examples we find the following valence classes: 1) to fall, to fell; 2) to rise, to raise; 3) rugaet, rugaet-sja; 4) killed, was killed.
Now we are ready to consider the notion of voice. It is helpful to define this notion with respect to the notion of valence.
I define voice as follows:
Voice is a grammatical category that characterizes predicates with respect to their quantitative and relational valence.
I regard this definition of voice as a generalization that covers various current notions of voice.
Starting from this definition of voice, we can develop a calculus of possible voices. The calculus will not tell us about the actual number of voices in every language; it will tell us how many voices a language can have. Clearly, no language has all possible voices. Languages differ from one another, in particular, by different sets of voices. That makes it possible to write a new chapter of language typology—a typology of voices.
The above definition of the voice has important consequences, which will be considered in the sections that follow.
We must distinguish two types of languages:
1) languages that have transitive constructions that are in opposition to intransitive constructions, and
2) languages that distinguish not between transitive and intransitive but rather between active and inactive (stative) constructions.
The first type is subdivided into two subtypes: a) accusative languages, such as Russian, Latin, or English, and b) ergative languages, such as Dyirbal. The second type is represented by many Amerindian languages, such as Dakota or Tlingit (these are called languages with the active system) (Klimov, 1973: 214-26).
The terms accusative languages and ergative languages are widely accepted conventional labels characterizing languages that do not necessarily have morphological cases; rather, they have syntactic counterparts of relations denoted by morphological cases. Thus, although Russian has morphological cases and English does not (short of a distinction between the nominative and accusative in personal pronouns: I-me, he-him, etc.), both Russian and English are called accusative languages, because word order and prepositions in English can express the same relations as morphological cases in Russian. For similar reasons, both Basque and Tongan are called ergative languages, although Basque has morphological cases and Tongan does not.
The transitive construction is a sentence with a two-place predicate in which either the primary term denotes an agent and the secondary term a patient (in accusative languages), or the primary term denotes a patient and the secondary term an agent (in ergative languages).
The intransitive construction is a sentence with a one-place predicate and a primary term only, which does not differentiate between the agent and the patient: it may denote either an agent or a patient.
An example of synonymous transitive constructions in Dyirbal (an ergative language) and English (an accusative language) (Dixon, 1972):
|(1)||a)||Duma yabu+ŋgu bura+l.|
|b)||Mother saw father.|
The Dyirbal transitive bura+l corresponds to English saw. Dyirbal ŋuma corresponds to English father. Both denote an agent, but it is a primary term in Dyirbal, while it is a secondary term in English. Dyirbal yabu+ŋgu corresponds to English mother. Both denote an agent, but it is a secondary term in Dyirbal, while it is a primary term in English.
Examples of nondifferentiation between the agent and the patient in English intransitive constructions:
|(2)||a) Peter sells well.|
|(the primary term denotes an agent)|
|b) These books sell well.|
|(the primary term denotes a patient)|
|c) Automobiles are sold.|
|(the primary term denotes a patient)|
|d) Charles and Peter are fighting over Mary.|
|(Charles and Peter are both agents and patients)|
The notions ‘agent’ and ‘patient’ are taken as primitive grammatical concepts.
To define the transitive and intransitive constructions, some linguists use the notions ‘subject’ and ‘direct object’ (for example, Perlmutter and Postal, 1984: 94-95). In terms of these notions, a sentence is transitive if it has both a subject and a direct object, and the sentence is intransitive if it has only a subject.
This characterization runs into difficulties.
First, since subject denotes both the topic and the agent, while direct object denotes a patient, which is a part of the comment, these notions can be used to characterize only the transitive construction in accusative languages: subject corresponds to the primary term denoting an agent, and direct object corresponds to the secondary term denoting a patient. These notions cannot be used to characterize the transitive constructions in ergative languages. Thus, in Dyirbal the primary term and the secondary term in a transitive construction each coincide partly with subject and partly with direct object: the primary term shares the property of topic with subject and the property of patient with direct object, and the secondary term shares the property of nontopic with direct object and the property of agent with subject.
The notion of subject also meets with difficulties in some accusative languages. Thus, Schachter (1976, 1977) has shown that in Philippine languages the notion of subject splits into two notions—the topic and the agent (actor in Schachter’s terminology)—which are independent of each other: there are separate markers for the agent and for the topic. From the point of view of Philippine languages, other accusative languages merge the syntactic properties of the topic and the agent within a single sentence part—the subject, while from the point of view of other accusative languages, Philippine languages divide the syntactic properties of the subject between the topic and the agent.
Subject and direct object are not universal notions, but they are valid concepts in most accusative languages and must be defined in terms of two classes of notions: 1) primary term, secondary term; 2) agent, patient.
Second, the use of the notion ‘subject’ to characterize the primary term in the intransitive construction conceals its fundamental property of nondifferentiating between agents and patients.
The foregoing shows that subject and direct object are complex notions that cannot be used as valid universal constructs. They must be replaced by more fundamental, truly universal notions: primary and secondary terms, on the one hand, and agent and patient, on the other.
The transitive and intransitive constructions in accusative languages can be represented as follows:
The transitive and intransitive constructions in ergative languages can be represented as follows:
These tables show that transitive constructions in the ergative and accusative languages are mirror images of each other.
The opposition primary term: secondary term is central to the syntactic organization of any natural language. The range of primary terms is greater than the range of secondary terms: primary terms occur with both intransitive and transitive verbs, while secondary terms occur only with transitive verbs. Primary terms occurring with intransitive verbs must be construed as units resulting from a neutralization of the opposition primary term: secondary term that is associated with transitive verbs.
The syntactic opposition primary term: secondary term belongs in a class of relations characterized by the Markedness Law:
Given two semiotic units A and B which are members of a binary opposition A:B, if the range of A is wider than the range of B, then the set of the relevant features of A is narrower than the set of relevant features of B, which has a plus relevant feature. (The range of a semiotic unit is the sum of its syntactic positions. The term relevant feature means here an essential feature that is part of the definition of a semiotic unit.)
The opposition A:B is the markedness relation between A and B, where A is the unmarked term and B is the marked term of this opposition.
The Markedness Law characterizes an important property of natural languages and other semiotic systems. It makes a significant empirical claim that mutatis mutandis is analogous to the law concerning the relation between mass and energy in physics.
In languages with case morphology, primary terms are denoted by the nominative case in accusative languages and by the absolutive case in ergative languages. Secondary terms are denoted by the accusative case in accusative languages and by the ergative case (or its equivalents) in ergative languages.
The relation between primary and secondary terms is characterized by the Dominance Law:
The marked term in a sentence cannot occur without the unmarked term, while the unmarked term can occur without the marked term.
A corollary of the Dominance Law is that the unmarked term of a sentence is its central, its independent term. By contrast, the marked term of a sentence is its marginal, its dependent term. The unmarked term is the only obligatory term of a sentence.
The Dominance Law explains why the marked term is omissible and the unmarked term is nonomissible in a clause representing the opposition unmarked term:marked term. The marked term is omissible because the unmarked term does not entail the occurrence of the marked term. And the marked term is nonomissible because the marked term entails the occurrence of the unmarked term; that is, it cannot occur without the unmarked term. As a consequence of this law, languages manifest the following general tendency: ergatives can, but absolutives cannot, be eliminated in transitive constructions in ergative languages; accusatives can, but nominatives cannot, be eliminated in transitive constructions in accusative languages.
As a consequence of the Markedness Law, we can establish a hierarchy of syntactic terms which I will call the Applicative Hierarchy:
|(5)||[ Primary term > Secondary term >(Tertiary term)] > Oblique term|
The brackets and parentheses have a special meaning in (5). The brackets embrace terms as members of the predicate frame: they are the operands of the predicate. The parentheses indicate that the tertiary term is marginal with respect to the primary and secondary terms, which are central to the predicate frame. The tertiary term is marked with respect to the secondary term, because the range of the secondary term is greater than the range of the tertiary term: the secondary term occurs in both transitive and ditransitive constructions, while the tertiary term occurs only in ditransitive ones. And oblique terms are marked with respect to the predicate frames, because the range of the predicate frames is greater than the range of oblique terms: the predicate frame occurs in every clause, while oblique terms occur only in some clauses. The oblique term is a term transposed into an adverbial that serves as an operator modifying the predicate frame.
Some languages have syntactic constructions that seem to be counterexamples to the claim that secondary terms occur only in transitive constructions and tertiary terms occur only in ditransitive constructions. Consider these Russian impersonal sentences:
‘I feel nauseated’.
‘I am cold’.
In (6a) menja is the accusative of the pronoun ja ‘I.’ Accusative in Russian, as in other languages, indicates a secondary term, and therefore one may wonder whether menja is a secondary term that occurs here without a primary term. True, the inherent syntactic function of accusative is to be the secondary term of a sentence. But in impersonal sentences it can take on the function of the primary term, which is superposed onto its primary function. Thus, it may be equivalent to the primary term of a subordinate clause, which can be changed into a participial construction by eliminating the primary term:
|(7)||Kogda ja smotrju na eto, menja tošnit. |
‘When I look at this, I feel nauseated’.
|→ Smotrja na eto, menja tošnit.|
‘Looking at this, I feel nauseated’.
Similarly, in (7b) the dative mne, whose inherent syntactic function is to be a tertiary term, has taken on the function of a primary term, which is reflected in its syntactic behavior (like that of menja in (6a)): it can induce the change of a corresponding subordinate clause into a participial construction by eliminating an equivalent primary term.
The terms of a clause are mapped onto grammatical meanings characterizing the participants of the situation denoted by the clause. The primary and secondary terms are mapped onto agent and patient, which are central grammatical meanings in a clause. The oblique terms can be mapped onto grammatical meanings characterizing the spatial opposition where: whence: whither: which way (location, source, goal, instrument) and other grammatical meanings that are rooted in spatial meanings, such as time, beneficiary, recipient, etc. The important thing to notice is that agent and patient are universal grammatical meanings associated with the primary and the secondary terms, while other grammatical meanings vary from language to language: every language has its own set of concrete grammatical meanings associated with oblique terms.
The tertiary term has an intermediary place between central terms and oblique terms. On the one hand, it has something in common with central terms: it may alternate with the primary and the secondary terms, in contrast with oblique terms, which never alternate with the central terms. On the other hand, like oblique terms, it is mapped onto concrete meanings—mostly onto beneficiary or recipient, but also onto instrument, goal, and others.
The grammatical meanings of terms (such as grammatical agent, grammatical patient, etc.) in the sense of applicative grammar are not to be equated with Filmorean case roles (Filmore, 1968, 1977), the thematic relations of Gruber (1965), or the theta roles of Chomsky (1981, 1982) and Marantz (1984). For example, Marantz (1984: 129) assigns different rules to the object of the preposition by in passive constructions. Consider
|(8)||a.||Hortense was passed by Elmer. (agent)|
|b.||Elmer was seen by everyone who entered. (experiencer)|
|c.||The intersection was approached by five cars at once. (theme)|
|d.||The porcupine crate was received by Elmer’s firm. (recipient)|
Applicative grammar treats all objects of by in (8a-d) as grammatical agents. Marantz assigns roles to these terms, because he lumps together grammatical and lexical meanings. One must strictly distinguish between and not confuse lexical and grammatical meanings. Grammatical meanings are obligatory meanings that are imposed by the structure of language, while lexical meanings are variables depending on the context. The grammatical meaning ‘agent’ assigned to a term is a formal meaning that treats an object denoted by the term as an agent no matter whether or not it is a real agent. Thus, the objects denoted by the terms in (5b–d) are not real agents, but linguistically they are treated as if they were real agents. Since lexical meanings are closer to reality, a conflict often arises between the lexical and grammatical meanings of a term. We can observe these conflicts in (5b–d), while in (5a) the lexical meaning of the term agrees with its grammatical meaning.
Every word has a number of meanings: some of them are lexical meanings, and others are grammatical meanings. Although from a structural standpoint the grammatical meanings are the most important, they are the least conspicuous. To dispel any illusions, we must understand that the grammatical meanings of a word are not directly accessible; they are blended with the lexical meanings. The blend of grammatical meaning and lexical meaning constitutes a heterogeneous object.
The Russian linguist Aleksandr Peškovskij, who was far ahead of his time, wrote about the heterogeneity of the meaning of the word as follows:
We warn the reader against the antigrammatical hypnotism that comes from the material parts of words. For us, material and grammatical meanings are like forces applied to one and the same point (a word) but acting sometimes in the same direction, sometimes in intersecting directions, and sometimes in exactly opposite directions. And here we must be prepared to see that the force of the material meaning, just like the stream of a river carrying away an object, will be obvious, while the force of the formal meaning, just like the wind blowing against the stream and holding back the same object, will require special methods of analysis. (Peškovskij, 1934:71)
From a formal, grammatical standpoint, The boy is seen by him behaves exactly in the same way as The boy was killed by him. Grammatically, by him is agent and the boy is patient in both sentences. The predicate was seen implies an agent and a patient in the same way as the predicate was killed. The difference between the two predicates lies in their lexical meaning. Both was seen and was killed imply a grammatical agent as the meaning of the oblique term by him. But the lexical meaning of was killed is in keeping with the grammatical notion of agent, while the meaning of was seen conflicts with this notion. Likewise, in the sentence Her younger sibling hates her, sibling is an agent from a grammatical point of view and an experiencer from a lexical point of view. The lexical notion of the experiencer and the grammatical notion of the agent conflict in a construction with the verb hate.
True, there is an interaction between lexical and grammatical meanings. So, the lexical meaning of the verb visit restricts the use of passive constructions with this verb: we can say John visited Rome, but Rome was visited by John cannot be used unless we mean to achieve a comic effect. But that in no way compromises the fundamental distinction between grammatical and lexical meanings.
The grammatical meaning ‘agent’ can be separated from lexical meanings by means of a thought experiment. If we replace the lexical morphemes of a word with dummy morphemes, we obtain the grammatical structure of a sentence in its pure form.
Here is an example of such an experiment (Fries, 1952: 71):
|(9)||a)||Woggles ugged diggles.|
|b)||Uggs woggled diggs.|
|c)||Woggs diggled uggles.|
|d)||A woggle ugged a diggle.|
|e)||An ugg woggles diggs.|
|f)||A diggled woggle ugged a woggled diggle.|
All of these sentences clearly are transitive active constructions, owing to the specific word order and nominal and verbal morphemes. It is clear that the primary terms in these sentences mean ‘agent’, while the secondary terms mean ‘patient’. Now we can relate passive constructions to all of these sentences:
|(10)||a)||Diggles were ugged by woggles.|
|b)||Diggs were woggled by uggs. etc.|
It is clear that the preposition by introduces a term meaning ‘agent’ in these sentences.
Now let us substitute a lexical morpheme for a dummy root in a verb. If we substitute the morpheme hate for a dummy verbal root, we will get sentences such as
|(11)||Woggles hated diggles.|
We can relate a passive construction to (11):
|(12)||Diggles were hated by woggles.|
From the viewpoint of the lexical meaning of hate, the primary term woggles in (11) and the oblique term in by woggles in (12) mean ‘experiencer’. But this meaning has nothing to do with the grammatical meaning of these terms (‘agent’), which remains invariant under various substitutions of lexical verbal roots whose meaning may often conflict with the grammatical meaning of the terms.
Lexical meanings are the meanings of morphemes that constitute word stems, while grammatical meanings are the meanings of inflexional morphemes, prepositions, conjunctions, and other devices, such as word order. Most current American works on grammar disregard the fundamental opposition grammatical meaning: lexical meaning and confound these notions. Recently, Foley and Van Valin have proposed the notions of actor and undergoer, which they define as “generalized semantic relations between a predicate and its arguments” (Foley and Van Valin, 1984: 29). ‘Actor’ and ‘undergoer’ are abstract notions that roughly correspond to the notions of ‘grammatical agent’ and ‘grammatical patient’ in the sense of applicative grammar. Still, Foley and Van Valin present these abstract notions as purely empirical generalizations, without defining the basis of their generalization. The work lacks a distinction between grammatical and lexical meanings, which is a necessary basis for the above and all other abstractions in grammar. We arrive at grammatical notions by separating, by abstracting, grammatical meanings from lexical meanings.
Rejecting the use of the notions ‘agent’ and ‘patient’ in formulating syntactic rules in view of the alleged vagueness of these notions, some linguists, among them Chomsky, Perlmutter, and Marantz, insist on positing a distinct grammatical level opposed to the level of semantic roles and dependencies. They adduce the dichotomy syntax versus semantics and advocate an autonomous syntax independent of semantics.
The dichotomy syntax versus semantics is false, because signs cannot be separated from meaning. The correct dichotomy is grammar versus lexicon rather than syntax versus semantics. Although grammar and lexicon interact, the grammatical structure of a sentence is relatively independent of lexical morphemes, and linguistic theory must do justice to this fundamental empirical fact. That means that linguistic theory must reject any kind of confusion of grammar with lexicon and must advocate the notion of autonomous grammar, in the sense of autonomy from lexicon rather than semantics.
A clear distinction between grammatical and lexical phenomena is a necessary condition for precision, for avoiding vagueness in semantic analysis. Semantic notions such as ‘agent’ seem to be imprecise and vague because of the confusion of grammar and lexicon. As a grammatical notion, the notion of agent is clear and precise insofar as it correlates with structural grammatical markers.
Let us turn to the second type of sentence construction—to the active system characterized by the opposition active constructions: stative constructions.
Here is how Boas and Deloria (1941) characterize this opposition in their Dakota Grammar:
There is a fundamental distinction between verbs expressing states and those expressing actions. The two groups may be designated as neutral and active. The language has a marked tendency to give a strong preponderance to the concept of state. All our adjectives are included in this group, which embraces also almost all verbs that result in a state. Thus a stem like to “sever” is not active but expresses the concept of “to be in severed condition,” the active verb being derived from this stem. The same is true of the concept “to scrape,” the stem of which means “to be in a scraped condition.” Other verbs which we class as active but which take no object, like “to tremble,” are conceived in the same way, the stem meaning “to be a-tremble.” Active verbs include terms that relate exclusively to animate beings, either as actors or as objects acted upon, such as words of going and coming, sounds uttered by animals and man, mental activities and those expressing actions that can affect only living beings (like “to kill,” “to wound,” etc.). There seem to be not more than 12 active words that would not be covered by this definition. (Boas and Deloria, 1941: 1)
Active languages have special morphological markers for a distinction between active and stative predicates and between terms denoting agents and nonagents. There are three morphological types of active and inactive constructions: 1) constructions with morphological markers only for predicates; 2) constructions with morphological markers for both predicates and terms; and 3) constructions with morphological markers only for terms. These morphological types of active and inactive constructions can be presented in the following diagram (Klimov, 1977: 58):
The important thing to notice is that the opposition of one-place and two-place predicates is not essential for active and inactive constructions. These constructions typically consist of predicates with a single term, which by definition is a primary term. If a secondary term occurs, it is always optional.
In terms of morphological case markers and the notions of subject and direct object, the ergative system in languages such as Basque or Avar has traditionally been viewed as the system of case markers whereby the intransitive subject is identified with the transitive direct object morphologically, while the transitive subject receives a unique case marker. The ergative system contrasts with the accusative system in languages such as Russian, Latin, or German, in which the intransitive and transitive subjects are treated morphologically alike, while the transitive direct object receives the unique case marker. That can be illustrated by the following examples from Basque and Latin:
(1) Basque (Lafitte, 1962):
a) Martin ethorri da.
Martin-Abs. came Aux.-3-Sg.
b) Martin-ek hourra igorri du.
Martin-Erg. child-Abs. sent Aux.-3-Sg.
‘Martin sent the child’.
(In Basque the ergative case marker is -(e)k; the absolutive case has a zero case marker. The auxiliary da is used when there is a third person singular intransitive subject, and the auxiliary du is used when there is a third person singular transitive subject and a third person singular transitive direct object.)
a) Venator necavit lupum.
hunter-Nom. kill-Perf. wolf-Acc.
‘The hunter has killed the wolf’.
b) Venator venit.
‘The hunter has come’.
Such facts are well known, but in the recent decade it has been discovered that in many ergative languages some syntactic rules apply to the absolutives in intransitives, and the ergatives in the transitive constructions, in the same way as they do to the intransitive and transitive subjects in accusative languages. That can be illustrated as follows:
In the syntax of accusative languages, there is a rule called Equi-NP Deletion, which deletes the subject of the embedded clause if it is the same as the subject of the matrix clause. Examples:
a) John wants to dance. (from *John wants [John dance])
b) John wants to see Mary. (from *John wants [John see Mary])
c) John can walk. (from *John can [John walk])
There is a syntactic rule called Subject Raising, which promotes the subject of the embedded clause into the matrix clause (the term raising has a metaphorical meaning: in a syntactic tree diagram, the embedded clause is located lower than the matrix clause). Examples:
a) Peter happens to love Nancy. (from It happens that Peter loves Nancy)
b) Jerry seems to be working. (from It seems that Jerry is working)
There is a syntactic rule called Conjunction Reduction, which operates by reducing two sentences to one or some other process. Examples:
a) Harry and Bill are working, (from Harry is working and Bill is working)
b) Ralph sold his old car and bought a new one. (from Ralph sold his old car, and he bought a new one)
It has been observed that in many ergative languages, Equi-NP Deletion, Subject Raising, Conjunction Reduction, and some other syntactic rules apply to a particular class of noun phrases—to a particular class of terms—namely, absolutives in the intransitive constructions and ergatives in the transitive constructions. Here is an example of the application of Equi-NP Deletion in Basque (Anderson, 1976: 12):
|(6)||a.||dantzatzerat joan da|
dance-infin-to go he is
‘He has gone to dance’.
|b.||txakurraren hiltzera joan nintzen.|
dog-def-gen kill-infin-to go I-was
‘I went to kill the dog’.
|c.||ikhusterat joan da.|
see-infin-to go he-is
‘Hei has gone to see himj’.
*‘Hei has gone for himj to see himi’.
The operation of Equi-NP Deletion does not depend on the transitivity of the verb in the matrix clause; the rule is controlled by both transitive verbs such as want and intransitive verbs such as go. It should be noted, however, that it is always ergatives and never absolutives that are deleted in the embedded clause.
Here is an example of Subject Raising in Tongan (Anderson, 1976: 13):
|(7)||a.||‘Oku lava ke hū’a Mele ki hono fale.|
pres possible tns enter abs Mary to his house
‘It is possible for Mary to enter his house’.
|b.||‘Oku lava ‘a Mele ‘o hū ki hono fale.|
pres possible abs Mary tns enter to his house
‘Mary can enter his house’.
In (7a) the subject ‘a Mele has been raised from the embedded clause. The rule also applies to transitive embedded clauses:
|(8)||a.||‘Oku lava ke taa’i ‘e Siale ‘a e fefine.|
pres possible tns hit erg Charlie abs def woman
‘It is possible for Charlie to hit the woman’.
|b.||‘Oku lava ‘e Siale ‘o taa’i ‘a e fefine.|
pres possible erg Charlie tns hit abs def woman
‘Charlie can hit the woman’.
The syntactic behavior of the ergative ‘e Siale is similar to the syntactic behavior of subjects in English that are raised out of embedded clauses. Ergatives thus can be raised out of the complements of lava ‘be possible’ regardless of transitivity. The syntactic behavior of absolutives is similar to the syntactic behavior of direct objects in corresponding English sentences: like direct objects, absolutives cannot be raised out of embedded sentences:
(9) *‘Oku lava ‘a e fefine ‘o taa’i ‘e Siale.
pres possible abs def woman tns hit erg Charlie
‘The woman can be hit (by Charlie)’.
If the ergatives in (7), (8), and (9) are interpreted as subjects and absolutives as direct objects, then Subject Raising applies in Tongan in the same sense as in English.
On the basis of this observation, a number of linguists (among them, Anderson, 1976; Comrie, 1978, 1979; Dixon, 1979) claim that absolutives in intransitive constructions and ergatives in transitive ones constitute exactly the same syntactic class denoted by the term subject in accusative languages, and to fail to recognize that as such is to miss a generalization. Because of this generalization, languages such as Basque or Tongan are considered to be morphologically ergative but syntactically accusative.
The linguists who make this claim must be given credit for unearthing the facts of syntax that call for explanation. No matter whether or not we agree with their argument, we must recognize it and contend with its full force and subtlety. Let us consider this argument in detail.
In one of the most important contributions to the study of ergativity, Stephen R. Anderson adduces that on the basis of rules such as Equi-NP Deletion and Subject Raising, embedded intransitive and transitive subjects are no more distinguished in Basque, an ergative language, than in English, an accusative language, and that subjects and direct objects are discriminated in both languages alike. Anderson concludes: “Rules such as those we have been considering, when investigated in virtually any ergative language, point unambiguously in the direction we have indicated. They show, that is, that from a syntactic point of view these languages are organized in the same way as are accusative languages, and that the basically syntactic notion of ‘subject’ has essentially the same reference in both language types” (Anderson, 1976: 16). Anderson admits that Dyirbal is different from accusative languages with respect to its syntax, but he regards that as an insignificant anomaly. He writes: “Dyirbal, which as noted differs fundamentally from the usual type, is in fact the exception which proves the rule” (Anderson, 1976: 23).
The fact that the same rules, such as Equi-NP Deletion and Subject Raising, apply both to subjects in accusative languages and to ergatives in ergative languages calls for explanation. Anderson and other linguists who share the same views must be given credit for showing that this fact engenders a problem. The formulation of a new problem is often more essential than its solution. Given the failure of a solution to a new significant problem, one can never return to old ideas in order to find a new solution; one has to search for new concepts, which marks the real advance in science. The claim that the syntactic organization of most ergative languages follows the pattern of accusative languages may be questioned, but the arguments that have been advanced for this claim must be surmounted within a fresh conceptual framework. That will be not a return to an old idea but an advance to a new idea giving a new significance to the old concept of ergativity.
It cannot be denied that in most ergative languages, with respect to the application of Equi and Subject Raising, ergatives are similar to transitive subjects in accusative languages. But does this similarity justify the generalization that in ergative languages the NPs to which Equi and Subject Raising apply belong to the class of subjects?
To answer this question, we must bear in mind that the subject is a cluster concept, that is, a concept that is characterized by a set of properties rather than by a single property. The application of Equi and Subject Raising is not a sufficient criterion for determining the class of subjects. Among other criteria, there is at least one that is crucial for characterizing the class of subjects. I mean the fundamental Criterion of the Nonomissibility of the Subject. A nonsubject can be eliminated from a sentence, which will still remain a complete sentence. But that is not normally true of the subject. For instance:
|(10)||a.||John paints a landscape.|
|c.||* Paints a landscape.|
The Criterion of the Nonomissibility of the subject is so important that some linguists consider it a single essential feature for the formal characterization of the subject (Martinet, 1975: 219-24). This criterion is high on Keenan’s Subject Properties List (Keenan, 1976: 313; Keenan uses the term indispensability instead of nonomissibility).
Omissibility should not be confused with ellipsis. Ellipsis is a rule of eliminating syntactic units in specific contexts, and the opposition omission: ellipsis is one of the important aspects of the syntactic structure of any natural language.
When we apply a rule of ellipsis, we can always recover the term that was dropped by ellipsis; but an omitted term cannot be recovered: if in John ate clams we omit clams, we get John ate, and clams cannot be recovered because, starting from John ate, we do not know which noun was omitted.
Every language has specific rules of ellipsis. Thus, in Latin the rule of ellipsis requires the use of predicates without personal pronouns. In Latin we normally say Amo I love’. That is a complete sentence with a predicate and a subject that is implied by the context. In Latin we may say Ego amo only in case we want to place a stylistic stress on the personal pronoun. In Russian, the rules of ellipsis are directly opposite to the rules of ellipsis in Latin: the normal Russian sentence corresponding to Latin Amo is Ja ljublju, without the ellipsis of the personal pronoun; but if we want to place a stylistic stress, we use ellipsis and say Ljublju. The Russian Ljublju is as complete a sentence as Latin Amo. Both sentences have a subject and a predicate.
Mel’čuk characterizes the Criterion of Nonomissibility as follows:
Deletability (called also dispensability; see Van Valin 1977: 690) is a powerful and reliable test for the privileged status of any NP: if there is in the language only one type of NP which cannot be omitted from the surface-syntactic structure of the sentence without affecting the grammaticality of the latter or its independence from the linguistic content, then this NP is syntactically privileged. Note that in English it is GS [grammatical subject] (and only GS) that possesses the property of non-deletability among all types of NP. To put it differently, if a grammatical sentence in English includes only one NP it must be GS. (Imperative sentences like Read this book!, etc. do not contradict the last statement.) Based on such cases as Wash yourself/yourselves!, Everybody stand up! and the like, a GS—you—is postulated in their surface-syntactic structures, where this GS cannot be omitted. It does not appear in the actual sentence following some rules of ellipsis. As for pseudo-imperative sentences of the type Fuck you, bastard!, these are explained away in the penetrating essay by Quang (1971). (Mel’čuk, 1983: 235-36)
The Criterion of the Nonomissibility of the Subject excludes the possibility of languages where subjects could be eliminated from sentences. Yet precisely such is the case with ergative languages, if we identify ergatives with transitive subjects and absolutives with intransitive subjects in intransitive constructions and with transitive objects in transitive constructions. In many ergative languages, we can normally eliminate ergatives, but we cannot eliminate absolutives from transitive constructions. Here is an example from Tongan (Churchward, 1953: 69):
|(11)||a.||‘Oku taki au ‘e Siale.|
‘Charlie leads me’.
|b.||‘Oku taki au.|
‘Leads me (I am led)’.
‘e Siale in (11a) is an ergative. It is omitted in (11b), which is a normal way of expressing in Tongan what we express in English by means of a passive verb (Tongan does not have passive).
Notice that in accusative languages the opposition subject: direct object is normally correlated with the opposition active voice: passive voice, while ergative languages normally do not have the opposition active voice: passive voice. This fact has significant consequences. In order to compensate for the lack of the passive, ergative languages use the omission of ergatives as a normal syntactic procedure that corresponds to passivization in accusative languages (an absolutive in a construction with an omitted ergative corresponds to a subject in a passive construction in an accusative language), or use focus rules that make it possible to impart prominence to any member of a sentence (in this case either an absolutive or an ergative may correspond to a subject in an accusative language). Here is an example of the application of focus rules in Tongan (Churchward, 1953: 67):
|(12)||a.||Na’e tamate’i ‘e Tēvita ‘a Kōlaiate.|
‘David killed Goliath’.
|b.||Na’e tamate’i‘a Kōlaiate ‘e Tēvita.|
‘Goliath was killed by David’.
Sentence (12a) corresponds to David killed Goliath in English, while (12b) corresponds to Goliath was killed by David. In the first case, the ergative ‘e Tēvita corresponds to the subject David in the active construction, while in the second case, the absolutive ‘a Kōlaiate corresponds to the subject Goliath in the passive. The focus rule gives prominence to the noun that immediately follows the verb, that is, to ‘e Tēvita in (12a) and to ‘a Kōlaiate in (12b).
In Tongan, as in many other ergative languages, we are faced with a serious difficulty resulting from the following contradiction: if the class of subjects is characterized by the application of Equi and Subject Raising, then ergatives are subjects in transitive constructions and absolutives are subjects in intransitive constructions; but if the class of subjects is characterized by the Criterion of the Nonomissibility of the Subject, then only absolutives can be subjects in transitive constructions. Since we cannot dispense with either of these criteria, that creates contradiction in defining the essential properties of the subject.
One might question whether we cannot dispense with either criterion. What if we choose to define subject in terms of one and disregard the other? The answer is that no essential criterion can be dispensed with in a theoretically adequate definition, because any theoretically adequate definition must include all essential features of the defined concept. We could dispense with one of these criteria only if we considered one of them inessential, that is, relating to an accidental feature of subject. But, as is well known, nonomissibility is a nonaccidental, permanent, essential feature of subject: subject is nonomissible because it is a syntactically distinguished, central, highly privileged term of a sentence in a given language. On the other hand, the behavior of subject with respect to Equi-NP Deletion and Subject Raising is also essential; therefore, we cannot dispense with this criterion, either. The two criteria are essential in defining the notion of subject, but at the same time they contradict one another when the notion of ergative is equated with the notion of subject. This contradiction I call the paradox of ergativity.
To solve this paradox, we must recognize that ergative and absolutive cannot be defined in terms of subject and object, but, rather, these are distinct primitive syntactic functions.
Since the terms ergative and absolutive are already used for the designation of morphological cases, I introduce special symbols with superscripts which will be used when ambiguity might arise as to whether syntactic functions or morphological cases are meant: ERG F means the syntactic function ‘ergative’, while ERGC means the morphological case ‘ergative’. Similarly, ABSF and ABSC.
The syntactic functions ‘absolutive’ and ‘ergative’ should be strictly distinguished from the morphological cases ‘absolutive’ and ‘ergative’. First, some languages, such as Abkhaz or Mayan languages, are case-less, but they have the syntactic functions ‘absolutive’ and ergative’. Second, the syntactic function ‘ergative’ can be denoted not only by the ergative case but also by other oblique cases and coding devices (including word order). Of course, we must establish operational definitions of the syntactic functions ‘absolutive’ and ‘ergative’. An instance of such an operational definition is presented in section 10.1.
The syntactic functions ‘ergative’ and ‘absolutive’ must be regarded as primitives independent of the syntactic functions ‘subject’ and ‘object’.
We can now formulate the Correspondence Hypothesis:
The morphological opposition of case markings ERGC:ABSC corresponds to the syntactic opposition ERGF: ABSF, which is independent of the syntactic opposition subject: object in accusative languages.
The symbols ERGC and ABSC are generalized designations of case markings. So, ERGC may designate not only an ergative case morpheme but any oblique case morpheme, say a dative or instrumental, or a class of morphemes that are in a complementary distribution, such as case markings of ergative.
Let us now compare the two particular classes of terms to which the syntactic rules in question apply: 1) the intransitive and transitive subjects in accusative languages, and 2) the absolutives in intransitive clauses and ergatives in transitive clauses in ergative languages. The first class is homogeneous with respect to nonomissibility (both intransitive and transitive subjects are nonomissible terms), but the second class is heterogeneous with respect to this property of terms: the absolutives are nonomissible, while ergatives are omissible terms. The heterogeneity of the class of terms to which the syntactic rules in question apply in ergative languages is an anomaly that calls for an explanation. We face the problem: How to resolve the contradiction that the ergative, which is the omissible term of a clause, is treated under the rules in question as if it were the nonomissible term?
In order to solve our problem, let us now consider more closely the syntactic oppositions ergative: absolutive and subject: object. Both of these oppositions can be neutralized. Thus, ergatives and absolutives contrast only as arguments of two-place predicates. The point of neutralization is the NP position in a one-place predicate where only an absolutive occurs.
The question arises, What is the meaning of the syntactic functions ergative and absolutive?
Ergative means ‘agent’, which we will symbolize by A. Absolutive, contrasting with ergative, means ‘patient’—henceforth symbolized by P. Since, in the point of neutralization, an absolutive replaces the opposition ergative: absolutive, it can function either as an ergative or as an absolutive, contrasting with ergative; that is, semantically it may mean either ‘agent’ (the meaning of an ergative) or ‘patient’ (the meaning of an absolutive contrasting with an ergative).
The absolutive is a neutral-negative (unmarked) member of the syntactic opposition ergative: absolutive, and the ergative is a positive (marked) member of this opposition. That can be represented by the following diagram:
The meaning of subject and object is defined in terms of A and P as follows:
Object means P; subject contrasting with object means A. In the point of neutralization, subject replaces the opposition subject: object. Therefore, it can function either as subject contrasting with object or as object; that is, semantically, it may mean either ‘agent’ (the meaning of subject contrasting with object) or ‘patient’ (the meaning of object).
The subject is a neutral-negative (unmarked) member of the syntactic opposition subject: object, and the object is a positive (marked) member of this opposition. That can be represented by the following diagram:
We come up with the opposition unmarked term:marked term. On the basis of this opposition, we establish the following correspondence between cases in ergative and accusative constructions:
Examples of the neutralization of syntactic oppositions in English (an accusative language):
|(16)||a.||John opened the door.|
|c.||The doors open.|
|d.||The doors are opened.|
In (16a), which is a transitive construction, the transitive subject John is an agent, and the transitive object the door is a patient. In the intransitive constructions, a subject denotes either an agent, as in (16b), or a patient, as in (16c) and (16d).
Examples of neutralization of syntactic oppositions in Tongan (an ergative language):
|(17)||a.||Na’e inu ‘a e kava’e Sione.|
Past drink Abs. the kava Erg. John
‘John drank the kava’.
|b.||Na’e inu ‘a Sione.|
Past drink Abs. John
|c.||Na’e lea ‘a Tolu.|
Past speak Abs. Tolu
|d.||Na’e ‘uheina ‘a e ngoué.|
‘The garden was rained upon’.
In (17a) the ergative ‘e Sione denotes an agent, and the absolutive ‘a e kava denotes a patient. In (17b) the transitive inu is used as an intransitive verb; therefore, here we have the absolutive ‘a Sione instead of the ergative ‘e Sione. In (17c) the absolutive ‘a Tolu denotes an agent. In (17d) the absolutive ‘a e ngoué denotes a patient.
Speaking of the neutralization of syntactic oppositions in ergative languages, we should not confuse ergative languages with active (or agentive) languages. Some linguists consider active languages to be a variety of ergative languages. This view is incorrect: as was shown in section 8 of this chapter, active languages are polarly opposed to both ergative and accusative languages. Sapir distinguished the active construction (typified by Dakota) from both the ergative construction (typified by Chinook) and the accusative construction (typified by Paiute) (Sapir, 1917: 86). The ergative and accusative constructions are both based upon a verbal opposition transitive intransitive, while for the active construction the basis of verbal classification is not the opposition transitive: intransitive (which is absent here) but rather a classification of verbs as active and inactive (stative). In a series of publications, G. A. Klimov has demonstrated the radical distinctness of the active construction from the ergative and accusative constructions and has provided a typology of the active constructions (Klimov, 1972, 1973, 1974, 1977). The notion of the active construction as radically distinct from the ergative construction is shared by a number of contemporary linguists (see, for example, Aronson, 1977; Dik, 1978; Kibrik, 1979). In active languages, the verbal classification as active and inactive correlates with the formal opposition of terms active (agent): inactive (patient). Since this opposition is valid for both two-place and one-place predicates (a one-place predicate can be combined with a noun in an active or in an inactive case), the NP position in a one-place predicate cannot be considered a point of the neutralization of the opposition active inactive. So, the notion of syntactic neutralization is inapplicable to active constructions. Therefore, the discussion below of the consequences of syntactic neutralization in ergative languages does not apply to active languages, where syntactic neutralization is absent.
The markedness relation between absolutive and ergative subject and object is defined by the Markedness Law (given on page 122):
Given two semiotic units A and B which are members of a binary opposition A:B, if the range of A is wider than the range of B, then the set of the relevant features of A is narrower than the set of the relevant features of B, which has a plus relevant feature.
The binary opposition A:B characterized by the Markedness Law is called the markedness relation between A and B; A is called the unmarked term and B, the marked term of this opposition.
Under the Markedness Law, absolutives and subjects are unmarked terms, and ergatives and objects are marked ones, because absolutives and subjects occur with both one-place and two-place predicates, while ergatives and objects occur only with two-place predicates.
The relation of markedness in a sentence is characterized by the Dominance Law (given on page 123):
The marked term in a sentence cannot occur without the unmarked term, while the unmarked term can occur without the marked term.
A corollary of the Dominance Law is that the unmarked term of a sentence is its central, its independent term. By contrast, the marked term of a sentence is its marginal, its dependent term. I will call the unmarked term the primary term and the marked term the secondary term.
The Dominance Law explains why the marked term is omissible and the unmarked term is nonomissible in a clause representing the opposition unmarked term:marked term. The marked term is omissible because the unmarked term does not presuppose the occurrence of the marked term. And the unmarked term is nonomissible because the marked term presupposes the occurrence of the unmarked term; that is, it cannot occur without the unmarked term.
It is to be noted that the Dominance Law makes an empirical claim that must be validated by empirical research. But this law, like any other linguistic law, is an idealization of linguistic reality in the same sense as physical laws are idealizations of physical reality. Just as in physics empirical research discovers empirically explicable deviations from physical laws, so in linguistics empirical research has to discover empirically explicable deviations from linguistic laws. Empirically explicable deviations from a law should not be confused with real counterexamples that undermine the law. Thus, in (10) and (11) I gave some examples of the omissibility and nonomissibility of terms that can be explained by the Dominance Law. But it is easy to find apparent counterexamples to this law. For example, one could produce a sentence such as John weighs 150 pounds as such a counterexample. But here we have a semantically explicable deviation from the law rather than a counterexample. If we analyze such sentences, we discover a semantic constraint on the omissibility of the direct object: If the meaning of the transitive predicate is incomplete without the meaning of the direct object, then the direct object cannot be omitted. This constraint has nothing to do with the syntactic structure of a sentence; it belongs in the realm of semantics. There may be found other apparent counterexamples to the law that actually are deviations explicable by rules of ellipsis or some other clearly defined constraints.
We can now formulate a law that I call the Law of Duality:
The marked term of an ergative construction corresponds to the unmarked term of an accusative construction, and the unmarked term of an ergative construction corresponds to the marked term of an accusative construction; and, vice versa, the marked term of an accusative construction corresponds to the unmarked term of an ergative construction, and the unmarked term of an accusative construction corresponds to the marked term of an ergative construction.
An accusative construction and an ergative construction will be called duals of each other.
The Law of Duality means that accusative and ergative constructions relate to each other as mirror images. The marked and unmarked terms in accusative and ergative constructions are polar categories, like, for example, positive and negative electric charges; a correspondence of unmarked terms to marked terms and of marked terms to unmarked terms can be compared to what physicists call ‘charge conjugation’, a change of all plus charges to minus and all minus charges to plus.
The proposed Law of Duality also reminds one of laws of duality in projective geometry and mathematical logic. For example, in logic duals are formed by changing alternation to conjunction in a formula and vice versa.
The Law of Duality is valid in phonology, as well. Consider, for instance, the opposition d:t in Russian and the opposition d:t in Danish. On the surface these two oppositions are the same. But, as a matter of fact, the Russian d:t is a case of the opposition voiced:voiceless, and the Danish d:t is a case of the opposition lax:tense.
In Danish the neutralization of the opposition d:t results in d, which can represent either d or t. So, d is a neutral-negative (unmarked) member of the opposition d:t, and t is a positive (marked) member of this opposition. That can be represented by the following diagram:
In Russian the neutralization of the opposition d:t results in t, which can represent either d or t. So, t is a neutral-negative (unmarked) member of the opposition d:t, and d is a positive (marked) member of this opposition. That can be represented by the following diagram:
We come up with the opposition unmarked term:marked term in phonology. On the basis of this opposition, we establish the following correspondence between members of the oppositions lax:tense and voiced:voiceless.
Now we can apply the Law of Duality in phonology:
The marked term of the opposition lax:tense corresponds to the unmarked term of the opposition voiced: voiceless, and the unmarked term of the opposition lax:tense corresponds to the marked term of the opposition voiced:voiceless; and, vice versa, the marked term of the opposition voiced:voiceless corresponds to the unmarked term of the opposition lax:tense, and the unmarked term of the opposition voiced:voiceless corresponds to the marked term of the opposition lax:tense.
Let us now turn to our main problem, which, in the light of the Markedness Law, can be restated as follows:How can we resolve the contradiction that ergative, which is the secondary (marked) term of a clause, is treated under the above rules as if it were the primary (unmarked) term of a clause?
In order to resolve this contradiction, we must rely on the notion of functional superposition introduced in section 6.3 of this chapter. Functional superposition means that any syntactic unit has its own characteristic syntactic function, but in addition, it can take on the function of any other syntactic unit, so that this function is superposed onto the characteristic function. The notion of functional superposition throws light on our problem. The important thing to notice is that only primary terms can appear in the intransitive clauses. An identification of a term of the transitive clause with the primary term of the intransitive clause involves a superposition of the function of the primary term in the intransitive clause onto the function of the given term of the transitive clause. Three possibilities are open:1) only the primary term of a transitive clause can be identified with the primary term of an intransitive clause (no superposition); 2) only the secondary term of a transitive clause can be identified with the primary term of an intransitive clause (a superposition of the function of the primary term); or 3) both the primary and the secondary terms of a transitive sentence can be identified with the primary term of an intransitive sentence (no superposition or the superposition of the function of the primary term of an intransitive clause onto the secondary term of the transitive clause).
Accusative languages realize only the first possibility:both intransitive subject and transitive subject are primary terms. But all three possibilities are realized in ergative languages:1) the syntactic rules in question are stated with reference to only absolutives in intransitive and transitive clauses (Dyirbal); 2) the syntactic rules in question are stated with reference to absolutives in intransitive clauses and ergatives in transitive clauses (Basque); and 3) the syntactic rules in question are stated with reference to absolutives in intransitive clauses and to absolutives and ergatives in transitive clauses (Archi, a Daghestan language; Kibrik, 1979:71-72).
The notion of functional superposition in the ergative construction should not be confused with the notion of the pivot introduced by Dixon (1979). These notions have nothing in common. Rather, they are opposed to each other. While the notion of functional superposition, characterizing a syncretic property of syntactic units, reveals the radical distinctness of the syntactic structure of the ergative construction from the syntactic structure of the accusative construction, pivot is nothing but a descriptive term that glosses over this distinctness. Dixon uses symbol S to denote an intransitive subject, symbol A to denote a transitive subject, and symbol O to denote a transitive object. Syntactic rules may treat S and A in the same way, or they may treat S and O in the same way: “we refer to S/A and S/O pivots respectively” (Dixon, 1979:132). Dixon writes: “Many languages which have an ergative morphology do not have ergative syntax; instead syntactic rules seem to operate on an ‘accusative’ principle treating S and A in the same way” (Dixon, 1979:63). Referring to Anderson (1976), Dixon relegates Basque and most other ergative languages (except Dyirbal and a few others) to a class of languages that have ergative morphology but accusative syntax. Thus, in spite of a different terminology, Dixon shares the same view as Anderson and other linguists who claim that the syntactic structure of ergative constructions in most ergative languages is identical with the syntactic structure of accusative constructions.
Why does functional superposition occur in ergative constructions and not in accusative constructions? This fact can be explained by a semantic hypothesis advanced on independent grounds. According to this hypothesis based on the semiotic principle of iconicity, the sequence agent-patient is more natural than the sequence patient-agent, because the first sequence is an image of a natural hierarchy according to which the agent is the starting point and the patient is the end point of an action. The semantic hierarchy agent-patient coincides with the syntactic hierarchy transitive subject-direct object in accusative languages, because transitive subject denotes agent and direct object denotes patient. But this semantic hierarchy contradicts the syntactic hierarchy absolutive-ergative, because absolutive, being syntactically a primary term, denotes patient’, which is semantically a secondary term, and ergative, being syntactically a secondary term, denotes ‘agent’, which is semantically a primary term. Hence, under the pressure of the semantic hierarchy agent-patient, functional superposition assigns to the ergative the role of a syntactically primary term.
I propose the following definition of the notions ‘accusative construction’ and ‘ergative construction’:
1. The accusative construction and ergative construction are two representations of the abstract transitive/intransitive clause pattern: primary term + transitive predicate + secondary term/primary term + intransitive predicate.
2. Primary term is represented by subject in the accusative construction and by absolutive in the ergative construction. Secondary term is represented by direct object in the accusative construction and by ergative in the ergative construction.
3. The abstract clause pattern and its representations are characterized by the Markedness Law and the Dominance Law.
4. There is a correlation between the accusative and ergative constructions characterized by the Law of Duality.
5. The primary and secondary terms may exchange their functions, so that the function of the primary term is superposed onto the secondary term, and the function of the secondary term is superposed onto the primary term.
In order to make the rules Equi and Subject Raising valid for both accusative and ergative languages, we have to replace them with more abstract rules: Equi-Primary Term Deletion and Primary Term Raising. The generalizations expressed by these abstract rules solve the problem raised by Anderson. Contrary to his claim, there are neither subjects nor direct objects in ergative languages:ergative and absolutive are distinct primitive syntactic functions. What ergative and accusative constructions have in common is that they are different realizations of the abstract construction primary term:secondary term. The new abstract rules represent a correct generalization that cannot be captured in terms of the subject and direct object.
The rules Equi-Primary Term Deletion and Primary Term Raising lay bare the parallelism between the syntactic structure of the ergative construction in Dyirbal and the syntactic structure of the accusative construction, but a sharp difference between the syntactic structures of the ergative construction in Basque and similar ergative languages and the accusative construction. Since in Dyirbal the rules in question apply only to absolutives both in intransitive and in transitive constructions, the only difference between the ergative construction in Dyirbal and an accusative construction, say, in English, boils down to a mirror-image semantic interpretation:while primary terms in Dyirbal ergative constructions denote patients and secondary terms, agents, in any accusative construction, quite the reverse, primary terms denote agents and secondary terms, patients. Let us consider an example of the ergative construction from Dyirbal (Dixon, 1972):
|(19)||Yabu||ywna + ygu||bura+n.|
|mother||father+ Agent||sec+ Past|
‘Mother was seen by father’.
Compare (19) with the English sentence
We see that the syntactic function of yabu denoting a patient in (19) is a counterpart of the syntactic function of father denoting an agent in (20), while the syntactic function of ŋuma+ŋgu denoting an agent in (19) is a counterpart of the syntactic function of mother denoting a patient in (20).
Now, if we turn to a comparison of Basque and English, we discover a sharp difference between the syntactic structures of accusative and ergative constructions. Consider some examples from Basque discussed by Anderson (1976:12; see (6) above):
‘He has gone to dance’.
‘I went to kill the dog’.
The deleted term from the embedded clause in the underlying structure has to be absolutive in (21) and ergative in (22).
In (22) the syntactic structure of the embedded ergative clause is very different from the syntactic structure of the embedded accusative clause in English, because the deleted ergative in the former has a syntactic behavior that is sharply distinct from the syntactic behavior of the deleted subject in the latter. While subject is an intrinsically primary term, ergative is an intrinsically secondary term. In (22) ergative functions as a primary term, but the function of the primary term is superposed onto ergative, whose characteristic intrinsic syntactic function is to serve as a secondary term. Since in (22) ergative functions as a primary term, the function of the secondary term is superposed onto absolutive, whose characteristic function is to serve as a primary term. The important thing to notice is that in transitive clauses similar to (22), neither ergative nor absolutive loses its characteristic function, and, consequently, absolutive can never be deleted in these clauses, while ergative can, in accordance with the Dominance Law, which holds that secondary terms presuppose primary terms, while primary terms do not presuppose secondary terms.
In summary, there is parallelism and at the same time a sharp difference between the syntactic structures of the ergative and accusative constructions. In order to do justice to both the parallelism and the difference, we have to state our generalizations and rules in terms of the abstract notions ‘primary term’ and ‘secondary term’ . The rules Equi-Primary Term Deletion and Primary Term Raising capture what ergative and accusative constructions have in common and at the same time reflect the laws and principles characterizing a profound difference between the two types of syntactic constructions.
Thus, we have come to a conclusion that is diametrically opposite to Anderson’s view on ergativity. He claims that the majority of ergative languages, with the exception of Dyirbal and a few similar languages, have the same syntactic structure as accusative languages. We claim, on the contrary, that the syntactic structure of the majority of ergative languages differs sharply from the syntactic structure of accusative languages, with the exception of Dyirbal, whose syntactic structure exhibits parallelism with the syntactic structure of accusative languages, while not being identical with it.
The theory of ergativity advanced above I call the Integrated Theory of Ergativity, because, rather than opposing morphological ergativity and syntactic ergativity, this theory integrates the two notions of ergativity into a single notion of ergativity.
Now, after I have shown that the Integrated Theory of Ergativity resolves the fundamental problems posed by ergative constructions, I will put it to a further test by tracing out its consequences that bear upon linguistic typology. I will show that this theory is able to explain some further linguistic phenomena.
So long as everything proceeds according to his prior expectations, a linguist has no opportunity to improve on his linguistic theory. Improvements on a linguistic theory result from the search for explanations of anomalous facts.
The statement about the importance of anomalous facts for improving on linguistic theories needs to be qualified. Not all anomalies are equally important for a linguistic theory. For instance, irregular plurals in English, such as mice from mouse, are anomalous, but they are not crucial for a theory of English grammar:these facts belong in the lexicon. Only if significant anomalies can be demonstrated will there be a genuine theoretical issue to face.
A fact that is a significant anomaly for a given linguistic theory I call a linguistic phenomenon.
It follows from the definition of the linguistic phenomenon that this concept is relative to a given theory. A fact that is anomalous from the standpoint of one theory may be regular from the standpoint of another theory.
To explain a linguistic phenomenon is to subsume it under a conceptual framework from whose point of view it ceases to be anomalous and is considered regular.
In testing a linguistic theory, it is important to find out whether this theory can make sense of linguistic phenomena for which there is no way of accounting using the currently accepted theories and, in addition, of all those phenomena that contravene these theories.
Let us now consider further consequences of the Integrated Theory of Ergativity. These consequences will be presented under the following headings:
1. Ergativity as a grammatical category
2. Accessibility to relative clause formation
3. Voices in ergative languages
4. Split ergativity
5. The class of ergative languages
6. The practical results anticipated
One fundamental consequence of this theory is that only those ergative processes can be considered formal ergative processes that correlate with ergative morphology.
I propose a broad definition of morphology that includes any coding device of a language. Under this definition, word order is part of morphology. The Abkhaz and Mayan languages are case-less, but since they have coding devices for marking case relations, these coding devices are covered by my definition of morphology.
Ergative expressions can be found in languages that do not have ergative morphology; that is, they are not distinguished by coding devices (Moravcsik, 1978). For example, as far as nominalizations are concerned, Russian, an accusative language, has ergative expressions:genitive functions as absolutive, and instrumental functions as ergative (Comrie, 1978:375-76). In French and Turkish, both accusative languages, there are causative constructions that are formed on ergative principles (Comrie, 1976:262-63); in French there are anti-passive constructions (Postal, 1977).
Do ergative expressions not distinguished by coding devices belong to a distinct grammatical category, that is, to a distinct grammatical class?
A language is a sign system. And in accordance with the Principle of Semiotic Relevance, two different grammatical meanings are distinct if they correlate with different signs, that is, with different coding devices. Consequently, two classes of expressions belong to different grammatical categories if the difference between their grammatical meanings correlates with different coding devices. For lack of this correlation, the two classes belong to the same grammatical category.
In studying natural languages, one may discover various linguistic relations. But if given linguistic relations are not distinguished from one another by at least one distinct coding rule, then they are variants of the same linguistic relation.
Ergative expressions can constitute a distinct grammatical category in a given language only if they are distinguished from other classes of expressions by at least one distinct coding rule.
In order to make my case concrete, I will consider ergative expressions in Russian. It is claimed that “as far as nominalizations are concerned, Russian has in effect an ergative system” (Comrie, 1978:376). This claim is based on the following data.
In Russian, passive constructions can be nominalized. For example, we may have
|(1)||a.||Gorod razrušen vragom.|
city has-been-destroyed enemy-by
‘The city has been destroyed by the enemy’.
|b.||razrušenie goroda vragom|
destruction city-of enemy-by
‘the city’s destruction by the enemy’
(1b) is the nominalization of (1a). In (1b) the genitive goroda denotes a patient, the instrumental vragom denotes an agent, and the verbal noun razrušenie corresponds to a transitive predicate. This nominal construction correlates with a nominal construction in which a verbal noun corresponds to an intransitive predicate and genitive denotes an agent, for example:
|‘the enemy’s arrival’|
If we compare (1b) with (2), we can see that the patient in (1b) and the agent in stand in the genitive (functioning as an absolutive), while the agent in (1b) stands in the instrumental (functioning as an ergative). Therefore, we can conclude that in Russian nominalizations involve ergativity.
Does ergativity constitute a distinct formal category in Russian nominal constructions?
Consider the following example of nominalization in Russian:
|(3)||a.||Ivan prenehregaet zanjatijami.|
John neglects studies
‘John neglects his studies’.
|b.||prenebrezenie Ivana zanjatijami|
Neglect John(Gen) studies(Instr)
‘John’s neglect of his studies’
The surface structure of (3b) is the same as the surface structure of (1b), but the instrumental zanjatijami denotes a patient rather than an agent, and the genitive Ivana denotes an agent rather than a patient. In this instance of nominalization, the instrumental zanjatijami functions as an object, and the genitive Ivana functions as a subject.
It is not difficult to find more examples of nominalization in which instrumental denote patients rather than agents and genitives denote agents rather than patients. This type of nominalization occurs in a large class of verbs that take an object in the instrumental, such as rukovodit’ ‘to guide’, upravljat’ ‘to manage’, torgovat’ ‘to sell’, etc.
All these examples show that Russian does not use any coding devices to make ergativity a distinct formal category in nominal constructions. True, ergativity differs from other relations denoted by the instrumental in Russian nominal constructions. But since ergativity is not distinguished from other relations in the opposition instrumental:genitive by at least one coding rule, ergativity does not constitute a distinct formal category and is simply a member of the class of relations denoted by the instrumental in Russian nominal constructions.
One may object to the above analysis of the ergative pattern and the meaning of the instrumental in Russian nominal constructions by pointing out that in Dyirbal and other Australian languages, the instrumental is used as an equivalent of both the ergative and the instrumental in other ergative languages. Why, one might ask, do I consider Dyirbal to have the grammatical category ‘ergative’ and deny that Russian has this grammatical category?
My answer is that any syntactic pattern must be considered in its relationship to the overall system of the language to which it belongs. The syntactic patterns with the instrumental are very different in Dyirbal and in Russian. True, the instrumental merges with the ergative in Dyirbal. But two instrumental, one with the meaning ‘agent’ and another with the meaning ‘instrument’, can contrast within the same sentence in Dyirbal, which is impossible in Russian.
Consider the Dyirbal sentence
A similar sentence with agent-instrumental and instrument-instrumental is ungrammatical in Russian. These two instrumentals are in complementary distribution in Russian, while they contrast in Dyirbal. Besides, sentences with agent-instrumentals are basic, that is, unmarked, in Dyirbal, while in Russian, sentences with agent-instrumentals are passive constructions, that is, non-basic, marked constructions. Actually, Russian nominal constructions with agent-instrumentals are analogues of Russian passive constructions.
The above consequence is of paramount importance for typological research:with respect to ergativity, only those syntactic processes are typologically significant that are reflected by morphological processes.
Here are some phenomena that are typologically significant for the study of ergative processes:relativization, split ergativity, extraction rules (so called because they extract a constituent from its position and move it to some other position; the term extraction rules covers WH-Question, relativization, and focus), antipassives, and possessives.
The important thing to note is that the ergative processes connected with these phenomena have no counterparts in accusative languages; they characterize only different types of ergative languages.
In treating ergativity as a grammatical category, we come across the following question:Is ergativity identical with agentivity?
Under the definition of the grammatical category proposed above, ergativity is identical with agentivity if we define the meaning ‘agent’ as a class of meanings characterized by the same coding devices as the syntactic function ‘ergative’.
The claim that agent is a grammatical category in ergative languages is opposed to the currently prevailing view that the notion ‘agent’ is a nonformal, purely semantic concept. Thus, Comrie writes:
I explicitly reject the identification of ergativity and agentivity, [. . .] despite some similarities between ergativity and agentivity, evidence from a wide range of ergative languages points against this identification. (Comrie, 1978:356)
To support his view, Comrie quotes examples, such as the following sentences from Basque (Comrie, 1978:357):
(5)Herra -k z -erabiltza.
hatred -Erg. you-move
‘Hatred inspires you‘.
(6)Ur-handia-k d-erabilka eihara.
river -Erg. it-move mill-Abs.
‘The river works the mill’.
Such examples show that agentivity is denied a formal status in ergative languages because of the confusion of the lexical and grammatical meanings of nouns in the ergative case.
Lexical meanings are meanings of morphemes that constitute word stems, while grammatical meanings are meanings of inflexional morphemes, prepositions, conjunctions, and other formal devices, such as word order. Lexical meanings are not necessarily congruous with grammatical meanings they are combined with. There may be a conflict between the lexical and grammatical meanings of a word. For example, the grammatical meaning of any noun is ‘thing’, but the lexical meaning of a noun may conflict with its grammatical meaning. Thus, the lexical meanings of the words table or dog are congruous with their grammatical meanings, but the lexical meanings of rotation (process) or redness (property) conflict with their grammatical meaning ‘thing’. The grammatical meaning of verbs is ‘process’. In verbs such as to give or to walk, the lexical meanings refer to different actions, and therefore they are congruous with the grammatical meaning of the verbs. But consider the verbs to father or to house. Here the lexical meanings conflict with the grammatical meaning of the verbs. Lexical meanings are closer to reality than are grammatical meanings. The differences between word classes are based not upon the nature of elements of reality words refer to, but upon the way of their presentation. Thus, a noun is a name of anything presented as a thing; a verb is a name of anything presented as a process. If we confuse lexical and grammatical meanings, we will be unable to distinguish not only between main classes of words but also between any grammatical categories. A case in point is the grammatical category of agentivity.
From a grammatical point of view, any noun in the ergative case means ‘agent’, no matter what its lexical meaning is (that is, the meaning of the stem of the given noun). In Comrie’s examples, the lexical meanings of her-ra-k in (33) and of ur-handia-k in (34) conflict with the meaning of the ergative case, which is a grammatical meaning. The ergative case has nothing to do with the objects of reality that the lexical meanings of nouns refer to. It has nothing to do with real agents; rather, the ergative case is a formal mode of presentation of anything as an agent, no matter whether it is a real agent or not. Contrary to the current view, the agent is a formal notion in ergative languages. This claim is based on a strict distinction between lexical and grammatical meanings.
While in ergative languages the agent is a grammatical category, it does not have a formal status in accusative languages. In these languages the agent is a variant of the meaning of the nominative, instrumental, or some other cases, or prepositional phrases. For example, we can use the term agent when speaking of passive constructions, but only in the sense of a variant of some more general grammatical category, because there are no distinct coding devices that separate the meaning ‘agent’ from other related meanings, such as the meaning ‘instrument’. Thus, in English, by introduces an agent in the passive but can have other meanings, as well. Compare:written by Hemingway and taken by force, earned by writing, etc.
Constraints on the extractability of ergatives pose serious problems with respect to the Keenan-Comrie Accessibility Hierarchy (Keenan and Comrie, 1977). Recent investigations have revealed that processes such as relative clause formation are sensitive to the following hierarchy of grammatical relations:
(7) Subject > Direct object > Indirect object > Oblique NP >
> Possessor > Object of comparison
where > means ‘more accessible than’.
The positions on the Accessibility Hierarchy are to be understood as specifying a set of possible relativizations that a language can make:relativizations that apply at some point of the hierarchy must apply at any higher point. The Accessibility Hierarchy predicts, for instance, that there is no language that can relativize direct objects and not subjects, or that can relativize possessors and subjects but not direct objects and oblique NPs.
The Accessibility Hierarchy excludes the possibility of languages where subjects were less accessible to relativization than were objects. Yet precisely such is the case with Mayan languages if the notion ‘ergative construction’ is defined on the basis of subject, as is done by the authors of the Accessibility Hierarchy, whose stance is representative of the views on ergativity. To see that, let us turn to Comrie’s definition of the notions ‘accusative construction’ and ‘ergative construction’ (Comrie, 1978:343-50; 1979:221-23).
In speaking about the arguments of one-place and two-place predicates, Comrie uses the symbol S to refer to the argument of a one-place predicate, and the symbols A (typically agent) and P (typically patient) to refer to the arguments of a two-place predicate. Where a predicate has the argument P, it is called a transitive predicate. All other predicates, whether one-place, two-place, or more than two-place, are called intransitive. An intransitive predicate can and usually does have an S, but it cannot have an A.
Using the three primitives S, A, and P, Comrie characterizes syntactic ergativity and syntactic accusativity (nominativity) as follows:
In treating ergativity from a syntactic viewpoint, we are looking for syntactic phenomena in languages which treat S and P alike, and differently from A. Syntactic nominativity likewise means syntactic phenomena where S and A are treated alike, and differently from P. This distinction is connected with the general problem of subject identification:if in a language S and A are regularly identified, that is, if the language is consistently or overwhelmingly nominative-accusative, then we are justified in using the term subject to group together S and A; if in a language S and A are regularly identified (consistent or overwhelming ergative-absolutive system), then we would be justified in using the term subject rather to refer to S and P, that is, in particular, to refer to P, rather than A, of the transitive construction. (Comrie, 1978:343)
In accordance with this characterization, Comrie arrives at the same conclusion as Anderson: he considers morphologically ergative languages, such as Basque or Tongan, to be syntactically accusative, because these languages treat S and A alike and differently from P; he considers Dyirbal syntactically to be ergative, because this language treats S and P alike and differently from A.
The weakness of such characterization is that the key notion ‘to treat syntactically alike’ is not analyzed adequately. What does it mean to say that Basque treats S and A alike? If it means only the application of the rules Equi-NP Deletion and Subject Raising, then, yes, Basque treats S and A alike and therefore must be regarded, according to this criterion, as a syntactically accusative language. But there is more to the syntax of the ergative construction than these rules. If we consider the markedness opposition and syntactic laws and phenomena associated with this key relation, then we conclude that Basque does not treat S and A alike. As a matter of fact, Comrie’s characterization of ergativity runs into the same difficulties as Anderson’s claim that we discussed in the previous section. But let us put aside these difficulties for now and turn to the Accessibility Hierarchy. Our question is, Can the Accessibility Hierarchy be regarded as a universal law? In order to answer this question, let us consider the facts of Mayan languages.
In Mayan languages, Equi-NP Deletion and other syntactic rules apply to ergatives in much the same way as they do in such languages as Basque or Tongan. Here is an example of Equi-NP Deletion in Quiche (Larsen and Norman, 1979:349):
‘The child began to walk’.
Since Mayan ergatives meet Comrie’s criteria of subjecthood, they must be considered subjects, and Mayan languages must be regarded as morphologically ergative but syntactically accusative languages.
Granted that Mayan ergatives must be defined as subjects, the Accessibility Hierarchy predicts that if Mayan languages allow relativization on absolutives, they must allow it also on ergatives. But, contrary to this prediction, in the Mayan languages of the Kanjobalan, Mamean, and Quichean subgroups, ergative NPs cannot as a rule be relativized (or questioned or focused), while absolutive NPs can. In order for an ergative NP to undergo relativization, it must be converted into a derived absolutive and the verb intransitivized through the addition of a special intransitivizing suffix. Here is an example of this process in Aguacatec (Larsen and Norman, 1979:358):
(9)Ja ø-ø-b’iy yaaj xna7n.
asp. 3sB-3sA-HIT MAN WOMAN
‘The man hit the woman’.
|(10)||a.||Na7 m-ø-b’iy-oon xna7n.|
WHO dep.asp.-3sB-HIT-suffix WOMAN
‘Who hit the woman’?
|b.||Ja ø-w-il yaaj ye m-ø-b’iy-oon xna7n.|
asp. 3sB-1s-A-SEE MAN THE dep. asp.-3sB-HIT-suffix WOMAN
‘I saw the man who hit the woman‘.
|c.||Yaaj m-ø-b’iy-oon xna7n.|
MAN dep.asp.-3sB-HIT-suffix WOMAN
‘It was the man who hit the woman’.
Here -oon is the intransitivizing suffix used to circumvent the constraints on extraction of ergatives (the term extraction rules is a cover term for relativization rules, focus rules, WH-Question).
The features of Mayan languages under discussion closely conform to those of the Dyirbal language, but while the Dyirbal absolutive meets Comrie’s criteria of subjecthood, the Mayan absolutive does not.
Dyirbal does not allow relativization on ergatives; instead, the verb of the relative clause is intransitivized by adding the suffix -ŋa-y, and the ergative is replaced by the absolutive case (Dixon, 1972:100). For instance, consider the Dyirbal sentence
|MOTHER(ABS)||FATHER+ ERG||SEE+ PAST|
‘Father saw mother’.
In sentence (11) the ergative is marked by -ŋgu. In order to be embedded into another sentence as a relative clause, sentence (11) must be antipassivized and ergative ŋuma+ŋgu replaced by absolutive ŋuma+ø. We may get, for example, the sentence
We see that the facts of Mayan languages present strong evidence against the Accessibility Hierarchy. Does that mean that the Accessibility Hierarchy must be abandoned as a universal law? I do not think so. The trouble with the Accessibility Hierarchy is that it is formulated as a universal law in nonuniversal terms, such as subject, direct object, etc. To solve the difficulty, it is necessary to abandon nonuniversal concepts, such as subject and direct object, and to replace them with really universal concepts. The key to the solution of this difficulty is provided by applicative grammar.
From the point of view of applicative grammar, the Accessibility Hierarchy is a particular instance of the Applicative Hierarchy established on independent grounds as a consequence of the Markedness Law (sec. 8 of this chapter):
[Primary term > Secondary term >(Tertiary term)] > Oblique term
The Applicative Hierarchy is interpreted in accusative languages as
[Subject > Direct object > (Indirect object)] > Oblique term
and in ergative languages as
[Absolutive > Ergative > (Indirect object)] > Oblique term
We see that the confusion of ergatives with subjects is inconsistent with the Accessibility Hierarchy, which creates an irresoluble difficulty. The treatment of ergatives and subjects as different syntactic functions, on the other hand, leads to deeper understanding of the Accessibility Hierarchy, which results in its restatement on an abstract level in keeping with true basic syntactic universal:primary, secondary, and tertiary terms.
The revised Accessibility Hierarchy accounts both for the facts that motivated the original Accessibility Hierarchy and for the facts that have been shown to contravene it. The revised Accessibility Hierarchy excludes the possibility of languages where primary terms are less accessible to relativization than secondary terms. And this requirement applies both to accusative languages, where primary terms are interpreted as subjects and secondary terms as direct objects, and to ergative languages, where primary terms are interpreted as absolutives and secondary terms as ergatives. All the facts that support the original Accessibility Hierarchy support also the revised Accessibility Hierarchy. But, besides, the revised Accessibility Hierarchy is supported by the facts, like the above examples from Aguacatec, which contravene the original Accessibility Hierarchy. That is a significant result, which shows the importance of the abstract concepts of applicative grammar.
In conclusion, I want to dispel a possible misunderstanding of the concepts I have introduced. It was said above that in order to save the Accessibility Hierarchy, it is necessary to abandon the nonuniversal concepts ‘subject’ and ‘direct object’ and replace them with the universal concepts ‘primary term’ and ‘secondary term’. The important thing to notice is that I suggest replacing one set of concepts with another set of concepts rather than one set of terms with another set of terms. The new terms primary term and secondary term designate a very different set of concepts from the concepts designated by the terms subject and direct object. One might argue that we could save the Accessibility Hierarchy by equating subject with absolutive and object with ergative. But this suggestion would obscure the essential difference between the three sets of concepts:
1) primary term:secondary term,
2) subject:direct object, and
No matter which terminology we use, we must distinguish between these three very different sets of concepts. The second and third sets of concepts are different interpretations (in accusative and ergative languages) of the first truly universal set of syntactic concepts.
One important consequence of the Law of Duality is that the opposition of voices in ergative languages is a mirror image of the opposition of voices in accusative languages:the basic voice in ergative languages corresponds to the derived voice in accusative languages, and the derived voice in ergative languages corresponds to the basic voice in accusative languages.
Since in accusative languages the basic voice is active and the derived voice is passive, that means that pure ergative languages cannot have a passive voice in the sense of accusative languages. Rather, pure ergative languages can have a voice that is converse in its effect to the passive of accusative languages—the so-called antipassive.
A split ergative language can have the passive voice only as a part of its accusative subsystem.
What is called the passive voice in ergative languages by Comrie and some other linguists cannot be regarded as the true passive from a syntactic point of view. Rather, it is a construction resulting from demotion of ergative. Thus, Comrie quotes the following sentence as an example of passive in Basque (Comrie, 1978:370):
True, (13) could be translated into a passive clause in English: The child was sent. But the possibility of this translation has nothing to do with the syntactic structure of (13). Since in any transitive ergative clause absolutive means patient and ergative means agent, the demotion of ergative automatically involves the topicalization of absolutive. The crucial difference between the demotion of ergative and passivization is that passivization topicalizes the patient by means of the conversion of the predicate (John sent the child:The child was sent by John), while the demotion of ergative topicalizes the patient without any change of the predicate (cf. a detailed discussion of clauses with demoted ergatives in Basque in Tchekhoff, 1978:88-93). I suggest calling the constructions with demoted ergatives quasi-passive constructions.
One might argue that the word passive should be used with reference to any construction involving the demotion of ‘agent’. This use of the word passive would, of course, cover the constructions both with converted predicates and with predicates that remain unchanged. However, the question of how the word passive should be used is a pure terminological issue and involves nothing of substance. Granted that we accept the broader use of the word passive, the important thing is to distinguish between and not to lump together two very different types of passive constructions:1) passive constructions with converted predicates involving the demotion of primary terms and the promotion of secondary terms to the position of primary terms; and 2) passive constructions with only the demotion of secondary terms denoting ‘agents’. The real issue is:Can the two types of passive constructions occur in both accusative and ergative languages? The answer is no. The first type can occur only in accusative languages, the second type only in ergative languages.
Why is the first type possible only in accusative languages? Because, in accordance with the Law of Duality, a counterpart of the first type in ergative languages is its mirror-image, that is, antipassive, construction.
Why is the second type possible only in ergative languages? Because the second type involves the demotion of the secondary term denoting ‘agent’. But in accordance with the Law of Duality, the secondary term of a clause denotes ‘agent’ in ergative languages and ‘patient’ in accusative languages. Therefore, while the demotion of the secondary term in an ergative construction makes it ‘passive’, the demotion of the secondary term in an accusative construction does not make it ‘passive’: the accusative construction remains active.
The above claims are deductive consequences of the Law of Duality. This law is subject to disconfirmation if counterexamples are found that cannot be explained as deviations motivated by special empirical conditions. The empirical study of voices in ergative languages in order to confirm or disconfirm the Law of Duality is one of the fascinating outcomes of the proposed theory of ergativity.
Ergative languages tend to exhibit various splits in case markings. These splits can be explained as conditioned by the properties of ergative constructions.
It is well known that in many ergative languages the ergative construction is confined to the past tense or the perfect aspect. How can we explain the correlation between ergative constructions and the tense/aspect?
Since in the ergative construction the primary term denotes patient, that means that the ergative construction presents the action from the point of view of the patient; therefore, the ergative construction focuses on the effect of the action. In the accusative construction the primary term denotes agent, which means that the accusative construction presents the action from the point of view of the agent; therefore, the accusative construction focuses on an action that has not yet been accomplished. Focusing on the effect of an action tends to correlate it with the past tense and the perfect aspect, while focusing on an action that has not yet been accomplished correlates it with the present, the future, the imperfect, and the durative aspects.
Similar explanations of the split in case marking conditioned by tense/aspect have already been proposed by other linguists (Regamay, 1954:373; Dixon, 1979:38). What, however, has passed unnoticed is that accusative languages present a counterpart of this split. Accusative languages tend to restrict the use of the passive constructions to the past tense and the perfect. For example, in Old Russian the use of passive constructions was unrestricted. In Modern Russian, however, the passive voice is confined to the past principles in the perfective aspect. Other types of passive construction have been replaced by constructions with reflexive verbs, which are used as a substitute for passive constructions. The explanation of this phenomenon suggests itself immediately if we accept the view that the passive construction is conceptually related to the ergative construction. Like the ergative construction, the passive construction presents the action from the point of view of the patient, and therefore it tends to correlate with the past tense and the perfective aspect.
Another major split in case marking is that involving personal pronouns. In most ergative languages, nouns and personal pronouns tend to have different patterns of case markings. For example, in Australian languages, pronouns usually have accusative case markings, while nouns have ergative case markings (Dixon, 1976). In Caucasian languages, nouns have ergative case markings, while pronouns mostly have an identical form for transitive agents and for patients. The noun/pronoun split in ergative languages can be explained on the same basis as the tense/aspect split. Since the ergative construction presents the action from the point of view of the patient, it cannot be used in situations where the action should be presented from the point of view of the agent, as in the case where we use personal pronouns (Blake, 1977).
A different type of split is analyzed by Peter Hook (1984). In terms of applicative grammar, this split can be characterized as follows. In some languages, such as Kashmiri or Sumerian, the fundamental syntactic opposition is primary term:secondary term, characterized by morphological markers. Thus, in Kashmiri the primary term is indicated by the second person suffix -akh (or its allomorphs -kh and -h), and the secondary term is indicated by the second person suffix -ath (or -th). Depending on the ergative or nonergative tense/aspect, the opposition primary term:secondary term splits as follows:in the ergative tense/aspect, the primary term is absolutive and the secondary term is ergative, but in the nonergative tense/aspect, the primary term is subject and the secondary term is direct object. If the transitive construction is in the ergative tense/aspect, the primary term means patient and the secondary term means agent; if the transitive construction is in the nonergative tense/aspect, the primary term means agent and the secondary term means patient. In order to characterize this split, Hook uses the term superabsolutive corresponding to the primary term that means patient in the ergative tense/aspect and agent in the nonergative tense/aspect, and the term antiabsolutive corresponding to the secondary term that means agent in the ergative tense/aspect and patient in the nonergative tense/aspect. From the standpoint of applicative grammar, the above split is a very special instance of the Law of Duality:
The marked term of the syntactic construction with the ergative tense/aspect corresponds to the unmarked term of the syntactic construction with the nonergative tense/aspect, and the unmarked term of the syntactic construction with the ergative tense/aspect corresponds to the marked term of the syntactic construction with the nonergative tense/aspect; and, vice versa, the marked term of the syntactic construction with the nonergative tense/aspect corresponds to the unmarked term of the syntactic construction with the ergative tense/aspect, and the unmarked term of the syntactic construction with the nonergative tense/aspect corresponds to the marked term of the syntactic construction with the ergative tense/aspect.
The definition of the ergative construction proposed here provides a uniform basis for the explanation of all splits in case marking in ergative languages.
Languages that are represented in current linguistic literature as ergative may be found to be nonergative in the light of the definition of the ergative construction. Thus, Georgian is generally represented as an ergative language (or, more precisely, as a split ergative language). For various reasons, some linguists have questioned whether Georgian is ergative at all (for example, Aronson, 1970). In the light of the Law of Markedness and the Dominance Law, we can characterize Georgian as an ergative language that has undergone a process of the reversal of markedness. Since in Georgian the ergative case has replaced the absolutive in intransitive clauses, the range of the ergative case became greater than the range of the absolutive, and, as a result, the two cases exchanged their places in the markedness opposition: the ergative turned into an unmarked case, and the absolutive into a marked case. With the exception of some traces of ergativity, contemporary Georgian must be considered an accusative language.
A revision of the class of ergative languages in the light of the proposed definitions of the notions ‘ergative construction’ and ‘ergative language’ may lead to exclusion of some other languages from this class.
In conclusion, I will say a few words about the practical results anticipated.
The first benefit that can be expected from the proposed theory of ergativity is that it will give an adequate understanding of the already accumulated vast amount of facts on the morphology, syntax, and semantics of ergative languages and will set guidelines for fruitful future field work in this domain.
The last few years have seen a significant increase in the amount of data on ergative languages, in particular on their syntax, but no generally accepted solution to the problem of ergativity has yet evolved. One might argue that the proper way to solve this problem is to increase field research in this area. There is no doubt that further field research on ergative languages is of paramount importance to a deeper understanding of ergativity. But field research cannot be fruitful without an understanding of already accumulated data and a realization of what to look for. In accordance with the proposed theory of ergativity, among the other tasks of future field research in this area, the following can be considered urgent:
1) collection of empirical data concerning different processes in ergative languages that in current linguistic literature are lumped together under the name passivization;
2) collection of empirical data concerning the application of the rules Equi-NP Deletion and Subject Raising to ergative constructions. The empirical data so far collected are clearly inadequate. The current view that in most ergative languages these rules apply to absolutives in intransitive clauses and to ergatives in transitive clauses is based on inadequate empirical data. Data from some Caucasian languages present evidence that these rules can be applied freely both to absolutives and ergatives, on the one hand, and to two absolutives, on the other.
3) collection of empirical data concerning the occurrence of ergatives in intransitive clauses. Ergatives occur in intransitive clauses either as a result of special conditions that are to be investigated, or as a result of the reversal of markedness, as in the case of Georgian.
4) collection of data concerning the use of instrumental and other oblique cases in ergative constructions. It may happen that a theoretical scrutiny of these data will discover that some allegedly ergative constructions are really varieties of the accusative construction.
There are some other important tasks of field research in ergative languages, but I will not discuss them here.
The second benefit that I anticipate is that the proposed research will call for a serious overhaul of existing theories of universal grammar.
In contrast to existing theories of ergativity, which oppose morphological, syntactic, and semantic ergativity and tend to emphasize one of these aspects at the expense of others, the Correspondence Hypothesis underlying the proposed research views ergativity as a unitary phenomenon that presupposes an isomorphism of morphological, syntactic, and semantic levels of ergativity.
The oppositions absolutive:ergative and subject:object are syntactic oppositions independent of each other. Ergative constructions cannot be defined by accusative constructions, nor can accusative constructions be defined by ergative constructions; rather, both these types of constructions must be defined with respect to a more abstract syntactic level underlying both ergative and accusative syntax. This abstract level is a fundamental component of applicative grammar.
The Law of Duality reveals the interrelation of ergative and accusative constructions in a more general semiotic framework of the opposition of markedness, which is valid not only in syntax but in phonology and semiotics, as well.
One important consequence of the Correspondence Hypothesis is that the morphology of an ergative language corresponds to some of its essential syntactic properties. Only those syntactic properties can be called ergative and have typological significance that have a counterpart in morphology.
With respect to the necessity of an overhaul of the existing theories of universal grammar, the following points are especially important:
a) Since, as was shown above, the notions of subject and direct object cannot be applied to the description of ergative languages, they cannot be considered universal. Therefore, they must be abandoned as primitive syntactic functions of universal grammar and replaced by the concepts of ‘primary term’ and ‘secondary term’, which are defined on the basis of the primitive concepts ‘operator’ and ‘operand’.
b) The new theory of ergativity calls for a reformulation of the Accessibility Hierarchy in terms of the concepts ‘primary term’ and ‘secondary term’.
c) Contrary to the common approach to syntax, which disregards or at least underestimates morphological data, the new theory of ergativity calls for a careful study of morphological data.
d) The only theory of universal grammar that at present provides an adequate theoretical framework for the new theory of ergativity is applicative grammar.
So far I have taken the notion of the passive voice for granted. In the present section I will give a theoretical analysis of this notion. I will be concerned with difficulties posed by passive constructions and various approaches to these difficulties. As a result of the theoretical analysis, a new theory of passivization will emerge, which will be presented informally here. A formalized theory of passivization will be given later, following the formal description of applicative grammar.
The theory of passivization presented in this book contains much of what was developed in my joint work with Jean-Pierre Desclés and Zlatka Guentcheva (Descles, Guentcheva, Shaumyan, 1985, 1986) but also introduces new notions and goes beyond what we did in a number of ways.
Although the relation between active and passive seems to be simple, in defining passive we face difficulties, which will be discussed here.
Consider first the sentences in English
|(1)||a. John closed the door.|
|b. The door was closed by john.|
|c. The door was closed.|
(1a) is an active sentence; (1b) and (1c) are corresponding passive sentences.
(1a) consists of three parts: the term John denoting an agent, the active predicate closed, and the term the door denoting a nonagent. (1b) consists of three parts: the term the door denoting a nonagent, the passive predicate was closed, and the term by John denoting an agent. (1c) has two parts: the nonagent the door and the passive predicate was closed—the term by John, denoting an agent, is missing.
From now on, I will call sentences such as (1b) long passive sentences, and sentences such as (1c) short passive sentences.
We observe that sentences (1a) and (1b) correlate with each other. They form an opposition active:passive. The correlation between the members of this opposition can be characterized as follows:
The passive predicate was closed is the converse of the active predicate closed.
The term John, which precedes the active predicate, corresponds to the term by John, which succeeds the passive predicate. The term the door, which succeeds the active predicate, corresponds to the term the door, which precedes the passive predicate.
In addition to the opposition active:passive, English has another opposition:long passive:short passive.
The meaning of passive predicates in short passives is ambiguous. Thus, the predicate was closed in (1c) either may imply an unspecified agent or may be simply a synonym of was not open without implying any agent. An implication of an unspecified agent is not a formal feature of the passive predicate but depends solely on the context or an opposition with adjectives. Take, for example, the sentence
(2)Today the shop is open, but yesterday it was closed all day long.
In (2) was closed is simply a synonym of was not open. Unlike was closed, the predicate was opened usually implies an unspecified agent, only because there is the predicate was open. While opened is a member of the opposition opened:open, there is no corresponding opposition for closed.
The ambiguity of passive predicates in short passives can be observed in other languages. For example, in the Russian short passive
|(3)||Ego dom vsegda otkryt dlja každogo.|
|‘His house is always open for everybody’.|
the passive predicate otkryt has the meaning of the adjective ‘open’.
In Latin the passive movetur means either ‘he (she, it) moves’ or ‘he (she, it) is moved’. Only the second meaning is passive proper; the first one is the meaning of the so-called middle voice (Lyons, 1968:375).
Although some languages, such as Uto-Aztecan ones, have special markers for designating unspecified agents in short passives (Langacker, 1976; Langacker and Munro, 1975), most languages of the world do not have these markers. Nevertheless, the unspecified agent must be considered an integral part of short passive, because short passive is a member of the opposition short passive:active. For example, The door was closed correlates with a set of active sentences:John closed the door, The boy closed the door, She closed the door, and so on. If we abstract from concrete agents in these sentences, we get the notion of the unspecified agent, which must be assigned to the predicate was closed of The door was closed. In most languages the unspecified agent is a zero term—a ‘silent term’—of the predicates of short passives. From a functional point of view, a short passive is a mirror image of an abstract sentence based on the corresponding set of active sentences. Thus, if we abstract from the concrete agents in a set of sentences corresponding to The door was closed, we get [unspecified agent] closed the door, whose mirror image is The door was closed [unspecified agent].
The passive meaning involving the unspecified agent is the inherent function of predicates in short passives. But certain contexts may superpose special functions onto predicates in short passives, as in the superposition of the function of an adjectival predicate onto was closed in (2) or onto the Russian otkryt in (3), or the function of the middle voice, as in the Latin movetur.
The passive predicate of a short passive construction is clearly a one-place predicate, which results from the application of the two-place converse predicate to the zero term denoting an unspecified agent. But what is the passive predicate of a long passive construction? Is the passive predicate of a long passive construction a one-place predicate or a two-place predicate?
If we compare (1b) with (1a), we see that from a functional point of view (1b) is a mirror image of (1a):The door in (1b) is a mirror-image counterpart of the door in (1a), and by John in (1b) is a mirror-image counterpart of John in (1a). Actually, by John functions as a secondary term of (1b). But can we conclude from this fact that by John is a regular secondary term like secondary terms in active sentences? No, we cannot, because normally by- phrases are used as oblique complements. Here are some examples:
|(4)||a.||He stood by the window.|
|b.||John entered the house by the back door.|
|c.||He walked by me without noticing me.|
|d.||Be here by this time tomorrow.|
|e.||He did not play by the rules.|
|f.||He led the blind man by the hand.|
|g.||He earns money by writing.|
|h.||Cats sleep by day and hunt by night.|
|i.||He is French by birth.|
|j.||He did it all by himself.|
The function of by-phrases in long passive constructions sharply differs from their function as oblique complements. To see that, let us take the sentence
(5) Mary was killed by John by the seashore.
In this long passive construction, by John is directly opposed to Mary as a secondary term, and it is opposed to by the seashore as a nucleus term to a marginal, oblique term. We can replace by the seashore with other oblique noun phrases, but that cannot be done with by John. The possible replacements are shown in the following example:
(6) shows clearly that by John functions as a nuclear secondary term. There is a striking difference between the meanings of the preposition by in by John and in by the seashore. In by John the preposition by has lost its concrete meaning and has become a marker introducing a secondary term.
We observe a conflict between the normal function of by- phrases, which is the role of oblique complements, and the function of by- phrases in long passive constructions, which is the role of secondary terms. To resolve the conflict, we must use the notion of functional superposition, which was introduced in section 6.3 of this Chapter.
In the light of the notion of functional superposition, we must distinguish between an inherent syntactic function of by- phrases and a syntactic function superposed onto the inherent function. Since the superposed function is special with respect to the inherent function of a syntactic unit, the range of the inherent function is wider than the range of the superposed function. Therefore, the distinction between the inherent and superposed functions of a syntactic unit must be based on the Markedness Law introduced in section 8 of this chapter. As is shown by empirical material, by-phrases are normally used as oblique complements—the range of by-phrases used as oblique complements is wider than the range of by-phrases used as secondary terms of long passive constructions. Consequently, the role of oblique complements is an inherent function of by-phrases, and the role of the secondary terms of long passive constructions is their superposed function. Although as a part of a long passive construction a by-phrase takes the function of a secondary term, it remains an oblique complement—it modifies the passive predicate, which remains a one-place predicate; the passive predicate in long passive constructions is a one-place predicate having a superposed function of a two-place predicate. Accordingly, the components of long passives in English must be described as follows:
primary term + one-place passive predicate + by-phrase modifier of one-place passive predicate
This structure undergoes a functional shift under the pressure of the opposition active construction:long passive construction. Since the long passive construction is a mirror image of the active construction, the by-phrase takes the function of the two-place predicate:the passive predicate and its by-phrase modifier take on the functions of a two-place predicate and its secondary term respectively. Prepositional phrases in active constructions may undergo a similar shift. Compare
|a.||to sit on the bench||a.||to agree on the terms|
|b.||to fly over the bridge||b.||to argue over the plan|
|c.||to be at the office||c.||to look at the picture|
In both I and II we have one-place predicates. But in II the prepositions have lost their concrete meaning and have become markers introducing a secondary term. Therefore, the prepositional phrases in II have taken the function of secondary terms, which has been superposed onto their inherent function of oblique complements. At the same time, the one-place predicates in II have taken the function of two-place predicates, which is superposed onto their inherent function.
So far I have used examples from English, but a similar analysis of the structure of long passives is supported by diverse languages having long passives. The secondary term of a long passive construction is never an inherently secondary term; it is a prepositional phrase or a noun in an oblique case whose inherent function is the role of an oblique complement.
Here is an example from Latin:
|(8)||a.||Magister discipulum laudat.|
teacher (Nom) student (Ace) praises
‘The teacher praises the student’.
|b.||Discipulus laudatur a magistro.|
student (Nom) is praised from teacher (Abl)
‘The student is praised by the teacher’.
(8a) is an active sentence, and (8b) is a long passive sentence. In (8b) the phrase a magistro consists of the preposition a and the noun magistro in the ablative. In (8b) a magistro has the superposed function of the secondary term, but its inherent function is to be a predicate modifier that means a point of departure; the inherent meaning of a magistro is ‘from the teacher’.
In Russian the inherent function of the instrumental is to be a predicate modifier that means an instrument. But the instrumental is also used in long passive constructions, where it takes the superposed function of the secondary term.
What is the meaning of passive predicates? Is the meaning of passive predicates the same in long passive constructions and in short passive constructions?
The above examples of long passive constructions show clearly that their passive predicates are converses of corresponding active predicates.
To characterize the meaning of passive predicates in short passive constructions, let me first introduce the notion of derelativization. This notion, used under different names, has been useful in contemporary logic, and it may turn out to be useful in linguistics, as well. Here is how Quine characterizes derelativization:
Commonly the key word of a relative term is used also derelativized, as an absolute term to this effect: it is true of anything x if and only if the relative term is true of x with respect to at least one thing. Thus anyone is a brother if and only if there is someone of whom he is a brother. Where the relative term is a transitive verb, the corresponding absolute term is the same verb used intransitively.
Relative terms also combine with singular terms by application, to give absolute general terms of a composite kind. Thus the relative term ‘brother of’ gives not only the absolute general term ‘brother’ but also the absolute general term ‘brother of Abel.’ Similarly the relative term ‘loves’ gives not only the absolute general term ‘loves’ (intransitive) but also the absolute general term ‘loves Mabel. Again the relative term ‘at’ gives the absolute general term ‘at Macy’s.’ (Quine, 1960:106)
Using the notion of derelativization, we can characterize the meaning of the passive predicate in short passive constructions as the derelativized converse of the active predicate. When the passive predicate is modified by a by- phrase in English or an analogous phrase in other languages, we get long passives in which the passive predicate takes the function of a regular converse predicate, and the phrase modifying the passive predicate takes the function of the secondary term of the converse predicate. The relation between the passive predicate of a short passive and the passive predicate of a long passive—say, between was closed in The door was closed and was closed in The door was closed by John—is similar to the relation between captain in He was a captain and captain in He was the captain of the team. Derelativization explains why passive predicates in short passive constructions have ambiguous meaning:since passive predicates are one-place predicates, their converse meaning must be supported either by the context or, as in long passive constructions, by agent modifiers; without this support, the converse meaning is lost.
The fact that secondary terms in long passive constructions are not inherent secondary terms but predicate modifiers that have only the function of secondary terms calls for an explanation. To explain this fact, we can advance the following hypothesis:
The inherent function of passive is the conversion of the active predicate and the derelativization of the converse predicate, which involves the superposition of the function of the secondary term onto the predicate modifier.
This hypothesis is supported by the following facts:
1. In some languages, such as Classic Arabic or Uto-Aztecan languages, long passives are not permitted—only short passives are used. In languages that have short and long passives, short passives are by far more common than long passives. For example, in English approximately 80 percent of all passives are short (Quirk, Greenbaum, Leech, 1972:807).
Why are long passives either not permitted or much less common than short passives? Because the function of long passives is less important than that of short passives. One of the basic communicative needs is a removal of the agent (denoted by the primary term in the active construction) when the agent is either unknown or unimportant. This need is met by impersonal actives and short passives. Short passives differ from impersonal actives in that they not only remove agents but also topicalize nonagents by changing active predicates into their derelativized converses, and secondary terms (denoting nonagents) into primary terms. In long passives, the agent is mentioned, but the nonagent is topicalized. Compare
|(9)||a.||A golf ball struck the senator.|
|b.||The senator was struck by a golf ball.|
The long passive (9b) does not remove the agent but topicalizes the non-agent. If we have to tell what happened to the senator, we choose (9b) in order to topicalize the senator.
|(10)||a.||Faulkner wrote “The Sound and the Fury.”|
|b.||“The Sound and the Fury” was written by Faulkner.|
If we have to tell who was the author of The Sound and the Fury, we choose (10a), because The Sound and the Fury must be the topic.
In passive constructions, topicalization of the nonagent is a consequence of the conversion of the active predicate. But topicalization of the nonagent is not necessarily connected with passive constructions. Just as the removal of the agent is not only the function of short passive constructions but also the function of active impersonal constructions, so the topicalization of the nonagent not only is the function of passive constructions but also may be an independent phenomenon that occurs in active constructions. The important thing is to distinguish between two types of topicalization of the nonagent: 1) topicalization of the nonagent by the conversion of the active predicate and 2) regular topicalization of the nonagent, which occurs in active constructions.
Regular topicalization of the nonagent is an important cross-linguistic phenomenon. Here is an example from English:
|(11)||a.||John likes wine|
|b.||Wine, John likes.|
2. Languages, such as English, that have long passives originally had merely short passives. Thus, in Old and Middle English passive sentences, the agent could be marked by the dative case or the prepositions on, among, at, between, betwixt, by, for, from, mid, of, through, to, with (Visser, 1963-73:1988-2000). Noun phrases in the dative case (Old English only) or with these prepositions were mere adverbial modifiers of the passive predicates. It sometimes was not clear whether these noun phrases had an agentive or an instrumental reading. Long passives were introduced into English only in the sixteenth century, when the variety of the agent markers was eliminated and the preposition by became the standard marker of the agent in passive sentences.
The above facts support the hypothesis about the inherent and the superposed functions of passive. Let us now turn to further questions concerning the relation of active and passive constructions.
Active is the basic voice, and passive is the derived voice. Passive predicates are derived from active predicates. That can be seen from linguistic data across languages. Thus, in many languages, such as English, Russian, French, Bulgarian, Armenian, Uto-Aztecan languages, etc., the passive predicate consists of BE+past participle or Reflexivization affix+active predicate. Various morphological processes of passive derivation are described in Keenan, 1975. In very rare cases, such as in Mandarin Chinese, the passive predicate does not differ morphologically from the active predicate, but in these cases passivization is characterized by syntactic symbolic processes. So, in Mandarin Chinese passivization is characterized by the particle bèi and a change of word order.
There are many cross-linguistic constraints on the formation of passive constructions. For example, many of the world’s languages, probably most, have the following constraint on active sentences:the subject of declarative clauses cannot be referential-indefinite. In order not to violate this categorial constraint, the speaker must resort to a special marked sentence type, the existential-presentative construction (such as English there is a . . . , French il y a un . . . , etc.). Languages of this type are, for example, Swahili, Bemba, Rwanda (Bantu), Chinese, Sherpa (Sino-Tibetan), Bikol (Austronesian), Ute (Uto-Aztecan), Krio (Creole), all Creoles, and many others (Givón, 1979:26-27). For example, in Krio (an English-based Creole), one finds the following distribution (Givón, 1979:27):
|(12)||a)||Ge wan man na di yad we de -ask fɔ yu.|
have one man in the yard REL PROG -ask for you
‘There is a man in the yard who is asking for you’.
|b)||*Wan man na di yad de -ask fɔ yu.|
one man in the yard PROG -ask for you
‘A man in the yard is asking for you.’
|c)||Di man na di yad de -ask fɔ yu.|
the man in the yard PROG -ask for you
‘The man in the yard is asking for you’.
In English, the analogues of (12b) occur with extremely low frequency: “About 10% of the subjects of main-declarative-affirmative-active sentences (non-presentative) are indefinite, as against 90% definite” (Givón, 1979:28, 51-73).
The distribution of definiteness in active and passive constructions is not at all the same. In a transitive active construction, the primary term (denoting an agent) is very often determined, and the object may or may not be determined.
|(13)||a)||John bought a/the book.|
|b)||The boy bought the book.|
|c)||?A boy bought the book.|
In a passive construction, the primary term (denoting a patient) is generally determined, while the agent, when it is overt, is often undetermined but may be determined in some contexts (Givón, 1979:63):
|(14)||a)||?A book was bought by John.|
|b)||He was beaten to death a minute later by an enraged wino.|
If the secondary term in an active sentence is determined, we can associate a corresponding passive:
|(15)||a)||The boy broke the cup.|
|b)||The cup was broken by the boy.|
If the secondary term in an active sentence is not determined, we cannot always and unconditionally associate a corresponding passive:
|(16)||a)||The boy broke a cup.|
|b)||?A cup was broken by the boy.|
Because of various constraints on their formation, passive constructions have a narrower range than active constructions. Therefore, under the Markedness Law, active is the unmarked and passive the marked member of the opposition active:passive.
Long passive constructions have passive predicates that function as converses of active predicates in corresponding active constructions. Under the definition of conversion in section 7 above, there must be an equivalence of long passive constructions and corresponding active constructions with respect to their meaning. Equivalence is not identity; it does not mean that long passive constructions can be freely substituted for active constructions—it means only that there are some essential properties of the meaning of long passive constructions and the meaning of corresponding active constructions that are invariant under the operation of the conversion of predicates. Here are some examples:
|(17)||a.||John approached Mary.|
|b.||Mary was approached by John.|
(17a) and (17b) are not identical—they are equivalent:the relation between the agent John and the nonagent Mary is invariant under the conversion of the predicate approached. The meaning of (17a) and (17b) is different with respect to topicalization:the topic is John in (17a) and Mary in (17b).
Differences in topicalization always involve differences in definiteness/indefiniteness. If the sentence The man stole an automobile is part of a text, we can conclude that the man refers to a person introduced before, while stole an automobile conveys new information. The indefinite article is used with the word automobile because this word refers to a part of the new information. No matter whether we have an active or a passive construction, the topic is used with the definite article in most cases, because the topic almost always refers to given information. Therefore, the passive sentences An automobile was stolen by the man and The automobile was stolen by a man differ in meaning from each other and from The man stole an automobile, although all three of the sentences have the same underlying invariant—the relation (denoted by the predicates) between the agent man and the nonagent automobile.
The foregoing shows that the differences in definiteness/indefiniteness of the agent and nonagent in active and passive constructions are logical consequences of the differences in their topicalization.
Another logical consequence of the differences in topicalization of the agent and nonagent is that passive predicates tend to denote a state resulting from a past action. Thus, it is well known that in contemporary English many passive phrases are ambiguous as to whether they refer to a process or a state. The letter was written may mean that at a certain moment somebody wrote the letter, or that the letter had already been written. In Russian, passive predicates denoting a process have fallen into disuse—as a rule, only passive predicates denoting the state resulting from past action are used. So, phrases such as Kniga pročitana The book has been read’ (pročitana denotes the state resulting from past action) are quite common, while phrases such as Kniga čitaema The book is read’ (čitaema denotes the process) are unacceptable in contemporary Russian. This tendency is explained by the fact that in passive constructions the action is presented from the point of view of the nonagent, so that we focus on the effect of the process rather than on the process itself. The process relates to the agent rather than to the nonagent. Since the active construction presents the action from the point of view of the agent, and the passive construction from the point of view of the nonagent, it is natural that the active construction serves to express the process, and the passive construction serves to express the effect of the process.
There may be other differences in meaning between active and corresponding passive constructions, but no matter what these differences are, there is always a certain relation between the agent and nonagent, which is invariant under the conversion of an active into a passive predicate.
Many languages passivize not only the secondary term but the tertiary term, as well. For example, we find in English
|(18)||a.||The money was given to John. (the secondary term passivized)|
|b.||John was given the money. (the tertiary term passivized)|
In English, the passivization of tertiary terms is not typical; it is an isolated phenomenon for only a very limited number of predicates. But in some other languages, such as Malagasy, the passivization of tertiary terms is common. In these languages, tertiary terms that are passivized may have a variety of meanings (addressee, beneficiary, instrument, place, time, etc.). Here are some examples from Malagasy (Keenan, 1972):
|(19)||a.||Nividy ny vary ho an’ny ankizy ny vehivavy.|
bought the rice for the children the woman
‘The woman bought the rice for the children’.
|b.||Novidyn’ny vehivavy ho an’ny ankizy ny vary.|
bought by the woman for the children the rice
‘The rice was bought for the children by the woman’.
|c.||Nividianan’ny vehivavy ny vary ny ankizy.|
bought by the woman the rice the children.
‘The children were bought the rice by the woman’.
|(20)||a.||Nividy ny vary amin’ny vola my vehivavy.|
bought the rice with the money the woman
‘The woman bought the rice with the money’.
|b.||Nividianan’ny vehivavy ny vary ny vola.|
bought by the woman the rice the money
‘The money was used by the woman to buy the rice’.
In spite of the variety of examples of the passivization of tertiary terms in different languages, this type of passivization is subordinate to the passivization of secondary terms. This claim is supported by the following law:
CONDITION ON THE PASSIVIZATION OF TERTIARY TERMS:
Any language that passivizes tertiary terms must passivize secondary terms, but the reverse is not true: the passivization of secondary terms does not presuppose the passivization of tertiary terms.
The passivization of tertiary terms is subordinate to the passivization of secondary terms, and the latter is independent of the former, since the passivization of tertiary terms presupposes the passivization of secondary terms, while the passivization of secondary terms does not presuppose the passivization of tertiary terms.
Now we are ready for a definition of short and long passive constructions.
The short passive construction is defined by the following conditions:
1) A one-place passive predicate is derived from a two-or three-place predicate in two steps:
i) formation of the converse of the active predicate;
ii) derelativization of the converse predicate by applying it to a zero term denoting an unspecified agent.
2) The passive predicate is applied to a noun phrase that serves as the primary term of the short passive construction.
3) The primary term of the short passive construction denotes a non-agent and is a functional counterpart of the secondary or tertiary term of the corresponding active construction.
The long passive construction is defined by the following conditions:
1) The passive predicate is derived in two steps, as in the short passive construction.
2) The passive predicate first is modified by a noun phrase that serves as an oblique term and then is applied to a noun phrase that serves as the primary term.
3) The primary term of the long passive construction denotes a non-agent and is a functional counterpart of the secondary or tertiary term of the corresponding active construction.
4) The function of the secondary term is superposed onto the oblique term, and the function of the two-or three-place converse predicate is superposed onto the passive predicate. As a result of the superposition, the oblique term of the long passive construction turns into a functional counterpart of the primary term of the corresponding active construction.
5) The passive predicate takes on the function of a two-or three-place converse predicate.
The above definition of short and long passive constructions seems to have internal contradictions. Thus:
1) Passive predicates are characterized as having the same form in short and long passive constructions. But while the predicate of a short passive construction is defined as a one-place predicate obtained by applying a converse predicate to a zero term denoting an unspecified agent, the predicate of a long passive construction is defined as a one-place predicate that functions as a two-or three-place converse predicate.
2) The agent term in the long passive construction is defined as an oblique term that functions as a secondary term.
These contradictions show that the passive predicate and the agent in a long passive construction must be viewed as centaurs. The notion of the centaur was introduced in section 1 of chapter 2 in connection with the analysis of the notion of the phoneme. Just as the phoneme is a unity of the sound and the diacritic, the passive predicate in the long passive construction is a unity of the form of a one-place predicate and the function of a two-or a three-place predicate, and the agent in the long passive construction is a unity of the form of an oblique term and the function of a secondary term.
Actually, the paradoxical structure of passive is the result of functional superpositions: in long passive constructions, the function of the two-or three-place converse predicate is superposed onto the one-place passive predicate, and the function of a secondary term is superposed onto an oblique term. These superpositions can be explained as a realization of a potential of natural languages for developing symmetrical constructions. Symmetrical constructions must satisfy two conditions: they must have converse predicates, and they must have the same number of terms converse predicates are applied to. Now, to develop symmetrical counterparts of active constructions, natural languages use short passive constructions with an oblique term used as a modifier of the passive predicate. As was said above, in Old and Middle English passive sentences, the agent could be marked by the dative case or the prepositions on, at, among, between, betwixt, by, for, from, mid, of, through, to, with. The passive sentences with an agent expressed by the dative case or a variety of prepositions were short passives rather than long passives proper. Here the meaning of the agent term was rooted in the various concrete meanings of the dative and prepositions, and so indicated a sort of instrument or a source of the action rather than the agent in the strictest sense of the word. These short passives became long passives only when, by the process of grammaticalization, the preposition by replaced all the means of expressing the agent and became a grammatical element superposing the function of the secondary term onto the oblique term.
It should be noted that the potential of symmetrical expressions is not necessarily realized in every language. Actually, passive, and especially long passive, constructions are a linguistic luxury, and many languages get by without them. Some languages may have verbal adjectives with passive meaning, but the sentences having these verbal adjectives cannot be considered passive constructions proper. Such was the case in the early stages of the development of Indo-European languages. In languages having passive constructions, the use of passive constructions, and especially of long passives, is connected with the need for a more abstract means of expression.
In some languages, passive predicates can be derived from intransitive active predicates. Consider the following example from German:
|(21)||a.||Die Kinder tanzten.|
’The children danced’.
|b.||Es wurde von den Kindern getanzt.|
*’By the children it was danced’.
This example seems to contradict the explanation of passive constructions in terms of conversion. Nevertheless, the difficulty raised by this and similar examples from German and some other languages can be solved by hypothesizing a zero dummy term as follows.
Let us compare (21a) and (21b). The predicates tanzten and wurde getanzt are oriented in opposite directions. The predicate tanzten is directed from the primary term, while wurde getanzt is directed to the primary term. The reversal of the orientation of the predicates is what conversion is all about. Therefore, the relation between the terms (21a) and (21b) can be rendered by a proportion:
(22) von Kindern: die Kinder = es:x
By substituting a zero dummy term ∅▵ for x, we get
(23) von Kindern: die Kinder = es: ∅▵
We must be cautious in hypothesizing abstract entities. But in our case the hypothesis of zero dummy secondary terms in active intransitive sentences is justified. It is well motivated by directly observable empirical properties of sentences and the operation of conversion.
By introducing a hypothetical zero dummy secondary term into active intransitive sentenes, we establish a well-motivated analogy between the derivation of passive predicates from active one-place predicates and the derivation of passive predicates from active two-place predicates.
One might argue that impersonal passive constructions do not satisfy the conditions implied by the definition of the passive. For these conditions rest upon the assumption that passive sentences are derived from active transitive sentences. It is difficult to imagine that such sentences as Latin Pugnabatur ‘It was fought’ should be derived from active transitive sentences.
To meet this argument, we must take into account the Principle of Polymorphism, which can be stated as follows:
There are different ways of constructing the same sentence, and not all ways are possible for every sentence.
For example, the sentence
(24) Boris was called by Peter.
can be constructed in two ways: 1) either we apply the passive predicate was called first to Peter, then to Boris; 2) or we derive (4) from the following sentence:
(25) Peter called Boris.
Not every passive sentence can be constructed by derivation from an active sentence. Such, for example, is the case with impersonal passives having zero terms, such as (21a) or Latin Pugnabatur It was fought’. The crucial property of passive constructions is not that we can derive them from active constructions but that we can obtain them by applying passive predicates to some terms as their operands. The important thing to notice is that, while it is not always possible to derive a passive construction from an active one, we can always derive a passive predicate from an active predicate.
Generative-transformational grammar and relational grammar admit only one way of characterizing passive constructions—by pointing out how they are derived from active constructions. Thus, they characterize (25) as necessarily obtained from (24). No wonder generative-transformational grammar and relational grammar run into difficulty when they try to characterize impersonal passive constructions that cannot be obtained from active constructions. To solve this difficulty, they have to introduce hypothetical entities that hardly can be justified on empirical grounds.
One of the advantages of the linguistic theory advocated in this book is that it rests on the Principle of Polymorphism, which makes it possible to give a realistic characterization of passive constructions.
One might argue against conversion by pointing out that some languages can sometimes have passive constructions with a patient that is not promoted to the position of the primary term, as in the following example from Polish:
(they) built school
’They built the school’.
by-them was-built school
’The school was built by them’.
In (26a) the noun szkolę is in the accusative case, and the predicate zbudowali is active in the past tense. In (26b) the impersonal form zbudowano has a special suffix -o. The noun szkolę is in the accusative case.
The impersonal passive construction (26b) differs from the active construction (26a) in that the active predicate zbudowali was changed into the impersonal passive zbudowano. As to the noun szkolę, it remained unchanged.
Do this and similar examples from Welsh, Finnish, and some other languages contravene our definition of the passive construction? To answer this question, we must bear in mind that according to the definition of the two-place predicate, an application of a two-place predicate to a term gives an expression that is equivalent to a one-place predicate. That means that in (26a), zbudowali szkolę is equivalent to a one-place predicate, and the whole sentence (26a) can be regarded alternatively either as a transitive sentence with the transitive predicate zbudowali and the patient szkolę or as an intransitive sentence with the intransitive predicate zbudowali szkolę. For the purposes of passivization, the second alternative was chosen by Polish. The possibility of this choice can be explained by the fact that the crucial function of passivization is demotion or suppression of the agent, rather than promotion of the patient to the position of the primary term; passive constructions are used when the agent is either unknown or unimportant. Granted that (26a) is regarded as an intransitive sentence, we can explain (26b) as we explained above the passivization of regular intransitive sentences.
As was shown in section 10.3, pure ergative languages cannot have a passive voice. What is sometimes called passive in ergative languages is actually a construction resulting from the demotion of ergative. Instead of passive, pure ergative languages can have a so-called antipassive.
While accusative languages have the opposition active: passive, pure ergative languages can have the opposition ergative: antipassive.
Here is an example of an antipassive construction from Dyirbal (Dixon, 1972):
|father||see + Antipassive+ Past||mother|
’Father saw mother’.
(27) is derived from
|mother||father+ Agent||see+ Past|
’Mother was seen by father’.
by: 1) deriving the antipassive predicate buṛal+ŋa+ŋyu from the ergative predicate buṛa+n (ŋa is a phonetically conditioned variant of the antipassive suffix -ŋay); 2) changing the original secondary term ŋuma+ngu in the ergative case into the primary term ŋuma in the absolutive case; and 3) changing the original primary term yabu in the absolutive case into the secondary term yabu+gu in the dative case.
Note that although I translated (27) by an English active construction, this translation is a crude device, for want of a better one, used to show the contrast between the passive and the antipassive constructions. Although an antipassive construction reminds one superficially of an active construction, there is a fundamental difference between them. As was said above, the active construction is fundamental and normal, while the antipassive construction is derived and stylistically marked.
It should be noted that most ergative languages lack antipassive constructions; they have only ergative constructions. This fact has a counterpart in accusative languages: some of these languages lack passive constructions. The only difference between ergative and accusative languages in this respect is that the lack of antipassive constructions is a much more common phenomenon than the lack of passive constructions.
Why are antipassive constructions less common in ergative languages than are passive constructions in accusative languages? Antipassive constructions are used to topicalize the agent; therefore, they make sense when in corresponding ergative constructions the topic is the nonagent rather than the agent, as can be seen in Dyirbal and in some ergative languages—these languages have anti-passive. But, as was shown in section 10, in most ergative languages the function of the primary term is superposed onto the secondary term—which results in the topicalization of the secondary term by changing word order: now the topicalized secondary term precedes and the primary term follows the ergative predicate. Since the secondary term denotes an agent, ergative languages that topicalize the agent by changing word order do not need anti-passive. Neither do they need passive, because they can get an analogue of passive constructions simply by denoting the agent without the conversion of the ergative predicate, as was shown in section 10.3.
There are two types of antipassive constructions: short antipassive constructions and long antipassive constructions. Antipassive constructions, which are mirror images of passive constructions, are defined as follows:
The short antipassive construction meets the following conditions:
1) a one-place antipassive predicate is derived from a two-or three-place ergative predicate in two steps:
i) formation of the converse of the ergative predicate;
ii) derelativization of the converse of the ergative predicate.
2) The antipassive predicate is applied to a noun phrase that serves as the primary term of the short passive construction.
3) The primary term of the short passive construction denotes an agent and is a functional counterpart of the secondary term of the corresponding ergative construction.
The long antipassive construction meets the following conditions:
1) The antipassive predicate is derived in two steps, as in the short anti-passive construction.
2) The antipassive predicate is first modified by a noun phrase that serves as an oblique term and is then applied to a noun phrase that serves as a primary term.
3) The primary term of the long antipassive construction denotes an agent and is a functional counterpart of the secondary term of the corresponding ergative construction.
4) The function of the secondary term is superposed onto the oblique term; the function of the two-or three-place converse predicate is superposed onto the antipassive predicate. As a result of the superposition, the oblique term of the long antipassive construction turns into a functional counterpart of the primary term of the corresponding ergative construction.
To appreciate the implication of the abstract theory of passivization presented in section 11, one must compare it with alternative theories of passivization. I will consider the theories of passivization in generative-transformational and relational grammar and the demotion theory of passivization.
Generative-transformational grammar defines the rules of passivization in terms of word order. For example, the rule of passivization in English is characterized as follows: the passive transformation has as input a string of the form
(1)NP1 + V + NP2
(where NP denotes a noun phrase and V denotes a verb), interchanges the two NPs, puts the verb in the passive form, and marks NP1 by the preposition by, yielding the string of the form
(2)NP2 + V pass + by + NP1
as (3) illustrates:
(3) John ate the banana → The banana was eaten by John.
The definition of the rules of passivization in terms of word order raises a series of problems.
In the first place, as was shown in section 1 of this chapter, there are languages, such as Russian, in which active and passive sentences may have the same word order, because word order is irrelevant for passivization.
Second, this approach runs into difficulties even when word order is relevant for passivization. Since different languages have different word order, we have to formulate distinct rules for each language where the order of the relevant words is different. Therefore, we have to treat functionally identical rules in two languages as distinct rules; as a result, we will miss the essential features of passivization and focus on its superficial aspects.
Third, granted that the transformational approach precludes the formulation of the universal rules of passivization, one might expect that, at least within the limits imposed by a highly specific nature of word order in various languages, this method could be valuable for understanding specific features of passivization in individual languages. But transformational theory falls short even of these modest expectations. Generative-transformational grammar rests upon the Autonomous Syntax Hypothesis, which claims that grammatical morphemes are for the most part meaningless and are inserted for purely formal purposes. In our case, the grammatical morphemes marking the passive constructions—by, be, and the perfect participial inflection—are considered meaningless entities with a purely formal function. Generative-transformational grammar claims that passive constructions can be produced only by deriving them from active sentences. But this claim is false. As a matter of fact, English passives frequently lack a by-phrase, because a by-phrase is not an intrinsic part of the English passive construction. Counterparts of English by-phrases, that is, agentive phrases, are not permitted at all in many languages. As a matter of fact, passive constructions with an agentive phrase are based on passive constructions without an agentive phrase, rather than vice versa. That can be stated as the following law:
If a language has passive constructions with agentive phrases, it must have passive constructions without agentive phrases, but the reverse is not true: if a language has passive constructions without agentive phrases, it may or may not have passive constructions with agentive phrases.
It is wrong to treat the preposition by as a tool of transformation of active constructions into passive constructions. The correct approach is to treat by-phrases as normal modifiers of intransitive passive predicates. The preposition by is a binary operator whose first operand is a term and whose second operand is a predicate. By-phrases are constructed in two steps: first, we apply by to a term, and then we apply the result of this application to a predicate. That means that by is a transposer of a term into a modifier of a predicate.
Passive constructions are not necessarily derived from active constructions. Rather, they are produced by the successive accretion of smaller components. That is possible because every component has its own meaning.
Relational grammar (henceforth RG) is a model of syntactic description suggested and developed by Perlmutter and Postal since about 1972. Perlmutter and Postal characterize RG as a new linguistic theory that is directly opposed to generative-transformational grammar (henceforth GTG). True, RG has many attractions in comparison with the standard version of GTG. But at the same time, RG shares with GTG many essential features, which gives us good reason to regard RG as a new type of GTG.
The basic feature that RG shares with GTG is the notion of an abstract underlying syntactic structure. This abstract structure is called deep structure in GTG and initial stratum in RG. True, there are significant technical differences between these notions, but with respect to their essence, these notions are alike: both are fictitious entities from which the empirical structure (called surface structure in GTG and final stratum in RG) is derived.
The derivation of surface structure from deep structure is described in GTG by means of a dynamic meta-language—in terms of transformations. The derivation of final stratum from initial stratum is described in RG by means of a static meta-language—in terms of static grammatical relations between successive strata, starting with initial stratum and ending with final stratum. But we should not be misled by the differences between the two meta-languages. The essential point is that both GTG and RG operate with a fictitious abstract structure from which they derive the empirical structure, no matter what they call all these things.
What really sets RG apart from classic GTG is the claim that grammatical relations such as ‘subject of, ‘direct object of’, indirect object of’, and others are needed to achieve three goals of linguistic theory: 1) to formulate linguistic universals; 2) to characterize the class of grammatical constructions found in natural languages; and 3) to construct adequate and insightful grammars of individual languages.
Another basic claim of RG is that grammatical relations cannot be defined in terms of other notions, such as word order, phrase structure configurations, or case markings. Rather, they must be taken as primitives of linguistic theory.
RG correctly criticizes classic GTG for its failure to provide cross-linguistically viable notions of grammatical relations to achieve the goals of linguistic theory; GTG is unable to do that because it states transformations in terms of the linear order of constituents.
One must be in sympathy with these important proposals of RG. But at the same time, RG is committed to the framework of multilevel structure in the spirit of classic GTG, which has grave consequences.
Before discussing the RG theory of passivization, I want to make a few terminological comments.
RG calls such notions as ‘subject of’, ‘object of’, etc. grammatical relations. These are binary relations. The question arises, What are the members of these binary relations? RG does not give a direct answer to this question. Neither can one find an unequivocal answer. Thus, we can regard the members of these relations as noun phrases and sentences. For example, given a sentence S and two noun phrases A and B, we can define the relations between them as follows: A is the subject of S, B is the direct object of S. Or we can regard the members of these relations as noun phrases and predicates. For instance, given a sentence S, a binary predicate P, and two noun phrases A and B, we can define the relations between them as follows: A is the subject of P, B is the direct object of P.
Under both interpretations, subject-of and direct-object-of are relations in a very trivial sense. By adding the preposition of to the words subject and direct object, we get the relational terms subject-of and direct-object-of But these relational terms do not characterize the essence of the notions of ‘subject’ and ‘direct object’.
By adding the preposition of to any noun, we can produce a variety of relational terms: house-of, heart-of, leg-of, tail-of, etc., etc.
Obviously the notions of subject, direct object, and indirect object cannot be understood as relations in an interesting, nontrivial sense, which would make it possible to draw an essential distinction between these notions and other syntactic notions.
With respect to these notions, the key word must be function (taken in a nonmathematical sense as a synonym of the word role) rather than relation. The essential feature that distinguishes them from other syntactic relations is the notion of syntactic function. Subject, direct object, and indirect object are basic functional units of the sentence, and as such they are essentially distinct from all other syntactic entities.
We have to look for a nontrivial notion of relation elsewhere. If we define a relation Rn (where n=2,3, . . .) as an entity that combines n elements, called the members of Rn, into a whole, then the basic relational syntactic notions are the notions of binary and ternary predicates, which connect two or three noun phrases of a sentence. Subject, direct object, and indirect object are the members of these relations. Binary and ternary predicates are relations in an interesting, nontrivial sense, because, as was shown above, by treating predicates as relations we can develop a rich variety of revealing relation-changing operations.
Since binary and ternary predicates are special instances of operators, the fundamental notion of linguistic relation in a nontrivial, significant sense must be ‘binary operator of’ and ‘ternary operator of’.
As was shown above, subject, direct object, and indirect object are not valid universal notions and must be replaced by the theoretical constructs primary term, secondary term, and tertiary term.
Let us now turn to the RG theory of passivization (Perlmutter, Postal, 1977).
RG starts with a basic universal assumption about the nature of clause structure. This assumption is stated informally as follows:
1)—A clause consists of a network of grammatical relations—among them are ‘subject of’, ‘direct object of’, and ‘indirect object of’.
On the basis of this assumption, RG states the universals of passivization as follows:
2)—A direct object of an active clause is the superficial subject of the corresponding passive clause.
3)—The subject of an active clause is neither the superficial subject nor the superficial direct object of the corresponding passive.
2) and 3) together have the following consequence:
4)—A passive clause is an intransitive clause.
2), 3), and 4) are called universals of passivization. Consider the following English active-passive pair:
|(4)||a.||Louise reviewed that book.|
|b.||That book was reviewed by Louise.|
The simplified network of grammatical relation for (4a) is
where p means predicate, 1 means subject, and 2 means object.
The simplified diagram for (4b) is
Here two horizontal curves show that the structure of the passive consists of two strata: 1) an initial stratum (indicated by the upper curve) and 2) a final stratum (indicated by the lower curve). RG claims that in any human language, every possible clause has a noun phrase that is an object in the initial stratum and a subject in the final stratum. (4b) includes (4a) as its initial stratum and in this way represents the correspondence stated in 2). Symbol Cho means chômeur; this term denotes a noun phrase that is neither a superficial subject or direct object nor an oblique.
The universals of passivization 2), 3), and 4) are based on the following laws (which are stated here in nontechnical terms) (Perlmutter, Postal, 1978):
The Oblique Law. Any noun phrase that bears an oblique relation to a clause must bear it in the initial stratum.
In nontechnical terms, the Oblique Law means that a subject, direct object, or indirect object cannot be converted into an oblique.
The Stratal Uniqueness Law. No stratum can contain more than one subject, one direct object, or one indirect object.
Let us now turn to a special condition that follows from the Oblique Law and the Stratal Uniqueness Law.
What relation does Louise bear in the second stratum of (6)? Since that book is the subject in this stratum, it follows from the Stratal Uniqueness Law that Louise cannot be the subject in the second stratum. RG claims that the relation borne by Louise in the second stratum is an additional primitive relation called the chômeur relation. The term chômeur is a French word meaning ‘unemployed’ or ‘idle’. A noun phrase that is a chômeur in a given stratum is a subject, direct object, or indirect object in a higher stratum (the term chômeur reflects an earlier conception of RG, when chômeur was a nominal that lost its grammatical relation).
The Chômeur Condition. If some noun phrase NPa bears a given term relation in a given stratum Si, and some other noun phrase NPb bears the same relation in the following stratum Si+1, then NPa bears a Chômeur relation in Si+1.
So, since Louise in (6) is the subject in the first stratum, and that book is the subject in the second, the Chômeur Condition stipulates that Louise be the Chômeur in the second stratum.
The Chômeur Condition follows from the above laws. So, Louise cannot be an oblique in the second stratum, because it is not an oblique in the first stratum (that follows from the Oblique Law), and it cannot be a subject in the second stratum, because no stratum can contain more than one subject, direct object, or indirect object (that follows from the Stratal Uniqueness Law).
The Final 1 Law. Every basic clause contains a final stratum subject.
This law claims that there is no basic clause without a subject. The subject may not appear on the surface, as in the following examples:
|(7)||a.||Kiss one salamander, and people say you are a pervert.|
|b.||Try and tickle yourself.|
|c.||John went home and then called Betty.|
Many languages have basic clauses that do not appear to have subjects. Take Russian, for instance:
’I feel nauseated’.
’There’s Pie rre’.
’Here he is’.
RG claims that these sentences have a dummy subject. The dummy may appear on the surface, as in French
(10) Il pleut.
or in its English counterpart
(11) It is raining.
There are also dummy objects, as in the following sentences:
|(12)||a. Terry made it clear that he would resign.|
|b. I hate it very much for you to scream like that.|
To define a class of possible sentences with dummy terms, the following law is advanced:
The Nuclear Dummy Law. Only subjects and objects can be dummy terms.
This law predicts, among other things, that chômeurs cannot be dummy terms. RG recognizes the following hierarchy of grammatical relations:
RG recognizes a class of rules called advancements and a class of structures produced by these rules. An advancement is a rule that moves a noun phrase up the hierarchy. A noun phrase undergoing an advancement is called an advancee.
The rules of passivization belong to the class of advancements. Here are some examples:
|(14)||a.||Harriet gave a new bowling ball to Ted.|
|b.||Harriet gave Ted a new bowling ball.|
|c.||Ted was given a new bowling ball by Harriet.|
In (14a) Ted bears the initial 3-relation to the clause. In (14c) Ted is an advancee, because it is advanced from the initial 3-relation first to the 2-relation, as in (14b), and then through passivization to the 1-relation. It can be seen from (14c) that a single noun phrase can undergo more than one advancement.
What can be said about the RG theory of passivization?
This theory marks an essential progress in comparison with the theory of passivization of GTG. The critical point is that RG explains passivization in terms of relational, or, better, functional, units—subject, direct object, indirect object—rather than in terms of the linear order of constituents.
Another significant advantage of this theory is that it builds on a body of universal laws. These laws are important not because they are definitive but because they have heuristic power: they direct linguists towards fruitful empirical research and raise significant theoretical issues.
RG has already stimulated interesting empirical research on a variety of typologically different languages and has given impetus to a discussion of some intriguing problems of linguistic theory. It should be noted, however, that the results of RG are far from conclusive. It must be given credit for raising novel theoretical problems rather than for solving them. As a matter of fact, the RG theory of passivization meets substantial difficulties when it is applied to accusative languages, and it breaks down with respect to ergative languages.
The difficulty with RG is that it conflates subject with the agent and object with the patient in transitive sentences. But if we take subject and object as purely syntactic terms, subject may be a patient and object an agent. Subject as a purely syntactic term is what I call the primary term. Object as a purely syntactic term is what I call the secondary term. RG is unable to accept the notion of transitive subject as a patient and the notion of object as an agent. Therefore, RG interprets the ergative case as a transitive subject, and the absolutive case in transitive sentences as a transitive object, although the reverse is true: in ergative languages the ergative case denotes a transitive object rather than a transitive subject, and the absolutive denotes a transitive subject rather than a transitive object.
Since RG conflates subject with the agent and direct object with the patient, it conflates the notion of the active construction, characteristic for accusative languages, with the notion of the ergative construction, characteristic for ergative languages. This conflation involves a conflation of the notion of passive voice with the notion of antipassive voice. RG regards antipassive constructions in ergative languages simply as passive constructions.
The label antipassive is sometimes used in RG, but in a completely different sense from that established for ergative languages. The term antipassive is proposed to be a label for constructions that are produced as a result of converting the direct object into a chômeur which can be omitted (Postal, 1977). This definition is applied to ergative and accusative languages. According to this definition, we obtain the following correspondences in accusative languages:
|(15)||Active constructions||Antipassive constructions|
|The woman sewed the dress.||The woman sewed on the dress.|
|The woman sewed.|
|The hunter shot the bear.||The hunter shot at the bear.|
|The hunter shot.|
The notion of antipassive in RG clearly has nothing in common with the current notion of the antipassive construction used to characterize specific constructions of ergative languages that are the reverses of the passive constructions in accusative languages.
The RG theory of passivization is rooted in the syntax of accusative languages, where the transitive subject denotes the agent and the direct object denotes the patient. Therefore, RG rules for passive constructions can work only for accusative languages, but even there some serious difficulties arise.
Consider the impersonal passives.
RG correctly regards passives and impersonal passives as the same phenomenon. Using the concept of the dummy term, RG correctly states that impersonal passives involve an advancement of 2 to 1. The dummy subject in impersonal passives can be represented either by zero or by a pronoun.
Examples of a dummy subject represented by zero:
|(16)||Hier wurde den ganzen Abend getanzt.|
|’It was danced here all evening’.|
|‘Here it was worked’.|
Example of a dummy subject represented by a pronoun from German:
|(18)||Es wurde getanzt.|
|It was danced (There was dancing)’.|
But how about impersonal passives with a direct object, as in the following examples from Polish:
|(19)||Stefana poslano do doma.|
|Stephen sent+PASSIVE to home|
|‘Stephen was sent home.’|
and Welsh (Xolodovič, 1974: 367):
|(20)||Llodwid Jones gan Williams.|
|killed + PASSIVE Jones by Williams|
|‘Jones was killed by Williams’.|
Stefana in (19) and Jones in (20) are direct objects rather than subjects. The direct object is marked in Polish by the accusative case suffix, and in Welsh by the syntactic environment.
These examples illustrate passive constructions without an advancement of 2 to 1. They contravene the claim of RG that any passive construction must involve an advancement of 2 to 1.
To solve the difficulty, RG resorts to such an ad hoc contrivance as a claim that although Stefana and Jones in the above examples are marked as direct objects, they must be interpreted as chômeurs from the relational point of view, because here a dummy allegedly is inserted as 2, putting 2 en chômage, and then is advanced from 2 to 1 (Perlmutter and Postal, 1984).
As a matter of fact, a satisfactory explanation of the above impersonal passive constructions can be given only by reference to universals that follow from the definitions of predicates and terms based on the Applicative Principle. These universals in terms of RG are:
1) A transitive predicate plus a direct object is syntactically equivalent to an intransitive predicate.
2) A ditransitive predicate plus an indirect object is syntactically equivalent to a transitive predicate.
In view of universal 2), impersonal passive constructions with a direct object can be regarded simply as impersonal passive constructions without a direct object. Accordingly, rules that apply to impersonal passive constructions without a direct object apply to impersonal passive constructions with a direct object.
RG has raised an interesting question: What syntactic type of intransitive predicates can be passivized?
It has been observed cross-linguistically that some intransitive predicates can never be passivized, while other predicates can. For example, the following intransitive predicates can never be passivized cross-linguistically (Perlmutter, 1978):
1. predicates expressed by adjectives in English: predicates describing sizes, shapes, weights, colors, smells, states of mind, etc.;
2. predicates whose initial nuclear term is semantically a patient, such as burn, fall, float, tremble, roll, flow, soar, etc.;
3. predicates of existing and happening: exist, happen, occur, vanish, etc.;
4. nonvoluntary emission of stimuli that impinge on the senses (light, noise, smell, etc.): shine, sparkle, jingle, smell, stink, etc.;
5. aspectual predicates: begin, start, continue, end, etc.;
6. duratives: last, remain, stay, survive, etc.
And here is the list of intransitive predicates whose passivization is possible cross-linguistically:
1. predicates describing willed or volitional acts: work, play, speak, quarrel, walk, knock, etc.;
2. certain involuntary bodily processes: cough, snore, weep, etc.
The question is: How is the semantic difference between these two classes characterized syntactically? Can we state a syntactic hypothesis predicting a cross-linguistic possibility of the passivization of intransitive predicates?
RG answers affirmatively to these questions. It advances a syntactic hypothesis called the Unaccusative Hypothesis, meant to predict the cross-linguistic possibility of the passivization of intransitive predicates.
The Unaccusative Hypothesis is stated as follows:
Certain intransitive clauses have an initial 2 but not an initial 1.
This hypothesis means that certain intransitive clauses have an underlying structure with a direct object that corresponds to the subject of the surface structure. For example, the underlying structure of
(21) Gorillas exist.
(22) Exist gorillas.
Gorillas in (22) is the direct object of exist, and it corresponds to gorillas in (21), which is the subject of exist in (21).
This correspondence is presented by the following relational network:
Gorillas is initial 2 but final 1.
The advancement in (23) is called unaccusative. The following terminology facilitates the discussion:
A transitive stratum contains a 1-arc and a 2-arc.
An unaccusative stratum contains a 2-arc but no 1-arc.
An unergative stratum contains a 1-arc but no 2-arc.
In current terminology, a transitive stratum contains a subject and an object; an unaccusative stratum contains a direct object but not a subject; an unergative stratum contains a subject but no object.
The Final 1 Law predicts that clauses with final unaccusative strata will not be well formed in any language. Taken together with certain other proposed linguistic universals, it has the following consequence:
Every clause with an unaccusative stratum involves an advancement to 1.
Under the Unaccusative Hypothesis, then, a class of intransitive clauses with an initial unaccusative stratum contrasts with a class of intransitive clauses with an initial unergative stratum. Impersonal passive clauses can be obtained from the second class of clauses. This syntactic class of clauses are semantically characterized by the list given above of the intransitive predicates whose passivization is possible cross-linguistically.
Is the Unaccusative Hypothesis correct?
What empirical facts support this hypothesis?
The fact that the class of intransitive clauses that can be passivized follows from the Unaccusative Hypothesis cannot be regarded as supporting this hypothesis: it is a basic principle of logic that true statements can follow from false statements. In view of this principle of logic, the truth of the Unaccusative Hypothesis cannot be confirmed by the prediction of the facts it was constructed to explain. The Unaccusative Hypothesis, like any other scientific hypothesis, can be considered plausible only if we can find facts that follow from it on independent grounds.
The trouble with the Unaccusative Hypothesis is that it cannot be tested on independent grounds.
The Unaccusative Hypothesis is unacceptable not in the sense that there are counterexamples to it; rather, it is unacceptable because it is consistent with any conceivable set of empirical data. There is a possibility always open to assign an initial unaccusative stratum to intransitive clauses that cannot be passivized and an initial unergative stratum to intransitive clauses that can be passivized. The trouble with the Unaccusative Hypothesis is that it is impossible to construct even potential counterexamples to it. As a matter of fact, only those hypotheses can have an empirical import to which potential counterexamples can be constructed.
The Unaccusative Hypothesis assumes an abstract stratum, called the unaccusative stratum, which contains an object and no subject, and it assumes an advancement of the object to the position of the subject. But there is no way of constructing counterexamples to these assumptions. These assumptions are gratuitous, because they make no empirical claims.
As an alternative to the Unaccusative Hypothesis, I propose a hypothesis that is free of assumptions to which counterexamples cannot be constructed. I call this hypothesis the syntactic neutralization hypothesis.
The Syntactic Neutralization Hypothesis claims that any intransitive clause is a result of the neutralization of transitive clauses containing a binary predicate applied to two terms, one of which becomes a primary term and another one, a secondary term. The terms in a transitive clause are always assigned a definite meaning: in accusative languages, the primary term denotes an agent and the secondary term a patient. In ergative languages the reverse is true: the primary term denotes a patient, and the secondary term denotes an agent. But in both accusative and ergative languages, the primary term in an intransitive clause may denote either an agent or a patient, because the primary term in an intransitive sentence represents the opposition primary term: secondary term and so can be identified with either member of this opposition.
The Syntactic Neutralization Hypothesis predicts that the primary term in an intransitive clause may denote either an agent or a patient, and in some languages, called active languages, this prediction is reflected in different case markings for terms denoting agents and terms denoting patients in intransitive clauses. In accordance with this prediction, intransitive clauses can be divided into two classes: a class of intransitive clauses that cannot be passivized, and a class of intransitive clauses that can be passivized cross-linguistically.
Both the Unaccusative Hypothesis and the Syntactic Neutralization Hypothesis predict the same facts, but the latter has two advantages over the former: 1) it is free of the assumptions of an underlying stratum, that is, a deep structure, which cannot be tested empirically; and 2) it can be tested on independent grounds, because it is a particular case of the Neutralization Hypothesis, which stands out as one of the pillars of linguistic theory and semiotics.
Let us now turn to the problems connected with the passivization of indirect objects.
Consider the sentence
(24) Peter gave money to Nancy.
In this sentence Peter is the subject, money is the direct object, and Nancy is the indirect object.
RG claims that the indirect object Nancy cannot be passivized by advancing it to 1. Rather, it is passivized in two steps: first, Nancy is advanced to 2, and money is put en chômage, which yields
(25) Peter gave Nancy money.
This sentence can be represented by the following stratal diagram:
In (26) Nancy is the direct object and money is a chômeur.
The second step is advancing Nancy to 1 and putting Peter en chomage, which yields
(27) Nancy was given money by Peter.
This sentence can be represented by the following diagram:
As these examples show, RG claims that indirect objects can be promoted to subjects only via promotion to direct object and putting the direct object en chômage.
There are two groups of counterexamples to this claim.
In some languages, such as Dutch, an indirect object can be promoted to direct object but cannot be promoted to subject (Dik, 1978: 125):
|(29)||a.||Jan gaf het boek aan Piet.|
John gave the book to Peter
’John gave the book to Peter’.
|b.||Jan gaf Piet het boek.|
John gave Peter the book
’John gave Peter the book’.
|c.||* Piet werd het hoek gegeven door Jan.|
Peter was the book given by John
’Peter was given the book by John’.
On the other hand, in some languages an indirect object can be promoted to subject, although it cannot be promoted to direct object. Here is an example from Achenese (Lawler, 1977):
|(30)||a.||Gopnyan ka gi-bre buku nyan ki-kamo.|
he perf. he-give book that to-us
‘He gave that book to us’.
|b.||Buku nyan ka gi-bre ki-kamo le-gopnyan.|
book that perf. he-give to-us by-him
‘That book was given to us by him’.
|c.||Kamo ka gi-bre buku nyan le-gopnyan.|
we perf. he-give book that by-him
‘We were given that book by him’.
|d.||*gopnyan ka gi-bre kamo buku nyan.|
he perf. he-give us book that
‘He gave us that book’.
Another group of counterexamples conflicts with the claim of RG that promotion of an indirect object to direct object puts the initial direct object en chomage. So, in some dialects of the English language, it is possible to passivize both Nancy and money in (24). These dialects can have
|(31)||a.||Peter gave Nancy money.|
|b.||Nancy was given money.|
|c.||Money was given Nancy.|
(31c) conflicts with the claim of RG that money in (31a) is a 2-chômeur. As a matter of fact, (31a) is a stylistic variant of (24) rather than the result of the promotion of Nancy from 3 to 2; therefore, the passive construction (27) must be regarded as derived from both (24) and (25).
We can advance strong arguments also against the notion of the 1-chômeur. Consider the following passive sentences:
(32) Mary was led by the hand.
(33) Mary was led by John.
RG claims correctly that by John in (33) is syntactically different from the oblique by the hand in (32) and at the same time has something in common with the subject in the active construction
(34) John led Mary.
That raises the problem, What is by John in (33), if it is different from a regular oblique?
To solve this problem, RG posits two levels for the passive construction (33), and the Chômeur relation by John is considered a chômeur on the final level and a subject on the initial level.
This solution is not satisfactory for the following reason:
The claim of RG that by John in (33) is syntactically different from the oblique by the hand in (32) is correct. But that is only half of the story. The other half is that by John also has something syntactically in common with by the hand. RG recognizes only the difference between the two prepositional phrases, but it fails to see that these prepositional phrases are syntactically identical from a structural point of view.
The crucial fact is that the two prepositional phrases are identical in their syntactic function of modifying predicates: both are predicate modifiers. But that is what can be called their inherent function. To see the difference between the two phrases, let us consider the meaning of by.
There is a striking difference between the meaning of the preposition by in (32) and in (33). In by the hand the preposition has a concrete meaning: it denotes a mode of action. In by John the preposition by has lost its concrete meaning, and by John, which is an oblique, has taken on the function of the secondary term—it has become a functional counterpart of the primary term of the corresponding active sentence.
Linguistic theory can dispense with both chômeur relations and strata in explaining passive constructions. The crucial thing is to distinguish between the inherent and superposed functions of the terms. It must be stressed that the inherent and superposed functions of oblique terms in passive constructions are not ad hoc notions. The distinction between the inherent and superposed functions concerns any syntactic, morphological, and phonological unit—a syntaxeme, a word, a morpheme, a phoneme. The distinction between inherent and superposed functions of linguistic units pervades any natural language and must be a cornerstone of universal grammar.
Some linguists have recently advanced a claim that passivization is essentially a demotion of the subject in a sentence (Keenan, 1975; Comrie, 1977; Jain, 1977). This claim is based on the syntax of impersonal passives.
True, passivization of active constructions involves the demotion of the subject. But the demotion of the subject is not characteristic only with respect to passivization; it is also involved in producing impersonal active sentences. Two kinds of demotion of the subject must be distinguished: a demotion connected with passivization, and a demotion connected with producing active impersonal constructions. Compare in Russian
|(35)||a.||Pulja ubila bojca.|
bullet-Nom. killed soldier-Acc.
‘The bullet killed the soldier’.
|b.||Boec byl ubit pulej.|
soldier-Nom. was killed bullet-Instr.
‘The soldier was killed by the bullet’.
|c.||Pulej ubilo bojca.|
bullet-Instr. killed-Impersonal soldier-Acc.
Literally: *‘It killed the soldier with the bullet’.
(35c) has no counterpart in English. It is an impersonal construction that was obtained by the demotion of the subject.
Similar examples can be found in other languages. They are crucial: by comparing passive constructions with impersonal active constructions, it becomes clear that the demotion of the subject that accompanies passivization is a part of the operation of conversion.
Consider now the following expression from Russian:
By demoting and eliminating the subject, we get an impersonal active construction
|‘It is said’|
rather than an impersonal passive construction.
The demotion theory of passivization is motivated by a desire to avoid abstractions, such as the zero or dummy subject. Of course, we must avoid unnecessary abstract entities. But to explain impersonal passive constructions, we need the notion ‘dummy subject’. By introducing this notion, we can see that the passivization of the sentence with a one-place predicate is analogous to the passivization of the sentences with two-place predicates. Otherwise the mechanism of the passivization of the sentences with a one-place predicate remains a mystery.
I have considered the most important alternative theories of passivization—the transformational theory, the theory of relational grammar, and the demotion theory. In view of the foregoing discussion, none of these theories is acceptable for one or another reason. All of them are inferior to the theory of passivization based on applicative grammar.
The formalism of applicative grammar is related to the formalism of categorial grammars, but at the same time there is an essential difference between the two formalisms.
Y. Bar-Hillel, C. Gaifman, and E. Shamir proved the formal equivalence of phrase structure grammars (immediate constituent grammars) and categorial grammars (Bar-Hillel, 1964). Since categorial grammars are equivalent to phrase structure grammars, categorial grammars are inadequate for the description of natural languages in the same way as are phrase structure grammars.
To have a grammar that could be adequate for the description of natural languages, I unified categorial grammar and the system of the combinators of combinatory logic into an integrated whole; the linguistic theory based on the resulting formalism I call applicative grammar. The relation of applicative grammar to categorial grammar is similar to the relation of generative-transformational grammar to phrase structure grammar. Just as it would be a mistake to identify generative-transformational grammar with phrase structure grammar, so it would be a mistake to identify applicative grammar with categorial grammar.
I now turn to the formalism of applicative grammar.
A language, by definition, must have three kinds, or categories, or types of expressions (henceforth I will use the term type as a synonym of the term category):
1) terms, or names of objects;
3) operators, that is, expressions that combine expressions to form other expressions.
In applicative grammar, operators are connected with one another by a network of formal definitions, which eventually reach the ultimate definientia—term and sentence. Let us take, for instance, the operators one-place predicate (p1), two-place predicate (p2), and three-place predicate (p3) and see how their definitions are interrelated with one another and with the ultimate definientia, the sentence (s) and the term (t). Consider the following example:
|showed Nancy||— (p2)|
|showed Nancy pictures||— (p1)|
|Jim showed Nancy pictures||— (s)|
The first expression showed is a three-place (or ditransitive) predicate of type p3. Applying it to an expression of type t produces showed Nancy, an expression of type p2, that is, a transitive predicate of the same type, as, for example, took. Applying showed Nancy to pictures produces showed Nancy pictures, an intransitive predicate of the same type p1 as, for example, took the book or walked. Applying showed Nancy pictures to Jim produces Jim showed Nancy pictures, an expression of type s.
By type I mean a class of operators. For the sake of generality, I conceive of sentences and terms as zero-place operators. This approach is convenient not only from the formal point of view; it is empirically justified, too. Thus, Jim walks or Jim takes the book is equivalent to the Latin impersonal Ningit ‘It snows’, which is nothing but a zero-place predicate. The English term the blind man is equivalent to the blind, which is nothing but a zero-place attribute.
Since I used the term type only in the sense of a class of operators, I will henceforth replace it with the term O-type.
The combinatory properties of predicates and terms can be expressed in a series of definitions:
The symbol ≡ indicates identity by definition. The juxtaposition of O-type symbols with the expression symbols indicates that a given expression belongs to a given type. The blanks between expressions are meant to divide them. The left expression is applied to the right one. The above definitions read:
1) Expression A of O-type s is identical by definition with expression B of O-type p1 applied to expression C of O-type t.
2) Expression B of O-type p1 is identical by definition with expression B1 of O-type p2 applied to expression C1 of O-type t.
3) Expression B1 of O-type p2 is identical by definition with expression B2 of O-type p3 applied to expression C2 of O-type t.
We face a conceptual problem: How do we dispose of p1, p2, and p3 by reducing them to the ultimate definientia t and s?
To solve this problem, I will construct a formal definition of O-type.
Consider an expression XY where X is an operator and Y is an operand. It is obvious that if expression XY belongs to a certain type v, and expression Y belongs to a certain type u, expression X must belong to a type of expressions that change expressions of type u into expressions of type v. Let us designate this type as
where the symbol O stands for operationality, that is, for the general notion of the type of operators. I call it the operationality primitive.5 This formula reads: ‘the type of operators from u into v.’
We can formulate a rule for classifying the above expression XY:
(4)If X is in Ouv and Y is in u, then XY is in v.
Now we can have a formal calculus of operators.
We postulate certain primitive O-types: c1, c2, . . . .
We define the formal concept of O-type as follows:
|RULE T:||a.||Primitive types c1, c2, . . . are O-types;|
|b.||if u and v are O-types, then Ouv is an O-type.|
Rule T can be represented by the following tree diagram:
Then we introduce notation for the relation ‘belongs to’:
Formula (6) reads: ‘expression X belongs to type y.’
Next we adopt the above rule (5) and present it as a tree diagram:
Here the horizontal line is meant to indicate that resultant v(XY) follows from the application of operator X to operand Y. I want to stress the relativity of the concepts ‘operand’ and ‘resultant’: the three expressions X, Y, and XY are all operators. But besides, Y is an operand and XY is a resultant of X.
Rule E reads: if expression X belongs to O-type Ouv, and expression Y belongs to O-type u, then expression XY belongs to O-type v.
It should be noted that formula Ouv could be presented in a different form, say as (u → v), if we adopted a convention that the arrow designates ‘O-type’. But the notation with prefix O is more convenient than the notation with infix →, since the former is bracketless and the latter involves the use of brackets, as in the case with any mathematical notation using infixes instead of prefixes. Besides, the arrow might involve conceptual ambiguity, since usually this symbol does not designate abstract objects.
We can deduce the following two rules from Rule E:
This rule reads: if the resultant of the application of expression X to expression Y belongs to O-type v, and expression Y belongs to O-type u, then expression X belongs to type Ouv.’
Rule E1 is a straightforward consequence of Rule E, and the proof is obvious: let X belong to O-type z; then, according to Rule E, since y belongs to O-type u and XY belongs to O-type v, z must be identical with Ouv.
Rule E1 defines the O-type of an operator in terms of the O-types of its operand and resultant.
The reverse of Rule E1 is this:
This rule reads: if expression XY is the resultant of the application of operator X to its operand Y, and X belongs to O-type Ouv, then Y belongs to O-type u and XY belongs to O-type v.
The proof of Rule E2 is no less obvious: if expression XY is constructed by Rule E, and operator X belongs to O-type Ouv, then Y must belong to O-type u and XY must belong to O-type v.
Rule E2 defines the O-types of an operand and of a resultant in terms of the O-type of their operator.
Let us now turn to our problem. By applying Rule E1 to the above set of definitions, we can define p1, p2, and p3 in terms of the ultimate definientia t and s. That is done in three steps: first, we define p1 as Ots, then we define p2, as OtOts, and finally we define p3 as OtOtOts. As a result, we get a new set of definitions:
Reducing the definitions of all O-types to their ultimate definientia makes it possible to determine clearly to which O-types given expressions must belong if they are to combine into well-formed expressions. As a matter of fact, Rule E formalizes the concept of well-formedness with respect to type.
I call an expression X well-formed with respect to type if it is constructed from expressions Y1, . . . ,Yn by Rule E.
We can construct applicative tree diagrams to represent various type analyses of expressions. For example, a type analysis of the sentence Jim showed Nancy pictures can be represented by the following diagram:
Let us now apply the calculus of O-types to other conceptual problems. We will be concerned with problems arising from conceptual ambiguity and vagueness. It is obvious that conceptual ambiguity and vagueness are highly disadvantageous for any science. The history of science abounds in examples where an increase in the conceptual clarity of a theory through careful clarifications and specifications of meaning had a significant influence on the progress of science, as, for example, the emergence of the theory of special relativity depended upon the recognition and subsequent reduction of conceptual ambiguity and vagueness within a particular domain.
We will be able to construct precise definitions for syntactical concepts either that are used undefined or whose definitions are of little avail because of their ambiguity and vagueness.
I start with the definition of the syntactic system of applicative grammar. This system is defined by six sorts of notions, as follows:
1. primitive O-types: t, s;
2. rules for constructing composite types from the primitives:
a. primitive O-types t, s are O-types,
b. if x and y are O-types, then Oxy is an O-type;
3. expressions belonging to O-types;
4. rules for constructing expressions: Rule E, Rule E1, Rule E2;
5. nine combinators (or combinatory operators): I, C, C*, W, B, S, K, ϕ, Ψ;
6. rules for applying combinators: reduction rules and expansion rules;
7. replacement rules;
8. deductive processes: expansion and reduction.
I introduce a few definitions:
DEFINITION 1. If expression XY belongs to type s, expression Y belongs to type t, and expression X belongs to type Ots, I call X a one-place predicate and Y a primary term.
DEFINITION 2. If expression XY belongs to type Ots, expression Y belongs to type t, and expression X belongs to type OtOts, I call X a two-place predicate and Y a secondary term.
DEFINITION 3. If expression XY belongs to type OtOts, expression Y belongs to type t, and expression X belongs to type OtOtOts, I call X a three-place predicate and Y a tertiary term.
The opposition of a primary and a secondary term constitutes the nucleus of a sentence. These terms I call nuclear.
It follows from definitions 1 and 2 that primary terms occur both in the opposition primary term: secondary term (with two-place predicates) and outside this opposition (with one-place predicates). Therefore, the position with a one-place predicate must be regarded as the point of neutralization of the opposition primary term: secondary term, which is represented by the primary term in this position. The secondary term is the positive (marked) member of this opposition, and the primary term is its neuter-negative (unmarked) term.
DEFINITION 4. Let AB be a well-formed expression. It follows from Rule E that A belongs to O-type Oxy, B belongs to O-type x, and AB belongs to O-type y. Either A or B can be considered the main constituent of expression AB, called its head: if x ≡ y, the head is B, and if x ≢ y, the head is A.
If B is the head, A is called a modifier of the head. If a is the head, B is called a complement of the head. The term dependent denotes modifiers and complements together.
Example: The expression Bill bought new books is represented by the following tree diagram:
books is the head of (new books), and new is its modifier; bought is the head of (bought (new books)), and (new books) is its complement; (bought (new books)) is the head of ((bought (new books)) Bill), and Bill is its complement.
DEFINITION 5. A type symbol is called adjoined if it is introduced into the syntactic system by a definition of the form z = Oxy,
where z denotes an adjoined type symbol and Oxy denotes an O-type where x and y are either other adjoined type symbols or t, or s.
It follows from this definition that we can introduce adjoined type symbols only in stages. At the first stage we can define only four adjoined type symbols for which arbitrary letters can be used:
(10) a = Ott, m = Oss, p1 = Ots, c = Ost
By substituting these adjoined type symbols for their definientia used as components of complex categories, we can introduce new adjoined type symbols, for example:
(11) p2, = Otp1, d = Op1p1
By substituting new adjoined type symbols for their definientia used as components in more complex types, we can introduce further adjoined type symbols, and so on.
It is obvious that any adjoined type symbol introduced at later stages can be defined in terms of the ultimate definientia t and s by a series of definitions. I call this series of definitions a definitional reduction.
Example of definitional reduction:
(12) p3 = Otp2 = OtOtp1 = OtOtOts
The concept of adjoined type symbols and their definitional reduction is important, because by introducing adjoined type symbols, we can present the system of types in a compact form.
Here is a table of adjoined type symbols denoting some basic syntactic types:
We can introduce as many adjoined type symbols as we need.
I will close this section by suggesting an alternative method of constructing applicative tree diagrams.
The above examples presented the construction of tree diagrams in accordance with a convention that an operator must be placed before its operand and that the tree diagram must be constructed from top to bottom, starting with the smallest units.
Sometimes, however, it is convenient to construct a tree diagram from bottom up, starting with a sentence. We take a sentence and treat it as the node, extract the operator from the sentence, and write the operator on the left and the operand on the right above the horizontal line. The same procedure is repeated over new nodes from left to right until we reach the ultimate constituents of the sentence. To illustrate how that is done, let us take the sentence Bill bought new books in the above example. We will construct an applicative tree diagram in steps.
Our first step is to take the sentence, draw a horizontal line over it, extract the operator from it, and form two new nodes over the horizontal line:
Our second step is to apply this procedure to the new nodes from left to right. We get
By applying the same procedure to the new nodes, we complete the construction:
There is no need to use brackets in this type of diagram.
Here is a tree diagram without brackets for the sentence Jim showed Nancy pictures:
The syntactic system provides the means for transposing one syntactic function into another. For example, the type OOttOts is a class of operators that transpose a modifier of a term into a one-place predicate; the type OtOtt is a class of operators that transpose a term into a modifier of a term. Sentences also can be transposed into different syntactic functions: for example, the type Ost is a class of operators that transpose a sentence into a term, and so on.
Now we are ready to formalize the notion of functional superposition. From a formal point of view, functional superposition amounts to superposition of types. Superposition of types is to be defined as follows:
Let us define the notion of the superposer.
DEFINITION 7. An operator R of type Ox˂y/x> shall be called a superposer.
As an example of superposition, let us consider the derivation of the English gerund from the finite forms of the verb. In English, the gerunds derived from the finite form of verbs are assigned a stratified type:
The suffix -ing in English gerunds is a superposer of type
(Formula Otks, where k = 1, 2, 3, indicates intransitive, transitive, and ditransitive predicates.)
Here are examples illustrating (18). The English gerund leaving, in the context of the expression John proposed our immediately leaving the office, belongs to type t, because it is modified by our of type Ott; also it belongs to type OtOts because it takes the secondary term the office of type t and is modified by the adverb immediately of type OOtksOtks.
There are six rules of combination of expressions belonging to stratified types:
Here is an explanation of the notation and meaning of rules S1-6. Symbol A is a variable indicating an operator, and B is a variable indicating an operand. Expressions enclosed in angle brackets indicate stratified types.
Rule S1 is interpreted as follows. Let A be an expression of type Ox ˂ylx> (which means that A is a superposer). Then, if A is applied to B of type x, we obtain a combination AB of a stratified type ˂y/x>. Rule S1 is a version of Rule E given in section 13.1. Rule S1 is illustrated by the derivation of a gerund:
Rule S3 is illustrated by the combination immediately leaving:
Rule S5 is illustrated by the combination immediately leaving the office:
Rule S2 is illustrated by the combination our immediately leaving the office:
If we consider the above expressions as parts of the sentence John proposed our immediately leaving the office, we get the following tre