Word order

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In linguistics, word order typology is the study of the order of the syntactic constituents of a language, and how different languages can employ different orders. Correlations between orders found in different syntactic sub-domains are also of interest. The primary word orders that are of interest are the constituent order of a clause – the relative order of subject, object, and verb; the order of modifiers (adjectives, numerals, demonstratives, possessives, and adjuncts) in a noun phrase; and the order of adverbials.

Some languages use relatively restrictive word order, often relying on the order of constituents to convey important grammatical information. Others—often those that convey grammatical information through inflection—allow more flexibility, which can be used to encode pragmatic information such as topicalisation or focus. Most languages, however, have a preferred word order,[1] and other word orders, if used, are considered "marked".[2]

Most nominative–accusative languages—which have a major word class of nouns and clauses that include subject and object—define constituent word order in terms of the finite verb (V) and its arguments, the subject (S), and object (O).[3][4][5][6]

There are six theoretically possible basic word orders for the transitive sentence. The overwhelming majority of the world's languages are either subject–verb–object (SVO) or subject–object–verb (SOV), with a much smaller but still significant portion using verb–subject–object (VSO) word order. The remaining three arrangements are exceptionally rare, with verb–object–subject (VOS) being slightly more common than object–subject–verb (OSV), and object–verb–subject (OVS) being significantly more rare than the two preceding orders.[7][a]


Finding the basic constituent of word order and mode of transmission[edit]

A paper by Murray Gell-Mann and Merritt Ruhlen, building on work in comparative linguistics, asserts that the distribution[clarification needed] of word order types in the world's languages was originally SOV. The paper compares a survey of 2135 languages with a "presumed phylogenetic tree" of languages, concluding that changes in word order tend to follow particular pathways, and the transmission of word order is to a great extent vertical (i.e. following the phylogenetic tree of ancestry) as opposed to horizontal (areal, i.e. by diffusion). According to this analysis, the most recent ancestor of[all?] currently known languages was spoken recently enough to trace the whole evolutionary path of word order in most cases.[8]

A strong similarity exists between the linguistic tree and the genetic tree.[9] It is not always easy to find the basic word order of S, O and V. First, not all languages make use of the categories of subject and object. In others, the subject and object may not form a clause with the verb. If subject and object can be identified within a clause, the problem can arise that different orders prevail in different contexts. For instance, French has SOV when the subject is a noun, but SVO when the subject is a pronoun and OSV for questions; German has verb-medial order in main clauses, but verb-final order in subordinate clauses. In other languages the word order of transitive and intransitive clauses may not correspond. In still others, the rules for ordering S, O, and V may exist, but be secondary to (and often overruled by) more fundamental ordering rules – e.g. for considerations such as topic–comment. To have a valid base for comparison, the basic word order is defined[by whom?] as:

  • syntax
  • clause
  • S and O must both be nominal arguments
  • pragmatically neutral, i.e. no element has special emphasis

While the first two of these requirements are relatively easy to respect, the latter two are more difficult. In spoken language, there are hardly ever two full nouns[clarification needed] in a clause; the norm is for the clause to have at most one noun, the other arguments being pronouns.[citation needed] In written language, this is somewhat different, but that is of no help when investigating oral languages. Finally, the notion of "pragmatically neutral" is difficult to test. While the English sentence "The king, they killed." has a heavy emphasis on king, in other languages, that order (OSV) might not carry a significantly higher emphasis than another order.

If all the requirements above are met, it still sometimes turns out that languages do not seem to prefer any particular word order. The last resort is text counts, but even then, some languages must be analyzed as having two (or even more) word orders.

Constituent word orders[edit]

of languages
SOV "She him loves." 45% 45
Proto-Indo-European, Sanskrit, Hindi, Ancient Greek, Latin, Japanese, Korean
SVO "She loves him." 42% 42
Cantonese, English, Hausa, Italian, Malay, Mandarin, Russian
VSO "Loves she him." 9% 9
Biblical Hebrew, Classical Arabic, Irish, Filipino, Tuareg-Berber, Welsh
VOS "Loves him she." 3% 3
Malagasy, Baure, Proto-Austronesian
OVS "Him loves she." 1% 1
Apalaí, Hixkaryana
OSV "Him she loves." 0% Warao
Frequency distribution of word order in languages surveyed by Russell S. Tomlin in 1980s[10][11]
( )

These are all possible word orders for the subject, verb, and object in the order of most common to rarest (the examples use "she" as the subject, "ate" as the verb, and "bread" as the object):

Sometimes patterns are more complex: German, Dutch, Afrikaans and Frisian have SOV in subordinates, but V2 word order in main clauses, SVO word order being the most common. Using the guidelines above, the unmarked word order is then SVO. French uses SOV by default, but in the common case where the subject is a clitic pronoun, the order is SVO instead.

Others, such as Latin, Greek, Persian, Romanian, Assyrian, Turkish, Finnish, Basque have no strict word order; rather, the sentence structure is highly flexible and reflects the pragmatics of the utterance. Similarly, Japanese requires that all sentences end with V, but it could be SOV or OSV.

Topic-prominent languages organize sentences to emphasize their topic–comment structure. Nonetheless, there is often a preferred order; in Latin and Turkish, SOV is the most frequent outside of poetry, and in Finnish SVO is both the most frequent and obligatory when case marking fails to disambiguate argument roles. Just as languages may have different word orders in different contexts, so may they have both fixed and free word orders. For example, Russian has a relatively fixed SVO word order in transitive clauses, but a much freer SV / VS order in intransitive clauses.[citation needed] Cases like this can be addressed by encoding transitive and intransitive clauses separately, with the symbol 'S' being restricted to the argument of an intransitive clause, and 'A' for the actor/agent of a transitive clause. ('O' for object may be replaced with 'P' for 'patient' as well.) Thus, Russian is fixed SVO but flexible SV/VS. In such an approach, the description of word order extends more easily to languages that do not meet the criteria in the preceding section. For example, Mayan languages have been described with the rather uncommon VOS word order. However, they are ergative–absolutive languages, and the more specific word order is intransitive VS, transitive VOA, where S and O arguments both trigger the same type of agreement on the verb. Indeed, many languages that some thought had a VOS word order turn out to be ergative like Mayan.

There is speculation on how the Celtic languages developed a VSO word order for their Indo-European tongues. Standard VSO word order is unusual in Indo-European. One popular theory is that Celtic tribes came into contact with Afro-Asiatic speakers; sometime in Europe - and this gradually influenced and changed the Celtic culture and grammar format within Celtic speakers taking a proposed Afro-Asiatic substratum. [13] However, this theory lacks sufficient evidence; as Celtic languages seem to lack evidence of word borrowings from Afro-Asiatic; and the archaic Celtic cultures seem to share very little similarities with Afro-Asiatic culture. So the Afro-Asiatic substratum hypothesis is widely disputed and often rejected by mainstream linguists. Although there is evidence of words in Celtic language that are of unknown origin and are perhaps of pre-Indo-European word borrowings.

For more information, see: Celtic Afro-Asiatic substratum.

Interestingly; an alternative is that a few of the words in Celtic that are unidentified or of a non-Indo-European origin seem to share at least some similarity to Basque words. [14] So it could be indicated instead that perhaps a change to VSO word format in Celtic Indo-European may have come from an extinct ancient language that was relative to Basque; when Celtic speakers first arrived in Western Europe.

Functions of constituent word order[edit]

A fixed or prototypical word order is one out of many ways to ease the processing of sentence semantics and reducing ambiguity. One method of making the speech stream less open to ambiguity (complete removal of ambiguity is probably impossible) is a fixed order of arguments and other sentence constituents. This works because speech is inherently linear. Another method is to label the constituents in some way, for example with case marking, agreement, or another marker. Fixed word order reduces expressiveness but added marking increases information load in the speech stream, and for these reasons strict word order seldom occurs together with strict morphological marking, one counter-example being Persian.[1]

Observing discourse patterns, it is found that previously given information (topic) tends to precede new information (comment). Furthermore, acting participants (especially humans) are more likely to be talked about (to be topic) than things simply undergoing actions (like oranges being eaten). If acting participants are often topical, and topic tends to be expressed early in the sentence, this entails that acting participants have a tendency to be expressed early in the sentence. This tendency can then grammaticalize to a privileged position in the sentence, the subject.

The mentioned functions of word order can be seen to affect the frequencies of the various word order patterns: The vast majority of languages have an order in which S precedes O and V. Whether V precedes O or O precedes V however, has been shown to be a very telling difference with wide consequences on phrasal word orders.[15]

Knowledge of word order on the other hand can be applied to identify the thematic relations of the NPs in a clause of an unfamiliar language. If we can identify the verb in a clause, and we know that the language is strict accusative SVO, then we know that Grob smock Blug probably means that Grob is the smocker and Blug the entity smocked. However, since very strict word order is rare in practice, such applications of word order studies are rarely effective.[citation needed]

Phrase word orders and branching[edit]

The order of constituents in a phrase can vary as much as the order of constituents in a clause. Normally, the noun phrase and the adpositional phrase are investigated. Within the noun phrase, one investigates whether the following modifiers occur before or after the head noun.

  • adjective (red house vs house red)
  • determiner (this house vs house this)
  • numeral (two houses vs houses two)
  • possessor (my house vs house my)
  • relative clause (the by me built house vs the house built by me)

Within the adpositional clause, one investigates whether the languages makes use of prepositions (in London), postpositions (London in), or both (normally with different adpositions at both sides).

There are several common correlations between sentence-level word order and phrase-level constituent order. For example, SOV languages generally put modifiers before heads and use postpositions. VSO languages tend to place modifiers after their heads, and use prepositions. For SVO languages, either order is common.

For example, French (SVO) uses prepositions (dans la voiture, à gauche), and places adjectives after (une voiture spacieuse). However, a small class of adjectives generally go before their heads (une grande voiture). On the other hand, in English (also SVO) adjectives almost always go before nouns (a big car), and adverbs can go either way, but initially is more common (greatly improved). (English has a very small number of adjectives that go after their heads, such as extraordinaire, which kept its position when borrowed from French.)

Pragmatic word order[edit]

Some languages have no fixed word order. These languages often use a significant amount of morphological marking to disambiguate the roles of the arguments. However, some languages use a fixed word order, even if they provide a degree of marking that would support free word order. Also, some languages with free word order—such as some varieties of Datooga—combine free word order with a lack of morphological distinction between arguments.

Typologically there is a trend that highly animate actors are more likely topical than low-animate undergoers, this trend would come through even in free-word-order languages giving a statistical bias for SO order (or OS in the case of ergative systems, however ergative systems do not usually extend to the highest levels of animacy, usually giving way to some form of nominative system at least in the pronominal system).[16] Most languages with a high degree of morphological marking have rather flexible word orders such as Turkish, Latin, Portuguese, Ancient and Modern Greek, Romanian, Hungarian, Lithuanian, Serbo-Croatian, Russian (in intransitive clauses), and Finnish. In some of those, a canonical order can still be identified, but in others this is not possible.[citation needed] When the word order is free, different choices of word order can be used to help identify the theme and the rheme.


In Hungarian, the enclitic -t marks the direct object. For "Kate ate a piece of cake", the possibilities are:

  • "Kati megevett egy szelet tortát." (same word order as English) ["Kate ate a piece of cake."]
  • "Egy szelet tortát Kati evett meg." (emphasis on agent [Kate]) ["A piece of cake Kate ate."]
  • "Kati evett meg egy szelet tortát." (also emphasis on agent [Kate]) ["Kate ate a piece of cake."]
  • "Kati egy szelet tortát evett meg." (emphasis on object [cake]) ["Kate a piece of cake ate."]
  • "Egy szelet tortát evett meg Kati." (emphasis on number [a piece, i.e. only one piece]) ["A piece of cake ate Kate."]
  • "Megevett egy szelet tortát Kati." (emphasis on completeness of action) ["Ate a piece of cake Kate."]
  • "Megevett Kati egy szelet tortát." (emphasis on completeness of action) ["Ate Kate a piece of cake."]


In Portuguese, clitic pronouns and commas allow many different orders:[citation needed]

  • Eu vou entregar para você amanhã. ["I will deliver to you tomorrow."] (same word order as English)
  • Entregarei para você amanhã. ["{I} will deliver to you tomorrow."]
  • Eu lhe entregarei amanhã. ["I to you will deliver tomorrow."]
  • Entregar-lhe-ei amanhã. ["Deliver to you {I} will tomorrow."] (mesoclisis)
  • A si, eu entregarei amanhã. ["To you I will deliver tomorrow."]
  • A si, entregarei amanhã. ["To you deliver {I} will tomorrow."]
  • Amanhã, entregarei para você. ["Tomorrow {I} will deliver to you"]
  • Poderia entregar, eu, a você amanhã? ["Could deliver I to you tomorrow?]

Braces ({ }) were used above to indicate omitted subject pronouns, which may be left implicit in Portuguese. Thanks to conjugation, the grammatical person is recovered.


In Latin, the endings of nouns, verbs, adjectives, and pronouns allow for extremely flexible order in most situations. Latin lacks articles.

  • Romulus condiderat urbem. ["Romulus had founded the city."] (Same order as English)
  • Romulus urbem condiderat. ["Romulus the city had founded."]
  • Condiderat Romulus urbem. ["Had founded Romulus city."]
  • Condiderat urbem Romulus. ["Had founded city Romulus."]
  • Urbem Romulus condiderat. ["The city Romulus had founded."]
  • Urbem condiderat Romulus. ["The city had founded Romulus."]

Romulus is in the nominative case, so it is the subject of the sentence. urbem is the accusative case of the third declension noun, urbs, so it is the object of the sentence. condiderat is the third person singular pluperfect indicative active form of the verb condo, condere. It tells the relationship between Romulus and urbem.

Latin prose often follows the word order Subject, Indirect Object, Direct Object, Adverb, Verb" (commonly known by the acronym "SIDAV"), but this is more of a guideline than a rule. Adjectives normally go after a noun they modify (either the Subject or the Object), but this is not absolutely required. In practice, there is great flexibility in word order, though the one rule usually followed is that the verb goes last in the sentence. Nonetheless, it is not incorrect grammar to use a completely different word order. Putting a word earlier in the sentence increases the emphasis on it, but this subtlety would only be particularly obvious to a native Latin speaker.[citation needed] However, even in Classical Latin poetry, lyricists followed word order very loosely to achieve a desired scansion. Romulus urbem condiderat (Subject Object Verb) is preferable, but there is nothing explicitly incorrect with condiderat urbem Romulus (Verb Object Subject).


Due to the presence of grammatical cases (nominative, genitive, dative, accusative, ablative, and in some cases or dialects vocative and locative) applied to nouns, pronouns and adjectives, the Albanian language permits a large number of positional combination of words. In spoken language a word order differing from the most common S-V-O helps the speaker putting emphasis on a word, thus changing partially the message delivered. Here it is an example:

  • "Marku më dha një dhuratë mua." ["Mark (me) gave a present to me.", neutral narrating sentence.]
  • "Marku mua më dha një dhuratë." ["Mark to me (me) gave a present.", emphasis on the indirect object, probably to compare the result of the verb on different persons.]
  • "Marku një dhuratë më dha mua." ["Mark a present (me) gave to me", meaning that Mark gave her only a present, and not something else or more presents.]
  • "Marku një dhuratë mua më dha." ["Mark a present to me (me) gave", meaning that Mark gave a present only to her.]
  • "Më dha Marku një dhuratë (mua)." ["Gave Mark to me a present.", neutral sentence, but puts less emphasis on the subject.]
  • "Më dha një dhuratë Marku mua." ["Gave a present to me Mark.", probably is the cause of an event being introduced later.]
  • "Më dha (mua) Marku një dhurate." ["Gave to me Mark a present.", same as above.]
  • "Më dha një dhuratë mua Marku" ["(Me) gave a present to me Mark.", puts emphasis on the fact that the receiver is her and not someone else.]
  • "Një dhuratë më dha Marku mua" ["A present gave Mark to me.", meaning it was a present and not something else.]
  • "Një dhuratë Marku më dha mua" ["A present Mark gave to me.", puts emphasis on the fact that she got the present and someone else got something different.]
  • "Një dhuratë (mua) më dha Marku." ["A present to me gave Mark.", no particular emphasis, but can be used to list different actions from different subjects.]
  • "Një dhuratëmua Marku më dha." ["A present to me Mark (me) gave", remembers that at least a present was given to her by Mark.]
  • "Mua më dha Marku një dhuratë." ["To me (me) gave Mark a present.", is used when Mark gave something else to others.]
  • "Mua një dhuratë më dha Marku." ["To me a present (me) gave Mark.", emphasis on "to me" and the fact that it was a present, only one present or it was something different from usual."]
  • "Mua Marku një dhuratë më dha" ["To me Mark a present (me) gave.", Mark gave her only one present.]
  • "Mua Marku më dha një dhuratë" ["To me Mark (me) gave a present." puts emphasis on Mark. Probably the others didn't give her present, they gave something else or the present wasn't expected at all.]

Indo-Aryan languages[edit]

The word order of many Indo-Aryan languages can change depending on what specific implications a speaker wishes to make. These are generally aided by the use of appropriate inflectional suffixes. Consider these examples from Bengali:

  • আমি ওটা জানি না। ["I that don't know.", typical, neutral sentence]
  • আমি জানি না ওটা। ["I don't know that.", general emphasis on what isn't known]
  • ওটা আমি জানি না। ["That I don't know.", agitation about what isn't known]
  • ওটা জানি না আমি। ["That don't know I.", general emphasis on the person who doesn't know]
  • জানি না আমি ওটা। ["Don't know I that.", agitation about the person who doesn't know]
  • *জানি না ওটা আমি। [*"Don't know that I.", unused]

Other issues[edit]

In many languages, changes in word order occur due to topicalization or in questions. However, most languages are generally assumed to have a basic word order, called the unmarked word order; other, marked word orders can then be used to emphasize a sentence element, to indicate modality (such as an interrogative modality), or for other purposes.

For example, English is SVO (subject-verb-object), as in "I don't know that", but OSV is also possible: "That I don't know." This process is called topic-fronting (or topicalization) and is common. In English, OSV is a marked word order because it emphasises the object, and is often accompanied by a change in intonation.

An example of OSV being used for emphasis:

A: I can't see Alice. (SVO)
B: What about Bill?
A: Bill I can see. (OSV, rather than I can see Bill, SVO)

Non-standard word orders are also found in poetry in English, particularly archaic or romantic terms – as the wedding phrase "With this ring, I thee wed" (SOV) or "Thee I love" (OSV) – as well as in many other languages.


Differences in word order complicate translation and language education – in addition to changing the individual words, the order must also be changed. This can be simplified by first translating the individual words, then reordering the sentence, as in interlinear gloss, or by reordering the words prior to translation.

See also[edit]


  1. ^ The constructed Klingon language uses OVS order for its deliberate alienness.


  1. ^ a b Comrie, 1981
  2. ^ Sakel, Jeanette (2015). Study Skills for Linguistics. Routledge. p. 61. 
  3. ^ Hengeveld, Kees (1992). Non-verbal predication. Berlin: Mouton de Gruyter. ISBN 3-11-013713-5. 
  4. ^ Sasse, H.J. (1993). "Das Nomen – eine universelle Kategorie?". Sprachtypologie und Universalienforschung. 46: 3. 
  5. ^ Jan Rijkhoff (2007) "Word Classes" Language and Linguistics Compass 1 (6) , 709–726 doi:10.1111/j.1749-818X.2007.00030.x
  6. ^ Rijkhoff, Jan (2004), "The Noun Phrase", Oxford University Press, ISBN 0-19-926964-5
  7. ^ Tomlin, Russel S. (1986). Basic word order: Functional principles. London: Croom Helm. ISBN 0-415-72357-4. 
  8. ^ Gell-Mann, Murray; Ruhlen, Merritt (10 October 2011). "The origin and evolution of word order". Proceedings of the National Academy of Sciences. 108 (42): 17290–17295. doi:10.1073/pnas.1113716108. 
  9. ^ Henn, B. M.; Cavalli-Sforza, L. L.; Feldman, M. W. (17 October 2012). "The great human expansion" (PDF). Proceedings of the National Academy of Sciences. 109 (44): 17758–17764. doi:10.1073/pnas.1212380109. PMC 3497766Freely accessible. PMID 23077256. 
  10. ^ Meyer, Charles F. (2010). Introducing English Linguistics International (Student ed.). Cambridge University Press. 
  11. ^ Tomlin, Russell S. (1986). Basic Word Order: Functional Principles. London: Croom Helm. p. 22. ISBN 9780709924999. OCLC 13423631. 
  12. ^ Kordić, Snježana (2006) [1st pub. 1997]. Serbo-Croatian. Languages of the World/Materials ; 148. Munich & Newcastle: Lincom Europa. pp. 45–46. ISBN 3-89586-161-8. OCLC 37959860. OL 2863538W.  Contents. Summary. [Grammar book].
  13. ^ https://mathildasanthropologyblog.wordpress.com/2008/07/12/an-afro-asiatic-connection-to-celtic-languages/
  14. ^ https://books.google.com/books?id=5TdIAAAAQBAJ&pg=PA73&lpg=PA73&dq=celtic+and+basque+word+borrowing+andere&source=bl&ots=8czDG34GtR&sig=eSDQrhiU5sAVsJn0m8S35krnLgs&hl=en&sa=X&ved=0ahUKEwjt4-q-64vWAhXKwFQKHddMCTIQ6AEILjAB#v=onepage&q=celtic%20and%20basque%20word%20borrowing%20andere&f=false
  15. ^ Dryer, Matthew S. 1992. 'The Greenbergian Word Order Correlations', Language 68: 81–138
  16. ^ "Language Universals and linguistic typology", Bernard Comrie, 1981

Further reading[edit]