Austronesian Counting

Tuesday, June 24, 2008

Kokota – one of the most bizarre numeral systems attested for an Oceanic language

Filed under: Uncategorized — richardparker01 @ 11:42 am

Robert Blust, perhaps the most  influential Austronesian linguist today, said this about the Kokota (Ysabel Island, Solomons) counting/numbering system:

“The numeral system, while basically decimal (1: kaike, 2: palu, 3: tilo, 4: fnoto, 5: yaha, 6: nablo, 7: fitu, 8: hana, 9: nheva, 10: naboto), contains only one clear reflex of the POC numerals (fitu < POC *pitu). Most strikingly, the higher numerals from 20 to 60 have been constructed on the basis of alternating decimal and vigesimal principles, the former occurring with odd multiples of ten and the latter with even multiples, though this has been obscured by historical change: varedake ’20’, tulufulu ’30’ (< POC *tolupuluq = 3 x 10), palu-tutu ’40’ (= 2 x tutu, a morpheme that does not occur earlier in the numeral system, but with the implied value ’20’), limafulu (< POC *limapuluq = 5 x 10), tilo-tutu ’60’ (= 3 x tutu). The numerals 70-90 are decimal-based multiples of salai, a morpheme with the implied value ’10’ that does not occur earlier: fitu-salai ’70’, hana-salai ’80’, nheva-salai ’90’. There are separate morphemes for ‘100’ (gobi) and ‘1000’ (toga).

This must surely rank as one of the most bizarre numeral systems attested for an Oceanic language, and naturally raises questions about possible past contact influences”.
Blust, Robert. “John Lynch, Malcolm Ross, and Terry Crowley. 2002. The Oceanic languages.” Oceanic Linguistics 44.2 (Dec 2005)

The languages of Ysabel form a dialect chain until they meet up with a different language group in the SE corner: All, except Bughotu, are currently grouped as Meso-Melanesian. Bughotu belongs to the Central East Oceanic grouping.













Kia, Zabana


pa lu







legha ha


Laghu (almost extinct)









legha ha






tilou /  tilo







na boto


Zazao, Kilokaka










bo ‘tq h’












na botho

Cheke Holo, Maringe, Hograno

kaha /  kaisei

pea / phia



falima /  glima

famno /  namno


hana /  nhana

heva /  nheva

botho /  nabotho


Gayo, Gao










fa botho

Bugotu, Bughotu










sa lage


The Ysabel Island 1-10s, viewed as a complete set, show that some reflexes of POC 1-10 can, at least, be recognised. 

The first 7 languages are from the NW of Ysabel, where they form a chain of language ‘bands’, spanning the relatively long, narrow island, from north to south, all in  the Meso-Melanesian language grouping.

Several of them use na botho for 10, instead of a reflex of proto-Oceanic *sa nga puluq.

Botho means ‘closed’, or ‘shut’; a very obvious term for the two closed fists that mark 10, after counting two hands by bending fingers, one by one,  down to the palm.

Bughotu, the last language, from the SE tip of the island, is from a completely different grouping, Central Eastern Oceanic. It is clearly more ‘conservative’ of the POC terms than the others.

But Bughotu was also used as a missionary ‘lingua franca’ in the 19thC, and this may be how some of its higher numbers ‘leaked over’ into Cheke Holo and Kokota, as shown below:

Analysing only the numbers 1-10 is not enough. To understand a number system properly, you must include the higher numbers, if any existed.

A common assumption, that a special word for 10 demonstrates a fully decimal system is simply not true.

Sometimes, in undeveloped number systems, a special word for 10 denotes only a ‘top number’ before counting the toes, using the same words again, until a special word for 20 marks the full set. But, in some recorded systems, yet another ‘top number’ kicks in at 15, to show when the left toes turn to the right, as in this example from Vanuatu:











South-East Ambrym



te he tisav

lu he tisav

tol he tisav

hat he tisav


tei e le       

le tei bus  

hanu tap



1- (second hand)

2– (second hand)

3– (second hand)

4– (second hand)

‘something’ – 2

one on leg

leg 1 finishes

whole person


Anomalies in the Kokota  decades














na boto








nheva salai


Cheke Holo

botho /  nabotho





thilotutu / namnosalei




kaisei gobi

1 hundred


sa lage

tutugu /

tolu hanavulu

e rua tutugu

e lime hanavulu

tolu tutugu

vitu hanavulu

vati tutugu

hia hanavulu


The Kokota decades show every symptom of having been adopted, or imposed, from somewhere else. They break one of the very first ‘elementary rules’ of true decimalisation; that they should be a clear, simple system of multiples of 10. They are not.  

Nobody innovates a system, for their own use, where they have to memorise more than a few new words to apply to the original 1-10, and 20.

Part of the reason for the modern dominance of the pure decimal system, worldwide, is that it is very simple to construct a higher number using only the first 10 number words, together with a single made-up word for each following power levels. So a number like 1,236,549 can be easily spoken as ‘one million (106), two hundred (102) and thirty six thousand (103), five hundred (102)  and fortynine.

The highest numbers that can be spoken are limited by the words you have available, so for instance, if your highest number word is 20, you’re subject to a theoretical Limiting Number of 400 (=20×20).

So, it looks as if Bughotu had a simple and limited vigesimal system, with a morpheme (hangavulu) borrowed from another South Solomonic language, probably Nggela, its neighbour, to denote 30 and and other odd-numbered decades.

Note that (hangavulu) is used as a single word, which can be multiplied, not a direct reflex of POC *sa-nga-puluq, meaning 1-10, so that POC 20 = *rua-nga-puluq.

In Bughotu, 30 is tolu-hanavulu, literally translated as: 3-1-10.

This loss of the logical meaning of *sa nga puluq (or its derivatives, hangafulu, tangavulu, etc) is very common as far to the west as the Bird’s Head of New Guinea, and throughout the Oceanic language group. I have termed it ‘Tautological Ten’.

So, when we turn to Kokota and Cheke Holo, and find their use of a reflex of *sa nga puluq (in only two odd-numbered decades) does ‘follow the rule’ (they seem to be using the –fulu portion of the morpheme ‘correctly’) this must arouse suspicions.

To find they, later on, utilise an apparent borrowing, salai, from Bughotu sa lage (remember that is from a separate language group) for 70, 80, and 90 almost proves there’s been some monkey-business.

I suspect this occurred when many Solomon Island number systems were standardised after the onset of literacy and formal education that arrived with missionaries in the late 19thC.

Often, languages, as they perceive a need for higher numbers, adopt a completely new borrowed system, as Embaloh is now doing in Borneo, Swahili did in East Africa, and Hausa did in West Africa. Each of these has adopted parts of their number systems from more ‘prestigious’ groups.

Usually, in an original vigesimal system, the counting goes up to 20 (1x ‘20 unit’) then repeats the cycle until the second 20 2x ‘20 unit’).  So, the counting of fingers goes up to a ‘top number’, 10, and the counting of toes up to another, usually glossed as ‘one man’ or ‘fingers-and-toes-finished

A very simple economical system, with their original 10 showing up on odd-numbered decades, is shown in this language from the western end of New Guinea:









snontujoser e sam-pur




Higher Numbers Still

In normal counting, the Dusneri don’t seem to have gone much further. Dusner has utin for 100, (or more probably, for a ‘very large number’) but no known larger numbers.

Many pre-literate cultures standardised and formalised their number systems, but usually only if they had a special need for higher numbers.

Babylonian – taxes and astronomy,

Yele – shell money,

and many Papua New Guinea Highlanders – body-part tallies to demonstrate shares of huge feasts, etc.

But, in interpretations of pre-literate number systems, modern scholars have gone much further, pinpointing certain words, and offering very precise translations of the modern usages, perhaps unjustifiably:








mola = 1 million

Cheke Holo

kaisei gobi

1 hundred

kaisei thoga
1 thousand

feferi = very large number




mola = 10,000

feferi = 100,000

In Kokota, gobi (100) may be related to goba – fat (obese), but probably without the same pejorative sense that we see n that description .

gobi (100) in nearby Nggela (related to Bughotu) is a specific measure – 10 canoes (or perhaps 10 canoe-worths of warriors?) (Codrington via Ivens)

         toga (1000) – also means many in Kokota. Datau toga = paramount chief , tehi = many, and togatehi = a great many or perhaps an emphatic combination

         feferi represents an uncountable large number in Cheke Holo, but, translated as a precise 100,000 in Bughotu

         mola is a common word in the Solomons for a ‘vast number’.
In some languages it has become standardised as 10,000, but in other cases, like Kokota, it now means ‘million’.
In Nggela, mola is the term for ten baskets of canarium nuts. This is not to suggest that someone once actually counted out a thousand nuts, then created a standard basket for them, as a measure, but that the word was appropriated at contact, and with the onset of literacy and decimalisation, to denote a more precise number.

Kokota certainly has a bizarre counting system, but I hope this post clarifies, a little, why that is so.




What do Welshmen have to do with Polynesians?

Filed under: Uncategorized — richardparker01 @ 8:22 am


It’s very difficult to convince Austronesian linguists that An numbers didn’t actually spring, fully-formed, as miraculously decimal systems in proto-Austronesian, or proto-Oceanic, at the latest, 3500 years ago. (It was just as difficult to convince Indo-Europeanists that this didn’t happen, either, but at least they’re coming round, now).

After all, if you collect large sets of cognate words from daughter languages, and then reconstruct them, using the phonetic rules you’ve already found by reconstructing other words, and end up with a very obvious decimal set, from 1-10, you make the most obvious deduction; it must always have been that way. Well, it wasn’t.


Vigesimal numbering, at least in the basic stages, occurred worldwide, simply because 20 is the logical end-point to counting first your fingers, and then your toes. At 20, you’ve got a higher unit, so you can memorise it somehow, and start again to count up to 20 again. Now you’ve got two higher units. Remember them, and then do it all over again. At the end, you will have x higher units, and an exact number left over.

At each count of twenty, you simply score a stick, or run your hand down your shepherd’s crook to another readymade score.

Right up to the 20th century, shepherds around England were using a ‘shepherd’s score’ to count their flocks. The system was adapted from the old Briton, or Brythonic number system, which itself evolved into separate languages (Welsh, Cumbric, Cornish and Breton) when the British were split up and forced west by the rude Anglo-Saxons in the 6thC AD.


Here’s an example of just one of these shepherds’ systems, compared with the old Welsh vigesimal system, that also persevered until the 20thC


Old Shepherds’ Score
























dau, dwy (fem)

tri,tair (fem)

pedwar, pedair (fem)
















































un ar ddeg


tair ar ddeg

pedair ar ddeg


un ar bymtheg

dwy ar bymtheg


pedair ar bymtheg



So these (almost certainly illiterate) shepherds adopted a counting system from a different language (or kept it, because they were lower-status Brythons left behind) in an interesting way. They didn’t even consider applying the logic inherent in the original system, but used made-up, but memorable, rhyming nonsense words instead.

Also, young mothers started using these rhyming words as lullabies. Hence ‘counting sheep’ to go to sleep. And when they grew older, they used the same tally words to count stitches in knitting. And children still use them in counting out games.


But there’s more of interest in the Welsh counting system, and some details that might reveal its real roots.

The counts through from 20 to 100 are very practically based on 20s – vigesimal. From 20 to 40 you go up again to 10 – deg ar hugain- and on, to deugain (2 ugains). But, at 50, the new word is hanner cant (half hundred). So now you’ve got three lots of higher units – 20s, 50s and 100s. With those higher units, you can go a long way; you could count yourself to sleep for a fortnight.


Old Welsh vigesimal system













deg ar hugain


hanner cant


deg a thrigain

pedwar ugain

deg a phedwar ugain

cant (cannoedd)


But there are a few other subtle clues in the Welsh counting system to the real history of Celtic systems:.

          at 15, the count starts again 15+1, 15+2, up to 20. This isn’t unusual; many languages reach the end of counting the first lot of toes, and then start out again with the second lot, ending up with ‘whole man’. What is unusual here is that the first lot of numbers (up to 10) isn’t applied to counting the teens.

         but then Welsh counting also does something else a bit strange. 18 is deunaw, (two nines) for some strange reason.

         In Breton, a closely related language (fleeing Brythons doing the very opposite of Dunkirk) 18 is tri w’ech (three sixes).

Why? These may be fossils of an archaic system of counting in threes, not fives.

Tuesday, May 27, 2008


Filed under: Uncategorized — richardparker01 @ 6:32 am

This post was a result of a Simon Greenhill article:
“Lexomics” – Breaking the language barrier

The trouble with “lexomics” is, as some of the commenters on the Nature article pointed out, is that the language evolution process is Lamarckian, not Darwinian; it’s driven, not followed.

If I was a lithping king, I could make all my thubjects lithp without too much trouble.

The other major problem is that there aren’t any fossils*. All the ancestors are hypothetical proto-languages.

If you take all the most common characteristics of an existing clade (or as many as you can find) and distill them down to the lowest common denominators, you’ll end up with a ‘proto-language’.
But you can’t be at all sure that major characteristics of the original ancestral language have not been entirely lost, or preserved in only a minority of the existing remnant languages. (Which you ignored, just because they were a minority).

Then, to trace the ‘descendents’ from this hypothetical language is absurd.

Even then, though, I hope some of the newer generation of linguists (you, Simon? – please)can use the mechanical/statistical techniques used by geneticists to resolve some major ‘language family tree’ problems, like the star-like pattern of supposed descendents from proto-Austronesian and proto-Oceanic.

*Except where we have surviving scripts. But it was pointed out a long time ago that if the Comparative Method was used, retrospectively, on the Romance languages, the resulting proto-language would NOT be Latin.

Tuesday, April 15, 2008

Borrowed or Inherited? Mistake Gives a Clue.

Filed under: Uncategorized — richardparker01 @ 9:23 am

The modern speakers of Misima, an Austronesian language right down on the New Guinea Bird’s Tail use the following words in counting nowadays:
6 esiwa, 7 ewon, 8 epit, 9 ewata.

The same words were recorded a century ago, as
6 siwa, 7 on, 8 pit, 9 ata,
but only in counting 10s, ie in 60, 70, 80, 90 while the ‘lower number’ words in Misima for 6-9 were the usual hand-1, hand-2, etc., common in most An groups in that area.

I glanced at them (they’re very familiar Austronesian number words),and thought, well that blows my theory that number words were conceived a long time later than the reconstructed proto-Austronesian
*enem, *pitu, *walu, *siwa (in that order).

Perhaps the proto-Austronesians did, after all, have a decimal system, and the more primitive systems in New Guinea, etc, really are `retrograde’ systems brought on by Papuan influence, which is the conventional linguistical view.

But the Misima are the only Austronesian group in that area that has these words, so I looked again a bit harder.

And suddenly realised they were using the right words OK, but in the wrong places.
The Misima use 9 for 6, 6 for 7, 7 for 8, and 8 for 9.
Nobody else, anywhere, does that.

It almost proves that these number words were borrowed, not inherited.

And it also almost proves that someone taught them how to make number words for the higher decades, but not how to use them correctly, and they still don’t use them properly.

Monday, April 14, 2008

Link to Austronesian Numbers Worksheet

Filed under: Uncategorized — richardparker01 @ 11:34 pm

I’ ve posted the current version of my worksheet on the web at:

So far, I’ve listed some 1600 number systems in both Austronesian and Papuan languages, and analysed them as best I can, with a code system that reduces a mass of information to a manageable size.

Maisin 6 = faketi tarosi taure sese which means ‘hand over 1’ is coded 5\1 because faketi tarosi is 5, the \ stands for a regular ‘connector’ and sese means 1.

Arifama-Miniafia, another An language close by, has 6 = umat roun ta’imon where 5 = umat roun , so that’s coded 51 because there is no ‘connector’.

Another dialect of Arifama has 6= uma ti reban taimo nomon, 5 = uma ti morob, and 5 isn’t repeated exactly in 6, while uma means hand, so this is coded H\1 (or should have been, but I made a typing error here) .

The coding has made it much easier to visualise connections between number-types in various language sub-groups (and whether they match up or not) and their distribution over larger areas.

This informaion (when I’ve worked out how to use Photo-Shop) will be transferred to geographical maps, making the picture a whole lot clearer.

Thursday, January 24, 2008

Indo-European Numbers 1-10

Filed under: Uncategorized — richardparker01 @ 11:12 pm

I’ve only just discovered that Eugenio Ramón Luján Martínez, in ‘The Indo-European system of numerals from ‘1’ to ‘10’’, makes detailed proposals on exactly how they came about, and how they were formed (the words’ etymology). This is from notes reviewing a paper by him:

The proto-Indo-European 1-10 numerals are:
*oynos/*sem *duwo: *treyes *kwetwores *penkwe *sweks *septm *okto: *newn *dekm

Indo-European ‘6’ ‘may best be explained as a loan from Semitic’, as does ‘7’.
(This is not at all unlikely; the Akkadian 6 and 7 were shishshu and sebe – RP)
‘1’ through ‘3’ were deictic in origin
‘4’ relates to the four fingers or the width of the palm,
*okto ‘8’ resolves to a dual marker (-o) and ‘4’
‘best related to Av. ašti ‘width of four fingers, palm’;
‘5’ is generally related to ‘fist’ and ‘finger’, but is also related to ‘all’;
‘10’ the I-E root underlies *deks- ‘right [hand]’; and
‘9’ is generally related to ‘new’.

The proto-Indo-European 1-10 numerals are:
*oynos/*sem *duwo: *treyes *kwetwores *penkwe *sweks *septm *okto: *newn *dekm

M concludes that achieving units for ‘1’ through ‘10’ remains far from demonstrating an original decimal system, as the grouping of ‘1’ through ‘3’ as deictic in origin, ‘4’, 5’, ‘8’, and ‘10’ as involving fingers or hands, and ‘9’ as ‘new’, suggests. Thus, we see can bases for at least two, and possibly four distinct counting systems prior to the development of the decimal system.
From: Notes on: Numeral Types and Changes Worldwide.

Martinez’ full doctoral thesis on Indo-European numbers is available online but is entirely in Spanish, and 24MB in size, which I shall endeavour to read some time. It deals with Indo-European numbers from 1 to 100.

This find certainly reinforces my conviction that numerals do not come into existence by immaculate conception, but evolve from very small, simple beginnings set in place many thousands of years ago, perhaps when humans first began to speak and estimate quantities.

1 – 3 are deictic, which means they rely on context. Early on, speakers in many languages made a distinction in pronouns: I (singular), we two (dual), we three (trial) and we (more than 3 – plural), and this also extended to the very low numbers, that used the same roots. Number markers related to these were added to many different kinds of words, not just pronouns and the lower numerals.

The dual still exists in the English distinctions both vs. all, either vs. any, twice vs. x times (an archaic thrice also exists, meaning “three times”), and so on, but the dual and the trial no longer occur in our pronouns.

Those very numbers (in fact 1-4) are also the most easily subitisable; that, is you can estimate the number very quickly by sight, without counting. Most people can estimate number by sight up to 7 or 8, but this takes a bit longer.

You can also, of course, easily subitise 1 hand, 2 hands, 1 foot, 2 feet once you start ‘bunching’ numbers into groups (mostly based on counting 5 digits, and then making that 1 unit, usually related to ‘hand’). A digit, of course, was literally, a finger or toe.

But some number systems rely on just the four fingers, so you get one bunch of 4 fingers, then the next stage is 2 bunches of 4 fingers = 8.
This seems to have happened in proto-Indo-European, or in a counting system that preceded that. (See above: *okto ‘8’ resolves to a dual marker (-o) and ‘4’,
‘best related to Av. ašti ‘width of four fingers, palm’).
9 would then be the start of a new cycle, or if 10 had become a new base, it might be a completely new word (‘9’ is generally related to ‘new’).

This kind of ‘4,8 cycle’ number system occurs in isolated areas in a few Austronesian languages around New Guinea, and in Papuan number sytems as well.
A more ‘advanced’ system, with a 5,10 cycle, but with ‘relicts’ of a base 4 system, is more common in Austronesian. In these cases, the ‘9’ is usually constructed something like X1.

This puzzled me for a long time, but the problem begins to clarify itself with the knowledge that proto-Indo-European is confirmed to be probably more of a messy accumulation of different counting systems than the miraculously fully-blown decimal system it appears to be.
Of all Indo-European 1-10 numeral systems, only Vedda has a system that counts 6-9 as 5+1, 5+2, etc. But there are more than 250 of those constructions in Austronesian languages, and in many quite unrelated languages, as well.

For that reason, I believe that the “proto” Austronesian numerals words *enem=6, *pitu=7, *walu=8, and *Siwa=9, appeared latest in the majority of Austronesian of An 1-10 systems.
*sa puluq=1 x puluq*, has nothing to do with hands, but probably appeared before 6-9, because many systems with 5+1, 5+2 constructions use *sa puluq. Furthermore, thei particular word seems to appear quite late in Austronesian languages, suggesting they were borrowed by languages that still preserved older systems in whole or in part.

Monday, January 7, 2008

Numeral Studies in Indo-European

Filed under: Uncategorized — richardparker01 @ 2:55 am

Nineteenth century laws of sound correspondence led to major advances in linguistics. Numeracy, the linguistics of numeral systems, and calculations … now represent twentieth century contributions to an understanding of the … decades. Numeral names … recall an old pre-exponential numeral system that stands between concrete counting and exponential decimal systems.

French Decades.
Seiler has characterized breaks in numeral formations as a “turning point between serializations” that mark the “semiotic status of the base”, while Hurford called attention to the point where a language changes methods for signaling addition as indicative of a base break. So the syntax of English ‘thir-teen … nine-teen’ (digit + base), in stating the smaller number first, differs from that of 21-29 (base + digit) with the smaller number suffixed to the base. Addition in one but multiplication in the other signals the teens / decades break.

Non-standard decade formations from 30 to 90 in French, trente, quarante, cinquante, soixante, septante, uitante /octante, nonante ‘thirty, forty, fifty, sixty, seventy, eighty, ninety’, are built on the strategy digit + a ten-valued suffix -(a)nte, parallel to the English forms with digit + ‘-ty’.

But despite French numerical reforms, standard French numerals for decade counting, like many Celtic systems, retain well-known breaks reminiscent of non-decimal systems. Major breaks in the standard system begin with 70 (soixante-dix, literally ’60-10′ to soixante-dix-neuf ’60-ten-nine’ or ’60-nineteen’) and 80 (quatre-vingt, literally ‘four-twenty’ to quatre-vingt-dix-neuf ‘four-twenty-nineteen’).

French soixante-dix and quatre-vingt have been accounted for as the result of Celtic influence. If Celtic, as a branch of IE, has inherited the PIE decimal system, however, both IE Celtic and French should share an inherited decimal system. To the extent that soixante ’60’ is 6 x 10, and 60 marks a base-like entity on which to build soixante-dix ’70’ as ’60-ten’, soixante formations recall a base value ’60’, but numerals quatre-vingt ’80’ (four-twenty), quatre-vingt-dix ’90’ (four-twenty-ten) build on 20

French Decades

Breaks in the standard French decade system reflect factors [10 and 6] operating on base units 10 and 60 as far as 79 and factors [10, 2, and 5] operating on base units 10 and 20 from 80 to 99. These numeral bases and factors are not powers of any base, but pre-exponential factors reminiscent of traditional systems of measure rather than sequential counting. Decade numerals trente to soixante ’30-60′ are formed regularly from the digits 3-6 plus the decade suffix -(a)nte, and French 62-69 follows the strategy of addition: ‘sixty+2 …’ established with 22.

The first break begins with soixante-dix ’60-10′ which uses 60 as base for adding 10-19 to build 70-79. But soixante itself is otherwise not the productive base that French cent (English ‘hundred’) is. There is no soixante-vingt, for example. The second break begins with the numeral quatre-vingt that, as ‘4-20′, builds on vingt ’20’ as a base. In quatre-vingt-dix ‘4-20-10’ the addition process of 60+10 recurs.

Is French vingt part of the paradigm, trente, quarante, …, or is / was it a separate, unanalyzable base? In the system that underlies quatre-vingt, it serves as a numeral base. By a factor of 5, numeral base vingt is converted to cent ‘100’. The numeral quatre-vingt (4 vingt’s) recalls the conversion of a base 20. Phonological correspondences with Latin make it part of an older decimal paradigm, to the extent that Latin vii-gint-ii ’20’ is ‘2-10’s’. Sound correspondences relate French vingt to Latin vii-gint-ii ‘twenty’ or IE *ui-kentii (Coleman 1992:397-398 with discussion of the relation of *kent- to IE ‘ten, decade, hundred’), while subsequent decades in -(a)nte correspond to Latin *-(a)-gint-aa: quinqu-a-gint-aa, tri-ginta ‘fifty, thirty’ (Pope 1966 [1934]:127; 318). Although historically vingt is a phonological reduction from a potential ancestral ‘two decades’ (Latin vii-gint-ii ‘two gint’s), whether vi-ngt was only accentually separated from soix-ante or not), vingt and soixante have separate roles in the French system of numeration.

NUMERACY AND THE GERMANIC UPPER DECADES*by Carol F. Justus Journal of Indo-European Studies 24, 1996, 45-80

I tried to contact Carol Justus, Director, Numerals Project at the University of Texas at Austin, to request her advice on my own study. I found that she had passed away on 1 August 2007. So I tried to contact Winfred Lehmann, Director of the Linguistics Research Center, University of Texas at Austin , but found, to my astonishment, that he also died, on the very same day.

Do Eskimos Count Like Austronesians ?

Filed under: Uncategorized — richardparker01 @ 12:12 am

If I came across the following set of numerals amongst my currentchart of some 1400 Austronesian and Papuan numeral systems, I would see nothing much amiss. Their construction, and relation to bodyparts, are fairly typical.
1 – ata’uzik – clearly includes a cognate of Austronesian *isa, POc*sa-kai, etc
2 – ma’dro – ditto of *dusa or *rua
3 – pi’ñasun
4 – si’saman
5 – tûdlemût – ditto of *lima (hand)
6 – atautyimiñ akbinigin tudlimût – “one hand and once on the next “
– bog standard Austronesian/Papuan construction
7 – madro’niñ akbi’nigin – “twice on the next”
8 – piñas’uniñ akbi’nigin – “three times on the next”
9 – kodlinotai’la – “that which has not its ten” – not usual, but not very rare
10 – kodlin – derived from kut or kule, “the upper part” – compare*puluq
14 – akimiaxotaityuña – “I have not fifteen.”
15 – akimi’a – fifteen (a separate word)- unusual in An
20 – inyui’na – “a man completed “- bog standard An/Pap construction
25 – inyui’na tûdlimûniñ akbini’digin – “twenty and five times on the next”
30 – inyui’na kodliniñ akbini’digin – ” twenty & ten times,”
35 – inyuilna akim’iaminñ aipâliñ” – “twenty & one fifteen times.”
40 – madro inyui’na or “madrolipi’a – “two twenties,”
100 – tûdlimûipi’a – five ‘pi’a’

These numbers, though, are spoken by Inuits in Point Barrow, in the extreme north of Alaska. Greenland Eskimos use much the same basic number words, but construct their teens and decades differently.
The original writer* points out:”The expressions in Greenlandic and other Eskimo dialects for these higher numbers are very different, which is pretty strong evidence that they have been developed since the separation of the Eskimo into their different branches
That is exactly what I am finding in my study of Austronesian/Papuan numerals. At each stage in the development of counting systems, certain groups in the mainstream adopted new words for numbers that they had only expressed by gesture previously, or had expressed as separately countable (and visible) ‘chunks’ like 10s or 20s. They adopted ‘consensus’ words for 10, 6-9, the teens, decades, and 100s, roughly in that order.
Some groups still lack those ‘consensus’ words.The ‘archaic’ lower numbers, from 1-5, 6-9 and 11-20 are still preserved in many languages that haven’t yet adopted the ‘consensus’ Austronesian number lexicon, and they’re mappable.
The higher numbers, like the teens, decades, hundreds, and thousands, developed, worldwide, only quite recently, and the times of their diffusions should be dateable (if only relatively, not absolutely).
So the fact that (some) Polynesians have fully developed decimal systems, including standard “An” words for 6-9, while many Melanesians in Vanuatu and New Caledonia haven’t, shows that Vanuatu and New Caledonia were first colonised a lot earlier than Polynesia, and in at least 3 separate waves, where newcomers either pushed their predecessors south, or absorbed them.
The Maori had a system based on 20s, not 10s, so that shows they left central Polynesia before the full decimal system diffused into that area.The fact that Easter Island had a full decimal system, while Maoris didn’t, shows that Easter Island was settled later than New Zealand.(Or that Maoris kept strictly to their traditions, of course. People will be human, and upset theories like this one).

Update: April 15 2008 – Since I wrote that, I’ve found that many Polynesian languages had vigesimal systems in use prior to contact with Europeans, so that many of the decimal systems apparent today are not very old at all. I certainly wouldn’t repeat again that ‘Easter Island was settled later than New Zealand’ based solely on my faulty recording of their number-systems.

This dateable number-naming development is still going on. Americans and English (until only the last decade or so) had different meanings for a ‘billion’ – America – 1000 million, England and Germany a million million. So the division is dateable (around 1600-1800 before America, isolated, developed its own meaning for the word ‘billion’), and so is the adoption of the American ‘billion’ by the English (1990-1995).
It is only since Anglo-Saxon times that the English ‘hundred’ came to mean 10×10, not a dozen 10s (12×10). ‘Beowulf’ mentions 100 warriors coming to a place, then 80 of them leaving, and 40 staying.
So the full decimal system we use now only came to England within the last 1000 years or so.It’s very possible that ‘primitive’ Austronesians adopted their identical decimal system before we did.
If this analysis works, it should assist in relative dating of migrations and cross-group influences to a much greater resolution than genetic or linguistic splits and mergers. (Both genetic and linguistic dates are very much estimated on the assumption that things change on a fairly regular and smooth basis. They don’t.)
*Notes on Counting and Measuring among the Eskimo of Point BarrowJohn Murdoch – American Anthropologist, Vol. 3, No. 1. (Jan., 1890),pp. 37-44.

Eskimos do count like Austronesians, but I’m certainly not claiming that they are recently related. The first few number names, and the actual ways of counting up to 1 hand and beyond, and then verbalising that, are pretty similar, worldwide.

Sunday, January 6, 2008

Erromanga – Preservation or Innovation?

Filed under: Uncategorized — richardparker01 @ 2:57 am

I wandered off-topic recently to look at the Erromangan language. (Erromanga is an island about midway between the big islands of Vanuatu and the big island of New Caledonia).

Erromanga once had a least three languages (Sye, Ura, and Utaha) but suffered very heavy depredations in the 19th century by ‘black-birders’ – recruiters for plantation labour in New Caledonia, and Queensland, Australia. There was a virtual population crash, from an estimated 6000 pre-contact, to only 400 in the 1930s, and about 1300 in 1989. Ura had (in 1989) less than half-a-dozen speakers, all elderly, and Utaha disappeared altogether, about a century ago.

In doing so, I re-read:
The Efate-Erromango problem in Vanuatu subgrouping, John Lynch,
Oceanic Linguistics 43.2 (Dec 2004): p311(28)
Available via JSTOR.

Lynch is a classical comparativist (the expert on Southern Vanuatu) and has 28 pages of grammatics and phonology, to support his theories of grouping/sub-grouping, but precious little about the lexicons of Erromanga, except this, under the heading of ‘innovations’: –

“(e) POC *sa[??]apuluq, PNCV *sa[??]avulu ‘ten” is replaced by PEE *rua-lima (‘two-five’): e.g., Lewo lua-lima, South Efate ralim. (The same innovation, however, is found to the immediate north of this subgroup, in Paamese h??lualim.) (9)”
and –
“(b) Erromangan languages share innovation …, the replacement of *sa[??]apuluq ‘ten’ by a form composed of ‘two’ and ‘five’: cf. Sye narwolem, Ura lurem ~ durem.”
ie, the technically more advanced (multiple) word phrase has been ‘replaced’ by a less-developed construction. This could only make sense to a specialist ruled completely and solely by the limited specialist techniques and jargon of his discipline.

Lynch nearly rescues himself from this, but not quite, by saying, in a sub-note:
It occurred to me that the replacement of a monomorphemic word for “ten” with a transparent bimorphemic one may have been part of a more general simplification of numeral systems, since many SOC languages have quinary systems. However. it turns out that many widely distributed languages that do have compound numerals, based on “five” for ‘six’ through ‘nine’ nevertheless retain *sa[??]apuluq ‘ten’.

Much more likely, though, is that many languages borrowed a new word, sa-puluq, meaning 1 x (bunch of) 10, before they got around to changing their old constructuons for 6-9

He kept his nose so close to the phonology and grammatics that he apparently ignores some quite amazing (to me, at least) lexical innovations:

Shoulder, which has a perfectly good POc ‘ancestral’ term,*(qa) para, is ‘innovated’ in Ura as ‘nobun-lenge’ = head-arm/hand’
Neck … POc *Ruqa, *liqoR … (Ura) bo-ri-na Lit. ‘X’+ na=breast
Hair … POc *raun ni qulu … (Ura) novlingen-nobu- (Sie) novlinompu … literally feathers/hair-head This is also the literal meaning of the POc construction, but the *POc word has two lines of ‘descendants’ – one used *raun, alone, and the other used *qulu, alone.
Mouth … POc *papaq. *qawa … (Ura) nobun nggivi– = lit. head-tooth
To sleep: … POc *tiRuR … (Ura) ahlei-ba = lit. to lie down-ba
Thatch/roof … POc *qatop … (Ura) nobun sungai = lit. head-house
To sew: … POc *saquit … (Ura) ehli (Sye) … etri
To stab, pierce … POc *soka … (Ura) ehli … (Sye) satri
Bite … POc *karat … (Ura) ahli … (Sye) elintvi
(This is not wildly exciting, even to an amateur linguist, as sew and stab are very obviously related in POc).

It makes one wonder if these fellows suddenly forgot their ‘inherited’ vocabulary on an isolated island (hasn’t happened elsewhere), or if someone took certain very basic words (body parts, mainly) and deliberately changed them, (ie a genuine invention) or if the people didn’t have those ‘ancestral proto-Oceanic’ words in the first place.

The overall number word/systems differences between Erromanga and Vanuatu languages further north was also completely missed by Lynch in his paper, although he did propose that ‘2 hands’ was an innovation’ for ‘1 x name for 10’ (leading on, perhaps, to 2×10=20, which it does in this case (Ura – lurem gelu=20, Sye – narwolem duru=20). That suggests that Ura and Sye both adopted the idea of decimal 10s before they adopted the words.

Erromangan languages (I have numeral data for Ura, Sie, and extinct Utaha) don’t even achieve the ‘consensus’ PAn names for 1-4:

1 – *PAn – *esa, *ias …. Ura – sai – OK
2 – *PAn – *dusa ….. Ura – ge-lu – OK

3 – *PAn – *telu ……. Ura – ge-he-li – is very strange, because it (should have) descended directly from its established ‘ancestor word’, *telu. Instead it appears to be a ‘linguistic innovation’ based on a 3rd person possessive, ‘ga’ and directly on a Trial *-(t,s)ali proposed for PSV – proto-Southern Vanuatu.
A similar construction is found in older relict languages in Tanna and New Caledonia … kesel, kahar, esech, seen, hejen, etc.

4 – *PAn – *Sepat …. Ura – le-me-lu (2-2) (Sie nd-vat).
Lemelu (2-2) must be dubbed a ‘linguistic innovation’, using classical Comparative Methodology. But it’s clearly not inventive and exactly the same construction I’ve found in number systems that haven’t gone much beyond naming numbers up to 5 in other parts of the world. (Example: aula aula=2-2=4, in Binahara, a Trans-New-Guinea language – it even has a very similar root-word).
Naming numbers from 6-9 is very obviously a later invention or borrowing added to the first 3 to 5 number words, so the appearance, suddenly of ‘vat’ in Ura – sini-vat – 8 is no surprise. It almost proves that the phrases for 6-9 were real inventions in Austronesian languages, at a later date than the first 5 number names were established.

5 – *PAn – *lima …. Ura – su-o-rem (1 hand). This is also a common construction where people mark the ‘first hand’ and then go on to mark ‘hand/hands two’ or ‘hand-hand’ for 10. (Binahara – gena-aulapu = 1-5, gena-aulapu-aulapu = 1-5-5 = 10).

6 – *PAn *enem … Ura – mi-sai (+1)
7 …*PAn *pitu …Ura – sim-he-lu
8 …*PAn *walu… Ura – sim-he-li
9 …*PAn *Siwa… Ura – sini-vat
10 * PAn * sa-piluq … Ura – lu-rem (2-hand)

Lynch’s comparative methods of sub-grouping languages enabled him to propose a settlement hypothesis for Vanuatu, as shown (right).

SOc (Southern Oceanic) groups all the languages of Vanuatu.
– that split into Northern Vanuatu and:
NSO (Nuclear Southern Oceanic – all languages south of Northern Vanuatu)
– that (NSO) split into Central Vanuatu and:
– SMel (Southern Melanesian) – all languages south of Epi and Efate islands.
– SMel split into
Southern Vanuatu (*PSV) and New Caledonian

That translates into a family tree that implies that people (who now speak North Vanuatu languages) first settled North Vanuatu, and stayed there, while another lot going south, split in the middle of Vanuatu, with one lot staying put, and another lot going south, and so on.

It implies that the speakers of languages further south would be the ones that settled their territories most recently.

But, to a non-comparativist (like me) it’s ‘obvious’, from the merest of glances at number systems, that the languages in the south are the oldest, and preserve their older constructions. This thinking reverses the implications of the genetic language tree produced by comparativists.

It would mean that the first major split from Southern Oceanic (SOc) would give one branch leading to the surviving New Caledonian group, with the rest continuing to evolve.
– The next split would be between ‘the rest’ and surviving Southern Melanesian.
– The very latest split would be between ‘the rest’ and surviving North Vanuatu.

Saturday, January 5, 2008

Innovations, Shminnovations (Glossary)

Filed under: Uncategorized — richardparker01 @ 5:00 pm

Comparative Method linguists seem to use trade jargon words that are often diametrically opposed to how the rest of us would use those particular terms.

Consider this:

“POC *sa[??]apuluq, PNCV *sa[??]avulu ‘ten” is replaced by PEE *rua-lima (‘two-five’): e.g., Lewo lua-lima, South Efate ralim. (The same innovation, however, is found to the immediate north of this subgroup, in Paamese h??lualim.) “
The Efate-Erromango problem in Vanuatu subgrouping, John Lynch,
Oceanic Linguistics 43.2 (Dec 2004): p311(28) Available via JSTOR.

Anyone who has ever studied numbering systems, per se, would never describe 2×5 as a replacement for 1×10 (or 1 x group of ten). From 2×5 to 1×10 is, quite definitely, a conceptual step forward. So 2×5 shows the preservation of an older term, not an innovation.

The major problem is that comparative linguists go down the ‘Snakes’ to reconstruct a wholly imaginary proto-language, then climb up the ‘Ladders’, look back to their construction, and base their judgement of what exists in current languages on what they themselves invented.

This leads to a few more arse-about-tit linguistic jargon words:

Retention – means a word (or bit of grammar) that apparently descends directly from the imaginary proto-language
Innovation – means a word (or bit of grammar) that apparently doesn’t descend from the imaginary proto-language
Reflex, reflected – means a word (or bit of grammar) that apparently corresponds to something in the imaginary proto-language
Conservative – means a language that apparently still preserves words (or bits of grammar) from the imaginary proto-language

In each case, the historical comparative linguist is referring back to (his own) imaginary proto-language, and not, in any way, to what might, just, have preceded that proto-language before it burst, fully-formed, into the world.

Henceforth in these posts, I will try to remember (as when I quote linguists directly) to highlight these linguistic jargon words, so you realise that they often mean exactly the opposite of what you (intuitively) might think they mean.

And I will try to remember to use completely different words myself:

Preservation – means a word (or bit of grammar) that still holds over from an earlier language.
– means a word (or bit of grammar) that doesn’t descend directly from
an earlier language – it’s genuinely new.
Descends from
means a word (or bit of grammar) that does descend directly from
an earlier language.
– means a language that apparently still preserves words (or bits of grammar) from an earlier language

Older Posts »

Blog at