GLOTTOCHRONOLOGY AND ITS APPLICATION TO THE BALTO-SLAVIC LANGUAGES

Václav Blažek

(PDF) GLOTTOCHRONOLOGY AND ITS APPLICATION TO THE BALTO-SLAVIC LANGUAGES | Václav Blažek - Academia.edu

B A L T I S T I C A X L I I ( 2 ) 2 0 0 7 185–2 1 0 P e t r a N O V O T N Á, V á c l a v B L A Ž E K Masaryk University GLOTTOCHRONOLOGY AND ITS APPLICATION TO THE BALTO-SLAVIC LANGUAGES In memoriam of Sergei Starostin (March 24, 1953 – Sept. 30, 2005) The explicit purpose of this contribution is to present a quantitative approach to the genetic classification of the Balto-Slavic languages. The implicit aim represents an attempt to rehabilitate the method called ‘glottochronology’. Although the method developed by Morris Swadesh was rightfully criticized by specialists in the Indo-European languages, this does not mean that it is impossible to reconstruct the processes of divergence of related languages including their absolute chronology. The radical modification of the ‘classical glottochronology’ formulated by Sergei S t a r o s t i n (1989; 1999) eliminates its most egregious mistakes and gives a tool for quite realistic estimates of an absolute date. The present article should serve as an illustration, which is in good agreement with both the data of archaeology and historical facts as well. The last, but not least reason for this topic is to mention the scientific heritage of Sergei Starostin, an excellent linguist and great man, who left us so unexpectably, but did so much. 0. Radiocarbon method. 1. ‘Classical glottochronology’ according to Swadesh. 2. ‘Recalibrated glottochronology’ according to Starostin. 3. Lexicostatistics and glottochronology applied to Slavic languages. 4. Lexicostatistics and glottochronology applied to Baltic and BaltoSlavic languages. 5. Correlations with the extralinguistc disciplines: history and archaeology. 6. Conclusion. 185 0. The method called glottochronology represents an attempt to date the divergence of related languages in absolute chronology. Its author, Morris Swadesh, was inspired by another method, used for dating organic remnants, the so called radiocarbon method. Let us repeat the main steps in the deduction of the method. In the beginning it was the discovery of the radiocarbon isotope C14, existing in the atmosphere in the proportion 1 : 1012 with the usual isotope C12. Thanks to the food-chain, the radioactive isotope occurs in green plants and consequently in biological tissues of animals. After the death of any living organism the disintegration of the radioactive isotopes according to the exponential function follows. The exponential disintegration means that after the constant time period T (= half-time of disintegration) the concentration of the radioactive isotope falls in a half, after 2T in a quarter, etc. On the basis of this phenomenon, W. F. Libby developed the radiocarbon method (1947), serving to determine the age of organic remnants younger than 50 millennia. The method was recently defined with more precision (e.g. the change of the half-time from 5568 to 5730 years; correlation with dendrochronology, etc.), but its basic idea remains. Regarding the fact that M. Swadesh borrowed the mathematic apparatus from Libby, it is useful to repeat it. (1) ∆N(t) = -λ· N (t) ·∆t ... decrease ∆N from N radioactive nuclei in the time interval ∆t, where λ is a constant of proportion (2) d N (t) = -λ· N (t) · dt ... approximation of discrete quantities by connected ones, allowing the integration ... leading to the solution ln N(t) = -λ · t + C. After delogarithmization we reach N(t) = e-λ t + C = e-λ t · eC, where eC = K. So we can write N(t) = K · e-λ t . It remains to determine the function of the constant K. It is possible thanks to the initial conditions, i.e. in the time t = 0, when N(t) = N0: (3) N(t) = N0 · e-λ t, where N0 represents the number of undisintegrated nuclei at the beginning of the process. From the equation (3), which is a standard solution of the differential equation (2), we deduce the significance of the half-time of disintegration T, 186 defined as the time interval, in which the number of the undisintegrated nuclei decrease in 1/2: (4) N(T) = 1/2 N0 1 /2 N0 = N0 · e-λT, after a reduction 1 /2 = e-λT, after logarithmization ln 1/2 = _λT, i.e. ln 2 = λT, or (5) The half-time of disintegration of the radioactive isotope C14 was empirically established as 5730 years. It allows one to determine the value of the constant of disintegration λ. For practical calculations it is helpful to use the formula, derived from the definition of the half-time of disintegration. If the number of the undisintegrated nuclei decreases in 1/2 after every time period T, we get: (6) , where n means, how many periods T correspond with the age of the specimen. Hence , i.e. . Let us logarithmize it: and we reach (7) n = From here we get the age of the specimen (8) t = n · T. 1. Around 1950 Libby’s radiocarbon method inspired one American anthropologist and specialist in native American languages, Morris Swadesh, to extend its application to the development of languages. His goal was the absolute dating of the time of divergence of related languages. Swadesh thought that the replacement of words in languages is determined by exponential rule similar to the disintegration of radioactive nuclei of isotope C14. He needed to calculate the rate of this change. For this reason he established a testing word-list, consisting first of 215, later of 200 semantic units, which had to be universal and immune from borrowing. Thanks to the cooperation of specialists in sinology, egyptology, classical philology, Romance and Germanic linguistics, he was able to determine 187 the average constant of disintegration applied to one millennium, in 19,5% changes in the testing word-list, i.e. on average 80,5% of the units of the basic word lexicon in the development of one language should be preserved during this period (see S w a d e s h 1952). Naturally, if the constant is really universal. In 1955 Swadesh published a new study, reflecting the first critical reactions. He radically reduced and changed the testing word-list. The new list consisted of 100 semantic units. On the basis of the reduced ‘basic lexicon’, the constant of disintegration was changed to 14% per. millennium, i.e. 86% of the lexical units should be preserved in the development of one language after one millennium. The elementary postulates may be formulated as follows: [1] In the lexicon of every natural language it is possible to determine the part, which is more stable than others. Let us call it the basic lexicon. [2] It is possible to define the set of meanings, expressed in every language by words from the basic lexicon. Let us designate it the basic testing list (BTL). The symbol N0 will signify the number of various meanings, contained in the list. [3] The share r of the words from the basic testing list preserved after the constant period ∆t, is constant; i.e. it depends only on the length of the time interval, not on a concrete language or a choice of words. [4] All words representing the basic testing list have equal chances of being preserved during the same time interval. [5] The probability of being preserved for any unit from the basic testing list does not depend on the probability of being preserved in the basic testing list of another language. To calculate the time passed between the existence of two languages A and B, where B is a descendant of A, Swadesh used the mathematical apparatus from the radiocarbon method. He began from equation (3): (9) N(t) = N0 · e-lt, where λ represents the analogy to the constant of disintegration in the equation (3). Exactly it is defined as the share of the words in the basic testing list, which are replaced during one millennium. Hence: (10) , or (11) , or 188 . From here , where . If the share r from the postulate (3) is also related to the period of one millennium, it will represent the constant which is complementary to λ , i.e. (12) r = 1 - λ . For the decrease of the words from BTS per millennium the equation ∆N = N0 - N(t1) = N0 - N0 · e-λ · 1 = N0 (1 - e-λ) is valid. The same value must be reflected in the product N0 · λ . From the comparison 1 - e-λl = λ = 1 - r (see 11) we reach (13) r = e- λ . The same result is accessible from the comparison of the right sides of the equations expressing the shares of the preserved words in the BTL per millennium: N = N0 · e-λ · 1 & N = N0 · r . Consequently it is possible to rewrite the equation (10) by means of (13) in the form (14) c = rt , where t indicates the time in millennia. Regarding the postulate (5) the share c2 of the preserved lexicon from the BTL in two related languages, i.e. the languages, developed from a common protolanguage, equal to the square of the share of the words preserved in the individual development: (15) c2 ,= (r t)2 = r2 t. Logarithmizing it, we express t: ln c2 ,= ln r2 t = 2t ln r. From here (16) (17) or with respect to the equation (13) , where c2 means the share of commonly inherited pairs of the words in BTL in both analyzed languages. In application of glottochronology the formulae (16) or (17) are used most frequently. For illustration of the practical procedure let us to estimate the time of divergence of German and French. In the BTL of both languages there are 33 pairs of commonly inherited words. Both lists are complete, which means that c2 = 0,33. Applying it for the equations (16) or (17), we reach the time of divergence in millennia: (16) It is more advantegous to calculate a rich set of data with corresponding share of preservation of BTL for one language (c1) or for two related languages (c2) – see table 1: 189 Ta b l e 1 c1 c2 t 0,99 0,97 0,95 0,90 0,85 0,80 0,75 0,70 0,65 0,60 0,55 0,50 0,45 0,40 0,35 0,30 0,25 0,20 0,15 0,10 0,97 0,94 0,90 0,81 0,72 0,64 0,56 0,49 0,42 0,36 0,30 0,25 0,20 0,16 0,12 0,09 0,06 0,04 0,02 0,01 0,03 0,20 0,35 0,70 1,10 1,50 1,90 2,40 2,90 3,40 4,00 4,60 5,30 6,10 7,00 8,00 9,30 10,7 13,0 15,3 The time of divergence for German and French occurs in the line for t, corresponding with c2 = 0,33. This value may be approximated between the times 3,40 a 4,00 millennia in table 1. Concretely it is possible to estimate the age of the common ancestor for German and French as 3700 BP or 1700 BC according to the methodology developed by Swadesh. The preceding steps operated only with a pair of synchronic languages. It is also necessary to solve the situation, if each of the compared languages was recorded at a different time. Let us designate t1 and t2 the times from the disintegration of the common ancestor of the compared languages to their record in various times. In this case the equation (16) can be modified as , and further (18) . Since t1 and t2 are usually unknown, only their subtraction ∆t12 is at our disposal, it is possible to substitute the sum t1 + t2 by t1 + t1 + ∆t12 = 2t1 + ∆t12, where t1 is shorter from both intervals t1, t2. From here for two asynchronically attested languages the final formula appears as follows: (19) , where t1 = min (t1, t2). 2. Swadesh’s glottochronology was welcomed by specialists studying languages without a longer literary history. On the other hand, the sharpest negative reaction was from specialists in the Indo-European languages. This was understandable: the comparison of the glottochronological estimates with safely known facts from the known history of some IndoEuropean languages frequently indicated a big disagreement. More interesting than the aprioristic rejection was the criticism of the concrete premises, postulates, conclusions, especially, if the critics offered their alternative solutions. The most remarkable modifications eliminating some of the weak points of the method were formulated by the Canadian Sheila E m b l e t o n (1986) and the Russian Sergei S t a r o s t i n (1989, English 1999). Both scholars agreed that the ‘classical glottochronology’ of Swadesh was mistaken in that the replacement of words was not distinguished from borrowing. E.g. such innovation was Russian glaz 190 “eye”, which replaced common Slavic *oko. On the other hand, it is possible to identify a borrowing, probably of Iranian origin, in Russian sobaka “dog”, besides the less frequent pës, which reflects common Slavic *pьsъ “dog”. Starostin offered a simple solution: eliminate all borrowings before any calculation. Applying this procedure to the testing languages, used for the estimation of the constant of disintegration λ, we reach lower value of the constant and its significantly smaller dispersion (table 3). Starostin compared the proportions of the inherited lexicon in histories of the same languages during various time of divergence, related to one millennium times, concretely in some Romance languages versus Vulgar Latin from the middle of the first mill. AD and versus early classical Latin from the time of Plautus, c. 200 BC. The values of c in the table 2 are calculated now without loans; time is expressed in millennia: Ta b l e 2 TABLE 2 language c = N(t) , t = 1,5 λ = ln c , t = 1,5 c = N(t) , t = 2,2 λ = ln c , t = 2,2 No t -t -t French Spanish Rumunian 88/99 = 0,89 0,07 75/97 = 0,77 0,12 90/98 = 0,92 0,06 79/97 = 0,80 0,10 87/96 = 0,91 0,06 76/95 = 0,80 0,10 For the differences between the results in the third and fifth columns Starostin finds the only explanation, the formula (11), implying is not valid. The empirical figures from the table 2 confirm that the optimal approximation is the function (20). The preceding thoughts are based on the data in the table 3. Ta b l e 3 language age t [millennia] λ after Swadesh λ without loans λ* = λ / t 1,3 0,14 0,10 0,08 English 1,2 0,08 0,05 0,04 German 1,0 0,20 0,05 0,05 Norwegish (riksmal) 1,0 0,06 0,06 0,06 Icelandic 1,5 0,09 0,07 0,05 French 1,5 0,07 0,06 0,04 Spanish 1,5 0,09 0,06 0,04 Rumunian 1,2 0,11 0,06 0,05 Japanese 2,6 0,10 0,10 0,04 Chinese 191 It is apparent that the dispersion of the ‘constant of disintegration’ λ according to Swadesh is very high, from 6 do 20%. After the elimination of borrowings, the dispersion of this value for the analyzed nine languages tapers to 5–10%. Still narrower will be the interval in the case, if λ is a function of time. Abstracting from rather specific English, the value oscillates from 4 to 6%. These results led Starostin to the new value of the ‘constant of decrease’: λ = 0.05 per millennium. The situation of English is more complex. It seems its development is faster than is usual in other languages. This phenomenon is undoubtedly connected with the massive influence of Old Norse in the period 800–1100 and Old French in the following five centuries, causing according to Starostin certain pidgin-like features in English. But even the new value of λ = 5% does not defend against tendency to reach a more recent date of divergence, especially in the case of longer time periods. Starostin seeks a solution in the following idea. It is empirically proven that individual words in the lexicon of every language, including BTL, are replaced unevenly. If the words in any language were ordered from least stable to most stable, the words with the lowest stability would be replaced most quickly, while the more stable words would have a longer life. This means, the speed of changes decreases over time. Summing up, “c” is not a constant, but a function of time, c = c(t) and formula (9) should be modified as follows: (21) N(t) = No · e-λ · c(t) · t for a development of one language, where , and (22) for the divergence of two languages, developed from a common protolanguage. From here it is possible to deduce for the time of development of one language (23), or for the time of divergence of two languages (24): 2 (23) (24) The result is a transcendental function, since c = c(t). The easiest way of determining of the time of divergence for the empirically investigated values is offered in table 4, calculated by Sergei Starostin: 192 Ta b l e 4 c1 c2 t 0,99 0,97 0,95 0,90 0,85 0,80 0,75 0,70 0,65 0,60 0,55 0,50 0,45 0,40 0,35 0,30 0,25 0,20 0,15 0,10 0,97 0,94 0,90 0,81 0,72 0,64 0,56 0,49 0,42 0,36 0,30 0,25 0,20 0,16 0,12 0,09 0,06 0,04 0,02 0,01 0,3 0,8 1,0 1,5 2,0 2,4 2,8 3,2 3,7 4,1 4,7 5,3 6,0 6,8 7,8 9,0 10,7 12,7 16,6 21,5 Now it is possible to return to the question of the time of divergence between German and French. In both languages there are 3 loans in the BTL and 33 common cognates. Hence . The corresponding time of divergence is c. 4 220 years. Naturally, it is an exaggeration to conclude that two languages were separated in a single concrete decade. Better is to use the formulation that their common protolanguage disintegrated in the 23rd cent. BC. 2.1. The situation of two asynchronically attested languages is solved by Starostin differently from Swadesh. Starostin’s strategy consists in projection of the historical data to the present level and only after this synchronization the same approach as for living languages is applied to them. It is useful to demonstrate this procedure on concrete idioms, e.g. classical Latin e.g. of Caesar (1st cent. BC) and Gothic of Wulfila’s translation of the New Testament (4th cent. AD). The Latin corpus (i.e. the 100-word-list) is complete, while in the Gothic list 18 units are missing (if Crimean Gothic ada “egg” is included). This means, there are 82 common semantic pairs from the BTL and from them 39 cognates, i.e. etymologically related forms inherited from a common protolanguage. The proportion 39/82 means 47,6%. A language recorded at the time interval ∆t ago would preserve till the present c-times less words from BTL. For Latin recorded 20.5 cent. ago it is c. 0.845. If Gothic would exist till the present time, in its hypothetical descendant the share of the preserved BTL would be 0.892 (see table 4). The common protolanguage of Latin and Gothic projected into the present would preserve cLG · cL · cG = 0.476 · 0.842 · 0.892 = 0.357, i.e. 35,7% common words. Let us mention, the result of the comparison of German and French gave the share 0.351. This means, the dating of the divergence of the representatives of modern Germanic and 193 Romance languages is practically the same as the dating of the divergence of Latin and Gothic, the 23rd cent. BC. It seems to be natural, but for the ‘classical glottochronology’ it was an unattainable goal. 3. For the Slavic languages, quantitative methods as lexicostatistics or glottochronology were applied by various scholars. Let us begin with the attempts based on standard Swadesh’s variant. 3.1.1. One of the most detailed attempts to apply ‘classical glottochronology’ for the Slavic languages is from Czech slavicists A. L a m p r e c h t & M. Č e j k a (1963) and Č e j k a himself (1972). In his study from 1972 Čejka compiled the 100-word-lists from 12 living languages. His results are concentrated in the table 5 (the figures are %): Ta b l e 5 Bul. Mac. SC. Sln. Slk. Cz. ULus. LLus. Pol. Blr. Ukr. Mac. SC. 86 80 84 Sln. Slk. Cz. 76 75 85 75 76 80 80 74 75 79 84 92 ULus. LLus. 73 76 77 78 86 87 71 73 74 78 87 87 94 Pol. Blr. Ukr. Rus. 74 71 75 79 85 81 80 83 77 74 77 76 80 77 78 78 80 72 71 73 71 76 73 74 74 76 92 74 70 71 74 74 74 74 73 77 86 86 The following step consists in the determination of the closest pairs or groups of languages. The pairs (or triads etc.) with the highest grade of relationship will serve as the base of comparison, leading to the deeper past. The order of the first closest pairs is: ULus. + LLus. (= Lus.) 94%, Cz.+ Slk. (= Czsl.) 92%, Blr.+ Ukr. 92%, Rus. + [Blr. + Ukr.] (= ESl.) 86%, Bul. + Mac. 86%, SC. + Sln. 85%. Ta b l e 6 Bul. + Mac. SC. + Sln. Czsl. Lus. Pol. 194 SC. + Sln. Czsl. Lus. Pol. ESl. 78.8 75.0 80.8 73.3 76.8 86.8 72.5 77.0 83.0 81.5 73.0 73.7 75.7 75.2 77.7 It is apparent that the West Slavic languages form a branch consisting of Polish and the compact unit of Lusatian and Czech-Slovak, considering the high score 86.75% between latter subgroups. Slovenian is in a special position between Serbo-Croatian (85%) and Czech (84%). Naturally, it is not possible to separate Czech and Slovak. That is why it is necessary to evaluate the Czech-Slovenian relation from the Czech-Slovak perspective. The average of Czech-Slovak vs. Slovenian scores is 82%, and it is less than 85% for Slovenian vs. Serbo-Croatian on the one hand, still less than the average for all 5 West Slavic languages (86.2%), and even less than the average of the lowest scores within West Slavic, Polish vs. Lusatian and Polish vs. Czech-Slovak, namely (83.0+81.5)%/2 = 82.3%. And so it is necessary to accept the traditional affiliation of Slovenian together with Serbo-Croatian, although the position of Slovenian is more or less transitional. Interesting are the almost equal common proportions of cognates between West Slavic & Slovenian-Serbo-Croatian (78.4%) and Slovenian-Serbo-Croatian & Bulgar-Macedonian (78.8%), indicating a common Southwest Slavic dialect continuum, although the result 73.8% for the West Slavic branch and Bulgar-Macedonian is lower than the average score 75.9% for West and East Slavic and very close to 73.1% between South and East Slavic. This lowest result and the common arithmetic average 74.6% between East and Southwest Slavic define the period of the disintegration for all Slavic languages. Čejka’s results may be depicted by the following tree-diagram (Čejka did not present any diagram of this type, but his data became a source for the diagram created by G i r d e n i s, M a ž i u l i s 1994, 11; the model of divergence presented here is based on the preceding discussion): Diagram 1 74 76 78 80 82 84 86 86% 73.1% - 74.6% 88 90 94% 78.4% 86.8% 76.1% 85% 86% 92% 94 Russian Ukrainian Belarusian 92% 82.3% 78.8% 92 Polish Lower Lusatian Upper Lusatian Slovak Czech Slovenian Serbo-Croatian Macedonian Bulgarian 195 3.1.2. Another scholar who tried to apply ‘classical glottochronology’ to the Slavic languages, was the German J. Vollmer. His results were published by Johann T i s c h l e r in his monograph Glottochronologie und Lexikostatistik (Innsbruck 1973, 133). Vollmer compared 6 modern Slavic languages, plus Old Church Slavonic (his word-lists were not published): Ta b l e 7 OCSl. Bul. SC. Slk. Cz. Pol. Bul. SC. Slk. Cz. Pol. Rus. 75 81 80 81 78 80 81 81 74 72 74 82 77 77 77 86 81 79 86 76 74 Abstracting from Old Church Slavonic as an extinct literary language, Vollmer’s results can be depicted as follows: Diagram 2 74 76 78 80 82 84 86 88 90 92 94 Russian 86% 75.5% - 76.5% Czech 83.5% 86% 77.2% Polish Slovenian Serbo-Croatian 81% Bulgarian It is apparent that the topology of the diagram based on Vollmer’s data is in principle in good agreement with Čejka’ results, perhaps only the equality of Czech-Slovak and Czech-Polish is rather surprising. But both models, translated into the absolute chronology according to Swadesh’s scenario, give, too young and thus ahistorical results: Čejka (74±1)%, i.e. AD 1000, Vollmer (75±0.5)%, i.e. AD 1050 as the date of disintegration of the Slavic languages. 3.2. Let us compare the results based on ‘classical glottochronology’ with the results reached by applying the recalibrated glottochronology: 196 3.2.1. The first model was developed directly by Sergei Starostin with his team. We are grateful him for unpublished data from his database. Ta b l e 8 Mac. SC. Sln. Slk. Cz. ULus. LLus. Bul. Mac. SC. Sln. Slk. Cz. ULus. ULus. Plb. Pol. Blr. Ukr. 90 Plb. Pol. Blr. Ukr. Rus. 88 84 82 81 75 75 77 80 82 76 80 90 83 79 82 79 79 83 81 84 78 81 93 89 89 83 82 88 86 88 82 84 87 90 82 81 88 86 85 79 85 91 85 87 85 90 91 85 83 89 88 88 88 87 80 82 96 89 85 86 78 80 90 89 86 79 80 87 86 81 83 90 85 85 97 92 88 D i a g r a m 3. Classification of the Slavic languages after S. Starostin (presented in Santa Fe, NM, USA, March 2004) 0 200 400 600 800 1000 1200 Russian Ukrainian East Slavic 800 1390 270 1300 130 Belarusian Polabian Upper Lusatian 840 420 West Slavic 1400 Lower Lusatian Polish 780 Slovak 960 Czech 670 Slovenian 1080 Serbo-Croatian Macedonian South Slavic 1000 Bulgarian 197 The present tree-diagram was generated by a computer program prepared by Sergei Starostin in the late 1980s. A preliminary version of this model was published in Starostin’s article Methodology of Long-Range Comparison, which was first published in the volume: V. Shevoroshkin (ed.) Nostratic, Dene-Caucasian, Austric and Amerind, Bochum 1992, 78, and later reproduced in the volume: V. Shevoroshkin, P.J. Sidwell (eds.) Historical Linguistics & Lexicostatistics, Melbourne 1999, 65. The first version of the diagram still operated with the trichotomy, opposing East, West and South branches, but latter without Slovenian and Serbo-Croatian, which were classified together with the West branch. 3.2.2. The second model based on the ‘recalibrated glottochronology’ was prepared by the authors of the present study (N o v o t n á 2004; N o v o t n á, B l a ž e k 2005). The word-lists cover 15 modern idioms, plus Polabian and Old Church Slavonic. In contrary to Starostin our calculation was realized ‘manually’, not via any computer program, but in agreement with the rules formulated by Starostin. The only methodological difference from Starostin consists in the systematic inclusion of synonyms. Swadesh postulated choosing only so called ‘main’ synonyms, the most frequent equivalents of concrete semantic units. But if there are more synonyms and some of them are related, the degree of the mutual genetic relationship is higher. And so it is not correct to eliminate synonyms. That is why we operate with 100 semantic units, while the number of the lexical units is usually higher. From our personal communication we know that Starostin also operated with synonyms, but not systemically. He also did not explain how to calculate with them. Our strategy is based on the standard list of 100 semantic units chosen already by Swadesh in 1955. The number of semantically identical and unborrowed units, attested in both compared languages, i.e. N0, corresponds to 100%. The numerator in our proportion is represented by the number of all cognates, including synonyms. Our results are summarized in table 9: 198 Blr. Ukr. Rus. Cz. 96/ 99 0.970 85/ 99 0.859 83/ 99 0.838 86.5/ 98 0.883 90.5/ 99 0.914 90/ 99 0.909 97/ 99 0.980 Pol. Slk. 94/ 99 0.949 85/ 99 0.859 83/ 99 0.838 86.5/ 98 0.883 89.5/ 99 0.904 88/ 99 0.889 Kaš. Sln. 90.5 100 0.905 89/ 100 0.890 88.5/ 100 0.885 95.5/ 99 0.965 98.5/ 100 0.985 Plb. Cr. 92/ 100 0.920 92.5/ 100 0.925 92/ 100 0.920 99/ 99 1.000 LLus. Srb. Mac. 90/ 88/ 90/ 100 100 99 0.900 0.880 0.909 96/ 91.5/ 100 99 0.960 0.924 91/ 99 0.919 ULus. Ukr. Blr. Pol. Kaš. Plb. LLus. ULus. Cz. Slk. Sln. Cr. Srb. Mac. Bul. OCSl. Bul. Ta b l e 9 92/ 99 0.909 86/ 99 0.869 84/ 99 0.848 86.5/ 98 0.883 89.5/ 99 0.904 88/ 99 0.889 93/ 99 0.939 92/ 99 0.929 90/ 99 0.909 85/ 99 0.859 84/ 99 0.848 87/ 98 0.888 90/ 99 0.909 88.5/ 99 0.894 92/ 99 0.929 91/ 99 0.919 98/ 99 0.990 77/ 88 0.875 70/ 88 0.795 71/ 88 0.807 74/ 87 0.851 78/ 88 0.886 76.5/ 88 0.869 74/ 88 0.841 76/ 88 0.864 77/ 88 0.875 77/ 88 0.875 85/ 97 0.876 81/ 97 0.835 78/ 97 0.804 82.5/ 96 0.859 84.5/ 97 0.871 84.9/ 97 0.866 85/ 96 0.885 84/ 96 0.875 86/ 96 0.896 86/ 96 0.896 76/ 87 0.874 88/ 99 0.889 83/ 99 0.838 81/ 99 0.818 82.5/ 98 0.842 85.5/ 99 0.864 86/ 99 0.869 89.5/ 98 0.913 89.5/ 98 0.913 89.5/ 98 0.913 91.5/ 98 0.934 74/ 87 0.851 96/ 96 1.000 81/ 97 0.835 81/ 97 0.835 81/ 97 0.835 82/ 96 0.854 85/ 97 0.876 83.5/ 97 0.861 85/ 96 0.885 85/ 96 0.885 86/ 96 0.896 85/ 96 0.885 70/ 86 0.814 80/ 94 0.851 84/ 96 0.875 81/ 99 0.818 79/ 99 0.798 79/ 99 0.798 80/ 98 0.816 83/ 99 0.838 81.5/ 99 0.823 84/ 98 0.857 83/ 98 0.847 84/ 98 0.857 83/ 98 0.847 71/ 88 0.807 80/ 96 0.833 84/ 98 0.857 96/ 97 0.990 85/ 100 0.850 83/ 100 0.830 83/ 100 0.830 85.5/ 99 0.864 88.5/ 100 0.885 86/ 100 0.870 86/ 99 0.869 86/ 99 0.869 88/ 99 0889 87/ 99 0.879 74/ 88 0.841 82/ 97 0.845 84/ 99 0.848 93/ 97 0.959 91/ 99 0.919 199 In the following steps we will abstract from Old Church Slavonic as an old literary (and rather artificial) language with an incomplete lexical corpus (the same may be said about Polabian; for this reason its results are rather problematic). The unexpectable share 93.2% connecting Old Church Slavonic with Czech requires a special explanation which is not a subject of the present study. Let us order the languages in groups, usually in pairs, according to languages with the closest relationship: Srb.-Cr. (= SC.) and Kaš.-Pol. agree 100%; regarding the different distribution of synonyms, they will be taken into account separately. Further ULus.-LLus. (= Lus.) 99%, Blr.-Ukr. 99%, SC.-Sln. 98%,.Cz.-Slk. 97%, Bul.-Mac. 95%. The comparison of Russian vs. Belarusian & Ukrainian gives 92.9%, indicating the East Slavic (= ESl.) unit. The results of the comparison between these groups are summarized in table 10. T a b l e 10 Bul.-Mak. SC.-Sln. Cz.-Slk. Lus. Plb. Kaš.-Pol. SC.-Sln. Cz.-Slk. Lus. Plb. Kaš.-Pol. ESl. 92.0 86.9 90.4 86.9 89.2 91.4 80.7 86.0 85.4 88.0 84.2 86.9 90.0 92.5 85.6 82.8 83.3 85.3 86.4 82.3 85.2 The East Slavic unit was already defined. It is apparent that the South Slavic unit with the average score 92.0% in the BTL exists too. It is more than 89.2% between SC.-Sln. and Cz.-Slk. For the existence of the West Slavic (= WSl.) unit there are also the arguments: 91.3% without Polabian, 89.6% including Polabian. The final step is the comparison of the South, West and East branches of Slavic, in t a b l e 11a without Polabian, in table 11b with Polabian: T a b l e 11a SSl. WSl. T a b l e 11b WSl. ESl. 87.4 83.1 85.7 SSl. WSl. WSl ESl. 87.0 83.1 85.2 This means that the traditional trichotomic classification of the Slavic languages should be corrected. In contrary to the usual three equidistant units it is necessary to introduce a hierarchic model with a sequention of 200 two dichotomies. The first division separated the ancestors of the East and Southwest Slavic dialects, the second division separated West and South Slavic. The average of all scores gives the result 85.7% without Polabian and 85.5% with Polabian. The dating of the disintegration of the Slavic dialect continuum should be defined by the value of the lowest result 83.1%, reached for South and East Slavic. Translated into absolute chronology (see table 4 calculated by Starostin), it is possible to date the disintegration of the Slavic languages to AD 520. The West and South Slavic languages were separated in the middle of the 8th cent., West Slavic began its disintegration in the end of the 9th cent. and during 10th cent., South Slavic in the beginning of the 11th cent. and East Slavic around 1070. The position of Polabian is between Lusatian (88.0%), Czech (87.8%) and Polish-Kašubian (85.6%). Remarkable is the low score between Polabian and Slovak (83.0%) in comparison with Czech, and the high score between Polabian and Slovenian-Serbo-Croatian (86.0%). The mutual relations are depicted in diagram 4: Diagram 4 81 83 85 87 89 91 93 95 1070 97 99% 1630 AD 520 1020 1630 900 1390 750 1390 1020 1220 Russian Ukrainian Belarusian Polish Kašubian Polabian Lower Lusatian Upper Lusatian Slovak Czech Slovenian Serbo-Croatian Macedonian Bulgarian The chronology of the following divergencies is difficult, regarding the phenomenon of ‘dialect’ chain. This chain appears, if we order the closest idioms in the direct neighbourhood: 201 LLus. Plb. Ukr. 99| |88.5 |99 Bul.-95-Mac.-94-Cr.-98-Sln.-92-Cz.-92-ULus.-93-Pol.-88.5-Blr.-94-Rus. |97 Slk. The scheme is more linear, if the common units Serbo-Croatian, Czech-Slovak, Lusatian and Belarusian-Ukrainian are taken in account (Polabian was left aside for its incomplete lexicon). Bul.-95-Mac.-93-SC.-97-Sln.-91-Cz.+Slk.-91.5-Lus.-92.5-Pol.+Kaš.-86Blr.+Ukr.-93-Rus. Only in two cases do the figures fall under 90%. It is symptomatic that the lowest values indicate the limits between the south and west branches (91%) and west and east branches (86%). This means that this alternative approach gives the same results as the preceding steps, i.e. the divergence of the Slavic languages can be described as a sequence of two dichotomies: (1) east vs. southwest (6th cent.); (2) south vs. west (middle of the 8th cent.). 4. According to tradition, the Baltic languages are divided into a western part represented by Old Prussian, extinct from c. 1700, and an eastern part, represented by the living languages, Lithuanian and Latvian. But Baltic dialectology was much more complex a millennium ago. The following model was proposed by V. M a ž i u l i s (1981): Diagram 5 North periphery Baltic Central Zemgalian Selian Couronian Latvian Lithuanian South periphery 202 Yatvingian Prussian Galindian 4.1. The first serious application of lexicostatistics (with 140-word-list, reduced for the limited Prussian lexicon) was used by L a n s z w e e r t (1984, xxxii–xxxvii), who found 63.6% for Lithuanian vs. Prussian, 58,6% for Prussian vs. Lithuanian and 55,2% for Prussian vs. Latvian: Diagram 6 50% 60% 70% Latvian East Baltic Baltic Lithuanian 63.6% ø56.9% West Prussian Baltic 4.2. The results of G i r d e n i s, M a ž i u l i s (1994, 9) are rather different: T a b l e 12 Latvian Prussian 68 53.6 /49.0* Lithuanian Latvian 44.3 Note: The figure 49.0% is a result of the correction 0.490 = 0.536 · 0.915, where the latter coefficient expresses the age 600 years of most of the Prussian records. The study of Girdenis & Mažiulis is also valuable for the individual comparison of Lithuanian, Latvian and Prussian with 12 Slavic languages: T a b l e 13 Bul. Mac. SC. Sln. Slk. Li. La. Pr. 46 45 44 44 46 Cz. ULus. LLus. Pol. 44 45 46 43 Blr. Ukr. Rus. 47 47 47 42 41 41 40 42 41 45 43 40 44 40 45 49! 39 41 40 42 42 42 42 39 40 41 41 Note: The figure 49% between Bulgarian and Prussian is apparently mistaken, probably it has to be 39% Using their own data for the Baltic languages and Čejka’s data for the Slavic languages and applying ‘classical glottochronology’, G i r d e n i s, M a ž i u l i s 1994, 11 proposed the scheme: 203 Diagram 7 -1000 -200 600 Lithuanian East Baltic 700 Baltic Balto-Slavic - 530±170 1000 Latvian West Baltic Prussian - 910±340 Slavic 4.3. Starostin (Workshop “Quantitative methods in Classification of Languages and Human Populations”; Santa Fe, NM, 2004, and p.c., June 2005) dated the separation of Lithuanian and Latvian to 80 B.C., Lithuanian and the ‘Dialect of Narew’ to 30 B.C., Latvian and the ‘Dialect of Narew’ to 230 B.C. The position of Prussian in his calculations is rather strange, it has to be closer to Slavic than to Baltic. The disintegration of the Balto-Slavic unity was dated to 1210 BC. 4.4. Our results were reached on the basis of the lexical data, compiled in the Appendix 1. Table 14 summarizes the mutual scores between the Baltic languages, table 15 between the Baltic and Slavic languages: T a b l e 14 language / % Lithuanian Latvian Prussian Latvian Prussian ‘Narewian’ 84.8 62.0 76.5 55.2 76.1 43.0 T a b l e 15 % Bul. Mac. Srb. Cr. Sln. Slk. Cz. ULus. LLus. Plb. Kaš. Pol. Blr. Ukr. Rus. Li. 49.0 48.0 48.5 49.0 48.0 51.5 51.5 50.5 48.5 47.7 48.5 49.5 50.5 49.5 50.0 La. 43.4 43.4 43.9 44.4 45.4 44.9 45.9 44.9 42.8 43.7 43.8 43.9 43.8 42.9 43.4 Pr. 49.4 48.3 49.9 49.4 48.3 50.4 52.5 50.4 48.3 47.4 48.9 48.9 46.7 46.7 46.2 Nar. 44.0 44.0 44.9 45.9 48.8 45.0 46.6 44.7 43.1 43.0 48.8 45.9 42.1 42.1 42.1 Table 16 demonstrates the average scores between South, West, East & all Slavic and the individual and all Baltic languages: 204 T a b l e 16 Lithuanian Latvian Prussian ‘Narewian’ all Baltic South Slavic West Slavic East Slavic all Slavic 48.5 49.7 50.0 49.4 44.1 44.3 43.4 44.1 49.0 49.5 46.5 48.7 45.5 45.3 42.1 44.7 46.8 / 47.2 * 47.2 / 47.8* 45.5 / 46.6* 46.7 / 47.4* Note: *Without ‘Narewian.’ Applying the ‘recalibrated glottochronology’ and including a calculation of synonyms, we reach diagram 8: Diagram 8 -1400 -1000 - 600 - 200 +200 +600 Latvian 84.8% +600 76.3% +190 56%* / 58% 46.7%*/ 47.4% - 830* / - 730 Lithuanian ‘Dialect of Narew’ Prussian -1400*/ -1340 Common Slavic 4.4.1. The double result 58/56% for Prussian vs. the other Baltic languages reflects the calculation without / with the ‘Dialect of Narew’ (Pogańske gwary z Narewu; see Z i n k e v i č i u s 1984). The score 43% between Prussian and the ‘Dialect of Narew’ in comparison with 62% and 55.2% for Prussian vs. Lithuanian and Prussian vs. Latvian respectively, excludes the identification of the ‘Dialect of Narew’ with the historical Yatwingians, known from the Middle Ages, if their language is to be connected with the other Baltic idioms of the southern periphery, including Prussian. Regarding this big difference, it seems better to accept the explanation 205 of S c h m i d (1986) who identified in the ‘Dialect of Narew’ a strong influence of Northeast Yiddish, spoken in the big cities of Lithuania and Latvia, hence the hybrid East Baltic-German idiom. For the relatively big difference between the Prussian-Lithuanian and Prussian-Latvian scores, viz. 62.0% vs. 55.2% respectively, there are at least two explanations: (i) The mutual influence between Prussian and Lithuanian, caused by their geographical proximity. (ii) The areal influence of Balto-Fennic or East Slavic on Latvian. In the analyzed 100-word-list, there is only one apparent borrowing of East Slavic origin in Latvian, viz. cilvēks “person, human being” and nothing from Balto-Fennic. This one item plays a minimum role. That is why it is necessary to admit a stronger role of mutual influence between Prussian and Lithuanian. For this reason, the separation of the central dialect, the ancestor of Lithuanian & Latvian, and the southern dialect, the ancestor of Prussian, should be closer to the result indicated by the score between Prussian & Latvian, i.e. 55.2%, reflecting 920 BC as the date of divergence with correction for the age of the Prussian language fragments (the coefficient 0.985 corresponding to the date c. 1400). 5. We have compared four attempts to apply glottochronology for the Slavic languages. All agree in the conclusion that the most divergent groups are East Slavic and Bulgarian-Macedonian. In three cases East Slavic is identified as the first separated branch, only Starostin saw BulgarMacedonian in this role. Applying ‘classical glottochronology’, Čejka and Vollmer reached very young data of divergence of Common Slavic – c. AD1000 (similarly Fodor – it was in fact his main objection against the method). Starostin’s dating to AD130 represents the opposite extreme. Without any reference in the historical documents it is necessary to use indirect evidence to verify it. The counter-argument may be sought in the stratum of archaic Germanic borrowings in Common Slavic, which have been ascribed to the Goths (cf. K i p a r s k y 1934, 192f). The most intensive contact was probably realized from the middle of the 4th cent., when the Slavs were integrated into the tribe union, formed by the Gothic king Ermanaric, as described by the Gothic historian Jordanes writing in the middle of the 6th cent. (Get. §119: Post Herulorum cede item 206 Hermanaricus in Venethos arma commovit, qui, quamvis armis despecti, sed numerositate pollentes, primum resistere conabantur. Sed nihil valet multitudo inbellium, praesertim ubi et deus permittit et multitudo armata advenerit. Nam hi, ut in initio expositionis vel catalogo gentium dicere coepimus, ab una stirpe exorti, tria nunc nomina ediderunt, id est Venethi, Antes, Sclaveni; qui quamvis nunc, ita facientibus peccatis nostris, ubique deseviunt, tamen tunc omnes Hermanarici imperiis servierunt). Elsewhere Jordanes informs us about the Slavic settlement of the first half of the 6th cent.: Introrsus illis Dacia est, ad coronae speciem arduis Alpibus emunita, iuxta quorum sinistrum latus, qui in aquilone vergit, ab ortu Vistulae fluminis per immensa spatia Venetharum natio populosa considet. Quorum nomina licet per varias familias et loca mutentur, principaliter tamen Sclaveni et Antes nominantur. Sclaveni a civitate Novitunense et lacu qui appellatur Mursiano usque ad Danastrum et in boream Viscla tenus commorantur: hi paludes silvasque pro civitatibus habent. Antes vero, qui sunt eorum fortissimi, qua Ponticum mare curvatur, a Danastro extenduntur usque ad Danaprum, quae flumina multis mansionibus ad invicem absunt (Get. §§34–35). From both passages it is apparent, that Jordanes recognized three ethnonyms relating to the Slavs: Venethi, Antes, Sclaveni. They cannot all reflect synonyms, since only Antes are localized between the rivers Dniestr and Dniepr. The Venethi must have lived left (i.e. west?) of the northern branch of the Carpathian Mountains (Alpes) and the source of the Vistula river. And the territory inhabited by the Sclaveni was defined by the city Novietunense, the Mursian lake and the rivers Vistula/Viscla and Danaster, i.e. Dniestr [§35]. This means that the territory of the Venethi was a part of the territory of the Sclaveni, complementary to the Antes. It is almost generally accepted that the Antes represented the ancestors of the East Slavs (e.g. N i e d e r l e 1953, 145– 47). It would imply the equation Venethi / Sclaveni = non-Antes. Briefly, the opposition Antes : non-Antes probably reflects the dichotomy East Slavic vs. Southwest Slavic. Jordanes’ contemporary, the Byzantine historian Procopius of Caesarea in his work ΥΠΕΡ ΤΩΝ ΠΟΛΕΜΩΝ ΛΟΓΟΙ differentiated only Σκλαβηνοί and Ἄνται: Χρόνῳ δὲ ὕστερον Ἄνται καὶ Σκλαβηνοὶ διάφοροι ἀλλήλοις γενόμενοι ἐς χεῖρας ἦλθον, ἔνθα δὴ τοῖς Ἄνταις ἡσσηθῆναι τῶν ἐναντίων τετύχηκεν. But he was sure that they 207 still used the same language: ἔστι δὲ καὶ μία ἑκατέροις φωνὴ ἀτεχνῶς βάρβαρος (III, 14). The separation of the Antes = East Slavs can thus be interpreted as the result of the disintegration of the Common Slavic ethnic and dialect continuum. 5.2. The first archaeological culture, for which a direct development to the historical Slavs was proposed, is Trziniec-Komarov, localized from Silesia to Central Ukraine and dated to the period 1500–1200 BC (G i m b u t a s 1963, 61; R y b a k o v 1978, 182–96; S e d o v 1979, 16; EIEC 338, 605–06; EIEC 526). This archaeological dating agrees with our glottochronological estimation of the disintegration of the Baltic and Slavic languages, c. 1400 BC. The separation of the ancestors of the Lithuanians & Latvians and Prussians, dated to the 9–8 cent. BC or better already to the 10 cent. BC (see above), correlates with the dating of the differences in the burial rites: after c. 1000 BC in the Southwest Baltic area the cremation was preferred, while in the East Baltic region inhumation burials continued (K i l i a n 1982, 47; EIEC 50). The reflex of the Slavic-Gothic symbiosis indicated by the stratum of East Germanic loanwords in Common Slavic, may be associated with at least one of the following cultures: Przeworsk from the territory of the upper Vistula-San-upper Dniestr, flourishing in the 2–4 cent. AD, Zarubincy from the basin of the upper Dniepr, dating from the 2 cent. BC to 2 cent. AD, Černjaxovo, known from the basins of the middle and lower Dniestr and Dniepr from the 2–5 cent. AD (EIEC 104–05, 470, 657; EIEC 526). The historically described Slavic expansion with its centre of gravity in the 6th cent. corresponds to the Prague & Penkov cultures. The Prague culture expanded in western Slavia (eastern Germany, Poland, Czech and Slovak Republics, Hungary, Romania, northwest Ukraine), the Penkov culture in eastern Slavia (in southern Ukraine, Moldova and Romania). The Penkov culture has been identified with Antes (EIEC 416, 448; EIEC 526). 6. Summing up, it is possible to reconstruct the prehistory and early history of the Balto-Slavic dialect continuum in time as follows: 15/14th cent. BC – crystalization of the proto-Slavs in the southern periphery of the proto-Baltic continuum, localized from Silesia to Central Ukraine (Trziniec-Komarov culture). Let us compare the 208 glottochronological estimates of the dates of divergence for some of the other Indo-European branches: Indo-Iranian – 2000 BC, Celtic – 1000 BC (Starostin; our date 1100 BC is very close), Germanic – 1st cent. BC, Tocharian – 1st cent. BC (see Appendix 2). These results represent unambiguous evidence for Balto-Slavic unity. 10/8th cent. BC – separation of the southwest Baltic dialect, the ancestor of Prussian, from the central Baltic dialect, the ancestor of Lithuanian and Latvian. The corresponding ancient communities differentiated in burial rites, namely the cremation vs. inhumation respectively. 200 AD – 5th cent. AD – coexistence of the Slavs and some East Germanic tribes (Goths?) in the territory from the upper Vistula and San to the middle Dniepr, i.e. including the probable Slavic homeland in the north and northeast of the Carpathian mountains. 6th cent. AD – Slavic expansion and first dialect differentiation between East Slavic (dialect of Antes) and the rest of Slavic. What was the first impuls for this disintegration? The migration and military activities of the Huns in Europe are probably too early (their power culminated in Europe in AD 375–453), on the other hand, the Avars came too late (568 is the date of their first conflict with the Byzantine Empire). Perhaps some of the East Germanic tribes, Goths or Gepids or both, occupying the territory between the Dniestr and the Carpathian Mountains, separated the Antes from other Slavs. 600 AD – separation of Latvian from the other central Baltic dialects, represented especially by Lithuanian. Regarding the phenomenon of Latvian palatalization, resembling the Slavic second palatalization, it is tempting to see here a specific Slavic influence, caused by the Slavic expansion, culminating in the 6th and 7th cent. Note: So called Pogańske gwary z Narewu probably represent a hybrid idiom based on the interference of Lithuanian & Latvian and Northeast Yiddish (S c h m i d 1986). From the point of view of Baltic dialectology, their identification with Yatwingian seems to be excluded. (To be continued in Blt 42(3)) 209 Petra NOVOTNÁ, Václav BLAŽEK Department of Linguistics & Baltic Studies Faculty of Arts of Masaryk University A. Nováka 1 CZ-60200 Brno Czech Republic [petano16@seznam.cz], [blazek@phil.muni.cz] 210

RELATED PAPERS

RELATED TOPICS

Log In

GLOTTOCHRONOLOGY AND ITS APPLICATION TO THE BALTO-SLAVIC LANGUAGES

GLOTTOCHRONOLOGY AND ITS APPLICATION TO THE BALTO-SLAVIC LANGUAGES

GLOTTOCHRONOLOGY AND ITS APPLICATION TO THE BALTO-SLAVIC LANGUAGES

GLOTTOCHRONOLOGY AND ITS APPLICATION TO THE BALTO-SLAVIC LANGUAGES

GLOTTOCHRONOLOGY AND ITS APPLICATION TO THE BALTO-SLAVIC LANGUAGES

Related Papers

RELATED PAPERS

RELATED TOPICS