List of languages by number of native speakers - WikiMili, The Best Wikipedia Reader

List of languages by number of native speakers

Last updated

Current distribution of human language families Human Language Families Updated.jpg
Current distribution of human language families

Human languages ranked by their number of native speakers are as follows. All such rankings should be used with caution, because it is not possible to devise a coherent set of linguistic criteria for distinguishing languages in a dialect continuum. [1] For example, a language is often defined as a set of mutually intelligible varieties, but independent national standard languages may be considered separate languages even though they are largely mutually intelligible, as in the case of Danish and Norwegian. [2] Conversely, many commonly accepted languages, including German, Italian and even English encompass varieties that are not mutually intelligible. [1] While Arabic is sometimes considered a single language centred on Modern Standard Arabic, other authors consider its mutually unintelligible varieties separate languages. [3] Similarly, Chinese is sometimes viewed as a single language because of a shared culture and common literary language. [4] It is also common to describe various Chinese dialect groups, such as Mandarin, Wu and Yue, as languages, even though each of these groups contains many mutually unintelligible varieties. [5]

Contents

There are also difficulties in obtaining reliable counts of speakers, which vary over time because of population change and language shift. In some areas, there is no reliable census data, the data is not current, or the census may not record languages spoken, or record them ambiguously. Sometimes speaker populations are exaggerated for political reasons, or speakers of minority languages may be underreported in favour of a national language. [6]

Top languages by population

Ethnologue (2024)

The following languages are listed as having at least 50 million first-language speakers in the 26th edition of Ethnologue published in 2023. [7] This section does not include entries that Ethnologue identifies as macrolanguages encompassing all their respective varieties, such as Arabic, Lahnda, Persian, Malay, Pashto, and Chinese.

Languages with at least 50 million first-language speakers [7]
LanguageNative speakers
(in millions)
Language familyBranch
Mandarin Chinese 941 Sino-Tibetan Sinitic
Spanish 486 Indo-European Romance
English 380 Indo-European Germanic
Hindi 345 Indo-European Indo-Aryan
Bengali 237 Indo-European Indo-Aryan
Portuguese 236 Indo-European Romance
Russian 148 Indo-European Balto-Slavic
Japanese 123 Japonic Japanese
Yue Chinese 86 Sino-Tibetan Sinitic
Vietnamese 85 Austroasiatic Vietic
Turkish 84 Turkic Oghuz
Wu Chinese 83 Sino-Tibetan Sinitic
Marathi 83 Indo-European Indo-Aryan
Telugu 83 Dravidian South-Central
Western Punjabi 82 Indo-European Indo-Aryan
Korean 81 Koreanic
Tamil 79 Dravidian South
Egyptian Arabic 78 Afroasiatic Semitic
Standard German 76 Indo-European Germanic
French 74 Indo-European Romance
Urdu 70 Indo-European Indo-Aryan
Javanese 68 Austronesian Malayo-Polynesian
Italian 64 Indo-European Romance
Iranian Persian 62 Indo-European Iranian
Gujarati 58 Indo-European Indo-Aryan
Hausa 54 Afroasiatic Chadic
Bhojpuri 53 Indo-European Indo-Aryan
Levantine Arabic 51 Afroasiatic Semitic
Southern Min 51 Sino-Tibetan Sinitic

CIA World Factbook (2018 estimates)

According to the CIA World Factbook , the most-spoken first languages in 2018 were: [8]

Top first languages by population per CIA [8]
RankLanguagePercentage
of world
population
(2018)
1 Mandarin Chinese 12.3%
2 Spanish 6.0%
3 English 5.1%
3 Arabic 5.1%
5 Hindi 3.5%
6 Bengali 3.3%
7 Portuguese 3.0%
8 Russian 2.1%
9 Japanese 1.7%
10 Western Punjabi 1.3%
11 Javanese 1.1%

See also

Related Research Articles

<span class="mw-page-title-main">Arabic</span> Semitic language and lingua franca of the Arab world

Arabic is a Central Semitic language of the Semitic language family spoken primarily in the Arab world. The ISO assigns language codes to 32 varieties of Arabic, including its standard form of Literary Arabic, known as Modern Standard Arabic, which is derived from Classical Arabic. This distinction exists primarily among Western linguists; Arabic speakers themselves generally do not distinguish between Modern Standard Arabic and Classical Arabic, but rather refer to both as al-ʿarabiyyatu l-fuṣḥā or simply al-fuṣḥā (اَلْفُصْحَىٰ).

<span class="mw-page-title-main">Chinese language</span> National language of China

Chinese is a group of languages spoken natively by the ethnic Han Chinese majority and many minority ethnic groups in China. Approximately 1.35 billion people, or around 16% of the global population, speak a variety of Chinese as their first language.

Dialect refers to two distinctly different types of linguistic relationships.

Ethnologue: Languages of the World is an annual reference publication in print and online that provides statistics and other information on the living languages of the world. It is the world's most comprehensive catalogue of languages. It was first issued in 1951, and is now published by SIL International, an American evangelical Christian non-profit organization.

<span class="mw-page-title-main">Hakka Chinese</span> Sinitic language originating in southern China

Hakka forms a language group of varieties of Chinese, spoken natively by the Hakka people in parts of Southern China and some diaspora areas of Taiwan, Southeast Asia and in overseas Chinese communities around the world.

<span class="mw-page-title-main">Languages of Africa</span>

The number of languages natively spoken in Africa is variously estimated at between 1,250 and 2,100, and by some counts at over 3,000. Nigeria alone has over 500 languages, one of the greatest concentrations of linguistic diversity in the world. The languages of Africa belong to many distinct language families, among which the largest are:

<span class="mw-page-title-main">Southern Min</span> Branch of the Min Chinese languages

Southern Min, Minnan or Banlam, is a group of linguistically similar and historically related Chinese languages that form a branch of Min Chinese spoken in Fujian, most of Taiwan, Eastern Guangdong, Hainan, and Southern Zhejiang. Southern Min dialects are also spoken by descendants of emigrants from these areas in diaspora, most notably in Southeast Asia, such as Singapore, Malaysia, the Philippines, Indonesia, Brunei, Southern Thailand, Myanmar, Cambodia, Southern and Central Vietnam, San Francisco, Los Angeles and New York City. Minnan is the most widely-spoken branch of Min, with approximately 48 million speakers as of 2017–2018.

<span class="mw-page-title-main">Varieties of Chinese</span> Family of local language varieties

There are hundreds of local Chinese language varieties forming a branch of the Sino-Tibetan language family, many of which are not mutually intelligible. Variation is particularly strong in the more mountainous southeast part of mainland China. The varieties are typically classified into several groups: Mandarin, Wu, Min, Xiang, Gan, Jin, Hakka and Yue, though some varieties remain unclassified. These groups are neither clades nor individual languages defined by mutual intelligibility, but reflect common phonological developments from Middle Chinese.

<span class="mw-page-title-main">Languages of Pakistan</span> Overview of languages spoken in Pakistan

Pakistan is a multilingual country with over 70 languages spoken as first languages. The majority of Pakistan's languages belong to the Indo-Iranian group of the Indo-European language family.

A dialect continuum or dialect chain is a series of language varieties spoken across some geographical area such that neighboring varieties are mutually intelligible, but the differences accumulate over distance so that widely separated varieties may not be. This is a typical occurrence with widely spread languages and language families around the world, when these languages did not spread recently. Some prominent examples include the Indo-Aryan languages across large parts of India, varieties of Arabic across north Africa and southwest Asia, the Turkic languages, the Chinese languages or dialects, and parts of the Romance, Germanic and Slavic families in Europe. Terms used in older literature include dialect area and L-complex.

<span class="mw-page-title-main">Zhuang languages</span> Various Tai languages used by the Zhuang people of southern China

The Zhuang languages are any of more than a dozen Tai languages spoken by the Zhuang people of Southern China in the province of Guangxi and adjacent parts of Yunnan and Guangdong. The Zhuang languages do not form a monophyletic linguistic unit, as northern and southern Zhuang languages are more closely related to other Tai languages than to each other. Northern Zhuang languages form a dialect continuum with Northern Tai varieties across the provincial border in Guizhou, which are designated as Bouyei, whereas Southern Zhuang languages form another dialect continuum with Central Tai varieties such as Nung, Tay and Caolan in Vietnam. Standard Zhuang is based on the Northern Zhuang dialect of Wuming.

<span class="mw-page-title-main">Marwari language</span> Language spoken in Rajasthan, India

Marwari is a language family within the Rajasthani language family of the Indo-Aryan languages, as well as individual language within this group. It is spoken in the Indian state of Rajasthan, as well as the neighbouring states of Gujarat and Haryana, some adjacent areas in eastern parts of Pakistan, and some migrant communities in Nepal. Most prominent languages included within Marwari are Marwari, Dhundhari, Shekhawati and Mewari. There are two dozen varieties of Marwari. Marwari is also referred to as simply Rajasthani.

<span class="mw-page-title-main">Mutual intelligibility</span> Closeness of linguistic varieties

In linguistics, mutual intelligibility is a relationship between languages or dialects in which speakers of different but related varieties can readily understand each other without prior familiarity or special effort. It is sometimes used as an important criterion for distinguishing languages from dialects, although sociolinguistic factors are often also used.

A pluricentric language or polycentric language is a language with several codified standard forms, often corresponding to different countries. Many examples of such languages can be found worldwide among the most-spoken languages, including but not limited to Chinese in mainland China, Taiwan and Singapore; English in the United States, United Kingdom, Canada, Australia, New Zealand, Ireland, South Africa, India, and elsewhere; and French in France, Canada, and elsewhere. The converse case is a monocentric language, which has only one formally standardized version. Examples include Japanese and Russian. In some cases, the different standards of a pluricentric language may be elaborated to appear as separate languages, e.g. Malaysian and Indonesian, Hindi and Urdu, while Serbo-Croatian is in an earlier stage of that process.

Autonomy and heteronomy are complementary attributes of a language variety describing its functional relationship with related varieties. The concepts were introduced by William A. Stewart in 1968, and provide a way of distinguishing a language from a dialect.

Linguistic demography is the statistical study of languages among all populations. Estimating the number of speakers of a given language is not straightforward, and various estimates may diverge considerably. This is first of all due to the question of defining "language" vs. "dialect". Identification of varieties as a single language or as distinct languages is often based on ethnic, cultural, or political considerations rather than mutual intelligibility. The second difficulty is multilingualism, complicating the definition of "native language". Finally, in many countries, insufficient census data add to the difficulties.

Northeastern Neo-Aramaic (NENA) is a grouping of related dialects of Neo-Aramaic spoken before World War I as a vernacular language by Jews and Assyrian Christians between the Tigris and Lake Urmia, stretching north to Lake Van and southwards to Mosul and Kirkuk. As a result of the Assyrian genocide, Christian speakers were forced out of the area that is now Turkey and in the early 1950s most Jewish speakers moved to Israel. The Kurdish-Turkish conflict resulted in further dislocations of speaker populations. As of the 1990s, the NENA group had an estimated number of fluent speakers among the Assyrians just below 500,000, spread throughout the Middle East and the Assyrian diaspora. In 2007, linguist Geoffrey Khan wrote that many dialects were nearing extinction with fluent speakers difficult to find.

The Fangyan was a Chinese dictionary compiled in the early 1st century CE by the Western Han dynasty poet and philosopher Yang Xiong. It was the first Chinese dictionary to include significant regional vocabulary, and is considered the "most significant lexicographic work" of its era. His dictionary's preface explains how he spent 27 years amassing and collating the dictionary. Yang collected regionalisms from many sources, particularly the 'light carriage' surveys made during the Zhou and Qin dynasties, where imperial emissaries were sent into the countryside annually to record folk songs and idioms from across China, reaching as far north as Korea.

Rawang, also known as Krangku, Kiutze (Qiuze), and Ch’opa, is a Sino-Tibetan language of India and Burma. Rawang has a high degree of internal diversity, and some varieties are not mutually intelligible. Most, however, understand Mutwang (Matwang), the standard dialect, and basis of written Rawang.

References

  1. 1 2 Paolillo, John C.; Das, Anupam (31 March 2006). "Evaluating language statistics: the Ethnologue and beyond" (PDF). UNESCO Institute of Statistics. pp. 3–5. Archived (PDF) from the original on 10 January 2017. Retrieved 17 November 2018.
  2. Chambers, J.K.; Trudgill, Peter (1998). Dialectology (2nd ed.). Cambridge University Press. ISBN   978-0-521-59646-6.
  3. Kaye, Alan S.; Rosenhouse, Judith (1997). "Arabic Dialects and Maltese". In Hetzron, Robert (ed.). The Semitic Languages. Routledge. pp. 263–311. ISBN   978-0-415-05767-7.
  4. Norman, Jerry (1988). Chinese. Cambridge University Press. p. 2. ISBN   978-0-521-29653-3.
  5. Norman, Jerry (2003). "The Chinese dialects: phonology". In Thurgood, Graham; LaPolla, Randy J. (eds.). The Sino-Tibetan languages . Routledge. pp.  72–83. ISBN   978-0-7007-1129-1.
  6. Crystal, David (1988). The Cambridge Encyclopedia of Language . Cambridge University Press. pp.  286–287. ISBN   978-0-521-26438-9.
  7. 1 2 Statistics, in Eberhard, David M.; Simons, Gary F.; Fennig, Charles D., eds. (2024). Ethnologue: Languages of the World (27th ed.). Dallas, Texas: SIL International.
  8. 1 2 "The World Factbook. People and Society. Languages". The World Factbook . Central Intelligence Agency. 29 November 2023. Retrieved 30 November 2023.