A database designed to give more reliable and more readily available answers to questions concerning the distribution of phonological segments in the world's languages has been created as part of the research program of the UCLA Phonetics Laboratory. The database is known formally as the UCLA Phonological Segment Inventory Database, and for convenience is referred to by the acronym UPSID. UPSID has been used to investigate a number of hypothesized phonological universals and “universal tendencies”. Principal among these have been certain ideas concerning the overall size and structure of the phonological inventories. The design of the database is briefly described in this chapter. A full description is given in chapter 10, and the various appendices at the end of the book report on the data contained in UPSID files. The remainder of the present chapter discusses the issues involving the overall structure and size of phonological inventories which have been examined with its use.
Design of the database
The languages included in UPSID have been chosen to approximate a properly constructed quota sample on a genetic basis of the world's extant languages. The quota rule is that only one language may be included from each small family grouping, for example, among the Germanic languages, one is included from West Germanic and one from North Germanic (East Germanic, being extinct and insufficiently documented for a reliable phonological analysis to be made, is not included).