assigned to the project full time. The Ethnologue shares space and resources with other
SIL projects on the Dallas campus. While it is one of SIL’s most visible and well-known
projects, it consumes a tiny fraction of SIL’s $150 million annual budget. Editorial
policies must also fit within these resource constraints, when it comes to producing a
printed volume or providing information services over the Internet. For example, there is
no one in SIL assigned to the Ethnologue for the purpose of developing its web-based
services — it shares maintenance of its website with SIL more generally — making
development of new forms of web-based presentation unlikely.
2.2. The Linguasphere Register
The Linguasphere Register is a comprehensive list of speech communities representing a
career-spanning effort of David Dalby to provide a complete catalogue of the world’s
speech communities and their relations to one another. Compilation of data that was
eventually incorporated into the Linguasphere was begun by Dalby in the 1950s, and the
Linguasphere Observatory, which now oversees the project, was founded in 1983.
Preview editions of the Linguasphere register were published in 1997 (formally presented
to the UNESCO Director-General) and 1998, and the framework edition was published in
2000.
While both the Linguasphere Register and the Ethnologue both aim to be
comprehensive catalogues of the world’s languages, the aims of the Linguasphere register
are somewhat different, and this is reflected in both its structure and organization. First, it
explicitly seeks to treat language and language varieties as a global system of
communication (the “linguasphere”). This leads it to adopt the speech community as its
smallest unit of analysis. A speech community is a group of people who are bound
together by regular patterns and norms of communication. In the conception used by the
Linguasphere Registry, speech communities constitute a hierarchy of specificity from
individual locales at the lowest level up to the entire community of humanity. Since
speech communities often cross national boundaries, the Linguasphere Register places
less emphasis on the borders of countries than in the Ethnologue, in which border-area
speech communities are split into separate entries under each country.
The primary goal of the Linguasphere Register is to place all human speech
communities into a comprehensive taxonomy of language varieties. Where most
linguistic taxonomies, including that of the Ethnologue, emphasize historical (“genetic”)
relationships among language varieties, the taxonomy used in the Linguasphere Register
does not use historical origin as its sole organizing criterion. Instead, “sectors” and
“zones” are established as the two outermost levels of classification. Both zones and
sectors can pertain to either geographic region (e.g. “African geosector”, “East Sahel
geozone”) or linguaitic family affiliation (“Afro-Asian phylosector”, “Semitic
phylozone”). These two levels of classification are partly independent. Geosectors may
contain either geozones or phylozones. In more traditional linguistic family
classifications, phylozones within a common geosector would simply be treated as
separate families without grouping them together with other families in any way.
Phylosectors appear to only have phylozones within them, and do not contain geozones.
Presumably, this is because language family has already been accepted as the taxonomic
principle for classifying these languages. Hence, the primary consideration in classifying
any speech community cones down to its family relatedness to other languages, or its
lack of established family relatedness. The inclusion of geographic classifications
nonetheless permits the Linguasphere Register to recognize classifications of
linguistically and geographically similar languages where a common historical
antecedent cannot be established (e.g. the North America geosector, the Sepik Valley
geozone). An advantage of this is that it becomes easier to navigate the taxonomy from
the top-levels and work down to find a desired language or group.
Each speech community listed in the Linguasphere Register is given a unique
language code that identifies its place within the taxonomy. The sector and zone of each
language are encoded in the two-digit prefix of each code. These sectors and zones are
considered to be fixed, and not subject to future change. The remainder of the code is a
sequence of up to six characters from the roman alphabet, the number of characters
depending on the level of detail of classification of the speech community. The first three
characters are upper case, and reflect the set, chain and net to which the speech
community belongs, respectively. The remaining three are in lower-case and correspond
to “outer language”, “inner language” and dialect, respectively. Outer and inner language
represent terminology unique to the Linguasphere Register that are not widely current in
linguistics, and they are not clearly defined in the register.
Hence, the Linguasphere Register provides a maximum of eight levels of
taxonomic classification. As an example of the Register’s classifications, consider
English and English-based Creoles, which are placed, within the Germanic phylozone of
the Indo-European Phylosector (52). The English net of speech communities is labeled
52-ABA, where the first A indicates English is part of a set with Norse (Scandanavian)
and Frysk (Frisian), the B indicates it is part of a chain involving English and Anglo-
Creoles, and the second A distinguishes the English net from the Anglo-Creole net,
identified as 52-ABB. Within Anglo-Creole, Caribbean Anglo-Creole is recognized as an
outer language (52-ABB-a), which has several inner languages (e.g. Gullah Creole 52-
ABB-aa, Belizean Creole 52-ABB-ad, etc.) and dialects (e.g. belize-creole-urban 52-
ABB-ada, belize-creole-vehicular 52-ABB-adb, etc.). The assignment of alphabetic
symbols at each level is arbitrary, serving only to identify specific groups of speech
communities as related or distinct.
Entries in the Linguasphere Register are organized in five columns. The first
column gives the taxonomic code for the speech community represented in the entry. The
second gives the name of the speech community so classified. The third column gives
alternative names and explanatory comments. The fourth indicates the geographic
location of the speech community, and the fifth indicates the relative size of the speech
community. Populations of the language groups are a secondary concern in the
Linguasphere Register, and not generally given for all taxonomic levels of speech
community identified. Typically, only figures of outer languages are given, although
sometimes there are figures for inner languages. In addition, populations are merely given
as a single digit (1 through 9) indicating the magnitude of the population of speakers as a