Microsoft Word u lg rept doc

Evaluation of language statistics

Yüklə 1,04 Mb.

Pdf görüntüsü

səhifə	13/24
tarix	30.10.2018
ölçüsü	1,04 Mb.
	#76647

1 ... 9 10 11 12 13 14 15 16 ... 24

3. Evaluation of language statistics
3.1. Sources and currency of data

3. Evaluation of language statistics

The Ethnologue provides extensive information on its sources of information both in the

entries itself and in the bibliography. Hence, it is possible to get an idea of the nature of

the information in the Ethnologue and the quality of its data. The Linguasphere Register

does not provide the same kind of documentation within entries, but instead provides

links to many of its sources on various pages of its website. Hence, we cannot evaluate

the Linguasphere directly, but we can compare it to the Ethnologue to ascertain Of

particular interest are the cited population figures, their source and the currency of the

data represented. Also of interest are the methods by which the data were collected

(through field linguistic survey, census, etc.). Finally, we should also be interested in

what, if anything, we can learn from the statistics that are presented. By tabulating the

statistics presented in different ways, and attempting to understand what they might tell

us about language populations, diversity and endangerment, we can potentially learn

about the gaps in the existing knowledge about languages and their speakers, as well as

the nature of the sources of information that we do have.

This section comprises an evaluation primarily of the Ethnologue, comparing at

relevant points to the Linguasphere Register as well as other relevant references. The

evaluation of language entries is conducted on two distinct sets of data. The first is a

random sample of 2001 entries from the 15

edition of the Ethnologue, for which we can

conduct a more in-depth investigation. The second data set is the complete set of

language entries from the 14

edition of the Ethnologue, which was collected for an

earlier project (Paolillo 2005). We also undertake a separate analysis of country entries

and maps, from the 15

edition.

The analysis of the language entries proceeds in three parts. First, in section 3.1

we investigate the cited sources for language entries, using summary counts of the

different sources classified according to type. Second, we examine the currency of the

data across language entries, by source, language family, country and region. Third, we

investigate the language group sizes recorded in the Ethnologue using the same

breakdowns as for currency, also comparing with the Linguasphere Register to examine

the consistency across the two resources. We then consider location information in

section 3.4, information about media, literatures and language use in section 3.5,

followed by classification issues in section 3.6. A short summary in section 3.8 concludes

this section of the report.

3.1. Sources and currency of data

Our evaluation of the sources of the Ethnologue is based upon the random sample of

2001 language entries. We identified the population estimate for each entry, if present,

and identified its source, and the year of the citation. We then classified each of the

sources according to one of several types: SIL, academic, Government, World Christian

Database, other Christian missionary, and other sources. When multiple sources were

given for a single entry, we used only the most recent one to determine both source type

and year. A number of entries had a date for a population figure, but no source. These

were recorded as “not indicated”. Still others gave a population figure, but had no source

or date. These were recorded as “none”. Finally, a number had no population estimate,

and hence no source information for it; these were recorded as “no estimate”. The types

and year of sources are cross-tabulated in Table 2.

Table 2. Type of source by year for population figures in a random sample of 2001

Ethnologue entries.

1920-5 1956-65 1966-75 1976-85 1986-95 1996-pres.

Total

SIL

169

242

519

Academic

149

104

204

477

Government

114

103

245

WCD

157

160

Missionary

121

Other

Not indicated

192

282

None

118

No estimate

Total

298

526

951

2001

Table 2 indicates that almost half of the Ethnologue’s sources for population

figures in the language entries are relatively recent; the bulk of the remainder fall within

the last 30 years, but there are some disturbingly old sources, such as one from 1920 and

one from 1925, in this sample. The two languages in question are both reportedly spoken

in Nigeria: Beele [bxq], 120 speakers in Bauchi state in a few villages near the Bole, and

Sheni [scv], 200 speakers in Kaduna state. It is unclear whether these would have

survived to the present day with such small numbers of speakers. Nigeria has 510 living

languages listed, so perhaps it is understandable that these small languages have been

missed in subsequent reports.

The distribution of source types indicates that the Ethnologue relies on SIL

sources for more than a quarter of its population estimates, and nearly as many from

academic sources. Presumably this is because many of the languages reported in the

Ethnologue are smaller and would not be reliably individuated by government and other

sources. A second major source of population estimates comes from the World Christian

Database (WCD) and other Christian missionary sources, collectively accounting for just

over a tenth of the language entries. What distinguishes these sources from many others

is the possibility that they have staff reporting these estimates from the field, in the

manner of academic linguists and SIL. However, it is less likely that such estimates

would be from trained linguists employing established language survey methods.

Conversations with the Ethnologue editorial indicated that their main concern with these

data sources is that they might report ethnic populations, instead of actual language

populations. While the two methods of counting can give similar estimates, it is

hazardous to assume so, especially in cases of language shift.

The “not indicated” and “none” categories also account for a large proportion of

the language entries. For both of these categories, the Ethnologue staff surmised that

Yüklə 1,04 Mb.

Dostları ilə paylaş:

1 ... 9 10 11 12 13 14 15 16 ... 24