Chapter 1
Language documentation:
What is it and what is it good for?
Nikolaus P. Himmelmann
Introduction
This chapter defines language documentation as a field of linguistic inquiry
and practice in its own right which is primarily concerned with the compi-
lation and preservation of linguistic primary data and interfaces between
primary data and various types of analyses based on these data. Further-
more, it argues (in Section 2) that while language endangerment is a major
reason for getting involved in language documentation, it is not the only
one. Language documentations strengthen the empirical foundations of
those branches of linguistics and related disciplines which heavily draw on
data of little-known speech communities (e.g. linguistic typology, cognitive
anthropology, etc.) in that they significantly improve accountability (verifi-
ability) and economizing research resources.
The primary data which constitute the core of a language documentation
include audio or video recordings of a communicative event (a narrative, a
conversation, etc.), but also the notes taken in an elicitation session, or a
genealogy written down by a literate native speaker. These primary data are
compiled in a structured corpus and have to be made accessible by various
types of annotations and commentary, here summarily referred to as the
“apparatus”. Sections 3 and 4 provide further discussion of the components
and structure of language documentations. Section 5 concludes with a pre-
view of the remaining chapters of this book.
1. What is a language documentation?
An initial, preliminary answer to this question is: a language documenta-
tion is a lasting, multipurpose record of a language. This answer, of
course, is not quite satisfactory since it immediately raises the question of
2
Nikolaus P. Himmelmann
what we mean by “lasting”, “multipurpose” and “record of a language”. In
the following, these constituents of the definition are taken up in reverse
order, beginning with “record of a language”.
At first sight, a further definition of “record of a language” may look
like a bigger a problem than it actually is since it involves the highly com-
plex and controversial issue of defining “a language”. The main problem
with defining “a language” consists in the fact that the word language refers
to a number of different, though interrelated phenomena. The problems in
defining it vary considerably, depending on which phenomenon is focused
upon. That is, different problems surface when the task is to define lan-
guage as opposed to dialect, or language as a field of scientific enquiry, or
language as a cognitive faculty of humans, and so on. Unless we want to
postpone working on language documentations until the probably never
arriving day when all the conceptual problems of defining language in all
of its different senses are resolved and a theoretically well-balanced delimi-
tation of “a language” for the purposes of language documentations is pos-
sible, we need a pragmatic approach in dealing with this problem.
The basic tenet of such a pragmatic approach is implied by the qualifiers
multipurpose and lasting in the definition above: The net should be cast as
widely as possible. That is, a language documentation should strive to in-
clude as many and as varied records as practically feasible, covering all
aspects of the set of interrelated phenomena commonly called a language.
Ideally, then, a language documentation would cover all registers and varie-
ties, social or local; it would contain evidence for language as a social prac-
tice as well as a cognitive faculty; it would include specimens of spoken
and written language; and so on.
A language documentation broadly conceived along these lines could
serve a large variety of different uses in, for example, language planning
decisions, preparing educational materials, or analyzing a set of problems
in syntactic theory. Users of such a multipurpose documentation would
include the speech community itself, national and international agencies
concerned with education and language planning, as well as researchers in
various disciplines (linguistics, anthropology, oral history, etc.). In fact, the
qualifier lasting adds a long-term perspective which goes beyond current
issues and concerns. The goal is not a short-term record for a specific pur-
pose or interest group, but a record for generations and user groups whose
identity is still unknown and who may want to explore questions not yet
raised at the time when the language documentation was compiled.
Obviously, this pragmatic explication of “lasting, multipurpose record
of a language” rests on the assumption that it is possible and useful to com-
Chapter 1 – Language documentation: What is it and what is it good for?
3
pile a database for a very broadly defined subject matter (“a language”)
without being guided by a specific theoretical or practical problem in mind
which could be resolved on the basis of this database. With regard to its use
in scientific inquiries, the validity of this assumption is shown by the suc-
cess of all those social and historical disciplines working with data not spe-
cifically produced for research purposes. Thus, for example, cave dwellers
in the Stone Age did not discard shellfish, animal bones, fragments of tools,
and the like within the cave with the purpose in mind of documenting their
presence and aspects of their diet and culture. But archeologists today use
this haphazardly discarded waste as the primary data for determining the
length and type of human occupation found in a given location. Similarly,
inscriptions on stones, bones, or clay tablets were not produced in order to
provide a record of linguistic structures and practices, but they have suc-
cessfully been used to explore the structural properties of languages such as
Hittite or Sumerian, which had already been extinct for millennia before
their modern linguistic analysis began.
However, it is also well known that historical remains and records tend
to be deficient in some ways with regard to modern purposes. Stone in-
scriptions and other historic documents with linguistic content, for exam-
ple, never provide a comprehensive record of the linguistic structures and
practices in use in the community at the time when these documents were
written. Thus, given that the Hittite records discovered to date mostly per-
tain to matters of government, law, trade, and religion, it remains unknown
how Hittite adolescents chatted with each other or whether it was possible
to have the verb in first position in subordinate clauses.
1
The experience with historical remains and records thus is ambivalent:
On the one hand, it clearly shows that they may serve as the database for
exploring issues they were not intended for. On the other hand, they show
that haphazardly compiled databases hardly ever contain all the information
one needs to answer all the questions of current interest. Based on this ob-
servation, the basic idea of a language documentation as developed here
can be stated as follows: The goal is to create a record of a language in the
sense of a comprehensive corpus of primary data which leaves nothing to
be desired by later generations wanting to explore whatever aspect of the
language they are interested in (what exactly is meant by “primary data”
here is further discussed in Section 3.1.1 below).
Put in this way, the task of compiling a language documentation is
enormous, and there is no principled upper limit for it. Obviously, every
specific documentation project will have to limit its scope and set specific
Dostları ilə paylaş: |