324
Paul Trilsbeek and Peter Wittenburg
3.3. Archive coherence
There are two diverging archiving strategies: (1) some digital archives fol-
low the principle of taking all digital material that is donated independent
of its format and store the material in the way in which it was delivered; (2)
others rely on a few well-supported open formats and require that all archi-
val objects are presented in these formats. It is obvious that a coherent ar-
chive, i.e. an archive relying on a few open formats, is more attractive for
users since it is easier to use. Despite the fact that it possibly imposes re-
quirements on them, it is also attractive for depositors in that coherence
increases the chance of preservation. It is easier and less costly to transform
a coherent archive into new formats as they will emerge in the coming dec-
ades. Maintaining an extremely incoherent archive and making its objects
accessible to users will always be more problematic and cost intensive. In
actual practice, most archives for endangered language resources will apply
a mixed strategy with different foci.
The optimal way for creating and maintaining a coherent archive is to
specify format requirements that have to be adhered to by depositors. How-
ever, such requirements may pose a problem for the depositors, as they may
be unwilling or unable to follow them (see Section 2.1 above). A way out
of this problem also practiced in the DoBeS programme is for the archive to
accept materials in a broader range of formats and to convert them before
ingestion as extensively as possible. The original formats need to be stored
as well, since conversions do not always preserve the full content of the
original. However, as indicated before, some original formats lack the nec-
essary explicitness and are not very well documented, making a conversion
expensive and prone to errors. Hence, depositors and archivists have to
agree on a selection of formats acceptable for the archive. Obviously, there
are also limits on the resources that an archive can afford to invest into
conversions, which may further limit the range of formats workable for a
given archive.
4. Short-term needs of known user groups
While the long-term requirements of archives are defined by the idea that
future generations will be interested in accessing comprehensive informa-
tion regarding cultures and languages of their ancestors, the short-term
needs are mainly defined by current usage scenarios. Technologically speak-
ing, their focus will be less on the storage side and more on the presentation
Chapter 13 – Archiving challenges
325
side. The presentation of material is determined by the available technology
on the one hand, and the interests of users on the other. In this section, we
briefly characterize some typical usage scenarios.
4.1. Internet access vs. local copying
Current technology advocates the use of online representation because via
the Internet, all media can be presented jointly, e.g. a transcription can be
viewed while listening to the corresponding audio file, a lexical entry can
be explained by a video clip, ritual ceremonies can be viewed in their com-
plex organization by using textual descriptions, listening to the voice of the
shaman and watching concomitant activities. The Internet will be more and
more preferred since it brings all digital information to the desk of the user
without having to worry about local storage capabilities, etc. However, the
presentation of high-quality videos is still a demanding task for networks.
For some users, including remote speech community centers, even the
transfer requirements of highly-compressed video formats such as MPEG4
may still be too much. So for some years to come, it will still be necessary
to provide local copies of archival materials for some users. Setting up such
local copies with all the components necessary for an optimal use, however,
is not a trivial task and needs to be planned ahead at the time when the basic
architecture of the archive is determined. Similarly, some users may not
have computers at their disposal so that, for example, a hardcopy version of
a resource such as a lexicon or a compilation of texts has to be provided.
Again, the basic architecture has to allow for such printed output.
4.2. What different user groups may be looking for
Researchers generally will want to discover suitable material by posing
complex metadata and/or content questions. They may, for example, want
to analyze the rich linguistic encoding contained in a lexicon in conjunction
with ethnographic notes. Based on new insights obtained by browsing ar-
chival materials, they may want to add new types of annotations or draw
relations between elements within a lexicon or even across documents. In
short, a language archive is seen as a multi-dimensional and multi-medial
space in which they want to navigate easily, view fragments, combine in-
formation, and create extensions of various sorts. This requires that each
resource contained in the archive can be discovered and accessed separately
326
Paul Trilsbeek and Peter Wittenburg
and that it is stored as neutrally as possible. Web-based analysis and anno-
tation frameworks with stereotypic viewers and major functionality may be
of help here for researchers who are not computer specialists. For special-
ists, open and well-documented formats are essential to allow them to write
their own software.
may want to use the material for
entertainment, self-reflection, or educational purposes. They often will be
interested primarily in audio and video recordings, i.e. the raw material. But
we also expect community members to indicate errors of various sorts and
to fill in missing information, i.e. they too may want to extend and enrich
the archive.
In collaboration with educators or documenting researchers, community
members may want to create school material that can be used to teach
community members. This may require the combination of different media
into one single multimedia presentation. Alternatively, the goal may be a
book that combines text and images. To prepare such a resource, one needs
to have a good overview of all material available and to have access to
every single object or even fragments of objects, such as short video clips
extracted from lengthy recordings. In both cases, the archive has to offer
atomic objects in their original form.
For many indigenous communities it will be important to have easy and
direct access to methods and presentation styles that are adapted to their
own culture (cf. the concept of “mobilized data” discussed in Chapter 15).
It is unlikely that archives will be able to offer such highly-customized data
access, because they generally will lack the necessary resources and exper-
tise. This is also true for the creation of educational materials. However,
archives can facilitate the creation of both types of data presentations as
much as possible by offering the resources in a neutral and open form so
that specialists can combine them in a flexible way.
Material contained in language resource archives can be expected to be
used as educational resources at universities and schools. Undergraduate
students, for example, may be asked to search for a specific phenomenon in
an archive or to carry out certain extensions of the material by adding anno-
tations, lexical attributes, comments, relations, etc. Education at the level of
primary and secondary schools, however, will probably require simpler and
more attractive discovery and presentation methods than the ones provided
by a multipurpose archive.
Journalists working on a broad variety of topics ranging from general
interest topics relating to language and culture, to specific issues pertaining to
Dostları ilə paylaş: |