330
Paul Trilsbeek and Peter Wittenburg
guides, electronic newspaper articles, and conference contributions will also
help in spreading relevant information.
5.2. Archivist–user interaction
To date, there is only limited experience regarding the interaction with dif-
ferent user groups of an archive. The following is almost exclusively based
on the many discussions with various DoBeS teams. In addition, we had
some interactions with journalists working on stories about language en-
dangerment and cultural heritage preservation.
Figure 5 summarizes major topics in the interaction between archivists
and users, and major methods used in complying with user requests.
Figure 5. Topics in the interaction between users and archivists
Various demands on discovery procedures were discussed in Section 4.2
above and will not be repeated here. Once a user has found some interesting
resources, he or she must be able to download or copy them. It should be
possible to copy whole subcorpora, including the metadata descriptions and
the resources. As already mentioned in Section 4.1, it should not be too
difficult to install a fully operational copy on another computer, for exam-
ple, in a local community center.
When a single resource, such as an annotation, a simple lexicon, or a
media file, is found with the help of metadata, it should also be possible to
play or visualize it directly using the usual web-browser plug-ins. However,
for complex linguistic data types, such as annotated media files that consist
of various media streams and several layers of annotations, this will not
Archivist–User Interaction
“Agreements” about:
– data distribution
– metadata-based navigation
– neutral access to objects
– web-based exploitation
– print facilities
Methods for:
– simple and complex search
– web-based presentation
– commenting
– establishing relations
– conversion (on the fly)
User
Archivist
Views
Archive
Chapter 13 – Archiving challenges
331
work with standard browsers. Here, more specialized browsers are required
which can exploit the bundling of different media types. ELAN and
LEXUS, developed at the MPI for Psycholinguistics, are such tools. An-
other approach is used by SMIL (Synchronized Multimedia Integration
Language), which is a World Wide Web consortium standard for integrat-
ing multimedia files. It can be used, e.g., for adding subtitles to a video
recording. A SMIL file does not contain the actual media themselves, but
contains links referring to them. A media player supporting the SMIL stan-
dard is needed in order to display the combined media files.
In general, we can expect more tools to be developed that support com-
plex operations using web access as a basis. LEXUS is such a framework
that allows one to create new lexica and manipulate existing ones via the
web. ANNEX is a framework for operating with a set of annotated multi-
media files via the web. ANNEX and LEXUS allow the user to collect
various annotated media files or lexica from different subarchives with the
clear intention to support crosslanguage work. Mechanisms to solve struc-
tural and semantic interoperability problems are in the process of being
designed. The selection of the resources is done based on metadata brows-
ing and/or searching.
A functionality that is often requested by researchers is the possibility to
create printouts of the materials deposited in an archive. While this may
seem to be a simple task, it involves many decisions that a developer has to
make on how to generate paper layouts for computer-based material. Dif-
ferent researchers may also have different requirements in this respect. To
date, there is no standard technology that can be used by inexperienced
users to associate their own layout with richly structured XML documents,
although the basic technology (XSLT) is available.
6. Access management
As long as individual researchers or projects were responsible for the re-
corded data and stored them in their offices, the legal and ethical problems
involved with holding and using such data did not become apparent. Due to
some cases of misuse, the availability of data via the Internet, a greater
general awareness regarding the relevance of ethical issues, and the intro-
duction of language archives as a new abstract type of institution between
the researcher and the consultants, legal and ethical issues have recently
received much more attention. Any archive will be faced with a number of
legal and ethical issues and has to treat them with great sensitivity.
332
Paul Trilsbeek and Peter Wittenburg
6.1. Legal and ethical issues
The legal situation of an archive tends to be very complex, since usually
different legal systems are involved. The speech community may be located
in one country, the researcher in another country, and the archivist even in a
third, all with potentially different legal systems. There are great differences,
e.g., between Australia, Europe, and the U.S. with respect to copyright laws,
which is one of the legal aspects of potential relevance for the resources
that archives store. For further details and problems, see Liberman (2000)
and Chapter 2.
Given the complexity and relative newness of all legal matters relating
to language archives, it is currently difficult if not impossible to get formal
legal advice. Nevertheless, it is necessary that an archive defines the legal
basis for its activity and comes to workable agreements with depositors and
users. Among other things, it has to claim the right to archive the deposited
material and it has to reserve all rights on the materials for the creators. It
also has to claim the right to give access to the resources, based on an in-
formed consent achieved by researcher(s) and speaker(s) with regard to
possible uses of the collected materials. Documents detailing these claims
and agreements should be made available to everyone via the web site so
that everyone is informed about the rules that apply in accessing and using
the archive.
Since many legal aspects remain uncertain and probably will remain
uncertain for some time to come, it is of crucial importance to develop a
relationship based on mutual trust among all participants. In this regard, it
will be useful to develop an explicit code of conduct (see the DoBeS web-
site for an example) which has to be accepted by everyone involved in
building, maintaining, and using the archive as their principle guideline of
behavior. The material stored in a language archive, in particular the re-
cordings, have to be generated with the consent of the speech community.
This consent should be explicit with regard to expectations about its usage
by others. Note that statements regarding the openness of resources may
change over time.
The main burden with regard to regulating access to resources has to be
carried by the main depositor, who often will also be a researcher. In gen-
eral, archivists will assume that the depositor/researcher knows the expecta-
tions of the speakers and that he/she has a deep understanding of the ethical
aspects involved. The depositor has to translate his or her knowledge in this
Dostları ilə paylaş: |