Essentials of Language Documentation

Yüklə 5,72 Mb.

Pdf görüntüsü

səhifə	44/144
tarix	22.07.2018
ölçüsü	5,72 Mb.
	#57633

1 ... 40 41 42 43 44 45 46 47 ... 144

More on sound archiving
4. Conclusions

108

Peter K. Austin

Media Player formats are all compressed in a way that loses information;

they are useful for working and presentation (e.g. for publication, on web

sites) but not suitable for archiving.

More on sound archiving

There are a large number of well-equipped sound archives around the world,

ranging from regional, to national, to international coverage. Some, such as the

Austrian National Sound Archive have been established for a long time and

have extensive experience with material in older ‘legacy’ formats. The Interna-

tional Association of Sound Archives (IASA) publishes lots of valuable and up-

to-date advice about archiving issues, and the Language Archives Newsletter

(http://www.mpi.nl/LAN) focuses on archiving for linguistic research.

3.4. Presentation, publication, and distribution

One of the ways that the presentation, publication, and distribution of rich

language documentations can be achieved currently is via multimedia

which links media, annotations (time-aligned transcriptions, analysis and

translations, hyperlinks) and metadata. One such format is linked files (in-

cluding HTML, MP3 sound clips, QuickTime, etc.) distributed via the

world wide web, but bandwidth can be problem for publication of media

files – even small movies of a few minutes in a compressed format can be

megabytes in size and take a long time to download via slow connections

(the use of video streaming software can partially overcome this limitation).

There is also SMIL (‘Synchronized Multimedia Integration Language’)

which is an application of XML to encode mixed media, text and image

information in a presentation form.

For highly complex richly annotated and linked media currently we

need to use multimedia platforms such as Macromedia Director, delivered

on CD-ROM or DVD as a publication format (see Chapter 15). Unfortu-

nately, the future of these formats and the carriers is unclear and how we

can archive multimedia for the future is also currently problematic. One

current major need is good multimedia players and ways for users to inter-

act with the rich documentations; it is necessary to model and design inter-

faces and access formats for various audiences. An example of such a for-

mat is the Spoken Karaim CD, described by Csató and Nathan (2003b),

Chapter 4 – Data and language documentation

109

which presents video and audio recordings with accompanying transcrip-

tions, translations, glosses, lexicon, and cultural information, all of which

are linked and interactive. The interface enables users to explore their own

pathways through the corpus and to search, collect items of interest, back-

track, and interact with the corpus. It has a simple attractive interface that

enables maximum interactivity without forcing the user to digest too much

information, and has been used for Karaim language support in education,

language maintenance, and revitalization (Nathan and Csató, forthc.).

Figure 5 is a screenshot from a CD-ROM of conversational documen-

tary materials in the Sasak language of eastern Indonesia (Austin, Jukes,

and Nathan 2000) which is based on the Karaim model. The top-left win-

dow shows images of the consultants who worked on the corpus, and below

it a Sasak lexicon arranged alphabetically (clicking on an entry in the lexi-

con reveals full details of the individual item in the top left window in place

of the images), and on the top right is the Sasak transcription of the conver-

sation (colors indicate the two speakers, their voices can be heard in the left

and right channels respectively of the associated time-aligned digital stereo

recording). Below the transcription is a small central window displaying

morpheme-by-morpheme analysis and gloss for a selected item in the text,

and below that, a display of the free translation in English of the speaker

turns (again color-coded). In the lower bottom left of the display there is a

search facility which the user can employ to find occurrences of morphemes

Figure 5. Screenshot from a CD-ROM presenting Sasak conversational materials

110

Peter K. Austin

or glosses of interest throughout the corpus, and in the top left is a set of

buttons that produce pronominal inflected forms of verbs (via a morpho-

logical generator) when the user moves them over a selected lexical entry

in the top left window (see Chapter 15 and Nathan 2000 b for further details

about the morphological generator developed for the Spoken Karaim CD).

4. Conclusions

Language documentation is an emerging field that involves recording,

analysis, annotation, archiving, and publication of rich and complex data.

By properly structuring the data representations and planning methods to

flow data between different formats and contexts, you can work produc-

tively with your materials, as well as publish and distribute them for others

and archive your resources to preserve them for the future. It is important

that all these aspects of a documentation project be incorporated in its plan-

ning and execution, in order to ensure maximally effective and useful

documentation.

Acknowledgements

Most of the material presented here has been “road tested” in lectures at

Frankfurt

University, Uppsala

University, the School

of Oriental

and

African

Studies, and the DoBeS summer school; I am grateful for comments and

feedback from audiences on these occasions. A proportion of this chapter

derives from information on language documentation and guidelines for

grant applicants co-written by David Nathan and myself and published on

the Hans Rausing Endangered Languages website (see particularly http://

www.hrelp.org/documentation/whatisit). I am grateful to David Nathan for

permission to incorporate this material into the present chapter, and for his

detailed comments on an earlier draft which picked up a number of errors

and infelicities. Thanks also to Jost Gippert, Nikolaus Himmelmann, Robert

Munro, and Peter Wittenburg for suggestions for improvement of earlier

presentations. Any remaining errors are solely mine.

Yüklə 5,72 Mb.

Dostları ilə paylaş:

1 ... 40 41 42 43 44 45 46 47 ... 144