Essentials of Language Documentation

Yüklə 5,72 Mb.

Pdf görüntüsü

səhifə	60/144
tarix	22.07.2018
ölçüsü	5,72 Mb.
	#57633

1 ... 56 57 58 59 60 61 62 63 ... 144

4. Systematic extraction of lexical databases

Chapter 6 – Documenting lexical knowledge

147

Table 1. Hanunoo pronouns

M

kuh ‘I’ 1s

–

+

muh ‘you’ 2s

–

+

yah ‘s/he’ 3s

–

+

tah ‘we two’ 1du

+

tam ‘we all’ 1pl

INCL

–

yuh ‘you all’ 2pl

–

–

dah

‘they’

3pl

– – –

mih ‘we (but not you)’ 1pl

EXCL

–

Another useful descriptive paradigm widely applied to (and in fact driven

by) lexicographic practice is the “frame-semantics” approach associated

with Charles Fillmore (see, for example, Fillmore and Atkins 1992). Indi-

vidual words, on this view, project wider, structured “frames” – configura-

tions of elements and actions, some of which receive explicit grammatical

realization and some of which remain implicit in the frame. Families of

words then share frames. For example, the Framenet description of the

“Commerce-buy” frame – which might be instantiated by such verbs as

buy, lease, or rent – is

These are words describing a basic commercial transaction involving a

buyer and a seller exchanging money and goods, taking the perspective of

the buyer. The words vary individually in the patterns of frame element re-

alization they allow. For example, the typical pattern for the verb BUY:

BUYER buys GOODS from SELLER for MONEY. Abby bought a car

from Robin for $ 5,000.

Clearly, frames themselves can be interrelated. Compare the description for

the “Giving” frame, which the “Commerce” frame above “inherits”:

A Donor transfers a

Theme

from a Donor to a

Recipient

This frame in-

cludes only actions that are initiated by the Donor (the one that starts out

owning the

Theme

). Sentences (even metaphorical ones) must meet the fol-

lowing entailments: the Donor first has possession of the

Theme

. Following

the transfer the Donor no longer has the

Theme

and the

Recipient

does.

148

John B. Haviland

In some ways related as a metasemantic device is the approach, most ex-

plicitly developed in Levin (1993), that uses various syntactic diagnostics –

such as patterns of diathesis – to partition lexical sets into families or

classes. Testing various diagnostic syntactic behaviors against their occur-

rence with specific verbs partitions the verbs into classes which can, ac-

cording to this logic, be expected to display commonalities of meaning. For

example, Levin proposes the following constructions as relevant tests to

discover semantic classes among transitive verbs.

(9) Diathesis diagnostics

MIDDLE

: The bread cuts easily.

CONATIVE

: Carla hit at the door.

BODY

PART POSSESSOR ASCENSION

: Terry touched Bill on the shoulder.

Applied to specific verbs (each of which may have a variety of hyponyms,

thus forming meaning families), these tests reveal different syntactic classes

corresponding to putative meaning families. The meaning families can, in

turn, be used to group individual lexical items, and the groupings are thus

justified not simply on notional but also on syntactic grounds.

(10) Diathesis diagnostics applied to different verbs (from Levin 1993: 6)

touch hit cut break

CONATIVE

Yes

BODY

PART POSS

ASC

. Yes Yes Yes No

MIDDLE

No No Yes Yes

4. Systematic extraction of lexical databases

After one has documented the basic structures of a grammar, and collected

an ample corpus of texts, how does one supplement elicited examples and

textually situated tokens of use to achieve a systematic compilation of lexi-

cal knowledge? Interlinear glossing of a large corpus can be used mechani-

cally to generate a structured word list, whose analytical perspicacity is in

direct proportion to the compiler’s care and consistency in morphological

and semantic tagging during the glossing procedure. Various computational

tools aid lexical extraction from text corpora – not only dedicated linguistic

database tools like SIL’s Shoebox/Toolbox, but also both general and spe-

Chapter 6 – Documenting lexical knowledge

149

cialized concordance tools (written, for example, as unix shell scripts, or

with programming languages like

PERL

ICON

Other computer techniques can also aid in eliciting lexemes in a lan-

guage, taking advantage of regular phonological patterns. A well-known

example is Terry Kaufman’s method for generating an exhaustive list of

“potential roots” in Mayan languages, based on the observation that the

root canon in Mayan is CVC or some simple variant thereof. Table 2 shows

a short

ICON

program that begins with all the consonants and vowels

in the

Mayan language Tseltal and produces a complete list of all permutations of

the form CV(:)(j)C. The program produces 8820 potential roots. (The first

of those beginning with b are shown in Table 3.) Each of these can be ex-

haustively (and exhaustingly) tested with native speakers to see which forms

actually produce recognizable lexical items – many speakers of Mayan lan-

guages and others with similarly straightforward phonotactics have, over

the years, been subjected to such a mind-numbing task.

Table 2. Tseltal root salad, in the Icon programming language

procedure main()

C := "`bcCjkKlmnpPrstTwxyzZ"

V := "aAeEiIoOuU"

M := "0j"

every (c1 := !C) do {

every (v1 := !V) do {

every (m1 :=!M) do {

every (c2 := !C) do {

root := c1||v1||m1||c2

write(root))

}}}}

end

Table 3. The first possible Tseltal roots beginning with b

ba' bab bach bach' baj bak bak' bal bam ban bap bap' bar bas bat

bat' baw bax bay bats bats' baj’ bajb bajch bajch' bajj bajk bajk' bajl

bajm bajn bajp bajp' bajr bajs bajt bajt' bajw bajx bajy bajts bajts'

baa’ baab baach … etc.

Yüklə 5,72 Mb.

Dostları ilə paylaş:

1 ... 56 57 58 59 60 61 62 63 ... 144