Phrase Detectives

Subject:Phrase Detectives
Date:Sat, 16 Jan 2010 09:59:51 -0800 (PST)
Just posted by Mark Liberman on Language Log
http://languagelog.ldc.upenn.edu/nll/?p=3D2050
------------------------------------
>Massimo Poesio writes:

>>Phrase Detectives [http://www.phrasedetectives.org/] is a game-with-a-pur=
pose designed to gather data about anaphora. We put online about 1.2 millio=
n words - half Wikipedia, half fiction from the Project Gutenberg (the plan=
is to make all the data freely available through LDC and the Anaphoric Ban=
k), and ask our players to tell us what an anaphoric expression refers to, =
or to check what other 'detectives' have done. The game collects 8 judgment=
s for every anaphoric expression, and each interpretation is validated by 5=
other players, so that the data can also be used to study disagreements in=
anaphoric interpretation. We have collected over 700,000 anaphoric judgmen=
ts in this first year and around 300,000 validations, and we'd like to comp=
lete the annotation of the first 1 million words before moving on to releas=
e 2 of the game (as you'll see if you play, there are several limitations),=
so we started a competition - $500 to whomever gets the most points in Jan=
uary - to double the number of players (we have around 1500, it would be ni=
ce to get to at least 3000).

>As Massimo suggests, the goal is to create a large text corpus annotated f=
or anaphora and coreference. Annotations of this kind are used by linguist=
s to determine the norms of language structure and use, by computational li=
nguists to train and test their programs, and by psychologists to develop a=
nd test hypotheses about the mechanisms of human language processing.

>In a well-ordered universe, such corpora would also be of interest to thos=
e who develop usage advice. A couple of years ago, I discussed some earlie=
r work of Massimo's in that connection ("A test kitchen for stylistic recip=
es", 6/1/2008) =96 though I don't think that release 1 of Phrase Detective=
s deals with discourse deixis.

>Anyhow, I urge everyone to participate in crowd-sourced linguistic annotat=
ion of this kind.

>And shouldn't there be some way to make things like this part of the educa=
tional curriculum, so that students could learn about grammar while simulta=
neously contributing to new research?

------------------------------------

An open invitation, which I second.
Your chance to make your grammatical opinion count.
Literally.

P.S. "Anaphor" is a technical term referring mostly to pronouns (he,
it, this), but also to epithets (as in "Bill left early and then [_the
bastard_ went for a beer without me]") as well as Zero (as in "Bill
left early because of [_ _ having to [_ _ walk the dog]]"). It
comes from "anaphora", which has its own Wikipedia page:
http://en.wikipedia.org/wiki/Anaphora_(linguistics)

-John Lawler http://www.umich.edu/~jlawler/aue
"As an adolescent I aspired to lasting fame, I
craved factual certainty, and I thirsted for a
meaningful vision of human life -- so I became
a scientist. This is like becoming an archbishop
so you can meet girls." -- M. Cartmill



Other posts:
red politician?
An apparent insenitive British custom
Shit-eating grin
Latin for "future state" as opposed to "status quo"?
AmE Irregular Verbs
• Phrase Detectives
re: colon
Morrison: broke as a haint
Morrison: put mirrors on her door
Morrison: to suck teeth
Accents again

generated at 15:24:26