http://languagelog.ldc.upenn.edu/nll/?p=3D2050 ------------------------------------ >Massimo Poesio writes: >>Phrase Detectives [http://www.phrasedetectives.org/] is a game-with-a-pur= pose designed to gather data about anaphora. We put online about 1.2 millio= n words - half Wikipedia, half fiction from the Project Gutenberg (the plan= is to make all the data freely available through LDC and the Anaphoric Ban= k), and ask our players to tell us what an anaphoric expression refers to, = or to check what other 'detectives' have done. The game collects 8 judgment= s for every anaphoric expression, and each interpretation is validated by 5= other players, so that the data can also be used to study disagreements in= anaphoric interpretation. We have collected over 700,000 anaphoric judgmen= ts in this first year and around 300,000 validations, and we'd like to comp= lete the annotation of the first 1 million words before moving on to releas= e 2 of the game (as you'll see if you play, there are several limitations),= so we started a competition - $500 to whomever gets the most points in Jan= uary - to double the number of players (we have around 1500, it would be ni= ce to get to at least 3000). >As Massimo suggests, the goal is to create a large text corpus annotated f= or anaphora and coreference. Annotations of this kind are used by linguist= s to determine the norms of language structure and use, by computational li= nguists to train and test their programs, and by psychologists to develop a= nd test hypotheses about the mechanisms of human language processing. >In a well-ordered universe, such corpora would also be of interest to thos= e who develop usage advice. A couple of years ago, I discussed some earlie= r work of Massimo's in that connection ("A test kitchen for stylistic recip= es", 6/1/2008) =96 though I don't think that release 1 of Phrase Detective= s deals with discourse deixis. >Anyhow, I urge everyone to participate in crowd-sourced linguistic annotat= ion of this kind. >And shouldn't there be some way to make things like this part of the educa= tional curriculum, so that students could learn about grammar while simulta= neously contributing to new research? ------------------------------------ An open invitation, which I second. Your chance to make your grammatical opinion count. Literally. P.S. "Anaphor" is a technical term referring mostly to pronouns (he, it, this), but also to epithets (as in "Bill left early and then [_the bastard_ went for a beer without me]") as well as Zero (as in "Bill left early because of [_ _ having to [_ _ walk the dog]]"). It comes from "anaphora", which has its own Wikipedia page: http://en.wikipedia.org/wiki/Anaphora_(linguistics) -John Lawler http://www.umich.edu/~jlawler/aue "As an adolescent I aspired to lasting fame, I craved factual certainty, and I thirsted for a meaningful vision of human life -- so I became a scientist. This is like becoming an archbishop so you can meet girls." -- M. Cartmill Other posts: |