Anyone can play, but there’s a gag order on any result comparisons and the data’s strictly proprietary (from LDC).
The evaluation plan is particularly interesting as a case study in
specifying an ontology and tagging standard for a complicated problem.
If you’ve never thought about something like this, I’d recommend it
highly. Don’t spend too much time worrying about their evaluation
One interesting note: they’ll be using Breck Baldwin’s and Amit
Bagga’s B-Cubed measure for cross-document coreference scoring. I still
like relational scoring myself, especially as it allows for uncertainty
on coreference to be “integrated out”. But I’ve never been able to
convince anyone else about its usefulness.
We (Alias-i) probably won’t have time to participate. Our research
energy’s going to be mostly directed toward our NIH grant, which itself
has a cross-document coreference and entity extraction component, but
one focused on high recall and database linkage rather than first-best
If I had my research druthers and decided to take on this problem,
I’d focus on Dirichlet process clusterers. In particular, I’d like to
see a truly Bayesian version of something like Haghighi and Klein (2007)
that used posterior sampling. Even more fun would be to integrate (pun
intended) with a Bayesian tagger, like the one described in Finkel et al. (2005). In fact, it looks like Finkel et al. (2006) are already thinking along these lines in other arenas.
I’ve been fascinating with cascading processing, and in particular
on-line disambiguation, ever since grad school, where I was encouraged
in this pursuit by Mark Steedman
(who knew computational linguists had Wikipedia entries?). Online
processing and disparate information integration for disambiguation was
even the subect of my job talk at Carnegie Mellon way back in 1989. It
was what I was working on at the end of my time at Bell Labs, which
spun out into my last ACL publication, Collins et al. (2004).