originally by lingpipe
Here’s a 2 page write-up of one of the models I’ve been looking at for evaluating data annotation in order to evaluate coding standards and annotator sensitivity and specificity:
* Carpenter, Bob. 2008. Hierarchical Bayesian Models of Categorical Data Analysis.
I’ve submitted it as a poster to the New York Academy of Sciences 3rd Annual Machine Learning Symposium, which will be October 10, 2008.
Please let me know what you think (carp@alias-i.com). I didn’t have room to squeeze in the more complex model that accounts for “easy” items. This model and the one for easy items derive from the epidemiology literature (cited in the paper), where they’re trying to estimate disease prevalence from a heterogeneous set of tests. I’ve added some more general Bayesian reasoning, and suggested applications for annotation (though Bruce and Wiebe were mostly there in their 1999 paper, which I cite), and for training using probabilistic supervision (don’t think anyone’s done this yet).
I’m happy to share the R scripts and BUGS models I used to generate the data, fit the models, and display the results. I’d also love to know how to get rid of those useless vertical axes in the posterior histogram plots.
Hierarchical Bayesian Models of Categorical Data Annotation
Posted by jeffy
Posted on 8:20 PM
with No comments
loading..
Popular Posts
-
This is a very useful list of surveys from Doug Oard , which would be definitely helpful for those who want to enter this territory. I...
-
Resources about lucene Resources Introductions The API documentation contains a short and simple code example that show...
-
We examine top Python Machine learning open source projects on Github, both in terms of contributors and commits, and identify most popula...
-
Sent to you by jeffye via Google Reader: Mixture models: clustering or density estimation via natural language processing blo...
-
Just read a post from http://blog.bigml.com/2013/02/21/everything-you-wanted-to-know-about-machine-learning-but-were-too-afraid-to-ask-pa...
-
Logistic Regression by Any Other Name LingPipe Blog I (Bob) have been working on logistic regression. In particular, multinomial logistic r...
-
The problems such as multirow.sty’ not found can be fixed via the following command (Ubuntu system): sudo apt-get install texlive-latex...
-
From https://de.dariah.eu/tatom/preprocessing.html Also refer to http://www.nltk.org/api/nltk.tokenize.html#module-nltk.tokenize ...
-
Type in a terminal window: gs -sDEVICE=bbox -dNOPAUSE -dBATCH file.pdf (or file.ps) you must have ghostscript installed of course. This c...
-
This plugin can be used to read a RSS feed and transform it into a custom piece of HTML. Setup <!DOCTYPE html> <html> <...
0 Comments:
Post a Comment