originally by lingpipe
Here’s a 2 page write-up of one of the models I’ve been looking at for evaluating data annotation in order to evaluate coding standards and annotator sensitivity and specificity:
* Carpenter, Bob. 2008. Hierarchical Bayesian Models of Categorical Data Analysis.
I’ve submitted it as a poster to the New York Academy of Sciences 3rd Annual Machine Learning Symposium, which will be October 10, 2008.
Please let me know what you think (carp@alias-i.com). I didn’t have room to squeeze in the more complex model that accounts for “easy” items. This model and the one for easy items derive from the epidemiology literature (cited in the paper), where they’re trying to estimate disease prevalence from a heterogeneous set of tests. I’ve added some more general Bayesian reasoning, and suggested applications for annotation (though Bruce and Wiebe were mostly there in their 1999 paper, which I cite), and for training using probabilistic supervision (don’t think anyone’s done this yet).
I’m happy to share the R scripts and BUGS models I used to generate the data, fit the models, and display the results. I’d also love to know how to get rid of those useless vertical axes in the posterior histogram plots.
Hierarchical Bayesian Models of Categorical Data Annotation
Posted by jeffy
Posted on 8:20 PM
with No comments
loading..
Popular Posts
-
Resources about lucene Resources Introductions The API documentation contains a short and simple code example that show...
-
We examine top Python Machine learning open source projects on Github, both in terms of contributors and commits, and identify most popula...
-
Repost from http://terrytao.wordpress.com/advice-on-writing-papers/ There are three rules for writing the novel. Unfortunately, no on...
-
Just read a post from http://blog.bigml.com/2013/02/21/everything-you-wanted-to-know-about-machine-learning-but-were-too-afraid-to-ask-pa...
-
汉字编码问题 下面是搜集的多篇关于汉字编码问题文章的合集,相信你的问题一定包含在其中,如果没有请留言,一起把这方面的内容补充全。 一、汉字编码的种类 汉字编码中现在主要用到的有三类,包括GBK,GB2312和Big5。 1、 GB2312又称国标码 ,由国家标准总...
-
SIGIR 2014 accepted Full Papers Included here is a tentative list of the full papers and their allocation into sessions. Note: titles, a...
-
Teach Yourself Programming in Ten Years Peter Norvig Why is everyone in such a rush? Walk into any bookstore, and you’ll see how to Te...
-
After installing Scipy, when I import optimize from Scipy, the following error occurs. Traceback (most recent call last): File &q...
-
牛津学生英语搭配词典(OXFORD Collocations Dictionary for Students of English ) 这是新东方李笑来老师极力推荐的字典,今天试用了一下果然不错–fall in love with it in first sight。都是英文...
-
From https://de.dariah.eu/tatom/preprocessing.html Also refer to http://www.nltk.org/api/nltk.tokenize.html#module-nltk.tokenize ...
0 Comments:
Post a Comment