Dirichlet prior for smoothing
via Research on Search by Dell Zhang on 5/25/09
Using Dirichlet distribution as the prior for smoothing in statistical language modeling leads to additive smoothing (a.k.a. Lidstone smoothing) that includes Laplace smoothing (i.e., add one) and Jeffreys-Perks smoothing (i.e., add half) (a.k.a. Expected Likelihood
Estimation) as special cases. This family of smothing methods can be regarded as a document dependent extension of linear interpolated smoothing.
Estimation) as special cases. This family of smothing methods can be regarded as a document dependent extension of linear interpolated smoothing.
It has been shown that Laplace smoothing, though most popular, is often inferior to Lidstone smoothing (using a value less than one) in modeling natural language data, e.g., for text classification tasks.
0 Comments:
Post a Comment