The latest release of LingPipe is LingPipe 3.5.0. This release replaces
LingPipe 3.4.0, with which it is backward compatible other than for the
matrix.Vector interface (details below).
Logistic Regression (aka Max Entropy)
The main addition in this release is of multinomial logistic
regression (often called "maximum entropy classification" in the
computational linguistics literature). Logistic regression produces a
probabilistic discriminitive classifier with state-of-the-art accuracy.
The regression estimators use stochastic gradient descent for
scalability to large feature spaces with sparse inputs.
There is a direct matrix-based implementation in
and an adapter based on general feature extraction
There are two support classes introduced for logistic regression, one
for Laplace, Gaussian and Cauchy priors (also known as "regularizers"):
and one for annealing schedules to control the gradient descent:
There is a new tutorial describing how to use these classes referenced below.
Cross-Validating Classification Corpus
To support cross-validation evaluations for classifiers, there is a new corpus implementation:
There is a new tutorial section describing how to use this class referenced below.
LineParser and SVMlight Classification Parser
There’s an implementation of a parser for the SVMlight file-based classifier format:
This parser is based on a new abstract line-based parser implementation in:
Pair Utility Class
There is a new class for pairs of heterogeneous types introduced
primarily as a utility for methods that return pairs of results:
Additional Vector Methods
has been updated with two new methods which allow an efficient
inspection of non-zero dimensions. This was done to allow the interface
vector to be used directly in logistic regression.
Vector and matrix client code remains unaffected. A conflict will
arise only with implementations of vector outside of LingPipe.
Logistic Regression Tutorial
There’s a new classification tutorial which covers
the new logistic regression classes in the
Cross-Validation Tutorial Section
There’s a new section in the topic classification tutorial covering cross-validation of classifiers and the