Home » » LingPipe 3.5.0 Released

LingPipe 3.5.0 Released

Intermediate Release



The latest release of LingPipe is LingPipe 3.5.0. This release replaces
LingPipe 3.4.0, with which it is backward compatible other than for the
matrix.Vector interface (details below).


Logistic Regression (aka Max Entropy)


The main addition in this release is of multinomial logistic
regression (often called "maximum entropy classification" in the
computational linguistics literature). Logistic regression produces a
probabilistic discriminitive classifier with state-of-the-art accuracy.
The regression estimators use stochastic gradient descent for
scalability to large feature spaces with sparse inputs.



There is a direct matrix-based implementation in



and an adapter based on general feature extraction

in:




There are two support classes introduced for logistic regression, one
for Laplace, Gaussian and Cauchy priors (also known as "regularizers"):



and one for annealing schedules to control the gradient descent:




There is a new tutorial describing how to use these classes referenced below.


Cross-Validating Classification Corpus



To support cross-validation evaluations for classifiers, there is a new corpus implementation:




There is a new tutorial section describing how to use this class referenced below.


LineParser and SVMlight Classification Parser


There’s an implementation of a parser for the SVMlight file-based classifier format:



This parser is based on a new abstract line-based parser implementation in:



Pair Utility Class


There is a new class for pairs of heterogeneous types introduced
primarily as a utility for methods that return pairs of results:



Additional Vector Methods


The interface matrix.Vector
has been updated with two new methods which allow an efficient
inspection of non-zero dimensions. This was done to allow the interface
vector to be used directly in logistic regression.


Vector and matrix client code remains unaffected. A conflict will
arise only with implementations of vector outside of LingPipe.


Logistic Regression Tutorial


There’s a new classification tutorial which covers

the new logistic regression classes in the stats

and classify packages:



Cross-Validation Tutorial Section


There’s a new section in the topic classification tutorial covering cross-validation of classifiers and the corpus.Corpus class.


4 Comments:

Popular Posts