Release: | 1.4 |
---|---|
Date: | October 13, 2010 |
Author: | peter.prettenhofer@gmail.com |
Contents:
Introduction
Bolt features discriminative learning of linear predictors (e.g. SVM or Logistic Regression) using fast online learning algorithms. Bolt is aimed at large-scale, high-dimensional and sparse machine-learning problems. In particular, problems encountered in information retrieval and natural language processing.
Bolt considers linear models (bolt.model.LinearModel) for binary classification,
and generalized linear models (bolt.model.GeneralizedLinearModel) for multi-class classification,
Where and are the model parameters that are learned from training data. In Bolt the model parameters are learned by minimizing the regularized training error given by,
where is a loss function that measures model fit and is a regularization term that measures model complexity.
Features
Bolt supports the following trainers for binary classification:
- Stochastic Gradient Descent (bolt.trainer.sgd.SGD)
- Supports various loss functions : Hinge, Modified Huber, Log.
- Supports various regularization terms : L2, L1, and Elastic Net.
PEGASOS (bolt.trainer.sgd.PEGASOS)
For multi-class classification:
One-versus-all (bolt.trainer.OVA)
Averaged Perceptron (bolt.trainer.avgperceptron.AveragedPerceptron)
- Maximum Entropy (bolt.trainer.maxent.MaxentSGD)
- aka Multinomial Logistic Regression
- Trained via SGD.
Benchmark
The following RCV1-CCAT benchmark results show that Bolt is competitive to state-of-the-art linear SVM solvers such as SVMPerf, liblinear, or sgd. The dataset comprises 781.264 training documents, each represented by a 47.152 dimensional feature vector.
Algorithm | Training time | Accuracy |
---|---|---|
SVMlight | >600.00 sec | |
SVMPerf [1] | 11.60 sec | 94.79 |
liblinear [2] | 9.00 sec | 94.77 |
bolt [3] | 2.33 sec | 94.79 |
sgd [4] | 1.09 sec | 94.77 |
[1] | Uses C=1000 |
[2] | Uses SVM (Dual), B=1 |
[3] | Uses E=5, r=0.00001, l=0, b |
[4] | Uses epochs=5, lambda=0.00001 |