stacked_generalization

Machine Learning Stacking Uti

These details have not been verified by PyPI

Project links

Homepage

Project description

stacked_generalization

Implemented machine learning *stacking technic[1]* as handy library in Python. Feature weighted linear stacking is also available. (See https://github.com/fukatani/stacked_generalization/tree/master/stacked_generalization/example.)

feature

1) Any scikit-learn model is availavle for Stage 0 and Stage 1 model. And stacked model itself has the same interface as scikit-learn library.

ex.

from stacked_generalization.lib.stacking import StackedClassifier
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression, RidgeClassifier
from sklearn import datasets, metrics
iris = datasets.load_iris()

# Stage 1 model
bclf = LogisticRegression(random_state=1)

# Stage 0 models
clfs = [RandomForestClassifier(n_estimators=40, criterion = 'gini', random_state=1),
        GradientBoostingClassifier(n_estimators=25, random_state=1),
        RidgeClassifier(random_state=1)]

# same interface as scikit-learn
sl = StackedClassifier(bclf, clfs)
sl.fit(iris.target, iris.data)
score = metrics.accuracy_score(iris.target, classifier.predict(iris.data))
print("Accuracy: %f" % score)

More detail example is here. https://github.com/fukatani/stacked_generalization/blob/master/stacked_generalization/example/cross_validation_for_iris.py

https://github.com/fukatani/stacked_generalization/blob/master/stacked_generalization/example/simple_regression.py

Stacked learning model itself is used as sk-learn model, so you can replace model such as RandomForestClassifier to stacked model easily in your scripts.

2) Evaluation model by out-of-bugs score.

Stacking technic itself uses CV to stage0. So if you use CV for entire stacked model, *each stage 0 model are fitted n_folds squared times.* Sometimes its computational cost can be significent,therefore we implemented CV only for stage1[2].

For example, when we get 3 blends (stage0 prediction), 2 blends are used for stage 1 fitting. The remaining one blend is used for model test. Repitation this cycle for all 3 blends, and averaging scores, we can get oob (out-of-bugs) score *with only n_fold times stage0 fitting.*

ex.

sl = StackedClassifier(bclf, clfs, oob_score_flag=True)
sl.fit(iris.target, iris.data)
print("Accuracy: %f" % sl.oob_score_)

3) Caching stage1 blend_data and trained model. (optional)

sl = StackedClassifier(bclf, clfs, save_stage0=True, save_dir='stack_temp')

Software Requirement

Python (2.7 or later)
scikit-learn

Installation

git clone https://github.com/fukatani/stacked_generalization.git
python setup.py install

License

MIT License. (http://opensource.org/licenses/mit-license.php)

Copyright

Many part of the implementation is based on the following. Thanks! https://github.com/log0/vertebral/blob/master/stacked_generalization.py

Other

Any contributions (implement, documentation, test or idea…) are welcome.

References

[1] L. Breiman, “Stacked Regressions”, Machine Learning, 24, 49-64 (1996). [2] J. Sill1 et al, “Feature Weighted Linear Stacking”, https://arxiv.org/abs/0911.0460, 2009.

Algorithm	Hash digest
SHA256	`3f7a7f5f031ec271105a6b87c7bbd5357d8c90fbf369d25dedb4d4c1ea61722f`
MD5	`1160d97c3efa2ff05c2155154ef2e6d3`
BLAKE2b-256	`a621748a6470b0b8f7c2b8ac1df7a28a84d71ba60e1ed3a247b0a02b26cf5f8a`

stacked_generalization 0.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

stacked_generalization

feature

1) Any scikit-learn model is availavle for Stage 0 and Stage 1 model. And stacked model itself has the same interface as scikit-learn library.

2) Evaluation model by out-of-bugs score.

3) Caching stage1 blend_data and trained model. (optional)

Software Requirement

Installation

License

Copyright

Other

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes