Tools for doing hyperparameter search with Scikit-Learn and Dask
Project description
Tools for performing hyperparameter search with Scikit-Learn and Dask.
Highlights
Drop-in replacement for Scikit-Learn’s GridSearchCV and RandomizedSearchCV.
Hyperparameter optimization can be done in parallel using threads, processes, or distributed across a cluster.
Works well with Dask collections. Dask arrays, dataframes, and delayed can be passed to fit.
Candidate estimators with identical parameters and inputs will only be fit once. For composite-estimators such as Pipeline this can be significantly more efficient as it can avoid expensive repeated computations.
For more information, check out the documentation.
Install
Dask-searchcv is available via conda or pip:
# Install with conda $ conda install dask-searchcv -c conda-forge # Install with pip $ pip install dask-searchcv
Example
from sklearn.datasets import load_digits
from sklearn.svm import SVC
import dask_searchcv as dcv
import numpy as np
digits = load_digits()
param_space = {'C': np.logspace(-4, 4, 9),
'gamma': np.logspace(-4, 4, 9),
'class_weight': [None, 'balanced']}
model = SVC(kernel='rbf')
search = dcv.GridSearchCV(model, param_space, cv=3)
search.fit(digits.data, digits.target)