Skip to main content

Tools for doing hyperparameter search with Scikit-Learn and Dask

Project description

Travis Status Documentation Status Conda Badge PyPI Badge

Tools for performing hyperparameter search with Scikit-Learn and Dask.

Highlights

  • Drop-in replacement for Scikit-Learn’s GridSearchCV and RandomizedSearchCV.

  • Hyperparameter optimization can be done in parallel using threads, processes, or distributed across a cluster.

  • Works well with Dask collections. Dask arrays, dataframes, and delayed can be passed to fit.

  • Candidate estimators with identical parameters and inputs will only be fit once. For composite-estimators such as Pipeline this can be significantly more efficient as it can avoid expensive repeated computations.

For more information, check out the documentation.

Install

Dask-searchcv is available via conda or pip:

# Install with conda
$ conda install dask-searchcv -c conda-forge

# Install with pip
$ pip install dask-searchcv

Example

from sklearn.datasets import load_digits
from sklearn.svm import SVC
import dask_searchcv as dcv
import numpy as np

digits = load_digits()

param_space = {'C': np.logspace(-4, 4, 9),
               'gamma': np.logspace(-4, 4, 9),
               'class_weight': [None, 'balanced']}

model = SVC(kernel='rbf')
search = dcv.GridSearchCV(model, param_space, cv=3)

search.fit(digits.data, digits.target)

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page