From 1fe3fc483721d6d63cd30fb42974f71339226412 Mon Sep 17 00:00:00 2001 From: perib Date: Fri, 20 Sep 2024 14:48:56 -0700 Subject: [PATCH 01/44] edits --- Tutorial/1_Estimators_Overview.ipynb | 1911 ----------------- Tutorial/1_Using_TPOT.ipynb | 599 ++++++ tpot2/config/classifiers.py | 3 +- tpot2/config/regressors.py | 4 +- tpot2/config/template_search_spaces.py | 83 +- tpot2/objectives/average_path_length.py | 9 + tpot2/objectives/complexity.py | 16 +- tpot2/objectives/number_of_leaves.py | 10 +- tpot2/objectives/number_of_nodes.py | 9 + tpot2/tpot_estimator/estimator.py | 2 +- .../tpot_estimator/templates/tpottemplates.py | 8 +- 11 files changed, 730 insertions(+), 1924 deletions(-) delete mode 100644 Tutorial/1_Estimators_Overview.ipynb create mode 100644 Tutorial/1_Using_TPOT.ipynb diff --git a/Tutorial/1_Estimators_Overview.ipynb b/Tutorial/1_Estimators_Overview.ipynb deleted file mode 100644 index cae78a96..00000000 --- a/Tutorial/1_Estimators_Overview.ipynb +++ /dev/null @@ -1,1911 +0,0 @@ -{ - "cells": [ - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Overview\n", - "\n", - "There are two evolutionary algorithms built into TPOT2, which corresponds to two different estimator classes.\n", - "\n", - "1. The `tpot2.TPOTEstimator` uses a standard evolutionary algorithm that evaluates exactly population_size individuals each generation. This is similar to the algorithm in TPOT1. The next generation does not start until the previous is completely finished evaluating. This leads to underutilized CPU time as the cores are waiting for the last individuals to finish training, but may preserve diversity in the population. \n", - "\n", - "2. The `tpot2.TPOTEstimatorSteadyState` differs in that it will generate and evaluate the next individual as soon as an individual finishes evaluation. The number of individuals being evaluated is determined by the n_jobs parameter. There is no longer a concept of generations. The population_size parameter now refers to the size of the list of evaluated parents. When an individual is evaluated, the selection method updates the list of parents. This allows more efficient utilization when using multiple cores.\n", - "\n", - "\n", - "Additionally, two other simplified estimators are provided. These have a simplified set of hyperparameters with default values set for classification and regression problems. Currently, both of these use the standard evolutionary algorithm in the `tpot2.TPOTEstimator` class.\n", - "\n", - "1. `tpot2.TPOTClassifier` for classification tasks\n", - "2. `tpot2.TPOTRegressor` for regression tasks" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Scorers, Objective Functions, and multi objective optimization.\n", - "\n", - "There are two ways of passing objectives into TPOT2. \n", - "\n", - "1. `scorers`: Scorers are functions that have the signature (estimator, X, y). These can be produced with the [sklearn.metrics.make_scorer](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html) function. This function is used to evaluate the test folds during cross validation. These are passed into TPOT2 via the scorers parameter. This can take in the scorer itself or the string corresponding to a scoring function ([as listed here](https://scikit-learn.org/stable/modules/model_evaluation.html)). TPOT2 also supports passing in a list of several scorers for multiobjective optimization. \n", - "\n", - "2. `other_objective_functions` : Other objective functions in TPOT2 have the signature (estimator) and returns a float or list of floats. These get passed an unfitted estimator (in the case of TPOT2, a `tpot2.GraphPipeline`). \n", - "\n", - "\n", - "Each scorer and objective function must be accompanied by a list of weights corresponding to the list of objectives. By default, TPOT2 maximizes objective functions (this can be changed by `bigger_is_better=False`). Positive weights means that TPOT2 will seek to maximize that objective, and negative weights correspond to minimization.\n", - "\n", - "Here is an example of using two scorers\n", - "\n", - " scorers=['roc_auc_ovr',tpot2.objectives.complexity_scorer],\n", - " scorers_weights=[1,-1],\n", - "\n", - "\n", - "Here is an example with a scorer and a secondary objective function\n", - "\n", - " scorers=['roc_auc_ovr'],\n", - " scorers_weights=[1],\n", - " other_objective_functions=[tpot2.objectives.number_of_leaves_objective],\n", - " other_objective_functions_weights=[-1]," - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 0%| | 0/1 [00:00 \n", - " Pipeline has none of the following attributes: predict_proba. \n", - " Traceback (most recent call last):\n", - " File \"/home/ribeirop/common/Projects/TPOT_Dev/tpot2/tpot2/utils/eval_utils.py\", line 53, in objective_nan_wrapper\n", - " value = func_timeout.func_timeout(timeout, objective_function, args=[individual], kwargs=objective_kwargs)\n", - " File \"/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/func_timeout/dafunc.py\", line 108, in func_timeout\n", - " raise_exception(exception)\n", - " File \"/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/func_timeout/py3_raise.py\", line 7, in raise_exception\n", - " raise exception[0] from None\n", - " File \"/home/ribeirop/common/Projects/TPOT_Dev/tpot2/tpot2/tpot_estimator/estimator.py\", line 620, in objective_function\n", - " return objective_function_generator(\n", - " File \"/home/ribeirop/common/Projects/TPOT_Dev/tpot2/tpot2/tpot_estimator/estimator_utils.py\", line 55, in objective_function_generator\n", - " cv_obj_scores = cross_val_score_objective(sklearn.base.clone(pipeline),x,y,scorers=scorers, cv=cv , fold=step)\n", - " File \"/home/ribeirop/common/Projects/TPOT_Dev/tpot2/tpot2/tpot_estimator/cross_val_utils.py\", line 31, in cross_val_score_objective\n", - " this_fold_scores = [sklearn.metrics.get_scorer(scorer)(this_fold_pipeline, X_test, y_test) for scorer in scorers]\n", - " File \"/home/ribeirop/common/Projects/TPOT_Dev/tpot2/tpot2/tpot_estimator/cross_val_utils.py\", line 31, in \n", - " this_fold_scores = [sklearn.metrics.get_scorer(scorer)(this_fold_pipeline, X_test, y_test) for scorer in scorers]\n", - " File \"/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/metrics/_scorer.py\", line 253, in __call__\n", - " return self._score(partial(_cached_call, None), estimator, X, y_true, **_kwargs)\n", - " File \"/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/metrics/_scorer.py\", line 344, in _score\n", - " response_method = _check_response_method(estimator, self._response_method)\n", - " File \"/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/utils/validation.py\", line 2106, in _check_response_method\n", - " raise AttributeError(\n", - "AttributeError: Pipeline has none of the following attributes: predict_proba.\n", - "\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 100%|██████████| 1/1 [00:07<00:00, 7.82s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 1\n", - "Best roc_auc_score score: 0.9938492063492064\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "\n", - "2024-06-28 17:22:24,449 - distributed.scheduler - ERROR - Removing worker 'tcp://127.0.0.1:33053' caused the cluster to lose scattered data, which can't be recovered: {'ndarray-71df36028cf839ff98696c18d6668a27', 'ndarray-809a54d2fd885201030a189763e7bd92'} (stimulus_id='handle-worker-cleanup-1719620544.4491522')\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "1.0\n" - ] - } - ], - "source": [ - "import tpot2\n", - "import sklearn\n", - "import sklearn.datasets\n", - "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", - "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", - "\n", - "\n", - "est = tpot2.TPOTClassifier(n_jobs=40, max_time_mins=30, verbose=5, generations=1, population_size=5)\n", - "est.fit(X_train, y_train)\n", - "\n", - "\n", - "print(scorer(est, X_test, y_test))" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
Pipeline(steps=[('robustscaler',\n",
-       "                 RobustScaler(quantile_range=(0.16675428907107737,\n",
-       "                                              0.7012433303146526))),\n",
-       "                ('passthrough', Passthrough()),\n",
-       "                ('featureunion-1',\n",
-       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
-       "                                                 SkipTransformer()),\n",
-       "                                                ('passthrough',\n",
-       "                                                 Passthrough())])),\n",
-       "                ('featureunion-2',\n",
-       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
-       "                                                 SkipTransformer()),\n",
-       "                                                ('passthrough',\n",
-       "                                                 Passthrough())])),\n",
-       "                ('bernoullinb',\n",
-       "                 BernoulliNB(alpha=0.7637690262115946, fit_prior=False))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('robustscaler',\n", - " RobustScaler(quantile_range=(0.16675428907107737,\n", - " 0.7012433303146526))),\n", - " ('passthrough', Passthrough()),\n", - " ('featureunion-1',\n", - " FeatureUnion(transformer_list=[('skiptransformer',\n", - " SkipTransformer()),\n", - " ('passthrough',\n", - " Passthrough())])),\n", - " ('featureunion-2',\n", - " FeatureUnion(transformer_list=[('skiptransformer',\n", - " SkipTransformer()),\n", - " ('passthrough',\n", - " Passthrough())])),\n", - " ('bernoullinb',\n", - " BernoulliNB(alpha=0.7637690262115946, fit_prior=False))])" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "est._evolver_instance.population.evaluated_individuals.iloc[0]['Individual'].export_pipeline()" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: : 1it [00:35, 35.93s/it]\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/covariance/_empirical_covariance.py:102: UserWarning: Only one sample available. You may want to reshape your data array\n", - " warnings.warn(\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "-5421.324324324324\n" - ] - } - ], - "source": [ - "import tpot2\n", - "import sklearn\n", - "import sklearn.metrics\n", - "import sklearn.datasets\n", - "\n", - "scorer = sklearn.metrics.get_scorer('neg_mean_squared_error')\n", - "X, y = sklearn.datasets.load_diabetes(return_X_y=True)\n", - "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", - "\n", - "est = tpot2.tpot_estimator.templates.TPOTRegressor(n_jobs=4, max_time_mins=30, verbose=2, cv=5)\n", - "est.fit(X_train, y_train)\n", - "\n", - "print(scorer(est, X_test, y_test))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Best Practices\n", - "\n", - "When running tpot from an .py script, it is important to protect code with `if __name__==\"__main__\":`" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: : 1it [01:05, 65.90s/it]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.9994639357871052\n" - ] - } - ], - "source": [ - "#my_analysis.py\n", - "\n", - "from dask.distributed import Client, LocalCluster\n", - "import tpot2\n", - "import sklearn\n", - "import sklearn.datasets\n", - "import numpy as np\n", - "\n", - "if __name__==\"__main__\":\n", - " scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - " X, y = sklearn.datasets.load_digits(return_X_y=True)\n", - " X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", - "\n", - "\n", - " est = tpot2.TPOTClassifier(n_jobs=4, max_time_mins=60, verbose=2)\n", - " est.fit(X_train, y_train)\n", - "\n", - "\n", - " print(scorer(est, X_test, y_test))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Common parameters\n", - "\n", - " scorers : (list, scorer)\n", - " A scorer or list of scorers to be used in the cross-validation process. \n", - " see https://scikit-learn.org/stable/modules/model_evaluation.html\n", - " \n", - " scorers_weights : list\n", - " A list of weights to be applied to the scorers during the optimization process.\n", - " \n", - " classification : bool\n", - " If True, the problem is treated as a classification problem. If False, the problem is treated as a regression problem.\n", - " Used to determine the CV strategy.\n", - " \n", - " cv : int, cross-validator\n", - " - (int): Number of folds to use in the cross-validation process. By uses the sklearn.model_selection.KFold cross-validator for regression and StratifiedKFold for classification. In both cases, shuffled is set to True.\n", - " - (sklearn.model_selection.BaseCrossValidator): A cross-validator to use in the cross-validation process.\n", - " - max_depth (int): The maximum depth from any node to the root of the pipelines to be generated.\n", - " \n", - " other_objective_functions : list, default=[tpot2.objectives.estimator_objective_functions.average_path_length_objective]\n", - " A list of other objective functions to apply to the pipeline.\n", - " \n", - " other_objective_functions_weights : list, default=[-1]\n", - " A list of weights to be applied to the other objective functions.\n", - " \n", - " objective_function_names : list, default=None\n", - " A list of names to be applied to the objective functions. If None, will use the names of the objective functions.\n", - " \n", - " bigger_is_better : bool, default=True\n", - " If True, the objective function is maximized. If False, the objective function is minimized. Use negative weights to reverse the direction.\n", - " \n", - " generations : int, default=50\n", - " Number of generations to run\n", - " \n", - " max_time_mins : float, default=float(\"inf\")\n", - " Maximum time to run the optimization. If none or inf, will run until the end of the generations.\n", - " \n", - " max_eval_time_mins : float, default=60*5\n", - " Maximum time to evaluate a single individual. If none or inf, there will be no time limit per evaluation.\n", - "\n", - " n_jobs : int, default=1\n", - " Number of processes to run in parallel.\n", - " \n", - " memory_limit : str, default=\"4GB\"\n", - " Memory limit for each job. See Dask [LocalCluster documentation](https://distributed.dask.org/en/stable/api.html#distributed.Client) for more information.\n", - "\n", - " \n", - " verbose : int, default=1 \n", - " How much information to print during the optimization process. Higher values include the information from lower values.\n", - " 0. nothing\n", - " 1. progress bar\n", - " \n", - " 3. best individual\n", - " 4. warnings\n", - " >=5. full warnings trace\n", - " 6. evaluations progress bar. (Temporary: This used to be 2. Currently, using evaluation progress bar may prevent some instances were we terminate a generation early due to it reaching max_time_mins in the middle of a generation OR a pipeline failed to be terminated normally and we need to manually terminate it.)\n", - " \n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# TPOTEstimator and TPOTEstimatorSteadyState\n", - "\n", - "TPOTEstimator and TPOTEstimatorSteadyState expose more parameters for customizing search spaces and evolutionary algorithms. The next tutorial will cover customizing search spaces in more detail.\n", - "\n", - "The TPOTClassifier and TPOTRegressor set default parameters for the TPOTEstimator for Classification and Regression.\n", - "In the future, a metalearner will be used to predict the best values for a given dataset." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### tpot2.TPOTEstimatorSteadyState" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Evaluations: : 77it [00:30, 2.54it/s]\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/linear_model/_sag.py:350: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge\n", - " warnings.warn(\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "1.0\n" - ] - } - ], - "source": [ - "import tpot2\n", - "import sklearn\n", - "import sklearn.datasets\n", - "\n", - "\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = tpot2.config.get_search_space(\"selectors\"), \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "est = tpot2.TPOTEstimatorSteadyState( \n", - " search_space = graph_search_space,\n", - " scorers=['roc_auc_ovr'], #scorers can be a list of strings or a list of scorers. These get evaluated during cross validation. \n", - " scorers_weights=[1],\n", - "\n", - " classification=True,\n", - "\n", - " max_eval_time_mins=15,\n", - " max_time_mins=30,\n", - " verbose=2)\n", - "\n", - "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", - "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))\n" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "fitted_pipeline = est.fitted_pipeline_ # access best pipeline directly\n", - "fitted_pipeline.plot()" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
roc_auc_scoreParentsVariation_FunctionIndividualSubmitted TimestampCompleted TimestampEval ErrorPareto_FrontInstance
00.914484NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('DecisionTreeClassifier_1', 'SelectFwe_1'), ...
10.966071NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('DecisionTreeClassifier_1', 'SelectPercentil...
20.735952NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('DecisionTreeClassifier_1', 'PassKBinsDiscre...
30.991534NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('LogisticRegression_1', 'SelectPercentile_1'...
40.997540NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('LogisticRegression_1', 'VarianceThreshold_1...
..............................
720.992910(19, 19)ind_mutate<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('KNeighborsClassifier_1', 'ColumnOneHotEncod...
730.983743(8, 8)ind_mutate<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('KNeighborsClassifier_1', 'VarianceThreshold...
740.997540(63, 63)ind_mutate<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('LogisticRegression_1', 'VarianceThreshold_1...
750.978929(63, 63)ind_mutate<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('LogisticRegression_1', 'SelectFwe_1'), ('Lo...
760.997540(65, 42)ind_crossover<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('LogisticRegression_1', 'VarianceThreshold_1...
\n", - "

77 rows × 9 columns

\n", - "
" - ], - "text/plain": [ - " roc_auc_score Parents Variation_Function \\\n", - "0 0.914484 NaN NaN \n", - "1 0.966071 NaN NaN \n", - "2 0.735952 NaN NaN \n", - "3 0.991534 NaN NaN \n", - "4 0.997540 NaN NaN \n", - ".. ... ... ... \n", - "72 0.992910 (19, 19) ind_mutate \n", - "73 0.983743 (8, 8) ind_mutate \n", - "74 0.997540 (63, 63) ind_mutate \n", - "75 0.978929 (63, 63) ind_mutate \n", - "76 0.997540 (65, 42) ind_crossover \n", - "\n", - " Individual Submitted Timestamp \\\n", - "0 " - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "fitted_pipeline = est.fitted_pipeline_ # access best pipeline directly\n", - "fitted_pipeline.plot() #plot the best pipeline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "view the results of all evaluated individuals as a pandas dataframe" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
roc_auc_scorecomplexity_scorerParentsVariation_FunctionIndividualSubmitted TimestampCompleted TimestampEval ErrorPareto_FrontInstance
00.97674616.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('LogisticRegression_1', 'FastICA_1'), ('Fast...
1NaNNaNNaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09INVALIDNaN[('DecisionTreeClassifier_1', 'FeatureAgglomer...
20.99555615.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('LogisticRegression_1', 'VarianceThreshold_1')]
30.9856154.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09None1.0[('KNeighborsClassifier_1', 'SelectPercentile_...
40.65190590.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('KNeighborsClassifier_1', 'SelectFwe_1'), ('...
.................................
82NaNNaN(23, 23)ind_mutate<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09INVALIDNaN[('KNeighborsClassifier_1', 'SelectPercentile_...
830.9667064.0(66, 26)ind_mutate , ind_mutate , ind_crossover<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('KNeighborsClassifier_1', 'SelectPercentile_...
84NaNNaN(44, 44)ind_mutate<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09INVALIDNaN[('DecisionTreeClassifier_1', 'SelectPercentil...
850.998730308.8(63, 63)ind_mutate<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('LogisticRegression_1', 'SelectPercentile_2'...
860.998889301.0(24, 24)ind_mutate<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09NoneNaN[('LogisticRegression_1', 'QuantileTransformer...
\n", - "

87 rows × 10 columns

\n", - "
" - ], - "text/plain": [ - " roc_auc_score complexity_scorer Parents \\\n", - "0 0.976746 16.0 NaN \n", - "1 NaN NaN NaN \n", - "2 0.995556 15.0 NaN \n", - "3 0.985615 4.0 NaN \n", - "4 0.651905 90.0 NaN \n", - ".. ... ... ... \n", - "82 NaN NaN (23, 23) \n", - "83 0.966706 4.0 (66, 26) \n", - "84 NaN NaN (44, 44) \n", - "85 0.998730 308.8 (63, 63) \n", - "86 0.998889 301.0 (24, 24) \n", - "\n", - " Variation_Function \\\n", - "0 NaN \n", - "1 NaN \n", - "2 NaN \n", - "3 NaN \n", - "4 NaN \n", - ".. ... \n", - "82 ind_mutate \n", - "83 ind_mutate , ind_mutate , ind_crossover \n", - "84 ind_mutate \n", - "85 ind_mutate \n", - "86 ind_mutate \n", - "\n", - " Individual Submitted Timestamp \\\n", - "0 \n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
roc_auc_scorecomplexity_scorerParentsVariation_FunctionIndividualSubmitted TimestampCompleted TimestampEval ErrorPareto_FrontInstance
30.9856154.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09None1.0[('KNeighborsClassifier_1', 'SelectPercentile_...
140.9975408.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09None1.0[('KNeighborsClassifier_1', 'SelectPercentile_...
241.00000023.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09None1.0[('LogisticRegression_1', 'SelectFwe_1'), ('Lo...
250.99761917.4NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09None1.0[('LogisticRegression_1', 'FastICA_1'), ('Logi...
420.9905566.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09None1.0[('LogisticRegression_1', 'VarianceThreshold_1...
440.9933137.4NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09None1.0[('LogisticRegression_1', 'SelectPercentile_1'...
630.9656751.0(9, 42)ind_crossover<tpot2.search_spaces.pipelines.graph.GraphPipe...1.719621e+091.719621e+09None1.0[('KNeighborsClassifier_1', 'VarianceThreshold...
\n", - "" - ], - "text/plain": [ - " roc_auc_score complexity_scorer Parents Variation_Function \\\n", - "3 0.985615 4.0 NaN NaN \n", - "14 0.997540 8.0 NaN NaN \n", - "24 1.000000 23.0 NaN NaN \n", - "25 0.997619 17.4 NaN NaN \n", - "42 0.990556 6.0 NaN NaN \n", - "44 0.993313 7.4 NaN NaN \n", - "63 0.965675 1.0 (9, 42) ind_crossover \n", - "\n", - " Individual Submitted Timestamp \\\n", - "3 " - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "pareto_front = est.pareto_front\n", - "\n", - "#plot the pareto front of number_of_leaves_objective vs roc_auc_score\n", - "\n", - "import matplotlib.pyplot as plt\n", - "plt.scatter(pareto_front['complexity_scorer'], pareto_front['roc_auc_score'])\n", - "plt.xlabel('complexity_scorer')\n", - "plt.ylabel('roc_auc_score')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### tpot2.TPOTEstimator" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 100%|██████████| 5/5 [01:12<00:00, 14.45s/it]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.9971509971509972\n" - ] - } - ], - "source": [ - "import tpot2\n", - "import sklearn\n", - "import sklearn.datasets\n", - "\n", - "est = tpot2.TPOTEstimator( \n", - " search_space = graph_search_space,\n", - " population_size=30,\n", - " generations=5,\n", - " scorers=['roc_auc_ovr'], #scorers can be a list of strings or a list of scorers. These get evaluated during cross validation. \n", - " scorers_weights=[1],\n", - " classification=True,\n", - " n_jobs=1, \n", - " early_stop=5, #how many generations with no improvement to stop after\n", - " \n", - " #List of other objective functions. All objective functions take in an untrained GraphPipeline and return a score or a list of scores\n", - " other_objective_functions= [ ],\n", - " \n", - " #List of weights for the other objective functions. Must be the same length as other_objective_functions. By default, bigger is better is set to True. \n", - " other_objective_functions_weights=[],\n", - " verbose=2)\n", - "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", - "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "tpot_dev", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.14" - }, - "orig_nbformat": 4, - "vscode": { - "interpreter": { - "hash": "7fe1fe9ef32cd5efd76326a08046147513534f0dd2318301a1a96ae9071c1c4e" - } - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb new file mode 100644 index 00000000..b4f18627 --- /dev/null +++ b/Tutorial/1_Using_TPOT.ipynb @@ -0,0 +1,599 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# What to expect from AutoML software\n", + "Automated machine learning (AutoML) takes a higher-level approach to machine learning than most practitioners are used to, so we've gathered a handful of guidelines on what to expect when running AutoML software such as TPOT.\n", + "\n", + "#### AUTOML ALGORITHMS AREN'T INTENDED TO RUN FOR ONLY A FEW MINUTES\n", + "Of course, you can run TPOT for only a few minutes and it will find a reasonably good pipeline for your dataset. However, if you don't run TPOT for long enough, it may not find the best possible pipeline for your dataset. It may even not find any suitable pipeline at all, in which case a RuntimeError('A pipeline has not yet been optimized. Please call fit() first.') will be raised. Often it is worthwhile to run multiple instances of TPOT in parallel for a long time (hours to days) to allow TPOT to thoroughly search the pipeline space for your dataset.\n", + "\n", + "#### AUTOML ALGORITHMS CAN TAKE A LONG TIME TO FINISH THEIR SEARCH\n", + "AutoML algorithms aren't as simple as fitting one model on the dataset; they are considering multiple machine learning algorithms (random forests, linear models, SVMs, etc.) in a pipeline with multiple preprocessing steps (missing value imputation, scaling, PCA, feature selection, etc.), the hyperparameters for all of the models and preprocessing steps, as well as multiple ways to ensemble or stack the algorithms within the pipeline.\n", + "\n", + "As such, TPOT will take a while to run on larger datasets, but it's important to realize why. With the default TPOT settings (100 generations with 100 population size), TPOT will evaluate 10,000 pipeline configurations before finishing. To put this number into context, think about a grid search of 10,000 hyperparameter combinations for a machine learning algorithm and how long that grid search will take. That is 10,000 model configurations to evaluate with 10-fold cross-validation, which means that roughly 100,000 models are fit and evaluated on the training data in one grid search. That's a time-consuming procedure, even for simpler models like decision trees.\n", + "\n", + "Typical TPOT runs will take hours to days to finish (unless it's a small dataset), but you can always interrupt the run partway through and see the best results so far. TPOT also provides a warm_start parameter that lets you restart a TPOT run from where it left off.\n", + "\n", + "#### AUTOML ALGORITHMS CAN RECOMMEND DIFFERENT SOLUTIONS FOR THE SAME DATASET\n", + "If you're working with a reasonably complex dataset or run TPOT for a short amount of time, different TPOT runs may result in different pipeline recommendations. TPOT's optimization algorithm is stochastic in nature, which means that it uses randomness (in part) to search the possible pipeline space. When two TPOT runs recommend different pipelines, this means that the TPOT runs didn't converge due to lack of time or that multiple pipelines perform more-or-less the same on your dataset.\n", + "\n", + "This is actually an advantage over fixed grid search techniques: TPOT is meant to be an assistant that gives you ideas on how to solve a particular machine learning problem by exploring pipeline configurations that you might have never considered, then leaves the fine-tuning to more constrained parameter tuning techniques such as grid search." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# TPOT with code\n", + "\n", + "We've taken care to design the TPOT interface to be as similar as possible to scikit-learn.\n", + "\n", + "TPOT can be imported just like any regular Python module. To import TPOT, type:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "from tpot2 import TPOTClassifier" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "then create an instance of TPOT as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "classification_optimizer = TPOTClassifier()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It's also possible to use TPOT for regression problems with the TPOTRegressor class. Other than the class name, a TPOTRegressor is used the same way as a TPOTClassifier. You can read more about the TPOTClassifier and TPOTRegressor classes in the API documentation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tpot2 import TPOTRegressor\n", + "regression_optimizer = TPOTRegressor()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Fitting a TPOT model works exactly like any other sklearn estimator. Some example code with custom TPOT parameters might look like:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: : 3it [00:33, 11.04s/it]\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/preprocessing/_data.py:2762: UserWarning: n_quantiles (1895) is greater than the total number of samples (455). n_quantiles is set to n_samples.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/linear_model/_sag.py:350: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge\n", + " warnings.warn(\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "auroc_score: 0.9904100529100529\n" + ] + } + ], + "source": [ + "import sklearn\n", + "import sklearn.datasets\n", + "import sklearn.metrics\n", + "\n", + "classification_optimizer = TPOTClassifier(search_space=\"light\", max_time_mins=30/60, n_jobs=30, cv=5)\n", + "\n", + "X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", + "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, random_state=1, test_size=0.2)\n", + "\n", + "classification_optimizer.fit(X_train, y_train)\n", + "\n", + "auroc_score = sklearn.metrics.roc_auc_score(y_test, classification_optimizer.predict_proba(X_test)[:,1])\n", + "print(\"auroc_score: \", auroc_score)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Scorers, Objective Functions, and multi objective optimization.\n", + "\n", + "There are two ways of passing objectives into TPOT2. \n", + "\n", + "1. `scorers`: Scorers are functions that have the signature (estimator, X, y). These can be produced with the [sklearn.metrics.make_scorer](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html) function. This function is used to evaluate the test folds during cross validation. These are passed into TPOT2 via the scorers parameter. This can take in the scorer itself or the string corresponding to a scoring function ([as listed here](https://scikit-learn.org/stable/modules/model_evaluation.html)). TPOT2 also supports passing in a list of several scorers for multiobjective optimization. \n", + "\n", + "2. `other_objective_functions` : Other objective functions in TPOT2 have the signature (estimator) and returns a float or list of floats. These get passed an unfitted estimator (in the case of TPOT2, a `tpot2.GraphPipeline`). \n", + "\n", + "\n", + "\n", + "Each scorer and objective function must be accompanied by a list of weights corresponding to the list of objectives. By default, TPOT2 maximizes objective functions (this can be changed by `bigger_is_better=False`). Positive weights means that TPOT2 will seek to maximize that objective, and negative weights correspond to minimization.\n", + "\n", + "Here is an example of using two scorers\n", + "\n", + " scorers=['roc_auc_ovr',tpot2.objectives.complexity_scorer],\n", + " scorers_weights=[1,-1],\n", + "\n", + "\n", + "Here is an example with a scorer and a secondary objective function\n", + "\n", + " scorers=['roc_auc_ovr'],\n", + " scorers_weights=[1],\n", + " other_objective_functions=[tpot2.objectives.number_of_leaves_objective],\n", + " other_objective_functions_weights=[-1],\n", + "\n", + "\n", + "TPOT will automatically name the scores based on the function name for the columns in the final results dataframe. If you would like to specify custom function names, you can set the `objective_function_names` to be a list of names (str) for each score. The order of the names are scorers first, and other objective functions second. (e.g. `objective_function_names=['scorer1','scorer2', 'objective1','objective2'])`.\n", + "\n", + "It is possible to have either the scorer or other_objective_function to return multiple values. In that case, just make sure that the `scorer_weights` and `other_objective_function_weights` are the same length as the number of returned scores.\n", + "\n", + "\n", + "TPOT comes with a few additional built in objective functions you can use. The first table are objectives applied to fitted pipelines, and thus are passee into the `scorers` parameter. The second table are objective functions for the `other_objective_functions` param.\n", + "\n", + "Scorers:\n", + "| Function | Description |\n", + "| :--- | :----: |\n", + "| tpot2.objectives.complexity_scorer | Estimates the number of learned parameters across all classifiers and regressors in the pipelines. Additionally, currently transformers add 1 point and selectors add 0 points (since they don't affect the complexity of the \"final\" predictive pipeline.) |\n", + "\n", + "Other Objective Functions.\n", + "\n", + "| Function | Description |\n", + "| :--- | :----: |\n", + "| tpot2.objectives.average_path_length | Computes the average shortest path from all nodes to the root/final estimator (only supported for GraphPipeline) |\n", + "| tpot2.objectives.number_of_leaves_objective | Calculates the number of leaves (input nodes) in a GraphPipeline |\n", + "| tpot2.objectives.number_of_nodes_objective | Calculates the number of nodes in a pipeline (whether it is an scikit-learn Pipeline, GraphPipeline, Feature Union, or the previous nested within each other) |" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Built In Configurations\n", + "TPOT can be used to optimize hyperparameters, select models, and optimize pipelines of models including determining the sequence of steps. Tutorial 2 goes into more detail on how to customize search spaces with custom hyperparameter ranges, model types, and possible pipeline configurations. TPOT also comes with a handful of default operators and parameter configurations that we believe work well for optimizing machine learning pipelines. Below is a list of the current built-in configurations that come with TPOT. These can be passed in as strings to the `search space` parameter of any of the TPOT estimators.\n", + "\n", + "| String | Description |\n", + "| :--- | :----: |\n", + "| linear | A linear pipeline with the structure of \"Selector->(transformers+Passthrough)->(classifiers/regressors+Passthrough)->final classifier/regressor.\" For both the transformer and inner estimator layers, TPOT may choose one or more transformers/classifiers, or it may choose none. The inner classifier/regressor layer is optional. |\n", + "| light | Same search space as linear, but without the inner classifier/regressor layer and with a reduced set of faster running estimators. |\n", + "| graph | TPOT will optimize a pipeline in the shape of a directed acyclic graph. The nodes of the graph can include selectors, scalers, transformers, or classifiers/regressors (inner classifiers/regressors can optionally be not included). This will return a custom GraphPipeline rather than an sklearn Pipeline. More details in Tutorial 6. |\n", + "| mdr |TPOT will search over a series of feature selectors and Multifactor Dimensionality Reduction models to find a series of operators that maximize prediction accuracy. The TPOT MDR configuration is specialized for genome-wide association studies (GWAS), and is described in detail online here.\n", + "\n", + "Note that TPOT MDR may be slow to run because the feature selection routines are computationally expensive, especially on large datasets. |\n", + "\n", + "Note: the `linear` and `graph` configurations by default allow for additional stacked classifiers/regressors within the pipeline in addition to the final classifier/regressor. If you would like to disable this, you can manually get the search space without inner classifier/regressors through the function `tpot2.config.template_search_spaces.get_template_search_spaces` with `inner_predictios=False`. You can pass the resulting search space into the `search space` param.\n", + "\n", + "The specific hyperparameter ranges used by TPOT can be found in files in the tpot2/config folder. The template search spaces listed above are defined in tpot2/config/template_search_spaces.py. Search spaces for individual models can be acquired in the tpot2/config/get_configspace.py file (`tpot2.config.get_search_space`). More details in Tutorial 2." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Example analysis " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Best Practices\n", + "\n", + "When running tpot from an .py script, it is important to protect code with `if __name__==\"__main__\":` . This is because of how TPOT handles parallelization with Python and Dask." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#my_analysis.py\n", + "\n", + "from dask.distributed import Client, LocalCluster\n", + "import tpot2\n", + "import sklearn\n", + "import sklearn.datasets\n", + "import numpy as np\n", + "\n", + "if __name__==\"__main__\":\n", + " scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", + " X, y = sklearn.datasets.load_digits(return_X_y=True)\n", + " X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", + "\n", + "\n", + " est = tpot2.TPOTClassifier(n_jobs=4, max_time_mins=60, verbose=2)\n", + " est.fit(X_train, y_train)\n", + "\n", + "\n", + " print(scorer(est, X_test, y_test))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Common parameters\n", + "\n", + " scorers : (list, scorer)\n", + " A scorer or list of scorers to be used in the cross-validation process. \n", + " see https://scikit-learn.org/stable/modules/model_evaluation.html\n", + " \n", + " scorers_weights : list\n", + " A list of weights to be applied to the scorers during the optimization process.\n", + " \n", + " classification : bool\n", + " If True, the problem is treated as a classification problem. If False, the problem is treated as a regression problem.\n", + " Used to determine the CV strategy.\n", + " \n", + " cv : int, cross-validator\n", + " - (int): Number of folds to use in the cross-validation process. By uses the sklearn.model_selection.KFold cross-validator for regression and StratifiedKFold for classification. In both cases, shuffled is set to True.\n", + " - (sklearn.model_selection.BaseCrossValidator): A cross-validator to use in the cross-validation process.\n", + " - max_depth (int): The maximum depth from any node to the root of the pipelines to be generated.\n", + " \n", + " other_objective_functions : list, default=[tpot2.objectives.estimator_objective_functions.average_path_length_objective]\n", + " A list of other objective functions to apply to the pipeline.\n", + " \n", + " other_objective_functions_weights : list, default=[-1]\n", + " A list of weights to be applied to the other objective functions.\n", + " \n", + " objective_function_names : list, default=None\n", + " A list of names to be applied to the objective functions. If None, will use the names of the objective functions.\n", + " \n", + " bigger_is_better : bool, default=True\n", + " If True, the objective function is maximized. If False, the objective function is minimized. Use negative weights to reverse the direction.\n", + " \n", + " generations : int, default=50\n", + " Number of generations to run\n", + " \n", + " max_time_mins : float, default=float(\"inf\")\n", + " Maximum time to run the optimization. If none or inf, will run until the end of the generations.\n", + " \n", + " max_eval_time_mins : float, default=60*5\n", + " Maximum time to evaluate a single individual. If none or inf, there will be no time limit per evaluation.\n", + "\n", + " n_jobs : int, default=1\n", + " Number of processes to run in parallel.\n", + " \n", + " memory_limit : str, default=\"4GB\"\n", + " Memory limit for each job. See Dask [LocalCluster documentation](https://distributed.dask.org/en/stable/api.html#distributed.Client) for more information.\n", + "\n", + " \n", + " verbose : int, default=1 \n", + " How much information to print during the optimization process. Higher values include the information from lower values.\n", + " 0. nothing\n", + " 1. progress bar\n", + " \n", + " 3. best individual\n", + " 4. warnings\n", + " >=5. full warnings trace\n", + " 6. evaluations progress bar. (Temporary: This used to be 2. Currently, using evaluation progress bar may prevent some instances were we terminate a generation early due to it reaching max_time_mins in the middle of a generation OR a pipeline failed to be terminated normally and we need to manually terminate it.)\n", + " \n" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# More Options\n", + "\n", + "`tpot2.TPOTClassifier` and `tpot2.TPOTRegressor` have a simplified set of hyperparameters with default values set for classification and regression problems. Currently, both of these use the standard evolutionary algorithm in the `tpot2.TPOTEstimator` class. If you want more control you can look into either the `tpot2.TPOTEstimator` or `tpot2.TPOTEstimatorSteadyState` class.\n", + "\n", + "There are two evolutionary algorithms built into TPOT2, which corresponds to two different estimator classes.\n", + "\n", + "1. The `tpot2.TPOTEstimator` uses a standard evolutionary algorithm that evaluates exactly population_size individuals each generation. This is similar to the algorithm in TPOT1. The next generation does not start until the previous is completely finished evaluating. This leads to underutilized CPU time as the cores are waiting for the last individuals to finish training, but may preserve diversity in the population. \n", + "\n", + "2. The `tpot2.TPOTEstimatorSteadyState` differs in that it will generate and evaluate the next individual as soon as an individual finishes evaluation. The number of individuals being evaluated is determined by the n_jobs parameter. There is no longer a concept of generations. The population_size parameter now refers to the size of the list of evaluated parents. When an individual is evaluated, the selection method updates the list of parents. This allows more efficient utilization when using multiple cores.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import tpot2\n", + "import sklearn\n", + "import sklearn.datasets\n", + "\n", + "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", + "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", + "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", + "\n", + "\n", + "est = tpot2.TPOTClassifier(n_jobs=40, max_time_mins=30, verbose=5, generations=1, population_size=5)\n", + "est.fit(X_train, y_train)\n", + "\n", + "\n", + "print(scorer(est, X_test, y_test))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "est._evolver_instance.population.evaluated_individuals.iloc[0]['Individual'].export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import tpot2\n", + "import sklearn\n", + "import sklearn.metrics\n", + "import sklearn.datasets\n", + "\n", + "scorer = sklearn.metrics.get_scorer('neg_mean_squared_error')\n", + "X, y = sklearn.datasets.load_diabetes(return_X_y=True)\n", + "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", + "\n", + "est = tpot2.tpot_estimator.templates.TPOTRegressor(n_jobs=4, max_time_mins=30, verbose=2, cv=5)\n", + "est.fit(X_train, y_train)\n", + "\n", + "print(scorer(est, X_test, y_test))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### tpot2.TPOTEstimatorSteadyState" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import tpot2\n", + "import sklearn\n", + "import sklearn.datasets\n", + "\n", + "\n", + "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", + " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", + " leaf_search_space = tpot2.config.get_search_space(\"selectors\"), \n", + " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", + " max_size = 10,\n", + ")\n", + "\n", + "est = tpot2.TPOTEstimatorSteadyState( \n", + " search_space = graph_search_space,\n", + " scorers=['roc_auc_ovr'], #scorers can be a list of strings or a list of scorers. These get evaluated during cross validation. \n", + " scorers_weights=[1],\n", + "\n", + " classification=True,\n", + "\n", + " max_eval_time_mins=15,\n", + " max_time_mins=30,\n", + " verbose=2)\n", + "\n", + "\n", + "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", + "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", + "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", + "est.fit(X_train, y_train)\n", + "print(scorer(est, X_test, y_test))\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fitted_pipeline = est.fitted_pipeline_ # access best pipeline directly\n", + "fitted_pipeline.plot()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#view the summary of all evaluated individuals as a pandas dataframe\n", + "est.evaluated_individuals" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import tpot2\n", + "import sklearn\n", + "import sklearn.datasets\n", + "\n", + "est = tpot2.TPOTEstimatorSteadyState( \n", + " search_space = graph_search_space,\n", + " scorers=['roc_auc_ovr',tpot2.objectives.complexity_scorer],\n", + " scorers_weights=[1,-1],\n", + "\n", + " classification=True,\n", + "\n", + " max_eval_time_mins=15,\n", + " max_time_mins=30,\n", + " verbose=2)\n", + "\n", + "\n", + "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", + "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", + "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", + "est.fit(X_train, y_train)\n", + "print(scorer(est, X_test, y_test))\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fitted_pipeline = est.fitted_pipeline_ # access best pipeline directly\n", + "fitted_pipeline.plot() #plot the best pipeline" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "view the results of all evaluated individuals as a pandas dataframe" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "est.evaluated_individuals" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "view pareto front as a pandas dataframe" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "est.pareto_front" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "pareto_front = est.pareto_front\n", + "\n", + "#plot the pareto front of number_of_leaves_objective vs roc_auc_score\n", + "\n", + "import matplotlib.pyplot as plt\n", + "plt.scatter(pareto_front['complexity_scorer'], pareto_front['roc_auc_score'])\n", + "plt.xlabel('complexity_scorer')\n", + "plt.ylabel('roc_auc_score')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### tpot2.TPOTEstimator" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import tpot2\n", + "import sklearn\n", + "import sklearn.datasets\n", + "\n", + "est = tpot2.TPOTEstimator( \n", + " search_space = graph_search_space,\n", + " population_size=30,\n", + " generations=5,\n", + " scorers=['roc_auc_ovr'], #scorers can be a list of strings or a list of scorers. These get evaluated during cross validation. \n", + " scorers_weights=[1],\n", + " classification=True,\n", + " n_jobs=1, \n", + " early_stop=5, #how many generations with no improvement to stop after\n", + " \n", + " #List of other objective functions. All objective functions take in an untrained GraphPipeline and return a score or a list of scores\n", + " other_objective_functions= [ ],\n", + " \n", + " #List of weights for the other objective functions. Must be the same length as other_objective_functions. By default, bigger is better is set to True. \n", + " other_objective_functions_weights=[],\n", + " verbose=2)\n", + "\n", + "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", + "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", + "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", + "est.fit(X_train, y_train)\n", + "print(scorer(est, X_test, y_test))" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "tpot_dev", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + }, + "orig_nbformat": 4, + "vscode": { + "interpreter": { + "hash": "7fe1fe9ef32cd5efd76326a08046147513534f0dd2318301a1a96ae9071c1c4e" + } + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/tpot2/config/classifiers.py b/tpot2/config/classifiers.py index 2fb09e41..8ee26963 100644 --- a/tpot2/config/classifiers.py +++ b/tpot2/config/classifiers.py @@ -42,7 +42,6 @@ def get_KNeighborsClassifier_ConfigurationSpace(n_samples): 'n_neighbors': Integer("n_neighbors", bounds=(1, min(100,n_samples)), log=True), 'weights': Categorical("weights", ['uniform', 'distance']), 'p': Integer("p", bounds=(1, 3)), - 'metric': Categorical("metric", ['euclidean', 'minkowski']), 'n_jobs': 1, } ) @@ -79,7 +78,7 @@ def get_DecisionTreeClassifier_ConfigurationSpace(n_featues, random_state): space = { 'criterion': Categorical("criterion", ['gini', 'entropy']), 'max_depth': Integer("max_depth", bounds=(1, min(20,2*n_featues))), #max of 20? log scale? - 'min_samples_split': Integer("min_samples_split", bounds=(1, 20)), + 'min_samples_split': Integer("min_samples_split", bounds=(2, 20)), 'min_samples_leaf': Integer("min_samples_leaf", bounds=(1, 20)), 'max_features': Categorical("max_features", [NONE_SPECIAL_STRING, 'sqrt', 'log2']), 'min_weight_fraction_leaf': 0.0, diff --git a/tpot2/config/regressors.py b/tpot2/config/regressors.py index ab14a7ea..6348f5c2 100644 --- a/tpot2/config/regressors.py +++ b/tpot2/config/regressors.py @@ -240,9 +240,7 @@ def get_KNeighborsRegressor_ConfigurationSpace(n_samples): space = { 'n_neighbors': Integer("n_neighbors", bounds=(1, min(100,n_samples))), 'weights': Categorical("weights", ['uniform', 'distance']), - 'p': Integer("p", bounds=(1, 3)), - 'metric': Categorical("metric", ['minkowski', 'euclidean', 'manhattan']), - } + 'p': Integer("p", bounds=(1, 3)), } ) diff --git a/tpot2/config/template_search_spaces.py b/tpot2/config/template_search_spaces.py index 593f0499..23b7a324 100644 --- a/tpot2/config/template_search_spaces.py +++ b/tpot2/config/template_search_spaces.py @@ -85,12 +85,93 @@ def get_graph_search_space(classification=True, inner_predictors=True, **get_sea return search_space -def get_template_search_spaces(default_search_space, classification=True, inner_predictors=True, **get_search_space_params): + +def get_light_search_space(classification=True, inner_predictors=False, **get_search_space_params ): + + selectors = get_search_space(["SelectFwe", "SelectPercentile", "VarianceThreshold","Passthrough"], **get_search_space_params) + + if classification: + estimators = get_search_space(['BernoulliNB', 'DecisionTreeClassifier', 'GaussianNB', 'KNeighborsClassifier', 'LogisticRegression', 'MultinomialNB'], **get_search_space_params) + else: + estimators = get_search_space(["RidgeCV", "LinearSVR", "LassoLarsCV", "KNeighborsRegressor", "DecisionTreeRegressor", "ElasticNetCV"], **get_search_space_params) + + # this allows us to wrap the classifiers in the EstimatorTransformer + # this is necessary so that classifiers can be used inside of sklearn pipelines + wrapped_estimators = WrapperPipeline(tpot2.builtin_modules.EstimatorTransformer, {}, estimators) + + scalers = get_search_space(["scalers","Passthrough"], **get_search_space_params) + + transformers_layer =UnionPipeline([ + ChoicePipeline([ + DynamicUnionPipeline(get_search_space(["transformers"],**get_search_space_params)), + get_search_space("SkipTransformer", **get_search_space_params), + ]), + get_search_space("Passthrough", **get_search_space_params) + ] + ) + + inner_estimators_layer = UnionPipeline([ + ChoicePipeline([ + DynamicUnionPipeline(wrapped_estimators), + get_search_space("SkipTransformer", **get_search_space_params), + ]), + get_search_space("Passthrough", **get_search_space_params)] + ) + + if inner_predictors: + search_space = SequentialPipeline(search_spaces=[ + scalers, + selectors, + transformers_layer, + inner_estimators_layer, + estimators, + ]) + else: + search_space = SequentialPipeline(search_spaces=[ + scalers, + selectors, + transformers_layer, + estimators, + ]) + + return search_space + +def get_mdr_search_space(classification=True, **get_search_space_params ): + + mdr_sp = DynamicLinearPipeline(get_search_space(["ReliefF", "SURF", "SURFstar", "MultiSURF", "ContinuousMDR"], **get_search_space_params), max_length=10) + + if classification: + estimators = get_search_space(['LogisticRegression'], **get_search_space_params) + else: + estimators = get_search_space(["ElasticNetCV"], **get_search_space_params) + + search_space = SequentialPipeline(search_spaces=[ + mdr_sp, + estimators, + ]) + + return search_space + + + + +def get_template_search_spaces(default_search_space, classification=True, inner_predictors=None, **get_search_space_params): + + if inner_predictors is None: + if default_search_space == "light": + inner_predictors = False + else: + inner_predictors = True + if isinstance(default_search_space, str): if default_search_space == "linear": return get_linear_search_space(classification, inner_predictors, **get_search_space_params) elif default_search_space == "graph": return get_graph_search_space(classification, inner_predictors, **get_search_space_params) + elif default_search_space == "light": + return get_light_search_space(classification, inner_predictors, **get_search_space_params) + elif default_search_space == "mdr": + return get_mdr_search_space(classification, **get_search_space_params) else: raise ValueError("Invalid search space") else: diff --git a/tpot2/objectives/average_path_length.py b/tpot2/objectives/average_path_length.py index 407b918b..dd3fcb0d 100644 --- a/tpot2/objectives/average_path_length.py +++ b/tpot2/objectives/average_path_length.py @@ -2,6 +2,15 @@ import numpy as np def average_path_length_objective(graph_pipeline): + """ + Computes the average shortest path from all nodes to the root/final estimator (only supported for GraphPipeline) + + Parameters + ---------- + graph_pipeline: GraphPipeline + The pipeline to compute the average path length for + + """ path_lengths = nx.shortest_path_length(graph_pipeline.graph, source=graph_pipeline.root) return np.mean(np.array(list(path_lengths.values())))+1 \ No newline at end of file diff --git a/tpot2/objectives/complexity.py b/tpot2/objectives/complexity.py index f4d5112d..f3c305a1 100644 --- a/tpot2/objectives/complexity.py +++ b/tpot2/objectives/complexity.py @@ -228,6 +228,20 @@ def calculate_model_complexity(est): return 1 -def complexity_scorer(est, X, y): +def complexity_scorer(est, X=None, y=None): + """ + Estimates the number of learned parameters across all classifiers and regressors in the pipelines. + Additionally, currently transformers add 1 point and selectors add 0 points (since they don't affect the complexity of the "final" predictive pipeline. + + Parameters + ---------- + est: sklearn.base.BaseEstimator + The estimator or pipeline to compute the complexity for + X: array-like + The input samples (unused) + y: array-like + The target values (unused) + + """ return calculate_model_complexity(est) diff --git a/tpot2/objectives/number_of_leaves.py b/tpot2/objectives/number_of_leaves.py index 2ea34c62..f876caff 100644 --- a/tpot2/objectives/number_of_leaves.py +++ b/tpot2/objectives/number_of_leaves.py @@ -1,5 +1,13 @@ -def number_of_leaves_scorer(est,X,y): +def number_of_leaves_scorer(est,X=None, y=None): return len([v for v, d in est.graph.out_degree() if d == 0]) def number_of_leaves_objective(est): + """ + Calculates the number of leaves (input nodes) in a GraphPipeline + + Parameters + ---------- + est: GraphPipeline + The pipeline to compute the number of leaves for + """ return len([v for v, d in est.graph.out_degree() if d == 0]) \ No newline at end of file diff --git a/tpot2/objectives/number_of_nodes.py b/tpot2/objectives/number_of_nodes.py index a56368e2..a531851d 100644 --- a/tpot2/objectives/number_of_nodes.py +++ b/tpot2/objectives/number_of_nodes.py @@ -3,6 +3,15 @@ import sklearn def number_of_nodes_objective(est): + """ + Calculates the number of leaves (input nodes) in an sklearn pipeline + + Parameters + ---------- + est: GraphPipeline | Pipeline | FeatureUnion | BaseEstimator + The pipeline to compute the number of nodes from. + """ + if isinstance(est, GraphPipeline): return sum(number_of_nodes_objective(est.graph.nodes[node]["instance"]) for node in est.graph.nodes) if isinstance(est, Pipeline): diff --git a/tpot2/tpot_estimator/estimator.py b/tpot2/tpot_estimator/estimator.py index 4623b12f..fdaaae86 100644 --- a/tpot2/tpot_estimator/estimator.py +++ b/tpot2/tpot_estimator/estimator.py @@ -36,7 +36,7 @@ def __init__(self, scorers, scorers_weights, classification, - cv = 5, + cv = 10, other_objective_functions=[], other_objective_functions_weights = [], objective_function_names = None, diff --git a/tpot2/tpot_estimator/templates/tpottemplates.py b/tpot2/tpot_estimator/templates/tpottemplates.py index c91645eb..ce37d31b 100644 --- a/tpot2/tpot_estimator/templates/tpottemplates.py +++ b/tpot2/tpot_estimator/templates/tpottemplates.py @@ -29,11 +29,11 @@ def __init__( self, early_stop = None, warm_start = False, periodic_checkpoint_folder = None, - verbose = 0, + verbose = 2, memory_limit = "4GB", client = None, random_state=None, - allow_inner_regressors=True, + allow_inner_regressors=None, **tpotestimator_kwargs, ): ''' @@ -282,11 +282,11 @@ def __init__( self, early_stop = None, warm_start = False, periodic_checkpoint_folder = None, - verbose = 0, + verbose = 2, memory_limit = "4GB", client = None, random_state=None, - allow_inner_classifiers=True, + allow_inner_classifiers=None, **tpotestimator_kwargs, ): From 1314eb75849d3fd37654fdbbbd65f4c85d6da477 Mon Sep 17 00:00:00 2001 From: perib Date: Fri, 20 Sep 2024 15:27:26 -0700 Subject: [PATCH 02/44] edits --- Tutorial/1_Using_TPOT.ipynb | 355 +++++++++++++++++- tpot2/evolvers/base_evolver.py | 2 +- tpot2/tpot_estimator/estimator.py | 2 +- .../tpot_estimator/templates/tpottemplates.py | 4 +- 4 files changed, 348 insertions(+), 15 deletions(-) diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index b4f18627..9e707306 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -137,7 +137,7 @@ "\n", "\n", "\n", - "Each scorer and objective function must be accompanied by a list of weights corresponding to the list of objectives. By default, TPOT2 maximizes objective functions (this can be changed by `bigger_is_better=False`). Positive weights means that TPOT2 will seek to maximize that objective, and negative weights correspond to minimization.\n", + "Each scorer and objective function must be accompanied by a list of weights corresponding to the list of objectives, these are `scorer_weights` and `other_objective_function_weights`, respectively. By default, TPOT2 maximizes objective functions (this can be changed by `bigger_is_better=False`). Positive weights means that TPOT2 will seek to maximize that objective, and negative weights correspond to minimization. For most selectors (and the default), only the sign matters. The scale of the weight may matter if using a custom selection function for the optimization algorithm. A zero weight means that the score will not have an impact on the selection algorithm.\n", "\n", "Here is an example of using two scorers\n", "\n", @@ -199,23 +199,26 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Example analysis " + "## Terminating Optimization\n", + "\n", + "Note that we use a short time duration for a quick example, but in practice you may need to run TPOT for a longer duration. by default, TPOT sets a time limit of 1 hour with a max limit of 5 minutes per pipeline. In practice you may want to increase these values.\n", + "\n", + "There are three methods of terminating a TPOT run and ending the optimization process. TPOT will always terminate as soon as one of the conditions is met.\n", + "* `max_time_mins` : (Default, 1 hour) After this many minutes, TPOT will terminate and return the best pipeline it found so far.\n", + "* `early_stop` : An int causes TPOT to terminate early if it goes that number of generations without seeing an improvement in performance. Generally a value of around 5 to 20 is sufficient to be reasonably sure that performance has converged.\n", + "* `generations` : The total number of generations of the evolutionary algorithm to run.\n", + "\n", + "By default, TPOT will run until the time limit is up, with no generation or early stop limits." ] }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### Best Practices\n", + "### Best Practices and tips:\n", "\n", - "When running tpot from an .py script, it is important to protect code with `if __name__==\"__main__\":` . This is because of how TPOT handles parallelization with Python and Dask." + "* When running tpot from an .py script, it is important to protect code with `if __name__==\"__main__\":` . This is because of how TPOT handles parallelization with Python and Dask.\n", + "* You can use the `early_stop` parameter to have TPOT terminate early. " ] }, { @@ -245,6 +248,336 @@ " print(scorer(est, X_test, y_test))" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Example analysis " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "ename": "TypeError", + "evalue": "TPOTEstimator.__init__() got an unexpected keyword argument 'scorer_weights'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "Cell \u001b[0;32mIn[5], line 29\u001b[0m\n\u001b[1;32m 15\u001b[0m X_train, X_test, y_train, y_test \u001b[38;5;241m=\u001b[39m sklearn\u001b[38;5;241m.\u001b[39mmodel_selection\u001b[38;5;241m.\u001b[39mtrain_test_split(X, y, train_size\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0.75\u001b[39m, test_size\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0.25\u001b[39m)\n\u001b[1;32m 18\u001b[0m est \u001b[38;5;241m=\u001b[39m tpot2\u001b[38;5;241m.\u001b[39mTPOTClassifier(\n\u001b[1;32m 19\u001b[0m scorers\u001b[38;5;241m=\u001b[39m[scorer, tpot2\u001b[38;5;241m.\u001b[39mobjectives\u001b[38;5;241m.\u001b[39mcomplexity_scorer],\n\u001b[1;32m 20\u001b[0m scorer_weights\u001b[38;5;241m=\u001b[39m[\u001b[38;5;241m1.0\u001b[39m, \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m1.0\u001b[39m],\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 27\u001b[0m early_stop\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m2\u001b[39m,\n\u001b[1;32m 28\u001b[0m verbose\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m2\u001b[39m,)\n\u001b[0;32m---> 29\u001b[0m \u001b[43mest\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit\u001b[49m\u001b[43m(\u001b[49m\u001b[43mX_train\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43my_train\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 32\u001b[0m \u001b[38;5;28mprint\u001b[39m(scorer(est, X_test, y_test))\n", + "File \u001b[0;32m~/Projects/common/Projects/TPOT_Dev/tpot2/tpot2/tpot_estimator/templates/tpottemplates.py:487\u001b[0m, in \u001b[0;36mTPOTClassifier.fit\u001b[0;34m(self, X, y)\u001b[0m\n\u001b[1;32m 479\u001b[0m get_search_space_params \u001b[38;5;241m=\u001b[39m {\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mn_classes\u001b[39m\u001b[38;5;124m\"\u001b[39m: \u001b[38;5;28mlen\u001b[39m(np\u001b[38;5;241m.\u001b[39munique(y)), \n\u001b[1;32m 480\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mn_samples\u001b[39m\u001b[38;5;124m\"\u001b[39m:\u001b[38;5;28mlen\u001b[39m(y), \n\u001b[1;32m 481\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mn_features\u001b[39m\u001b[38;5;124m\"\u001b[39m:X\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m1\u001b[39m], \n\u001b[1;32m 482\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrandom_state\u001b[39m\u001b[38;5;124m\"\u001b[39m:\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mrandom_state}\n\u001b[1;32m 484\u001b[0m search_space \u001b[38;5;241m=\u001b[39m get_template_search_spaces(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msearch_space, classification\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m, inner_predictors\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mallow_inner_classifiers, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mget_search_space_params)\n\u001b[0;32m--> 487\u001b[0m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mTPOTClassifier\u001b[49m\u001b[43m,\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;21;43m__init__\u001b[39;49m\u001b[43m(\u001b[49m\n\u001b[1;32m 488\u001b[0m \u001b[43m \u001b[49m\u001b[43msearch_space\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msearch_space\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 489\u001b[0m \u001b[43m \u001b[49m\u001b[43mscorers\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mscorers\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 490\u001b[0m \u001b[43m \u001b[49m\u001b[43mscorers_weights\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mscorers_weights\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 491\u001b[0m \u001b[43m \u001b[49m\u001b[43mcv\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcv\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 492\u001b[0m \u001b[43m \u001b[49m\u001b[43mother_objective_functions\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mother_objective_functions\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m#tpot2.objectives.estimator_objective_functions.number_of_nodes_objective],\u001b[39;49;00m\n\u001b[1;32m 493\u001b[0m \u001b[43m \u001b[49m\u001b[43mother_objective_functions_weights\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mother_objective_functions_weights\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 494\u001b[0m \u001b[43m \u001b[49m\u001b[43mobjective_function_names\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mobjective_function_names\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 495\u001b[0m \u001b[43m \u001b[49m\u001b[43mbigger_is_better\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mbigger_is_better\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 496\u001b[0m \u001b[43m \u001b[49m\u001b[43mcategorical_features\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcategorical_features\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 497\u001b[0m \u001b[43m \u001b[49m\u001b[43mmemory\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmemory\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 498\u001b[0m \u001b[43m \u001b[49m\u001b[43mpreprocessing\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpreprocessing\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 499\u001b[0m \u001b[43m \u001b[49m\u001b[43mmax_time_mins\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmax_time_mins\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 500\u001b[0m \u001b[43m \u001b[49m\u001b[43mmax_eval_time_mins\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmax_eval_time_mins\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 501\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_jobs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_jobs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 502\u001b[0m \u001b[43m \u001b[49m\u001b[43mvalidation_strategy\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mvalidation_strategy\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 503\u001b[0m \u001b[43m \u001b[49m\u001b[43mvalidation_fraction\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mvalidation_fraction\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 504\u001b[0m \u001b[43m \u001b[49m\u001b[43mearly_stop\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mearly_stop\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 505\u001b[0m \u001b[43m \u001b[49m\u001b[43mwarm_start\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mwarm_start\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 506\u001b[0m \u001b[43m \u001b[49m\u001b[43mperiodic_checkpoint_folder\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mperiodic_checkpoint_folder\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 507\u001b[0m \u001b[43m \u001b[49m\u001b[43mverbose\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mverbose\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 508\u001b[0m \u001b[43m \u001b[49m\u001b[43mclassification\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 509\u001b[0m \u001b[43m \u001b[49m\u001b[43mmemory_limit\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmemory_limit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 510\u001b[0m \u001b[43m \u001b[49m\u001b[43mclient\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mclient\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 511\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 512\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtpotestimator_kwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 513\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39minitialized \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[1;32m 515\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28msuper\u001b[39m()\u001b[38;5;241m.\u001b[39mfit(X,y)\n", + "\u001b[0;31mTypeError\u001b[0m: TPOTEstimator.__init__() got an unexpected keyword argument 'scorer_weights'" + ] + } + ], + "source": [ + "#my_analysis.py\n", + "\n", + "from dask.distributed import Client, LocalCluster\n", + "import tpot2\n", + "import sklearn\n", + "import sklearn.datasets\n", + "import numpy as np\n", + "\n", + "import tpot2.objectives\n", + "\n", + "\n", + "scorer = sklearn.metrics.get_scorer('roc_auc_ovr')\n", + "\n", + "X, y = sklearn.datasets.load_digits(return_X_y=True)\n", + "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", + "\n", + "\n", + "est = tpot2.TPOTClassifier(\n", + " scorers=[scorer, tpot2.objectives.complexity_scorer],\n", + " scorer_weights=[1.0, -1.0],\n", + " objective_function_names=['roc_auc_ovr', 'complexity'],\n", + "\n", + " search_space=\"light\",\n", + " n_jobs=4, \n", + " max_time_mins=60, \n", + " max_eval_time_mins=5,\n", + " early_stop=2,\n", + " verbose=2,)\n", + "est.fit(X_train, y_train)\n", + "\n", + "print(scorer(est, X_test, y_test))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Further analysis of results\n", + "\n", + "The `evaluated_individuals` attribute of the tpot estimator object is a Pandas Dataframe containing information about a run. Each row corresponds to an individual pipeline explored by tpot. The dataframe contains the following columns:\n", + "\n", + "| Column | Description |\n", + "| :--- | :----: |\n", + "| | The first set of columns will correspond to each objective function. These can either be automatically named by TPOT, or passed in by the user. |\n", + "| Parents | This contains a tuple that contains the indexes of the 'parents' of the current pipeline. For example, (29, 42) means that the pipelines in indexes 29 and 42 were utilized to generate that pipeline. |\n", + "| | |\n", + "| | |\n", + "| | |\n", + "| | |\n", + "| | |" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
roc_auc_scoreParentsVariation_FunctionIndividualGenerationSubmitted TimestampCompleted TimestampEval ErrorPareto_FrontInstance
00.992125NaNNaN<tpot2.search_spaces.pipelines.sequential.Sequ...0.01.726870e+091.726870e+09NoneNaN(Passthrough(), SelectFwe(alpha=0.005516602107...
1NaNNaNNaN<tpot2.search_spaces.pipelines.sequential.Sequ...0.01.726870e+091.726870e+09TIMEOUTNaN(RobustScaler(quantile_range=(0.232938031941, ...
20.991613NaNNaN<tpot2.search_spaces.pipelines.sequential.Sequ...0.01.726870e+091.726870e+09NoneNaN(MinMaxScaler(), Passthrough(), FeatureUnion(t...
30.974053NaNNaN<tpot2.search_spaces.pipelines.sequential.Sequ...0.01.726870e+091.726870e+09NoneNaN(MaxAbsScaler(), VarianceThreshold(threshold=0...
4NaNNaNNaN<tpot2.search_spaces.pipelines.sequential.Sequ...0.01.726870e+091.726870e+09TIMEOUTNaN(RobustScaler(quantile_range=(0.0238159499352,...
.................................
1950.946552(108, 124)ind_crossover , ind_mutate<tpot2.search_spaces.pipelines.sequential.Sequ...3.01.726871e+091.726871e+09NoneNaN(StandardScaler(), SelectPercentile(percentile...
196NaN(94, 94)ind_mutate<tpot2.search_spaces.pipelines.sequential.Sequ...3.01.726871e+091.726871e+09TIMEOUTNaN(MinMaxScaler(), SelectPercentile(percentile=9...
1970.999038(34, 12)ind_crossover<tpot2.search_spaces.pipelines.sequential.Sequ...3.01.726871e+091.726871e+09NoneNaN(RobustScaler(quantile_range=(0.2919651866521,...
1980.998404(118, 118)ind_mutate<tpot2.search_spaces.pipelines.sequential.Sequ...3.01.726871e+091.726871e+09NoneNaN(RobustScaler(quantile_range=(0.2919651866521,...
199NaN(103, 124)ind_crossover<tpot2.search_spaces.pipelines.sequential.Sequ...3.01.726871e+091.726871e+09TIMEOUTNaN(StandardScaler(), SelectPercentile(percentile...
\n", + "

200 rows × 10 columns

\n", + "
" + ], + "text/plain": [ + " roc_auc_score Parents Variation_Function \\\n", + "0 0.992125 NaN NaN \n", + "1 NaN NaN NaN \n", + "2 0.991613 NaN NaN \n", + "3 0.974053 NaN NaN \n", + "4 NaN NaN NaN \n", + ".. ... ... ... \n", + "195 0.946552 (108, 124) ind_crossover , ind_mutate \n", + "196 NaN (94, 94) ind_mutate \n", + "197 0.999038 (34, 12) ind_crossover \n", + "198 0.998404 (118, 118) ind_mutate \n", + "199 NaN (103, 124) ind_crossover \n", + "\n", + " Individual Generation \\\n", + "0 Date: Fri, 20 Sep 2024 19:44:59 -0700 Subject: [PATCH 03/44] tutorial 1 --- Tutorial/1_Using_TPOT.ipynb | 1589 ++++++++++++++++++++++++++++++++--- 1 file changed, 1478 insertions(+), 111 deletions(-) diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index 9e707306..d3cf8307 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -131,13 +131,13 @@ "\n", "There are two ways of passing objectives into TPOT2. \n", "\n", - "1. `scorers`: Scorers are functions that have the signature (estimator, X, y). These can be produced with the [sklearn.metrics.make_scorer](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html) function. This function is used to evaluate the test folds during cross validation. These are passed into TPOT2 via the scorers parameter. This can take in the scorer itself or the string corresponding to a scoring function ([as listed here](https://scikit-learn.org/stable/modules/model_evaluation.html)). TPOT2 also supports passing in a list of several scorers for multiobjective optimization. \n", + "1. `scorers`: Scorers are functions that have the signature (estimator, X_test, y_test) and take in estimators that are expected to be fitted to training data. These can be produced with the [sklearn.metrics.make_scorer](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html) function. This function is used to evaluate the test folds during cross validation (defined in the `cv` parameter). These are passed into TPOT2 via the scorers parameter. This can take in the scorer itself or the string corresponding to a scoring function ([as listed here](https://scikit-learn.org/stable/modules/model_evaluation.html)). TPOT2 also supports passing in a list of several scorers for multi-objective optimization. For each fold of CV, TPOT only fits the estimator once, then evaluates all provided scorers in a loop.\n", "\n", - "2. `other_objective_functions` : Other objective functions in TPOT2 have the signature (estimator) and returns a float or list of floats. These get passed an unfitted estimator (in the case of TPOT2, a `tpot2.GraphPipeline`). \n", + "2. `other_objective_functions` : Other objective functions in TPOT2 have the signature (estimator) and returns a float or list of floats. These get passed a single unfitted estimator once, outside of cross validation. The user may choose to fit the pipeline within this objective function as well.\n", "\n", "\n", "\n", - "Each scorer and objective function must be accompanied by a list of weights corresponding to the list of objectives, these are `scorer_weights` and `other_objective_function_weights`, respectively. By default, TPOT2 maximizes objective functions (this can be changed by `bigger_is_better=False`). Positive weights means that TPOT2 will seek to maximize that objective, and negative weights correspond to minimization. For most selectors (and the default), only the sign matters. The scale of the weight may matter if using a custom selection function for the optimization algorithm. A zero weight means that the score will not have an impact on the selection algorithm.\n", + "Each scorer and objective function must be accompanied by a list of weights corresponding to the list of objectives, these are `scorers_weights` and `other_objective_function_weights`, respectively. By default, TPOT2 maximizes objective functions (this can be changed by `bigger_is_better=False`). Positive weights means that TPOT2 will seek to maximize that objective, and negative weights correspond to minimization. For most selectors (and the default), only the sign matters. The scale of the weight may matter if using a custom selection function for the optimization algorithm. A zero weight means that the score will not have an impact on the selection algorithm.\n", "\n", "Here is an example of using two scorers\n", "\n", @@ -153,9 +153,9 @@ " other_objective_functions_weights=[-1],\n", "\n", "\n", - "TPOT will automatically name the scores based on the function name for the columns in the final results dataframe. If you would like to specify custom function names, you can set the `objective_function_names` to be a list of names (str) for each score. The order of the names are scorers first, and other objective functions second. (e.g. `objective_function_names=['scorer1','scorer2', 'objective1','objective2'])`.\n", + "TPOT will always automatically name the scorers based on the function name for the columns in the final results dataframe. TPOT will use the function name as the column name for `other_objective_functions`. However, if you would like to specify custom column names, you can set the `objective_function_names` to be a list of names (str) for each value returned by the function in `other_objective_functions`. This can be useful if your additional functions return more than one value per function.\n", "\n", - "It is possible to have either the scorer or other_objective_function to return multiple values. In that case, just make sure that the `scorer_weights` and `other_objective_function_weights` are the same length as the number of returned scores.\n", + "It is possible to have either the scorer or other_objective_function to return multiple values. In that case, just make sure that the `scorers_weights` and `other_objective_function_weights` are the same length as the number of returned scores.\n", "\n", "\n", "TPOT comes with a few additional built in objective functions you can use. The first table are objectives applied to fitted pipelines, and thus are passee into the `scorers` parameter. The second table are objective functions for the `other_objective_functions` param.\n", @@ -174,6 +174,19 @@ "| tpot2.objectives.number_of_nodes_objective | Calculates the number of nodes in a pipeline (whether it is an scikit-learn Pipeline, GraphPipeline, Feature Union, or the previous nested within each other) |" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Measuring Model Complexity\n", + "\n", + "When running TPOT, it can sometimes be beneficial to include a secondary objective that measures model complexity. More complex models can yield higher performance but this comes at the cost of interpretability. Simpler models may be more interpretable, but often have lower predictive performance. Sometimes, however, vast increases in complexity only marginally improve predictive performance. There may be other simpler and more interpretable pipelines with marginal performance decreases that could be acceptable for the increased interpretability. However, these pipelines are often missed by optimizing purely for performance. By including both performance and complexity as objective functions, TPOT will attempt to optimize the best pipeline for all complexity levels simultaneously. After optimization, the user will be able to see the complexity vs performance tradeoff and make the decision of which pipeline best suits their needs. \n", + "\n", + "Two methods of measuring complexity to consider would be `tpot2.objectives.number_of_nodes_objective` or `tpot2.objectives.complexity_scorer`. The number of nodes objective simply calculates the number of steps within a pipeline. This is a simple metric, however it does not differentiate between the complexity of different model types. For example, a simple LogisticRegression counts the same as the much more complex XGBoost. The complexity scorer tries to estimate the number of learned parameters included in the classifiers and regressors of the pipeline. It is challenging and potentially subjective how to exactly quantify and compare complexity between different classes of models. However, this function provides a reasonable heuristic for the evolutionary algorithm that at least separates out qualitatively more or less complex algorithms from one another. While it may be hard to exactly compare the relative complexities of LogisticRegression and XGBoost, for example, both will always be on opposite ends of the complexity values returned by this function. This allows for pareto fronts with LogisticRegression on one side, and XGBoost on the other.\n", + "\n", + "An example of this analysis is demonstrated in a following section." + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -252,24 +265,39 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Example analysis " + "# Example analysis and the Estimator class \n", + "\n", + "Here we use a toy example dataset included in scikit-learn. We will use the `light` configuration and the `complexity_scorer` to estimate complexity.\n", + "\n" ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 2, "metadata": {}, "outputs": [ { - "ename": "TypeError", - "evalue": "TPOTEstimator.__init__() got an unexpected keyword argument 'scorer_weights'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "Cell \u001b[0;32mIn[5], line 29\u001b[0m\n\u001b[1;32m 15\u001b[0m X_train, X_test, y_train, y_test \u001b[38;5;241m=\u001b[39m sklearn\u001b[38;5;241m.\u001b[39mmodel_selection\u001b[38;5;241m.\u001b[39mtrain_test_split(X, y, train_size\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0.75\u001b[39m, test_size\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0.25\u001b[39m)\n\u001b[1;32m 18\u001b[0m est \u001b[38;5;241m=\u001b[39m tpot2\u001b[38;5;241m.\u001b[39mTPOTClassifier(\n\u001b[1;32m 19\u001b[0m scorers\u001b[38;5;241m=\u001b[39m[scorer, tpot2\u001b[38;5;241m.\u001b[39mobjectives\u001b[38;5;241m.\u001b[39mcomplexity_scorer],\n\u001b[1;32m 20\u001b[0m scorer_weights\u001b[38;5;241m=\u001b[39m[\u001b[38;5;241m1.0\u001b[39m, \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m1.0\u001b[39m],\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 27\u001b[0m early_stop\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m2\u001b[39m,\n\u001b[1;32m 28\u001b[0m verbose\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m2\u001b[39m,)\n\u001b[0;32m---> 29\u001b[0m \u001b[43mest\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit\u001b[49m\u001b[43m(\u001b[49m\u001b[43mX_train\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43my_train\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 32\u001b[0m \u001b[38;5;28mprint\u001b[39m(scorer(est, X_test, y_test))\n", - "File \u001b[0;32m~/Projects/common/Projects/TPOT_Dev/tpot2/tpot2/tpot_estimator/templates/tpottemplates.py:487\u001b[0m, in \u001b[0;36mTPOTClassifier.fit\u001b[0;34m(self, X, y)\u001b[0m\n\u001b[1;32m 479\u001b[0m get_search_space_params \u001b[38;5;241m=\u001b[39m {\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mn_classes\u001b[39m\u001b[38;5;124m\"\u001b[39m: \u001b[38;5;28mlen\u001b[39m(np\u001b[38;5;241m.\u001b[39munique(y)), \n\u001b[1;32m 480\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mn_samples\u001b[39m\u001b[38;5;124m\"\u001b[39m:\u001b[38;5;28mlen\u001b[39m(y), \n\u001b[1;32m 481\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mn_features\u001b[39m\u001b[38;5;124m\"\u001b[39m:X\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m1\u001b[39m], \n\u001b[1;32m 482\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrandom_state\u001b[39m\u001b[38;5;124m\"\u001b[39m:\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mrandom_state}\n\u001b[1;32m 484\u001b[0m search_space \u001b[38;5;241m=\u001b[39m get_template_search_spaces(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msearch_space, classification\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m, inner_predictors\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mallow_inner_classifiers, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mget_search_space_params)\n\u001b[0;32m--> 487\u001b[0m \u001b[38;5;28;43msuper\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mTPOTClassifier\u001b[49m\u001b[43m,\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;21;43m__init__\u001b[39;49m\u001b[43m(\u001b[49m\n\u001b[1;32m 488\u001b[0m \u001b[43m \u001b[49m\u001b[43msearch_space\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msearch_space\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 489\u001b[0m \u001b[43m \u001b[49m\u001b[43mscorers\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mscorers\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 490\u001b[0m \u001b[43m \u001b[49m\u001b[43mscorers_weights\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mscorers_weights\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 491\u001b[0m \u001b[43m \u001b[49m\u001b[43mcv\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcv\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 492\u001b[0m \u001b[43m \u001b[49m\u001b[43mother_objective_functions\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mother_objective_functions\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m#tpot2.objectives.estimator_objective_functions.number_of_nodes_objective],\u001b[39;49;00m\n\u001b[1;32m 493\u001b[0m \u001b[43m \u001b[49m\u001b[43mother_objective_functions_weights\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mother_objective_functions_weights\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 494\u001b[0m \u001b[43m \u001b[49m\u001b[43mobjective_function_names\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mobjective_function_names\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 495\u001b[0m \u001b[43m \u001b[49m\u001b[43mbigger_is_better\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mbigger_is_better\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 496\u001b[0m \u001b[43m \u001b[49m\u001b[43mcategorical_features\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcategorical_features\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 497\u001b[0m \u001b[43m \u001b[49m\u001b[43mmemory\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmemory\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 498\u001b[0m \u001b[43m \u001b[49m\u001b[43mpreprocessing\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpreprocessing\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 499\u001b[0m \u001b[43m \u001b[49m\u001b[43mmax_time_mins\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmax_time_mins\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 500\u001b[0m \u001b[43m \u001b[49m\u001b[43mmax_eval_time_mins\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmax_eval_time_mins\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 501\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_jobs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_jobs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 502\u001b[0m \u001b[43m \u001b[49m\u001b[43mvalidation_strategy\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mvalidation_strategy\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 503\u001b[0m \u001b[43m \u001b[49m\u001b[43mvalidation_fraction\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mvalidation_fraction\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 504\u001b[0m \u001b[43m \u001b[49m\u001b[43mearly_stop\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mearly_stop\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 505\u001b[0m \u001b[43m \u001b[49m\u001b[43mwarm_start\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mwarm_start\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 506\u001b[0m \u001b[43m \u001b[49m\u001b[43mperiodic_checkpoint_folder\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mperiodic_checkpoint_folder\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 507\u001b[0m \u001b[43m \u001b[49m\u001b[43mverbose\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mverbose\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 508\u001b[0m \u001b[43m \u001b[49m\u001b[43mclassification\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 509\u001b[0m \u001b[43m \u001b[49m\u001b[43mmemory_limit\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmemory_limit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 510\u001b[0m \u001b[43m \u001b[49m\u001b[43mclient\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mclient\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 511\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 512\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtpotestimator_kwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 513\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39minitialized \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[1;32m 515\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28msuper\u001b[39m()\u001b[38;5;241m.\u001b[39mfit(X,y)\n", - "\u001b[0;31mTypeError\u001b[0m: TPOTEstimator.__init__() got an unexpected keyword argument 'scorer_weights'" + "name": "stderr", + "output_type": "stream", + "text": [ + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/distributed/node.py:182: UserWarning: Port 8787 is already in use.\n", + "Perhaps you already have a cluster running?\n", + "Hosting the HTTP server on port 44127 instead\n", + " warnings.warn(\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: : 3it [00:45, 15.13s/it]\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.9988455597304394\n" ] } ], @@ -287,14 +315,13 @@ "\n", "scorer = sklearn.metrics.get_scorer('roc_auc_ovr')\n", "\n", - "X, y = sklearn.datasets.load_digits(return_X_y=True)\n", + "X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", "\n", "\n", "est = tpot2.TPOTClassifier(\n", " scorers=[scorer, tpot2.objectives.complexity_scorer],\n", - " scorer_weights=[1.0, -1.0],\n", - " objective_function_names=['roc_auc_ovr', 'complexity'],\n", + " scorers_weights=[1.0, -1.0],\n", "\n", " search_space=\"light\",\n", " n_jobs=4, \n", @@ -311,7 +338,538 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Further analysis of results\n", + "You can access the best pipeline selected by TPOT with the `fitted_pipeline_` attribute. This is the pipeline with the highest cross validation score (on the first scorer, or first objective function if no scorer is provided.)" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n",
+       "                ('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0056828922429)),\n",
+       "                ('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('kneighborsclassifier',\n",
+       "                 KNeighborsClassifier(n_jobs=1, n_neighbors=28, p=1,\n",
+       "                                      weights='distance'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n", + " ('variancethreshold',\n", + " VarianceThreshold(threshold=0.0056828922429)),\n", + " ('featureunion',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('kneighborsclassifier',\n", + " KNeighborsClassifier(n_jobs=1, n_neighbors=28, p=1,\n", + " weights='distance'))])" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "best_pipeline = est.fitted_pipeline_\n", + "best_pipeline" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([0, 1, 3, 2, 9, 1, 2, 5, 4, 2, 3, 2, 0, 0, 2, 0, 1, 8, 2, 8, 9, 5,\n", + " 7, 3, 9, 1, 8, 2, 9, 3, 4, 5, 6, 1, 0, 9, 8, 3, 1, 3, 3, 1, 1, 4,\n", + " 3, 0, 3, 2, 2, 5, 2, 4, 0, 8, 6, 3, 1, 3, 9, 8, 3, 4, 7, 8, 4, 3,\n", + " 0, 3, 1, 8, 7, 0, 2, 8, 2, 0, 4, 5, 8, 1, 1, 8, 4, 0, 1, 5, 9, 9,\n", + " 7, 8, 7, 3, 2, 3, 5, 2, 9, 2, 2, 5, 9, 8, 3, 1, 9, 2, 6, 5, 5, 8,\n", + " 2, 3, 7, 4, 0, 2, 7, 9, 6, 3, 8, 8, 8, 9, 4, 4, 7, 7, 8, 6, 1, 0,\n", + " 4, 3, 8, 4, 4, 2, 1, 0, 6, 7, 9, 0, 4, 7, 6, 0, 4, 5, 0, 0, 3, 4,\n", + " 5, 5, 9, 5, 7, 2, 9, 7, 4, 2, 3, 2, 5, 7, 4, 8, 4, 6, 6, 0, 1, 2,\n", + " 7, 0, 1, 9, 7, 5, 5, 8, 3, 7, 9, 5, 9, 5, 7, 7, 2, 2, 1, 6, 9, 3,\n", + " 8, 9, 8, 8, 8, 3, 9, 1, 1, 5, 3, 6, 9, 8, 6, 4, 7, 7, 2, 0, 7, 8,\n", + " 8, 0, 7, 9, 6, 9, 4, 2, 1, 8, 5, 6, 1, 2, 5, 8, 3, 9, 0, 6, 2, 7,\n", + " 5, 5, 3, 6, 7, 8, 6, 6, 5, 4, 1, 1, 6, 5, 9, 4, 9, 1, 1, 8, 8, 1,\n", + " 0, 5, 6, 9, 3, 4, 1, 5, 9, 7, 0, 2, 1, 9, 3, 6, 4, 3, 6, 0, 2, 6,\n", + " 1, 3, 1, 4, 7, 6, 2, 4, 2, 1, 9, 7, 6, 7, 4, 1, 8, 8, 7, 6, 1, 6,\n", + " 7, 5, 7, 7, 3, 8, 8, 0, 6, 4, 5, 2, 0, 4, 4, 4, 4, 2, 4, 2, 1, 0,\n", + " 5, 4, 1, 5, 9, 0, 6, 2, 4, 5, 7, 3, 1, 1, 5, 1, 9, 3, 2, 4, 7, 9,\n", + " 4, 4, 0, 1, 2, 2, 5, 9, 7, 1, 6, 7, 1, 1, 1, 7, 6, 1, 1, 9, 8, 3,\n", + " 0, 5, 4, 7, 7, 7, 5, 4, 8, 1, 3, 1, 5, 8, 1, 0, 0, 4, 8, 7, 2, 7,\n", + " 8, 0, 1, 5, 3, 5, 0, 6, 2, 0, 1, 5, 7, 5, 2, 5, 7, 7, 9, 0, 2, 9,\n", + " 9, 5, 0, 5, 6, 3, 5, 5, 0, 6, 5, 5, 3, 7, 9, 5, 9, 4, 6, 2, 3, 9,\n", + " 3, 7, 1, 3, 8, 0, 1, 2, 1, 5])" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "best_pipeline.predict(X_test)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### saving the pipeline\n", + "\n", + "We recommend using dill or pickle to save the instance of the fitted_pipeline_. Note that we do not recommend pickling the TPOT object itself." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import dill as pickle\n", + "with open(\"best_pipeline.pkl\", \"wb\") as f:\n", + " pickle.dump(best_pipeline, f)\n", + "\n", + "#load the pipeline\n", + "import dill as pickle\n", + "with open(\"best_pipeline.pkl\", \"rb\") as f:\n", + " my_loaded_best_pipeline = pickle.load(f)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## The evaluated_individuals Dataframe - Further analysis of results\n", "\n", "The `evaluated_individuals` attribute of the tpot estimator object is a Pandas Dataframe containing information about a run. Each row corresponds to an individual pipeline explored by tpot. The dataframe contains the following columns:\n", "\n", @@ -319,16 +877,39 @@ "| :--- | :----: |\n", "| | The first set of columns will correspond to each objective function. These can either be automatically named by TPOT, or passed in by the user. |\n", "| Parents | This contains a tuple that contains the indexes of the 'parents' of the current pipeline. For example, (29, 42) means that the pipelines in indexes 29 and 42 were utilized to generate that pipeline. |\n", - "| | |\n", - "| | |\n", - "| | |\n", - "| | |\n", - "| | |" + "| Variation_Function | The function applied to the parents to generate the new pipeline |\n", + "| Individual | The individual class that represents a specific pipeline and hyperparameter configuration. This class also contains functions for mutation and crossover. To get the sklearn estimator/pipeline object from the individual you can call the `export_pipeline()` function. (as in, `pipe = ind.export_pipeline()`) |\n", + "| Generation | The generation where the individual was created. (Note that the higher performing pipelines from previous generations may still be present in the current \"population\" of a given generation if selected.) |\n", + "| Submitted Timestamp | Timestamp, in seconds, at which the pipeline was sent to be evaluated. This is the output of time.time(), which is \"Return the time in seconds since the epoch as a floating-point number. \" |\n", + "| Completed Timestamp | Timestamp at which the pipeline evaluation completed in the same units as Submitted Timestamp |\n", + "| Pareto_Front\t | If you have multiple parameters, this column is True if the pipeline performance fall on the pareto front line. This is the set of pipelines with scores that are strictly better than pipelines not on the line, but not strictly better than one another. |\n", + "| Instance | This contains the unfitted pipeline evaluated for this row. (This is the pipeline returned by calling the export_pipeline() function of the individual class) |\n" ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 24, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['roc_auc_score', 'complexity_scorer']" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#get the score/objective column names generated by TPOT\n", + "est.objective_names" + ] + }, + { + "cell_type": "code", + "execution_count": 5, "metadata": {}, "outputs": [ { @@ -353,6 +934,7 @@ " \n", " \n", " roc_auc_score\n", + " complexity_scorer\n", " Parents\n", " Variation_Function\n", " Individual\n", @@ -367,68 +949,73 @@ " \n", " \n", " 0\n", - " 0.992125\n", + " NaN\n", + " NaN\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.726870e+09\n", - " 1.726870e+09\n", - " None\n", + " 1.726883e+09\n", + " 1.726883e+09\n", + " INVALID\n", " NaN\n", - " (Passthrough(), SelectFwe(alpha=0.005516602107...\n", + " (Normalizer(norm='max'), SelectPercentile(perc...\n", " \n", " \n", " 1\n", - " NaN\n", + " 0.945922\n", + " 546.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.726870e+09\n", - " 1.726870e+09\n", - " TIMEOUT\n", + " 1.726883e+09\n", + " 1.726883e+09\n", + " None\n", " NaN\n", - " (RobustScaler(quantile_range=(0.232938031941, ...\n", + " (RobustScaler(quantile_range=(0.2886683384991,...\n", " \n", " \n", " 2\n", - " 0.991613\n", + " 0.991940\n", + " 23.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.726870e+09\n", - " 1.726870e+09\n", + " 1.726883e+09\n", + " 1.726883e+09\n", " None\n", " NaN\n", - " (MinMaxScaler(), Passthrough(), FeatureUnion(t...\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.00027494990...\n", " \n", " \n", " 3\n", - " 0.974053\n", + " 0.975386\n", + " 34.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.726870e+09\n", - " 1.726870e+09\n", + " 1.726883e+09\n", + " 1.726883e+09\n", " None\n", " NaN\n", - " (MaxAbsScaler(), VarianceThreshold(threshold=0...\n", + " (Passthrough(), Passthrough(), FeatureUnion(tr...\n", " \n", " \n", " 4\n", - " NaN\n", + " 0.990177\n", + " 24.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.726870e+09\n", - " 1.726870e+09\n", - " TIMEOUT\n", + " 1.726883e+09\n", + " 1.726883e+09\n", + " None\n", " NaN\n", - " (RobustScaler(quantile_range=(0.0238159499352,...\n", + " (Normalizer(norm='l1'), Passthrough(), Feature...\n", " \n", " \n", " ...\n", @@ -442,90 +1029,109 @@ " ...\n", " ...\n", " ...\n", + " ...\n", " \n", " \n", " 195\n", - " 0.946552\n", - " (108, 124)\n", - " ind_crossover , ind_mutate\n", + " 0.964649\n", + " 4.0\n", + " (120, 120)\n", + " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 3.0\n", - " 1.726871e+09\n", - " 1.726871e+09\n", + " 1.726883e+09\n", + " 1.726883e+09\n", " None\n", " NaN\n", - " (StandardScaler(), SelectPercentile(percentile...\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.00062804731...\n", " \n", " \n", " 196\n", - " NaN\n", - " (94, 94)\n", + " 0.994112\n", + " 4.0\n", + " (124, 124)\n", " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 3.0\n", - " 1.726871e+09\n", - " 1.726871e+09\n", - " TIMEOUT\n", + " 1.726883e+09\n", + " 1.726883e+09\n", + " None\n", " NaN\n", - " (MinMaxScaler(), SelectPercentile(percentile=9...\n", + " (Passthrough(), SelectFwe(alpha=0.011355913641...\n", " \n", " \n", " 197\n", - " 0.999038\n", - " (34, 12)\n", - " ind_crossover\n", + " 0.982564\n", + " 5.0\n", + " (34, 114)\n", + " ind_mutate , ind_mutate , ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 3.0\n", - " 1.726871e+09\n", - " 1.726871e+09\n", + " 1.726883e+09\n", + " 1.726883e+09\n", " None\n", " NaN\n", - " (RobustScaler(quantile_range=(0.2919651866521,...\n", + " (MinMaxScaler(), SelectFwe(alpha=0.01135591364...\n", " \n", " \n", " 198\n", - " 0.998404\n", - " (118, 118)\n", - " ind_mutate\n", + " 0.683244\n", + " 19.0\n", + " (114, 37)\n", + " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 3.0\n", - " 1.726871e+09\n", - " 1.726871e+09\n", + " 1.726883e+09\n", + " 1.726883e+09\n", " None\n", " NaN\n", - " (RobustScaler(quantile_range=(0.2919651866521,...\n", + " (RobustScaler(quantile_range=(0.0249691582537,...\n", " \n", " \n", " 199\n", " NaN\n", - " (103, 124)\n", + " NaN\n", + " (133, 62)\n", " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 3.0\n", - " 1.726871e+09\n", - " 1.726871e+09\n", + " 1.726883e+09\n", + " 1.726883e+09\n", " TIMEOUT\n", " NaN\n", " (StandardScaler(), SelectPercentile(percentile...\n", " \n", " \n", "\n", - "

200 rows × 10 columns

\n", + "

200 rows × 11 columns

\n", "" ], "text/plain": [ - " roc_auc_score Parents Variation_Function \\\n", - "0 0.992125 NaN NaN \n", - "1 NaN NaN NaN \n", - "2 0.991613 NaN NaN \n", - "3 0.974053 NaN NaN \n", - "4 NaN NaN NaN \n", - ".. ... ... ... \n", - "195 0.946552 (108, 124) ind_crossover , ind_mutate \n", - "196 NaN (94, 94) ind_mutate \n", - "197 0.999038 (34, 12) ind_crossover \n", - "198 0.998404 (118, 118) ind_mutate \n", - "199 NaN (103, 124) ind_crossover \n", + " roc_auc_score complexity_scorer Parents \\\n", + "0 NaN NaN NaN \n", + "1 0.945922 546.0 NaN \n", + "2 0.991940 23.0 NaN \n", + "3 0.975386 34.0 NaN \n", + "4 0.990177 24.0 NaN \n", + ".. ... ... ... \n", + "195 0.964649 4.0 (120, 120) \n", + "196 0.994112 4.0 (124, 124) \n", + "197 0.982564 5.0 (34, 114) \n", + "198 0.683244 19.0 (114, 37) \n", + "199 NaN NaN (133, 62) \n", + "\n", + " Variation_Function \\\n", + "0 NaN \n", + "1 NaN \n", + "2 NaN \n", + "3 NaN \n", + "4 NaN \n", + ".. ... \n", + "195 ind_mutate \n", + "196 ind_mutate \n", + "197 ind_mutate , ind_mutate , ind_crossover \n", + "198 ind_crossover \n", + "199 ind_crossover \n", "\n", " Individual Generation \\\n", "0 " + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "import seaborn as sns\n", + "import matplotlib.pyplot as plt\n", + "#replace nans in pareto front with 0\n", + "fig, ax = plt.subplots(figsize=(5,5))\n", + "sns.scatterplot(df[df['Pareto_Front']!=1], x='roc_auc_score', y='complexity_scorer', label='other', ax=ax)\n", + "sns.scatterplot(df[df['Pareto_Front']==1], x='roc_auc_score', y='complexity_scorer', label='Pareto Front', ax=ax)\n", + "ax.title.set_text('Performance of all pipelines')\n", + "#log scale y\n", + "ax.set_yscale('log')\n", + "plt.show()\n", + "\n", + "#replace nans in pareto front with 0\n", + "fig, ax = plt.subplots(figsize=(10,5))\n", + "sns.scatterplot(df[df['Pareto_Front']==1], x='roc_auc_score', y='complexity_scorer', label='Pareto Front', ax=ax)\n", + "ax.title.set_text('Performance of only the Pareto Front')\n", + "#log scale y\n", + "# ax.set_yscale('log')\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
roc_auc_scorecomplexity_scorerParentsVariation_FunctionIndividualGenerationSubmitted TimestampCompleted TimestampEval ErrorPareto_FrontInstance
1830.99898931.0(5, 5)ind_mutate<tpot2.search_spaces.pipelines.sequential.Sequ...3.01.726883e+091.726883e+09None1.0(MaxAbsScaler(), VarianceThreshold(threshold=0...
1460.99885325.0(43, 43)ind_mutate<tpot2.search_spaces.pipelines.sequential.Sequ...2.01.726883e+091.726883e+09None1.0(StandardScaler(), SelectPercentile(percentile...
620.99867114.0(43, 43)ind_mutate<tpot2.search_spaces.pipelines.sequential.Sequ...1.01.726883e+091.726883e+09None1.0(StandardScaler(), SelectPercentile(percentile...
1750.99853813.0(86, 62)ind_crossover<tpot2.search_spaces.pipelines.sequential.Sequ...3.01.726883e+091.726883e+09None1.0(MinMaxScaler(), VarianceThreshold(threshold=0...
1870.99780010.0(144, 144)ind_mutate<tpot2.search_spaces.pipelines.sequential.Sequ...3.01.726883e+091.726883e+09None1.0(RobustScaler(quantile_range=(0.1489485642159,...
770.9977959.0(18, 18)ind_mutate<tpot2.search_spaces.pipelines.sequential.Sequ...1.01.726883e+091.726883e+09None1.0(Passthrough(), VarianceThreshold(threshold=0....
1480.9970147.0(43, 43)ind_mutate<tpot2.search_spaces.pipelines.sequential.Sequ...2.01.726883e+091.726883e+09None1.0(Passthrough(), SelectPercentile(percentile=72...
1160.9949444.0(85, 39)ind_crossover<tpot2.search_spaces.pipelines.sequential.Sequ...2.01.726883e+091.726883e+09None1.0(RobustScaler(quantile_range=(0.2187724978734,...
\n", + "
" + ], + "text/plain": [ + " roc_auc_score complexity_scorer Parents Variation_Function \\\n", + "183 0.998989 31.0 (5, 5) ind_mutate \n", + "146 0.998853 25.0 (43, 43) ind_mutate \n", + "62 0.998671 14.0 (43, 43) ind_mutate \n", + "175 0.998538 13.0 (86, 62) ind_crossover \n", + "187 0.997800 10.0 (144, 144) ind_mutate \n", + "77 0.997795 9.0 (18, 18) ind_mutate \n", + "148 0.997014 7.0 (43, 43) ind_mutate \n", + "116 0.994944 4.0 (85, 39) ind_crossover \n", + "\n", + " Individual Generation \\\n", + "183 #sk-container-id-3 {\n", + " /* Definition of color scheme common for light and dark mode */\n", + " --sklearn-color-text: black;\n", + " --sklearn-color-line: gray;\n", + " /* Definition of color scheme for unfitted estimators */\n", + " --sklearn-color-unfitted-level-0: #fff5e6;\n", + " --sklearn-color-unfitted-level-1: #f6e4d2;\n", + " --sklearn-color-unfitted-level-2: #ffe0b3;\n", + " --sklearn-color-unfitted-level-3: chocolate;\n", + " /* Definition of color scheme for fitted estimators */\n", + " --sklearn-color-fitted-level-0: #f0f8ff;\n", + " --sklearn-color-fitted-level-1: #d4ebff;\n", + " --sklearn-color-fitted-level-2: #b3dbfd;\n", + " --sklearn-color-fitted-level-3: cornflowerblue;\n", + "\n", + " /* Specific color for light theme */\n", + " --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n", + " --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n", + " --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n", + " --sklearn-color-icon: #696969;\n", + "\n", + " @media (prefers-color-scheme: dark) {\n", + " /* Redefinition of color scheme for dark theme */\n", + " --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n", + " --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n", + " --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n", + " --sklearn-color-icon: #878787;\n", + " }\n", + "}\n", + "\n", + "#sk-container-id-3 {\n", + " color: var(--sklearn-color-text);\n", + "}\n", + "\n", + "#sk-container-id-3 pre {\n", + " padding: 0;\n", + "}\n", + "\n", + "#sk-container-id-3 input.sk-hidden--visually {\n", + " border: 0;\n", + " clip: rect(1px 1px 1px 1px);\n", + " clip: rect(1px, 1px, 1px, 1px);\n", + " height: 1px;\n", + " margin: -1px;\n", + " overflow: hidden;\n", + " padding: 0;\n", + " position: absolute;\n", + " width: 1px;\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-dashed-wrapped {\n", + " border: 1px dashed var(--sklearn-color-line);\n", + " margin: 0 0.4em 0.5em 0.4em;\n", + " box-sizing: border-box;\n", + " padding-bottom: 0.4em;\n", + " background-color: var(--sklearn-color-background);\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-container {\n", + " /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n", + " but bootstrap.min.css set `[hidden] { display: none !important; }`\n", + " so we also need the `!important` here to be able to override the\n", + " default hidden behavior on the sphinx rendered scikit-learn.org.\n", + " See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n", + " display: inline-block !important;\n", + " position: relative;\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-text-repr-fallback {\n", + " display: none;\n", + "}\n", + "\n", + "div.sk-parallel-item,\n", + "div.sk-serial,\n", + "div.sk-item {\n", + " /* draw centered vertical line to link estimators */\n", + " background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n", + " background-size: 2px 100%;\n", + " background-repeat: no-repeat;\n", + " background-position: center center;\n", + "}\n", + "\n", + "/* Parallel-specific style estimator block */\n", + "\n", + "#sk-container-id-3 div.sk-parallel-item::after {\n", + " content: \"\";\n", + " width: 100%;\n", + " border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n", + " flex-grow: 1;\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-parallel {\n", + " display: flex;\n", + " align-items: stretch;\n", + " justify-content: center;\n", + " background-color: var(--sklearn-color-background);\n", + " position: relative;\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-parallel-item {\n", + " display: flex;\n", + " flex-direction: column;\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-parallel-item:first-child::after {\n", + " align-self: flex-end;\n", + " width: 50%;\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-parallel-item:last-child::after {\n", + " align-self: flex-start;\n", + " width: 50%;\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-parallel-item:only-child::after {\n", + " width: 0;\n", + "}\n", + "\n", + "/* Serial-specific style estimator block */\n", + "\n", + "#sk-container-id-3 div.sk-serial {\n", + " display: flex;\n", + " flex-direction: column;\n", + " align-items: center;\n", + " background-color: var(--sklearn-color-background);\n", + " padding-right: 1em;\n", + " padding-left: 1em;\n", + "}\n", + "\n", + "\n", + "/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n", + "clickable and can be expanded/collapsed.\n", + "- Pipeline and ColumnTransformer use this feature and define the default style\n", + "- Estimators will overwrite some part of the style using the `sk-estimator` class\n", + "*/\n", + "\n", + "/* Pipeline and ColumnTransformer style (default) */\n", + "\n", + "#sk-container-id-3 div.sk-toggleable {\n", + " /* Default theme specific background. It is overwritten whether we have a\n", + " specific estimator or a Pipeline/ColumnTransformer */\n", + " background-color: var(--sklearn-color-background);\n", + "}\n", + "\n", + "/* Toggleable label */\n", + "#sk-container-id-3 label.sk-toggleable__label {\n", + " cursor: pointer;\n", + " display: block;\n", + " width: 100%;\n", + " margin-bottom: 0;\n", + " padding: 0.5em;\n", + " box-sizing: border-box;\n", + " text-align: center;\n", + "}\n", + "\n", + "#sk-container-id-3 label.sk-toggleable__label-arrow:before {\n", + " /* Arrow on the left of the label */\n", + " content: \"▸\";\n", + " float: left;\n", + " margin-right: 0.25em;\n", + " color: var(--sklearn-color-icon);\n", + "}\n", + "\n", + "#sk-container-id-3 label.sk-toggleable__label-arrow:hover:before {\n", + " color: var(--sklearn-color-text);\n", + "}\n", + "\n", + "/* Toggleable content - dropdown */\n", + "\n", + "#sk-container-id-3 div.sk-toggleable__content {\n", + " max-height: 0;\n", + " max-width: 0;\n", + " overflow: hidden;\n", + " text-align: left;\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-toggleable__content.fitted {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-toggleable__content pre {\n", + " margin: 0.2em;\n", + " border-radius: 0.25em;\n", + " color: var(--sklearn-color-text);\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-toggleable__content.fitted pre {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-fitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-3 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n", + " /* Expand drop-down */\n", + " max-height: 200px;\n", + " max-width: 100%;\n", + " overflow: auto;\n", + "}\n", + "\n", + "#sk-container-id-3 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n", + " content: \"▾\";\n", + "}\n", + "\n", + "/* Pipeline/ColumnTransformer-specific style */\n", + "\n", + "#sk-container-id-3 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", + " color: var(--sklearn-color-text);\n", + " background-color: var(--sklearn-color-unfitted-level-2);\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", + " background-color: var(--sklearn-color-fitted-level-2);\n", + "}\n", + "\n", + "/* Estimator-specific style */\n", + "\n", + "/* Colorize estimator box */\n", + "#sk-container-id-3 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-2);\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-2);\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-label label.sk-toggleable__label,\n", + "#sk-container-id-3 div.sk-label label {\n", + " /* The background is the default theme color */\n", + " color: var(--sklearn-color-text-on-default-background);\n", + "}\n", + "\n", + "/* On hover, darken the color of the background */\n", + "#sk-container-id-3 div.sk-label:hover label.sk-toggleable__label {\n", + " color: var(--sklearn-color-text);\n", + " background-color: var(--sklearn-color-unfitted-level-2);\n", + "}\n", + "\n", + "/* Label box, darken color on hover, fitted */\n", + "#sk-container-id-3 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n", + " color: var(--sklearn-color-text);\n", + " background-color: var(--sklearn-color-fitted-level-2);\n", + "}\n", + "\n", + "/* Estimator label */\n", + "\n", + "#sk-container-id-3 div.sk-label label {\n", + " font-family: monospace;\n", + " font-weight: bold;\n", + " display: inline-block;\n", + " line-height: 1.2em;\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-label-container {\n", + " text-align: center;\n", + "}\n", + "\n", + "/* Estimator-specific */\n", + "#sk-container-id-3 div.sk-estimator {\n", + " font-family: monospace;\n", + " border: 1px dotted var(--sklearn-color-border-box);\n", + " border-radius: 0.25em;\n", + " box-sizing: border-box;\n", + " margin-bottom: 0.5em;\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-estimator.fitted {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-0);\n", + "}\n", + "\n", + "/* on hover */\n", + "#sk-container-id-3 div.sk-estimator:hover {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-2);\n", + "}\n", + "\n", + "#sk-container-id-3 div.sk-estimator.fitted:hover {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-2);\n", + "}\n", + "\n", + "/* Specification for estimator info (e.g. \"i\" and \"?\") */\n", + "\n", + "/* Common style for \"i\" and \"?\" */\n", + "\n", + ".sk-estimator-doc-link,\n", + "a:link.sk-estimator-doc-link,\n", + "a:visited.sk-estimator-doc-link {\n", + " float: right;\n", + " font-size: smaller;\n", + " line-height: 1em;\n", + " font-family: monospace;\n", + " background-color: var(--sklearn-color-background);\n", + " border-radius: 1em;\n", + " height: 1em;\n", + " width: 1em;\n", + " text-decoration: none !important;\n", + " margin-left: 1ex;\n", + " /* unfitted */\n", + " border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n", + " color: var(--sklearn-color-unfitted-level-1);\n", + "}\n", + "\n", + ".sk-estimator-doc-link.fitted,\n", + "a:link.sk-estimator-doc-link.fitted,\n", + "a:visited.sk-estimator-doc-link.fitted {\n", + " /* fitted */\n", + " border: var(--sklearn-color-fitted-level-1) 1pt solid;\n", + " color: var(--sklearn-color-fitted-level-1);\n", + "}\n", + "\n", + "/* On hover */\n", + "div.sk-estimator:hover .sk-estimator-doc-link:hover,\n", + ".sk-estimator-doc-link:hover,\n", + "div.sk-label-container:hover .sk-estimator-doc-link:hover,\n", + ".sk-estimator-doc-link:hover {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-3);\n", + " color: var(--sklearn-color-background);\n", + " text-decoration: none;\n", + "}\n", + "\n", + "div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n", + ".sk-estimator-doc-link.fitted:hover,\n", + "div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n", + ".sk-estimator-doc-link.fitted:hover {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-3);\n", + " color: var(--sklearn-color-background);\n", + " text-decoration: none;\n", + "}\n", + "\n", + "/* Span, style for the box shown on hovering the info icon */\n", + ".sk-estimator-doc-link span {\n", + " display: none;\n", + " z-index: 9999;\n", + " position: relative;\n", + " font-weight: normal;\n", + " right: .2ex;\n", + " padding: .5ex;\n", + " margin: .5ex;\n", + " width: min-content;\n", + " min-width: 20ex;\n", + " max-width: 50ex;\n", + " color: var(--sklearn-color-text);\n", + " box-shadow: 2pt 2pt 4pt #999;\n", + " /* unfitted */\n", + " background: var(--sklearn-color-unfitted-level-0);\n", + " border: .5pt solid var(--sklearn-color-unfitted-level-3);\n", + "}\n", + "\n", + ".sk-estimator-doc-link.fitted span {\n", + " /* fitted */\n", + " background: var(--sklearn-color-fitted-level-0);\n", + " border: var(--sklearn-color-fitted-level-3);\n", + "}\n", + "\n", + ".sk-estimator-doc-link:hover span {\n", + " display: block;\n", + "}\n", + "\n", + "/* \"?\"-specific style due to the `` HTML tag */\n", + "\n", + "#sk-container-id-3 a.estimator_doc_link {\n", + " float: right;\n", + " font-size: 1rem;\n", + " line-height: 1em;\n", + " font-family: monospace;\n", + " background-color: var(--sklearn-color-background);\n", + " border-radius: 1rem;\n", + " height: 1rem;\n", + " width: 1rem;\n", + " text-decoration: none;\n", + " /* unfitted */\n", + " color: var(--sklearn-color-unfitted-level-1);\n", + " border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n", + "}\n", + "\n", + "#sk-container-id-3 a.estimator_doc_link.fitted {\n", + " /* fitted */\n", + " border: var(--sklearn-color-fitted-level-1) 1pt solid;\n", + " color: var(--sklearn-color-fitted-level-1);\n", + "}\n", + "\n", + "/* On hover */\n", + "#sk-container-id-3 a.estimator_doc_link:hover {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-3);\n", + " color: var(--sklearn-color-background);\n", + " text-decoration: none;\n", + "}\n", + "\n", + "#sk-container-id-3 a.estimator_doc_link.fitted:hover {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-3);\n", + "}\n", + "
Pipeline(steps=[('robustscaler',\n",
+       "                 RobustScaler(quantile_range=(0.2187724978734,\n",
+       "                                              0.7909007640608))),\n",
+       "                ('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0193318854527)),\n",
+       "                ('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('kneighborsclassifier',\n",
+       "                 KNeighborsClassifier(n_jobs=1, n_neighbors=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('robustscaler',\n", + " RobustScaler(quantile_range=(0.2187724978734,\n", + " 0.7909007640608))),\n", + " ('variancethreshold',\n", + " VarianceThreshold(threshold=0.0193318854527)),\n", + " ('featureunion',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('kneighborsclassifier',\n", + " KNeighborsClassifier(n_jobs=1, n_neighbors=1))])" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#access the best performing pipeline with the lowest complexity\n", + "\n", + "best_pipeline_lowest_complexity = sorted_pareto_front.iloc[-1]['Instance']\n", + "best_pipeline_lowest_complexity" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Plot performance over time + Continuing a run from where it left off\n", + "\n", + "Plotting the performance over time is a good way of trying to access whether or not the TPOT model has converged. If performance seems to asymptote over time, there may not be much more performance to be gained by running for a longer period of time. If the plot looks like it is still actively improving, it may be worth running TPOT for a longer duration. \n", + "\n", + "There are two ways to resume TPOT. If the `warm_start` parameter is set to True, subsequent calls to `fit` will continue training where it left off (The conventional scikit-learn default is to retrain from scratch on subsequent calls to fit). Additionally, if `periodic_checkpoint_folder` is set, TPOT will periodically save its current state. If TPOT terminates normally, is interrupted (job canceled, PC shut off), or crashes (memory issues), it will be able to resume training from where it left off. ** NOTE: If the periodic_checkpoint_folder is set, TPOT will always resume from the **" ] }, { @@ -640,6 +1987,26 @@ " \n" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Common mistakes or issues\n", + "\n", + "If you are experiencing issues with TPOT, here are some common issues and how to address them.\n", + "\n", + "* Performance is lower than expected.\n", + " * TPOT may have to be run for a longer duration, increase `max_time_mins`, `early_stop`, or `generations`.\n", + " * Individual pipelines may need more time to complete fitting, increase `max_eval_time_seconds`.\n", + " * The configuration may not include the optimal model types or hyperparameter ranges, explore other included templates or customize your own search space (see Tutorial 2!)\n", + "* TPOT is running forever and never terminating\n", + " * Check that at least one of the three termination conditions is set to a reasonable level. These are `max_time_mins`, `early_stop`, or `generations`. Additionally check that `max_eval_time_seconds` is giving enough time for most models to train without being overly long. (Some estimators may take an unreasonably long time to fit, this parameter is intended to prevent them from slowing everything to a halt. In my experience, SVC and SVR tend to be the culprits, so removing them from the search space may also improve run time).\n", + "* Many pipelines in the evaluated_individuals dataframe have crashed or turned up invalid!\n", + " * This may actually be normal and is expected behavior for TPOT. In some cases, TPOT may attempt an invalid hyperparameter combination which results in the pipeline not working. Other times, the pipeline configuration itself may be invalid. For example, a selector may not select any features due to its hyperparameter. Another common example is `MultinomialNB` throwing an error because it expects positive values, but a prior transformation yielded a negative value. \n", + " * If you used custom search spaces, you can use `ConfigSpace` conditionals to prevent invalid hyperparameters (this may still occur due to how TPOT uses crossover).\n", + " * Setting `verbose=5` will print out the full error message for all failed pipelines. This can be useful for debugging whether or not there is something misconfigured in your pipeline, custom search space modules, or something else." + ] + }, { "attachments": {}, "cell_type": "markdown", From 2ee9ac8ea5cf84858147b442b82a2ef3d2532bb9 Mon Sep 17 00:00:00 2001 From: perib Date: Fri, 20 Sep 2024 19:54:21 -0700 Subject: [PATCH 04/44] edits --- Tutorial/1_Using_TPOT.ipynb | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index d3cf8307..6b74347e 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -1991,7 +1991,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Common mistakes or issues\n", + "### Common mistakes or issues - Debugging\n", "\n", "If you are experiencing issues with TPOT, here are some common issues and how to address them.\n", "\n", @@ -2004,7 +2004,16 @@ "* Many pipelines in the evaluated_individuals dataframe have crashed or turned up invalid!\n", " * This may actually be normal and is expected behavior for TPOT. In some cases, TPOT may attempt an invalid hyperparameter combination which results in the pipeline not working. Other times, the pipeline configuration itself may be invalid. For example, a selector may not select any features due to its hyperparameter. Another common example is `MultinomialNB` throwing an error because it expects positive values, but a prior transformation yielded a negative value. \n", " * If you used custom search spaces, you can use `ConfigSpace` conditionals to prevent invalid hyperparameters (this may still occur due to how TPOT uses crossover).\n", - " * Setting `verbose=5` will print out the full error message for all failed pipelines. This can be useful for debugging whether or not there is something misconfigured in your pipeline, custom search space modules, or something else." + " * Setting `verbose=5` will print out the full error message for all failed pipelines. This can be useful for debugging whether or not there is something misconfigured in your pipeline, custom search space modules, or something else.\n", + "\n", + "Other things to be aware of:\n", + "\n", + "* On small datasets, it is not impossible for TPOT to over fit the cross validation score itself. This can lead to lower than expected performance on held out datasets. TPOT will always return the model with the highest CV score as its final fitted_pipeline. However, if the highest performing model as evaluated by cross validation actually was just overfit to the CV score, it may actually be worse performing compared to other models on the pareto front.\n", + " * Using a secondary complexity objective and evaluating the entire pareto front may be beneficial. In some cases a lower performing pipeline with lower complexity can actually perform better on held out sets. These can either be evaluated and compared on a held out validation set, or sometimes, if very data limited, simply using a different seed of splitting the CV folds can work as well.\n", + " * TPOT can do this automatically. The `validation_strategy` parameter than select between re-testing the final pareto front on either a held out validation set (percent of data set by `validation_fraction`) or on a different seed for splitting the CV folds. These can be selected by setting `validation_strategy` to \"split\" or \"reshuffled\", respectively.\n", + " * Increasing the number of folds of cross validation can mitigate this. \n", + " * Nested cross validation can also be used to estimate the performance of the TPOT optimization algorithm itself.\n", + " * Removing more complex methods from the search space can reduce the changes of overfitting" ] }, { From 714b1e81f4e44ee107c1cd110b21ad67d375f6d9 Mon Sep 17 00:00:00 2001 From: perib Date: Fri, 20 Sep 2024 20:11:17 -0700 Subject: [PATCH 05/44] allowed memory to be passed through to linear pipelines (previously only supported graphpipelines) --- tpot2/search_spaces/base.py | 2 +- tpot2/search_spaces/nodes/fss_node.py | 2 +- tpot2/search_spaces/nodes/genetic_feature_selection.py | 2 +- tpot2/search_spaces/pipelines/choice.py | 4 ++-- tpot2/search_spaces/pipelines/dynamic_linear.py | 4 ++-- tpot2/search_spaces/pipelines/dynamicunion.py | 2 +- tpot2/search_spaces/pipelines/graph.py | 8 +++----- tpot2/search_spaces/pipelines/sequential.py | 9 ++++----- tpot2/search_spaces/pipelines/union.py | 2 +- tpot2/search_spaces/pipelines/wrapper.py | 4 ++-- tpot2/tpot_estimator/estimator.py | 2 +- tpot2/tpot_estimator/estimator_utils.py | 6 +++--- tpot2/tpot_estimator/steady_state_estimator.py | 2 +- 13 files changed, 23 insertions(+), 26 deletions(-) diff --git a/tpot2/search_spaces/base.py b/tpot2/search_spaces/base.py index 97bb2a57..ea3b751b 100644 --- a/tpot2/search_spaces/base.py +++ b/tpot2/search_spaces/base.py @@ -33,7 +33,7 @@ def wrapper(self, other, rng=None, **kwargs): return wrapper - def export_pipeline(self) -> BaseEstimator: + def export_pipeline(self, **kwargs) -> BaseEstimator: return def unique_id(self): diff --git a/tpot2/search_spaces/nodes/fss_node.py b/tpot2/search_spaces/nodes/fss_node.py index 4dda0d92..a5dfe2b8 100644 --- a/tpot2/search_spaces/nodes/fss_node.py +++ b/tpot2/search_spaces/nodes/fss_node.py @@ -55,7 +55,7 @@ def crossover(self, other, rng=None): self.selected_subset_name = other.selected_subset_name self.sel_subset = other.sel_subset - def export_pipeline(self): + def export_pipeline(self, **kwargs): return FeatureSetSelector(sel_subset=self.sel_subset, name=self.selected_subset_name) diff --git a/tpot2/search_spaces/nodes/genetic_feature_selection.py b/tpot2/search_spaces/nodes/genetic_feature_selection.py index 0bea039a..0af42aa1 100644 --- a/tpot2/search_spaces/nodes/genetic_feature_selection.py +++ b/tpot2/search_spaces/nodes/genetic_feature_selection.py @@ -160,7 +160,7 @@ def _crossover_swap(self, ss2, rng=None): self.mask = np.where(mask, self.mask, ss2.mask) - def export_pipeline(self): + def export_pipeline(self, **kwargs): return MaskSelector(mask=self.mask) diff --git a/tpot2/search_spaces/pipelines/choice.py b/tpot2/search_spaces/pipelines/choice.py index 694567db..da1fcfd0 100644 --- a/tpot2/search_spaces/pipelines/choice.py +++ b/tpot2/search_spaces/pipelines/choice.py @@ -33,8 +33,8 @@ def _mutate_node(self, rng=None): def crossover(self, other, rng=None): return self.node.crossover(other.node, rng) - def export_pipeline(self): - return self.node.export_pipeline() + def export_pipeline(self, **kwargs): + return self.node.export_pipeline(**kwargs) def unique_id(self): return self.node.unique_id() diff --git a/tpot2/search_spaces/pipelines/dynamic_linear.py b/tpot2/search_spaces/pipelines/dynamic_linear.py index 528ec7c4..e58005d3 100644 --- a/tpot2/search_spaces/pipelines/dynamic_linear.py +++ b/tpot2/search_spaces/pipelines/dynamic_linear.py @@ -127,8 +127,8 @@ def _crossover_node(self, other, rng): crossover_success = True return crossover_success - def export_pipeline(self): - return sklearn.pipeline.make_pipeline(*[step.export_pipeline() for step in self.pipeline]) + def export_pipeline(self, memory=None, **kwargs): + return sklearn.pipeline.make_pipeline(*[step.export_pipeline() for step in self.pipeline], memory=memory) def unique_id(self): l = [step.unique_id() for step in self.pipeline] diff --git a/tpot2/search_spaces/pipelines/dynamicunion.py b/tpot2/search_spaces/pipelines/dynamicunion.py index 8d8772eb..25a0147f 100644 --- a/tpot2/search_spaces/pipelines/dynamicunion.py +++ b/tpot2/search_spaces/pipelines/dynamicunion.py @@ -136,7 +136,7 @@ def _crossover_node(self, other, rng): return changed - def export_pipeline(self): + def export_pipeline(self, **kwargs): values = list(self.union_dict.values()) return sklearn.pipeline.make_union(*[step.export_pipeline() for step in values]) diff --git a/tpot2/search_spaces/pipelines/graph.py b/tpot2/search_spaces/pipelines/graph.py index fc769b1c..68a64441 100644 --- a/tpot2/search_spaces/pipelines/graph.py +++ b/tpot2/search_spaces/pipelines/graph.py @@ -85,7 +85,6 @@ def __init__( self.cross_val_predict_cv = cross_val_predict_cv self.method = method - self.memory = memory self.use_label_encoder = use_label_encoder self.root = self.root_search_space.generate(rng) @@ -597,7 +596,7 @@ def _merge_duplicated_nodes(self): return graph_changed - def export_pipeline(self): + def export_pipeline(self, memory=None, **kwargs): estimator_graph = self.graph.copy() #mapping = {node:node.method_class(**node.hyperparameters) for node in estimator_graph} @@ -623,7 +622,7 @@ def export_pipeline(self): for label, instance in label_to_instance.items(): estimator_graph.nodes[label]["instance"] = instance - return tpot2.GraphPipeline(graph=estimator_graph, memory=self.memory, use_label_encoder=self.use_label_encoder, method=self.method, cross_val_predict_cv=self.cross_val_predict_cv) + return tpot2.GraphPipeline(graph=estimator_graph, memory=memory, use_label_encoder=self.use_label_encoder, method=self.method, cross_val_predict_cv=self.cross_val_predict_cv) def plot(self): @@ -749,13 +748,12 @@ def __init__(self, self.cross_val_predict_cv = cross_val_predict_cv self.method = method - self.memory = memory self.use_label_encoder = use_label_encoder def generate(self, rng=None): rng = np.random.default_rng(rng) ind = GraphPipelineIndividual(self.root_search_space, self.leaf_search_space, self.inner_search_space, self.max_size, self.crossover_same_depth, - self.cross_val_predict_cv, self.method, self.memory, self.use_label_encoder, rng=rng) + self.cross_val_predict_cv, self.method, self.use_label_encoder, rng=rng) # if user specified limit, grab a random number between that limit if self.max_size is None or self.max_size == np.inf: diff --git a/tpot2/search_spaces/pipelines/sequential.py b/tpot2/search_spaces/pipelines/sequential.py index 8d7fc9ca..ab9f97da 100644 --- a/tpot2/search_spaces/pipelines/sequential.py +++ b/tpot2/search_spaces/pipelines/sequential.py @@ -126,8 +126,8 @@ def _crossover_node(self, other, rng): return crossover_success - def export_pipeline(self): - return sklearn.pipeline.make_pipeline(*[step.export_pipeline() for step in self.pipeline], memory=self.memory) + def export_pipeline(self, memory=None, **kwargs): + return sklearn.pipeline.make_pipeline(*[step.export_pipeline() for step in self.pipeline], memory=memory) def unique_id(self): l = [step.unique_id() for step in self.pipeline] @@ -138,13 +138,12 @@ def unique_id(self): class SequentialPipeline(SklearnIndividualGenerator): - def __init__(self, search_spaces : List[SklearnIndividualGenerator], memory=None ) -> None: + def __init__(self, search_spaces : List[SklearnIndividualGenerator] ) -> None: """ Takes in a list of search spaces. will produce a pipeline of Sequential length. Each step in the pipeline will correspond to the the search space provided in the same index. """ self.search_spaces = search_spaces - self.memory = memory def generate(self, rng=None): - return SequentialPipelineIndividual(self.search_spaces, memory=self.memory, rng=rng) \ No newline at end of file + return SequentialPipelineIndividual(self.search_spaces, rng=rng) \ No newline at end of file diff --git a/tpot2/search_spaces/pipelines/union.py b/tpot2/search_spaces/pipelines/union.py index 811ef38b..a8fd392b 100644 --- a/tpot2/search_spaces/pipelines/union.py +++ b/tpot2/search_spaces/pipelines/union.py @@ -60,7 +60,7 @@ def _crossover_node(self, other, rng): return crossover_success - def export_pipeline(self): + def export_pipeline(self, **kwargs): return sklearn.pipeline.make_union(*[step.export_pipeline() for step in self.pipeline]) def unique_id(self): diff --git a/tpot2/search_spaces/pipelines/wrapper.py b/tpot2/search_spaces/pipelines/wrapper.py index d61bc5f3..1b5807c8 100644 --- a/tpot2/search_spaces/pipelines/wrapper.py +++ b/tpot2/search_spaces/pipelines/wrapper.py @@ -100,14 +100,14 @@ def check_hyperparameters_for_None(self): self.hyperparameters[key] = False - def export_pipeline(self): + def export_pipeline(self, **kwargs): if self.hyperparameters_parser is not None: final_params = self.hyperparameters_parser(self.hyperparameters) else: final_params = self.hyperparameters - est = self.node.export_pipeline() + est = self.node.export_pipeline(**kwargs) wrapped_est = self.method(est, **final_params) return wrapped_est diff --git a/tpot2/tpot_estimator/estimator.py b/tpot2/tpot_estimator/estimator.py index ae1762d8..b8a14b75 100644 --- a/tpot2/tpot_estimator/estimator.py +++ b/tpot2/tpot_estimator/estimator.py @@ -881,7 +881,7 @@ def ind_generator(rng): if self.export_graphpipeline: best_individual_pipeline = best_individual.export_flattened_graphpipeline(memory=self.memory, cross_val_predict_cv=self.cross_val_predict_cv) else: - best_individual_pipeline = best_individual.export_pipeline() + best_individual_pipeline = best_individual.export_pipeline(memory=self.memory) if self.preprocessing: self.fitted_pipeline_ = sklearn.pipeline.make_pipeline(sklearn.base.clone(self._preprocessing_pipeline), best_individual_pipeline ) diff --git a/tpot2/tpot_estimator/estimator_utils.py b/tpot2/tpot_estimator/estimator_utils.py index 7be96e26..0eef127b 100644 --- a/tpot2/tpot_estimator/estimator_utils.py +++ b/tpot2/tpot_estimator/estimator_utils.py @@ -19,7 +19,7 @@ def apply_make_pipeline(graphindividual, preprocessing_pipeline=None, export_gra if export_graphpipeline: est = graphindividual.export_flattened_graphpipeline(**pipeline_kwargs) else: - est = graphindividual.export_pipeline() + est = graphindividual.export_pipeline(**pipeline_kwargs) if preprocessing_pipeline is None: @@ -38,7 +38,7 @@ def objective_function_generator(pipeline, x,y, scorers, cv, other_objective_fun if export_graphpipeline: pipeline = pipeline.export_flattened_graphpipeline(**pipeline_kwargs) else: - pipeline = pipeline.export_pipeline() + pipeline = pipeline.export_pipeline(**pipeline_kwargs) if budget is not None and budget < 1: if is_classification: @@ -70,7 +70,7 @@ def val_objective_function_generator(pipeline, X_train, y_train, X_test, y_test, if export_graphpipeline: pipeline = pipeline.export_flattened_graphpipeline(**pipeline_kwargs) else: - pipeline = pipeline.export_pipeline() + pipeline = pipeline.export_pipeline(**pipeline_kwargs) fitted_pipeline = sklearn.base.clone(pipeline) fitted_pipeline.fit(X_train, y_train) diff --git a/tpot2/tpot_estimator/steady_state_estimator.py b/tpot2/tpot_estimator/steady_state_estimator.py index e3440dd9..0a55c2c1 100644 --- a/tpot2/tpot_estimator/steady_state_estimator.py +++ b/tpot2/tpot_estimator/steady_state_estimator.py @@ -893,7 +893,7 @@ def ind_generator(rng): if self.export_graphpipeline: best_individual_pipeline = best_individual.export_flattened_graphpipeline(memory=self.memory, cross_val_predict_cv=self.cross_val_predict_cv) else: - best_individual_pipeline = best_individual.export_pipeline() + best_individual_pipeline = best_individual.export_pipeline(memory=self.memory) if self.preprocessing: self.fitted_pipeline_ = sklearn.pipeline.make_pipeline(sklearn.base.clone(self._preprocessing_pipeline), best_individual_pipeline ) From 37291373d94a7fec7035a57f2aac82ca0423fcb8 Mon Sep 17 00:00:00 2001 From: perib Date: Fri, 20 Sep 2024 20:13:51 -0700 Subject: [PATCH 06/44] edits --- Tutorial/1_Using_TPOT.ipynb | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index 6b74347e..67abf7de 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -1999,12 +1999,15 @@ " * TPOT may have to be run for a longer duration, increase `max_time_mins`, `early_stop`, or `generations`.\n", " * Individual pipelines may need more time to complete fitting, increase `max_eval_time_seconds`.\n", " * The configuration may not include the optimal model types or hyperparameter ranges, explore other included templates or customize your own search space (see Tutorial 2!)\n", - "* TPOT is running forever and never terminating\n", + "* TPOT is too slow! It is running forever and never terminating\n", " * Check that at least one of the three termination conditions is set to a reasonable level. These are `max_time_mins`, `early_stop`, or `generations`. Additionally check that `max_eval_time_seconds` is giving enough time for most models to train without being overly long. (Some estimators may take an unreasonably long time to fit, this parameter is intended to prevent them from slowing everything to a halt. In my experience, SVC and SVR tend to be the culprits, so removing them from the search space may also improve run time).\n", + " * Set the `memory` parameter to allow TPOT to prevent repeated work when using either scikit-learn pipelines or TPOT GraphPipelines.\n", "* Many pipelines in the evaluated_individuals dataframe have crashed or turned up invalid!\n", " * This may actually be normal and is expected behavior for TPOT. In some cases, TPOT may attempt an invalid hyperparameter combination which results in the pipeline not working. Other times, the pipeline configuration itself may be invalid. For example, a selector may not select any features due to its hyperparameter. Another common example is `MultinomialNB` throwing an error because it expects positive values, but a prior transformation yielded a negative value. \n", " * If you used custom search spaces, you can use `ConfigSpace` conditionals to prevent invalid hyperparameters (this may still occur due to how TPOT uses crossover).\n", " * Setting `verbose=5` will print out the full error message for all failed pipelines. This can be useful for debugging whether or not there is something misconfigured in your pipeline, custom search space modules, or something else.\n", + "* TPOT is crashing due to memory issues\n", + " * Set the `memory_limit` parameter so that n_jobs*memorylimit is less than the available RAM on your machine plus some wiggle room. This should prevent crashing due to memory concerns.\n", "\n", "Other things to be aware of:\n", "\n", From fc4c4e554d3bd1e71ad20f4f95331260701e6874 Mon Sep 17 00:00:00 2001 From: perib Date: Fri, 20 Sep 2024 20:25:42 -0700 Subject: [PATCH 07/44] edit --- Tutorial/1_Using_TPOT.ipynb | 41 ++++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index 67abf7de..64ef0153 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -1922,7 +1922,41 @@ "\n", "Plotting the performance over time is a good way of trying to access whether or not the TPOT model has converged. If performance seems to asymptote over time, there may not be much more performance to be gained by running for a longer period of time. If the plot looks like it is still actively improving, it may be worth running TPOT for a longer duration. \n", "\n", - "There are two ways to resume TPOT. If the `warm_start` parameter is set to True, subsequent calls to `fit` will continue training where it left off (The conventional scikit-learn default is to retrain from scratch on subsequent calls to fit). Additionally, if `periodic_checkpoint_folder` is set, TPOT will periodically save its current state. If TPOT terminates normally, is interrupted (job canceled, PC shut off), or crashes (memory issues), it will be able to resume training from where it left off. ** NOTE: If the periodic_checkpoint_folder is set, TPOT will always resume from the **" + "There are two ways to resume TPOT. If the `warm_start` parameter is set to True, subsequent calls to `fit` will continue training where it left off (The conventional scikit-learn default is to retrain from scratch on subsequent calls to fit). Additionally, if `periodic_checkpoint_folder` is set, TPOT will periodically save its current state. If TPOT terminates normally, is interrupted (job canceled, PC shut off), or crashes (memory issues), it will be able to resume training from where it left off. ** NOTE: If the periodic_checkpoint_folder is set, TPOT will always resume from the **\n", + "\n", + "In this case we can see that performance is near optimal and has slowed, so more time is likely unnecessary." + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "#get columns where roc_auc_score is not NaN\n", + "scores_and_times = df[df['roc_auc_score'].notna()][['roc_auc_score', 'Completed Timestamp']].sort_values('Completed Timestamp', ascending=True).to_numpy()\n", + "\n", + "#get best score at a given time\n", + "best_scores = np.maximum.accumulate(scores_and_times[:,0])\n", + "times = scores_and_times[:,1]\n", + "times = times - df['Submitted Timestamp'].min()\n", + "\n", + "fig, ax = plt.subplots(figsize=(10,5))\n", + "ax.plot(times, best_scores)\n", + "ax.set_xlabel('Time (seconds)')\n", + "ax.set_ylabel('Best Score')\n", + "plt.show()\n" ] }, { @@ -2002,12 +2036,17 @@ "* TPOT is too slow! It is running forever and never terminating\n", " * Check that at least one of the three termination conditions is set to a reasonable level. These are `max_time_mins`, `early_stop`, or `generations`. Additionally check that `max_eval_time_seconds` is giving enough time for most models to train without being overly long. (Some estimators may take an unreasonably long time to fit, this parameter is intended to prevent them from slowing everything to a halt. In my experience, SVC and SVR tend to be the culprits, so removing them from the search space may also improve run time).\n", " * Set the `memory` parameter to allow TPOT to prevent repeated work when using either scikit-learn pipelines or TPOT GraphPipelines.\n", + " * Increase n_jobs to use more processes/CPU power. See Tutorial 7 for advanced Dask usage, including parallelizing across multiple nodes on an HPC.\n", + " * Use feature selection, either the build in configuration of sklearn methods (see Tutorial 2), or genetic feature selection (see Tutorials 3 and 5 for two different strategies).\n", + " * Use successive halving to reduce computational load (See tutorial 8).\n", "* Many pipelines in the evaluated_individuals dataframe have crashed or turned up invalid!\n", " * This may actually be normal and is expected behavior for TPOT. In some cases, TPOT may attempt an invalid hyperparameter combination which results in the pipeline not working. Other times, the pipeline configuration itself may be invalid. For example, a selector may not select any features due to its hyperparameter. Another common example is `MultinomialNB` throwing an error because it expects positive values, but a prior transformation yielded a negative value. \n", " * If you used custom search spaces, you can use `ConfigSpace` conditionals to prevent invalid hyperparameters (this may still occur due to how TPOT uses crossover).\n", " * Setting `verbose=5` will print out the full error message for all failed pipelines. This can be useful for debugging whether or not there is something misconfigured in your pipeline, custom search space modules, or something else.\n", "* TPOT is crashing due to memory issues\n", " * Set the `memory_limit` parameter so that n_jobs*memorylimit is less than the available RAM on your machine plus some wiggle room. This should prevent crashing due to memory concerns.\n", + " * Using feature selection may also improve memory usage as described above.\n", + " * Remove modules that create high RAM usage (e.g. multiple PolynomialFeatures or one with high degree).\n", "\n", "Other things to be aware of:\n", "\n", From 66eaf7a788807e4537be292f35bbc406ce77b384 Mon Sep 17 00:00:00 2001 From: perib Date: Fri, 20 Sep 2024 20:57:42 -0700 Subject: [PATCH 08/44] edit --- Tutorial/1_Using_TPOT.ipynb | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index 64ef0153..0e34e66d 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -1941,6 +1941,13 @@ }, "metadata": {}, "output_type": "display_data" + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: : 0it [1:47:46, ?it/s]\n" + ] } ], "source": [ From 308fef3c5ff1239635d3aa3ba8f1a6390c06d7e0 Mon Sep 17 00:00:00 2001 From: perib Date: Mon, 23 Sep 2024 09:02:43 -0700 Subject: [PATCH 09/44] more docs --- ISSUE_TEMPLATE.md | 2 +- Tutorial/1_Using_TPOT.ipynb | 156 +++++++++++------- tpot2/tpot_estimator/estimator.py | 2 +- .../tpot_estimator/steady_state_estimator.py | 2 +- .../tpot_estimator/templates/tpottemplates.py | 4 +- 5 files changed, 105 insertions(+), 61 deletions(-) diff --git a/ISSUE_TEMPLATE.md b/ISSUE_TEMPLATE.md index 1bf2b7b4..ae32ba3b 100644 --- a/ISSUE_TEMPLATE.md +++ b/ISSUE_TEMPLATE.md @@ -8,7 +8,7 @@ ## Process to reproduce the issue -[ordered list the process to finding and recreating the issue, example below] +[ordered list the process to finding and recreating the issue, example below. A minimally reproducible example would be ideal. This refers to the minimum amount of code necessary to reproduce the issue.] 1. User creates TPOT instance 2. User calls TPOT `fit()` function with training data diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index 0e34e66d..182710da 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -1972,59 +1972,27 @@ "source": [ "### Common parameters\n", "\n", - " scorers : (list, scorer)\n", - " A scorer or list of scorers to be used in the cross-validation process. \n", - " see https://scikit-learn.org/stable/modules/model_evaluation.html\n", - " \n", - " scorers_weights : list\n", - " A list of weights to be applied to the scorers during the optimization process.\n", - " \n", - " classification : bool\n", - " If True, the problem is treated as a classification problem. If False, the problem is treated as a regression problem.\n", - " Used to determine the CV strategy.\n", - " \n", - " cv : int, cross-validator\n", - " - (int): Number of folds to use in the cross-validation process. By uses the sklearn.model_selection.KFold cross-validator for regression and StratifiedKFold for classification. In both cases, shuffled is set to True.\n", - " - (sklearn.model_selection.BaseCrossValidator): A cross-validator to use in the cross-validation process.\n", - " - max_depth (int): The maximum depth from any node to the root of the pipelines to be generated.\n", - " \n", - " other_objective_functions : list, default=[tpot2.objectives.estimator_objective_functions.average_path_length_objective]\n", - " A list of other objective functions to apply to the pipeline.\n", - " \n", - " other_objective_functions_weights : list, default=[-1]\n", - " A list of weights to be applied to the other objective functions.\n", - " \n", - " objective_function_names : list, default=None\n", - " A list of names to be applied to the objective functions. If None, will use the names of the objective functions.\n", - " \n", - " bigger_is_better : bool, default=True\n", - " If True, the objective function is maximized. If False, the objective function is minimized. Use negative weights to reverse the direction.\n", - " \n", - " generations : int, default=50\n", - " Number of generations to run\n", - " \n", - " max_time_mins : float, default=float(\"inf\")\n", - " Maximum time to run the optimization. If none or inf, will run until the end of the generations.\n", - " \n", - " max_eval_time_mins : float, default=60*5\n", - " Maximum time to evaluate a single individual. If none or inf, there will be no time limit per evaluation.\n", - "\n", - " n_jobs : int, default=1\n", - " Number of processes to run in parallel.\n", - " \n", - " memory_limit : str, default=\"4GB\"\n", - " Memory limit for each job. See Dask [LocalCluster documentation](https://distributed.dask.org/en/stable/api.html#distributed.Client) for more information.\n", - "\n", - " \n", - " verbose : int, default=1 \n", - " How much information to print during the optimization process. Higher values include the information from lower values.\n", - " 0. nothing\n", - " 1. progress bar\n", - " \n", - " 3. best individual\n", - " 4. warnings\n", - " >=5. full warnings trace\n", - " 6. evaluations progress bar. (Temporary: This used to be 2. Currently, using evaluation progress bar may prevent some instances were we terminate a generation early due to it reaching max_time_mins in the middle of a generation OR a pipeline failed to be terminated normally and we need to manually terminate it.)\n", + "Here is a subset of the most common parameters to customize and what they do. See the docs for `TPOTEstimator` or `TPOTEstimatorSteadyState` full documentation of all parameters. \n", + "\n", + "| Parameter | Type | Description |\n", + "|--------------------------------|-----------------------|-----------------------------------------------------------------------------|\n", + "| scorers | list, scorer | List of scorers for cross-validation; see |\n", + "| scorers_weights | list | Weights applied to scorers during optimization |\n", + "| classification | bool | Problem type: True for classification, False for regression |\n", + "| cv | int, cross-validator | Cross-validation strategy: int for folds or custom cross-validator |\n", + "| max_depth | int | Maximum pipeline depth |\n", + "| other_objective_functions | list | Additional objective functions; default: [average_path_length_objective] |\n", + "| other_objective_functions_weights | list | Weights for additional objective functions; default: [-1] |\n", + "| objective_function_names | list | Names for objective functions; default: None (uses function names) |\n", + "| bigger_is_better | bool | Optimization direction: True for maximize, False for minimize |\n", + "| generations | int | Number of optimization generations; default: 50 |\n", + "| max_time_mins | float | Maximum optimization time (minutes); default: infinite |\n", + "| max_eval_time_mins | float | Maximum evaluation time per individual (minutes); default: 300 |\n", + "| n_jobs | int | Number of parallel processes; default: 1 |\n", + "| memory_limit | str | Memory limit per job; default: \"4GB\" |\n", + "| verbose | int | Optimization process verbosity: 0 (none), 1 (progress), 3 (best individual), 4 (warnings), 5+ (full warnings) |\n", + "| memory | str, memory object | If supplied, pipeline will cache each transformer after calling fit with joblib.Memory. |\n", + "| periodic_checkpoint_folder | str | Folder to save the population to periodically. If None, no periodic saving will be done. If provided, training will resume from this checkpoint.|\n", " \n" ] }, @@ -2032,14 +2000,76 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Common mistakes or issues - Debugging\n", + "# Pipeline caching in TPOT (joblib.Memory)\n", + "\n", + "With the memory parameter, pipelines can cache the results of each transformer after fitting them. This feature is used to avoid repeated computation by transformers within a pipeline if the parameters and input data are identical to another fitted pipeline during optimization process. TPOT allows users to specify a custom directory path or joblib.Memory in case they want to re-use the memory cache in future TPOT runs (or a warm_start run).\n", + "\n", + "There are three methods for enabling memory caching in TPOT:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tpot2 import TPOTClassifier\n", + "from tempfile import mkdtemp\n", + "from joblib import Memory\n", + "from shutil import rmtree\n", + "\n", + "# Method 1, auto mode: TPOT uses memory caching with a temporary directory and cleans it up upon shutdown\n", + "est = TPOTClassifier(memory='auto')\n", + "\n", + "# Method 2, with a custom directory for memory caching\n", + "est = TPOTClassifier(memory='/to/your/path')\n", + "\n", + "# Method 3, with a Memory object\n", + "cachedir = mkdtemp() # Create a temporary folder\n", + "memory = Memory(cachedir=cachedir, verbose=0)\n", + "est = TPOTClassifier(memory=memory)\n", + "\n", + "# Clear the cache directory when you don't need it anymore\n", + "rmtree(cachedir)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Note: TPOT does NOT clean up memory caches if users set a custom directory path or Memory object. We recommend that you clean up the memory caches when you don't need it anymore.**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Checkpointing\n", + "\n", + "TPOT can checkpoint its progress to disk and resume from that point later if the `periodic_checkpoint_folder` parameter is used. TPOT will save its internal dataframe of pipelines and their performance to disk every generation, allowing you to interrupt TPOT’s execution and resume it later on the same or a different machine.\n", + "\n", + "This feature is useful in several scenarios:\n", + "\n", + "Interrupting TPOT’s execution and resuming it later on the same or a different machine.\n", + "Handling unexpected terminations, such as power outages, cluster job cancellations, bugs, errors, or out-of-memory issues. The checkpointed dataframe can be loaded and inspected to help diagnose problems.\n", + "Running TPOT on a cluster and periodically saving its progress to disk.\n", + "\n", + "**Note: TPOT does not clean up the checkpoint files. If the `periodic_checkpoint_folder` parameter is set, it will always continue training from the last saved point, even if the input data has changed. A common issue is forgetting to change this folder between experiments, and TPOT continuing training from pipelines optimized for another dataset. If you intend to start a run from scratch, you must either remove the parameter, supply an empty folder, or delete the original checkpoint folder.**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### FAQ and Debugging\n", "\n", "If you are experiencing issues with TPOT, here are some common issues and how to address them.\n", "\n", - "* Performance is lower than expected.\n", + "* Performance is lower than expected. what can I do?\n", " * TPOT may have to be run for a longer duration, increase `max_time_mins`, `early_stop`, or `generations`.\n", " * Individual pipelines may need more time to complete fitting, increase `max_eval_time_seconds`.\n", " * The configuration may not include the optimal model types or hyperparameter ranges, explore other included templates or customize your own search space (see Tutorial 2!)\n", + " * Check that `periodic_checkpoint_folder` is set correctly. A common issue is forgetting to change this folder between experiments, and TPOT continuing training from pipelines optimized for another dataset.\n", "* TPOT is too slow! It is running forever and never terminating\n", " * Check that at least one of the three termination conditions is set to a reasonable level. These are `max_time_mins`, `early_stop`, or `generations`. Additionally check that `max_eval_time_seconds` is giving enough time for most models to train without being overly long. (Some estimators may take an unreasonably long time to fit, this parameter is intended to prevent them from slowing everything to a halt. In my experience, SVC and SVR tend to be the culprits, so removing them from the search space may also improve run time).\n", " * Set the `memory` parameter to allow TPOT to prevent repeated work when using either scikit-learn pipelines or TPOT GraphPipelines.\n", @@ -2054,10 +2084,24 @@ " * Set the `memory_limit` parameter so that n_jobs*memorylimit is less than the available RAM on your machine plus some wiggle room. This should prevent crashing due to memory concerns.\n", " * Using feature selection may also improve memory usage as described above.\n", " * Remove modules that create high RAM usage (e.g. multiple PolynomialFeatures or one with high degree).\n", + "* Why are my TPOT runs not reproducible when random_state is set?\n", + " * Check that `periodic_checkpoint_folder` is set correctly. If this is set to a non-empty folder, TPOT will continue training from the checkpoint rather than start a new run from scratch. For TPOT runs to be reproducible, they have to have the same starting points.\n", + " * If using custom search spaces, make sure to pass in a fixed `random_state` value into the configspace of the scikit-learn modules that utilize them. TPOT does not check whether estimators do or do not take in a random state value (See Tutorial 2).\n", + " * If using the pre-built search spaces provided by TPOT, make sure to pass in `random_state` to `tpot2.config.get_configspace` or `tpot2.config.template_search_spaces.get_template_search_spaces`. This ensures all estimators that support it get a fixed random_state value. (See Tutorial 2).\n", + " * If using custom Node and Pipeline types, make sure that all random decisions utilize the rng parameter passed into the mutation/crossover functions.\n", + " * If `max_eval_time_mins` is set, TPOT will terminate pipelines that go over this time limit. If the pipeline evaluation happens to be very similar to the time limit, its possible that small random fluctuations in CPU allocation may cause a give pipeline to happen to be evaluated in one run but not another. This slightly different result would throw off the random number generator thoughout the rest of the run. Setting `max_eval_time_mins` to None or a higher value may prevent this edge case.\n", + " * If using `TPOTEstimatorSteadyState` with `n_jobs`>1, it is also possible that random fluctuations in CPU allocation slightly change the order in which pipelines are evaluated, which will affect the downstream results. `TPOTEstimatorSteadyState` is more reliably reproducible when `n_jobs=1` (This is not an issue for the default `TPOTEstimator`, `TPOTClassifier`, `TPOTRegressor` as they used a batched generational approach where execution order does not impact results).\n", + "* TPOT is not using all the CPU cores I expected given my `n_jobs` setting.\n", + " * The default TPOT algorithm uses a generational approach. This means the TPOT will need to fully evaluated `population_size` (default 50) pipelines before starting the next batch. Often, TPOT will be waiting for the last few pipelines to finish evaluating, which could be less than `n_jobs`. Some estimators or pipelines can be significantly slower to evaluated than others. This can be addressed in a few ways:\n", + " * Decrease `max_eval_time_mins` to cut long running pipeline evaluations early.\n", + " * Remove estimators or hyperparameter configurations that are prone to very slow convergence (which is very often `SVC` or `SVR`).\n", + " * Alternatively, `TPOTEstimatorSteadyState` uses a slightly different backend for the evolutionary algorithm that does not utilize the generational approach. Instead, new pipelines are generated and evaluated as soon as the previous one finishes. With this estimator, all cores should be utilized at all times. \n", + "\n", + "\n", "\n", "Other things to be aware of:\n", "\n", - "* On small datasets, it is not impossible for TPOT to over fit the cross validation score itself. This can lead to lower than expected performance on held out datasets. TPOT will always return the model with the highest CV score as its final fitted_pipeline. However, if the highest performing model as evaluated by cross validation actually was just overfit to the CV score, it may actually be worse performing compared to other models on the pareto front.\n", + "* **Overfitting** On small datasets, it is not impossible for TPOT to over fit the cross validation score itself. This can lead to lower than expected performance on held out datasets. TPOT will always return the model with the highest CV score as its final fitted_pipeline. However, if the highest performing model as evaluated by cross validation actually was just overfit to the CV score, it may actually be worse performing compared to other models on the pareto front.\n", " * Using a secondary complexity objective and evaluating the entire pareto front may be beneficial. In some cases a lower performing pipeline with lower complexity can actually perform better on held out sets. These can either be evaluated and compared on a held out validation set, or sometimes, if very data limited, simply using a different seed of splitting the CV folds can work as well.\n", " * TPOT can do this automatically. The `validation_strategy` parameter than select between re-testing the final pareto front on either a held out validation set (percent of data set by `validation_fraction`) or on a different seed for splitting the CV folds. These can be selected by setting `validation_strategy` to \"split\" or \"reshuffled\", respectively.\n", " * Increasing the number of folds of cross validation can mitigate this. \n", diff --git a/tpot2/tpot_estimator/estimator.py b/tpot2/tpot_estimator/estimator.py index b8a14b75..825fb93f 100644 --- a/tpot2/tpot_estimator/estimator.py +++ b/tpot2/tpot_estimator/estimator.py @@ -154,7 +154,7 @@ def __init__(self, However, the output to the next node will come from cross_val_predict with the specified number of folds. memory: Memory object or string, default=None - If supplied, pipeline will cache each transformer after calling fit. This feature + If supplied, pipeline will cache each transformer after calling fit with joblib.Memory. This feature is used to avoid computing the fit transformers within a pipeline if the parameters and input data are identical with another fitted pipeline during optimization process. - String 'auto': diff --git a/tpot2/tpot_estimator/steady_state_estimator.py b/tpot2/tpot_estimator/steady_state_estimator.py index 0a55c2c1..bcfef964 100644 --- a/tpot2/tpot_estimator/steady_state_estimator.py +++ b/tpot2/tpot_estimator/steady_state_estimator.py @@ -214,7 +214,7 @@ def __init__(self, memory: Memory object or string, default=None - If supplied, pipeline will cache each transformer after calling fit. This feature + If supplied, pipeline will cache each transformer after calling fit with joblib.Memory. This feature is used to avoid computing the fit transformers within a pipeline if the parameters and input data are identical with another fitted pipeline during optimization process. - String 'auto': diff --git a/tpot2/tpot_estimator/templates/tpottemplates.py b/tpot2/tpot_estimator/templates/tpottemplates.py index cf529965..6b776754 100644 --- a/tpot2/tpot_estimator/templates/tpottemplates.py +++ b/tpot2/tpot_estimator/templates/tpottemplates.py @@ -87,7 +87,7 @@ def __init__( self, memory: Memory object or string, default=None - If supplied, pipeline will cache each transformer after calling fit. This feature + If supplied, pipeline will cache each transformer after calling fit with joblib.Memory. This feature is used to avoid computing the fit transformers within a pipeline if the parameters and input data are identical with another fitted pipeline during optimization process. - String 'auto': @@ -341,7 +341,7 @@ def __init__( self, memory: Memory object or string, default=None - If supplied, pipeline will cache each transformer after calling fit. This feature + If supplied, pipeline will cache each transformer after calling fit with joblib.Memory. This feature is used to avoid computing the fit transformers within a pipeline if the parameters and input data are identical with another fitted pipeline during optimization process. - String 'auto': From 1a891fbff2bcf25d0b5d3480f4a8f1c675cb7d31 Mon Sep 17 00:00:00 2001 From: perib Date: Mon, 23 Sep 2024 09:27:17 -0700 Subject: [PATCH 10/44] updated contribute, added graph light template, doc updates --- Tutorial/1_Using_TPOT.ipynb | 37 ++++++---- docs/contribute.md | 97 +++++++++++++++++++++++++- tpot2/config/template_search_spaces.py | 35 +++++++++- 3 files changed, 153 insertions(+), 16 deletions(-) diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index 182710da..da5ff472 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -178,7 +178,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Measuring Model Complexity\n", + "## Measuring Model Complexity\n", "\n", "When running TPOT, it can sometimes be beneficial to include a secondary objective that measures model complexity. More complex models can yield higher performance but this comes at the cost of interpretability. Simpler models may be more interpretable, but often have lower predictive performance. Sometimes, however, vast increases in complexity only marginally improve predictive performance. There may be other simpler and more interpretable pipelines with marginal performance decreases that could be acceptable for the increased interpretability. However, these pipelines are often missed by optimizing purely for performance. By including both performance and complexity as objective functions, TPOT will attempt to optimize the best pipeline for all complexity levels simultaneously. After optimization, the user will be able to see the complexity vs performance tradeoff and make the decision of which pipeline best suits their needs. \n", "\n", @@ -199,6 +199,7 @@ "| linear | A linear pipeline with the structure of \"Selector->(transformers+Passthrough)->(classifiers/regressors+Passthrough)->final classifier/regressor.\" For both the transformer and inner estimator layers, TPOT may choose one or more transformers/classifiers, or it may choose none. The inner classifier/regressor layer is optional. |\n", "| light | Same search space as linear, but without the inner classifier/regressor layer and with a reduced set of faster running estimators. |\n", "| graph | TPOT will optimize a pipeline in the shape of a directed acyclic graph. The nodes of the graph can include selectors, scalers, transformers, or classifiers/regressors (inner classifiers/regressors can optionally be not included). This will return a custom GraphPipeline rather than an sklearn Pipeline. More details in Tutorial 6. |\n", + "| graph-light | Same as graph search space, but with a reduced set of faster running estimators. |\n", "| mdr |TPOT will search over a series of feature selectors and Multifactor Dimensionality Reduction models to find a series of operators that maximize prediction accuracy. The TPOT MDR configuration is specialized for genome-wide association studies (GWAS), and is described in detail online here.\n", "\n", "Note that TPOT MDR may be slow to run because the feature selection routines are computationally expensive, especially on large datasets. |\n", @@ -212,7 +213,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Terminating Optimization\n", + "## Terminating Optimization (Early Stopping)\n", "\n", "Note that we use a short time duration for a quick example, but in practice you may need to run TPOT for a longer duration. by default, TPOT sets a time limit of 1 hour with a max limit of 5 minutes per pipeline. In practice you may want to increase these values.\n", "\n", @@ -228,10 +229,10 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Best Practices and tips:\n", + "## Best Practices and tips:\n", "\n", - "* When running tpot from an .py script, it is important to protect code with `if __name__==\"__main__\":` . This is because of how TPOT handles parallelization with Python and Dask.\n", - "* You can use the `early_stop` parameter to have TPOT terminate early. " + "* You can use the `early_stop` parameter to have TPOT terminate early. \n", + "* When running tpot from an .py script, it is important to protect code with `if __name__==\"__main__\":` . This is because of how TPOT handles parallelization with Python and Dask." ] }, { @@ -268,7 +269,8 @@ "# Example analysis and the Estimator class \n", "\n", "Here we use a toy example dataset included in scikit-learn. We will use the `light` configuration and the `complexity_scorer` to estimate complexity.\n", - "\n" + "\n", + "Note, for this toy example, we set a relatively short run time. In practice, we would recommend running TPOT for a longer duration with an `early_stop` value of around 5 to 20 (more details below)." ] }, { @@ -326,7 +328,7 @@ " search_space=\"light\",\n", " n_jobs=4, \n", " max_time_mins=60, \n", - " max_eval_time_mins=5,\n", + " max_eval_time_mins=10,\n", " early_stop=2,\n", " verbose=2,)\n", "est.fit(X_train, y_train)\n", @@ -844,7 +846,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### saving the pipeline\n", + "## Saving the Pipeline\n", "\n", "We recommend using dill or pickle to save the instance of the fitted_pipeline_. Note that we do not recommend pickling the TPOT object itself." ] @@ -1189,7 +1191,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### Lets plot the performances of the different pipelines including the pareto front\n", + "## Lets plot the performances of the different pipelines including the pareto front\n", "\n", "Plotting the performance of multiple objectives in a scatterplot is a useful way to visualize the tradeoff between model complexity and predictive performance. This is best visualized when plotting the pareto front pipelines, which presents the best performing pipeline along the spectrum of complexity. Generally, higher complexity models may yield higher performance, but be more difficult to interpret. " ] @@ -1918,7 +1920,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### Plot performance over time + Continuing a run from where it left off\n", + "## Plot performance over time + Continuing a run from where it left off\n", "\n", "Plotting the performance over time is a good way of trying to access whether or not the TPOT model has converged. If performance seems to asymptote over time, there may not be much more performance to be gained by running for a longer period of time. If the plot looks like it is still actively improving, it may be worth running TPOT for a longer duration. \n", "\n", @@ -2044,7 +2046,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Checkpointing\n", + "# Checkpointing\n", "\n", "TPOT can checkpoint its progress to disk and resume from that point later if the `periodic_checkpoint_folder` parameter is used. TPOT will save its internal dataframe of pipelines and their performance to disk every generation, allowing you to interrupt TPOT’s execution and resume it later on the same or a different machine.\n", "\n", @@ -2061,7 +2063,16 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### FAQ and Debugging\n", + "# Parallelization\n", + "\n", + "See Tutorial 7 for more details on parallelization with Dask, including information of using multiple nodes." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# FAQ and Debugging\n", "\n", "If you are experiencing issues with TPOT, here are some common issues and how to address them.\n", "\n", @@ -2099,7 +2110,7 @@ "\n", "\n", "\n", - "Other things to be aware of:\n", + "## Other things to be aware of:\n", "\n", "* **Overfitting** On small datasets, it is not impossible for TPOT to over fit the cross validation score itself. This can lead to lower than expected performance on held out datasets. TPOT will always return the model with the highest CV score as its final fitted_pipeline. However, if the highest performing model as evaluated by cross validation actually was just overfit to the CV score, it may actually be worse performing compared to other models on the pareto front.\n", " * Using a secondary complexity objective and evaluating the entire pareto front may be beneficial. In some cases a lower performing pipeline with lower complexity can actually perform better on held out sets. These can either be evaluated and compared on a held out validation set, or sometimes, if very data limited, simply using a different seed of splitting the CV folds can work as well.\n", diff --git a/docs/contribute.md b/docs/contribute.md index ae86a06b..d8edf39a 100644 --- a/docs/contribute.md +++ b/docs/contribute.md @@ -1,3 +1,98 @@ # Contributing -We welcome you to check the existing issues for bugs or enhancements to work on. If you have an idea for an extension to TPOT, please file a new issue so we can discuss it. \ No newline at end of file +We welcome you to check the existing issues for bugs or enhancements to work on. If you have an idea for an extension to TPOT, please file a new issue so we can discuss it. + +# Contribution Guide + +We welcome you to [check the existing issues](https://github.com/EpistasisLab/tpot2/issues/) for bugs or enhancements to work on. If you have an idea for an extension to TPOT, please [file a new issue](https://github.com/EpistasisLab/tpot2/issues/new) so we can discuss it. + +## Project layout + +The latest stable release of TPOT is on the [main branch](https://github.com/EpistasisLab/tpot2/tree/main), whereas the latest version of TPOT in development is on the [development branch](https://github.com/EpistasisLab/tpot2/tree/dev). Make sure you are looking at and working on the correct branch if you're looking to contribute code. + +In terms of directory structure: + +* All of TPOT's code sources are in the `tpot` directory +* The documentation sources are in the `docs_sources` directory +* Images in the documentation are in the `images` directory +* Tutorials for TPOT are in the `tutorials` directory +* Unit tests for TPOT are in the `tests.py` file + +Make sure to familiarize yourself with the project layout before making any major contributions, and especially make sure to send all code changes to the `development` branch. + +## How to contribute + +The preferred way to contribute to TPOT is to fork the +[main repository](https://github.com/EpistasisLab/tpot2/) on +GitHub: + +1. Fork the [project repository](https://github.com/EpistasisLab/tpot2): + click on the 'Fork' button near the top of the page. This creates + a copy of the code under your account on the GitHub server. + +2. Clone this copy to your local disk: + + $ git clone git@github.com:YourUsername/tpot2.git + $ cd tpot + +3. Create a branch to hold your changes: + + $ git checkout -b my-contribution + +4. Make sure your local environment is setup correctly for development. Installation instructions are almost identical to [the user instructions](installing.md) except that TPOT should *not* be installed. If you have TPOT installed on your computer then make sure you are using a virtual environment that does not have TPOT installed. Furthermore, you should make sure you have installed the `pytest` package into your development environment so that you can test changes locally. + + $ conda install pytest + +5. Start making changes on your newly created branch, remembering to never work on the ``main`` branch! Work on this copy on your computer using Git to do the version control. + +6. Once some changes are saved locally, you can use your tweaked version of TPOT by navigating to the project's base directory and running TPOT directly from the command line: + + $ python -m tpot.driver + + or by running script that imports and uses the TPOT module with code similar to `from tpot import TPOTClassifier` + +7. To check your changes haven't broken any existing tests and to check new tests you've added pass run the following (note, you must have the `pytest` package installed within your dev environment for this to work): + + $ pytest -s -v + +8. When you're done editing and local testing, run: + + $ git add modified_files + $ git commit + + to record your changes in Git, then push them to GitHub with: + + $ git push -u origin my-contribution + +Finally, go to the web page of your fork of the TPOT repo, and click 'Pull Request' (PR) to send your changes to the maintainers for review. Make sure that you send your PR to the `dev` branch, as the `main` branch is reserved for the latest stable release. This will start the CI server to check all the project's unit tests run and send an email to the maintainers. + +(If any of the above seems like magic to you, then look up the +[Git documentation](http://git-scm.com/documentation) on the web.) + +## Before submitting your pull request + +Before you submit a pull request for your contribution, please work through this checklist to make sure that you have done everything necessary so we can efficiently review and accept your changes. + +If your contribution changes TPOT in any way: + +* Update the [documentation](https://github.com/EpistasisLab/tpot2/tree/main/docs) so all of your changes are reflected there. + +* Update the [README](https://github.com/EpistasisLab/tpot2/blob/main/README.md) if anything there has changed. + +If your contribution involves any code changes: + +* Update the [project unit tests](https://github.com/EpistasisLab/tpot2/tree/main/tpot2/tests) to test your code changes. + +* Make sure that your code is properly commented with [docstrings](https://www.python.org/dev/peps/pep-0257/) and comments explaining your rationale behind non-obvious coding practices. + + +If your contribution requires a new library dependency: + +* Double-check that the new dependency is easy to install via `pip` or Anaconda. If the dependency requires a complicated installation, then we most likely won't merge your changes because we want to keep TPOT easy to install. + + +## After submitting your pull request + +After submitting your pull request, GitHub will automatically run unit tests on your changes and make sure that your updated code builds and runs. We also use services that automatically check code quality and test coverage. + +Check back shortly after submitting your pull request to make sure that your code passes these checks. If any of the checks come back with a red X, then do your best to address the errors. diff --git a/tpot2/config/template_search_spaces.py b/tpot2/config/template_search_spaces.py index 23b7a324..934720fb 100644 --- a/tpot2/config/template_search_spaces.py +++ b/tpot2/config/template_search_spaces.py @@ -67,9 +67,9 @@ def get_graph_search_space(classification=True, inner_predictors=True, **get_sea if classification: if inner_predictors: - inner_search_space = tpot2.config.get_search_space(["classifiers","transformers","scalers","selectors_regression"],**get_search_space_params) + inner_search_space = tpot2.config.get_search_space(["classifiers","transformers","scalers","selectors_classification"],**get_search_space_params) else: - inner_search_space = tpot2.config.get_search_space(["transformers","scalers","selectors_regression"],**get_search_space_params) + inner_search_space = tpot2.config.get_search_space(["transformers","scalers","selectors_classification"],**get_search_space_params) else: if inner_predictors: inner_search_space = tpot2.config.get_search_space(["regressors", "transformers","scalers","selectors_regression"],**get_search_space_params) @@ -86,6 +86,35 @@ def get_graph_search_space(classification=True, inner_predictors=True, **get_sea return search_space +def get_graph_search_space_light(classification=True, inner_predictors=True, **get_search_space_params ): + + if classification: + root_search_space = get_search_space(['BernoulliNB', 'DecisionTreeClassifier', 'GaussianNB', 'KNeighborsClassifier', 'LogisticRegression', 'MultinomialNB'], **get_search_space_params) + else: + root_search_space = get_search_space(["RidgeCV", "LinearSVR", "LassoLarsCV", "KNeighborsRegressor", "DecisionTreeRegressor", "ElasticNetCV"], **get_search_space_params) + + + if classification: + if inner_predictors: + inner_search_space = tpot2.config.get_search_space(['BernoulliNB', 'DecisionTreeClassifier', 'GaussianNB', 'KNeighborsClassifier', 'LogisticRegression', 'MultinomialNB',"transformers","scalers","SelectFwe", "SelectPercentile", "VarianceThreshold"],**get_search_space_params) + else: + inner_search_space = tpot2.config.get_search_space(["transformers","scalers","SelectFwe", "SelectPercentile", "VarianceThreshold"],**get_search_space_params) + else: + if inner_predictors: + inner_search_space = tpot2.config.get_search_space(["RidgeCV", "LinearSVR", "LassoLarsCV", "KNeighborsRegressor", "DecisionTreeRegressor", "ElasticNetCV", "transformers","scalers", "SelectFwe", "SelectPercentile", "VarianceThreshold"],**get_search_space_params) + else: + inner_search_space = tpot2.config.get_search_space(["transformers", "scalers", "SelectFwe", "SelectPercentile", "VarianceThreshold"],**get_search_space_params) + + + search_space = tpot2.search_spaces.pipelines.GraphPipeline( + root_search_space= root_search_space, + leaf_search_space = None, + inner_search_space = inner_search_space, + ) + + return search_space + + def get_light_search_space(classification=True, inner_predictors=False, **get_search_space_params ): selectors = get_search_space(["SelectFwe", "SelectPercentile", "VarianceThreshold","Passthrough"], **get_search_space_params) @@ -168,6 +197,8 @@ def get_template_search_spaces(default_search_space, classification=True, inner_ return get_linear_search_space(classification, inner_predictors, **get_search_space_params) elif default_search_space == "graph": return get_graph_search_space(classification, inner_predictors, **get_search_space_params) + elif default_search_space == "graph_light": + return get_graph_search_space_light(classification, inner_predictors, **get_search_space_params) elif default_search_space == "light": return get_light_search_space(classification, inner_predictors, **get_search_space_params) elif default_search_space == "mdr": From bd927f62e7949a48cb6c0d1bbac3403ee941a216 Mon Sep 17 00:00:00 2001 From: perib Date: Mon, 23 Sep 2024 10:43:58 -0700 Subject: [PATCH 11/44] docs and bug fixes --- Tutorial/1_Using_TPOT.ipynb | 4 +-- docs/contribute.md | 11 +++----- tpot2/builtin_modules/passkbinsdiscretizer.py | 2 +- tpot2/config/regressors.py | 6 ++--- tpot2/config/template_search_spaces.py | 6 ++--- tpot2/config/templates/__init__.py | 0 tpot2/config/templates/autoqtl.py | 0 tpot2/config/templates/stc.py | 0 .../nodes/estimator_node_gradual.py | 3 +++ tpot2/tpot_estimator/estimator.py | 13 +++++++++- .../tpot_estimator/templates/tpottemplates.py | 26 ++++++++++++++++--- 11 files changed, 49 insertions(+), 22 deletions(-) delete mode 100644 tpot2/config/templates/__init__.py delete mode 100644 tpot2/config/templates/autoqtl.py delete mode 100644 tpot2/config/templates/stc.py diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index da5ff472..0e1461f5 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -197,9 +197,9 @@ "| String | Description |\n", "| :--- | :----: |\n", "| linear | A linear pipeline with the structure of \"Selector->(transformers+Passthrough)->(classifiers/regressors+Passthrough)->final classifier/regressor.\" For both the transformer and inner estimator layers, TPOT may choose one or more transformers/classifiers, or it may choose none. The inner classifier/regressor layer is optional. |\n", - "| light | Same search space as linear, but without the inner classifier/regressor layer and with a reduced set of faster running estimators. |\n", + "| linear-light | Same search space as linear, but without the inner classifier/regressor layer and with a reduced set of faster running estimators. |\n", "| graph | TPOT will optimize a pipeline in the shape of a directed acyclic graph. The nodes of the graph can include selectors, scalers, transformers, or classifiers/regressors (inner classifiers/regressors can optionally be not included). This will return a custom GraphPipeline rather than an sklearn Pipeline. More details in Tutorial 6. |\n", - "| graph-light | Same as graph search space, but with a reduced set of faster running estimators. |\n", + "| graph-light | Same as graph search space, but without the inner classifier/regressors and with a reduced set of faster running estimators. |\n", "| mdr |TPOT will search over a series of feature selectors and Multifactor Dimensionality Reduction models to find a series of operators that maximize prediction accuracy. The TPOT MDR configuration is specialized for genome-wide association studies (GWAS), and is described in detail online here.\n", "\n", "Note that TPOT MDR may be slow to run because the feature selection routines are computationally expensive, especially on large datasets. |\n", diff --git a/docs/contribute.md b/docs/contribute.md index d8edf39a..6ef12518 100644 --- a/docs/contribute.md +++ b/docs/contribute.md @@ -45,17 +45,12 @@ GitHub: 5. Start making changes on your newly created branch, remembering to never work on the ``main`` branch! Work on this copy on your computer using Git to do the version control. -6. Once some changes are saved locally, you can use your tweaked version of TPOT by navigating to the project's base directory and running TPOT directly from the command line: - $ python -m tpot.driver +6. Check your changes haven't broken any existing tests and pass all your new tests. Navigate the terminal into the `tpot2/tpot2/` folder and run the command `pytest` to start all tests. (note, you must have the `pytest` package installed within your dev environment for this to work): - or by running script that imports and uses the TPOT module with code similar to `from tpot import TPOTClassifier` + $ pytest -7. To check your changes haven't broken any existing tests and to check new tests you've added pass run the following (note, you must have the `pytest` package installed within your dev environment for this to work): - - $ pytest -s -v - -8. When you're done editing and local testing, run: +7. When you're done editing and local testing, run: $ git add modified_files $ git commit diff --git a/tpot2/builtin_modules/passkbinsdiscretizer.py b/tpot2/builtin_modules/passkbinsdiscretizer.py index 6ca5a9b5..7b8d5f3d 100644 --- a/tpot2/builtin_modules/passkbinsdiscretizer.py +++ b/tpot2/builtin_modules/passkbinsdiscretizer.py @@ -15,7 +15,7 @@ class PassKBinsDiscretizer(BaseEstimator, TransformerMixin): """ Same as sklearn.preprocessing.KBinsDiscretizer, but passes through columns that are not discretized due to having fewer than n_bins unique values instead of ignoring them. """ - def __init__(self, n_bins=5, encode='onehot-dense', strategy='quantile', subsample='warn', random_state=None): + def __init__(self, n_bins=5, encode='onehot-dense', strategy='quantile', subsample=None, random_state=None): self.n_bins = n_bins self.encode = encode self.strategy = strategy diff --git a/tpot2/config/regressors.py b/tpot2/config/regressors.py index 6348f5c2..11362cce 100644 --- a/tpot2/config/regressors.py +++ b/tpot2/config/regressors.py @@ -15,7 +15,7 @@ def get_RandomForestRegressor_ConfigurationSpace(random_state): space = { 'n_estimators': 100, - 'criterion': Categorical("criterion", ['mse', 'mae', "friedman_mse"]), + 'criterion': Categorical("criterion", ['friedman_mse', 'poisson', 'absolute_error', 'squared_error']), 'max_features': Float("max_features", bounds=(0.05, 1.0)), 'bootstrap': Categorical("bootstrap", [True, False]), 'min_samples_split': Integer("min_samples_split", bounds=(2, 21)), @@ -221,7 +221,7 @@ def get_Perceptron_ConfigurationSpace(random_state): def get_DecisionTreeRegressor_ConfigurationSpace(random_state): space = { - 'criterion': Categorical("criterion", ['squared_error', 'friedman_mse', 'mae']), + 'criterion': Categorical("criterion", ['friedman_mse', 'poisson', 'absolute_error', 'squared_error']), # 'max_depth': Integer("max_depth", bounds=(1, n_features*2)), 'min_samples_split': Integer("min_samples_split", bounds=(2, 21)), 'min_samples_leaf': Integer("min_samples_leaf", bounds=(1, 21)), @@ -334,7 +334,7 @@ def get_AdaBoostRegressor_ConfigurationSpace(random_state): def get_ExtraTreesRegressor_ConfigurationSpace(random_state): space = { 'n_estimators': 100, - 'criterion': Categorical("criterion", ["squared_error", "friedman_mse", "mae"]), + 'criterion': Categorical("criterion", ['friedman_mse', 'poisson', 'absolute_error', 'squared_error']), 'max_features': Float("max_features", bounds=(0.05, 1.0)), 'min_samples_split': Integer("min_samples_split", bounds=(2, 21)), 'min_samples_leaf': Integer("min_samples_leaf", bounds=(1, 21)), diff --git a/tpot2/config/template_search_spaces.py b/tpot2/config/template_search_spaces.py index 934720fb..8407fef8 100644 --- a/tpot2/config/template_search_spaces.py +++ b/tpot2/config/template_search_spaces.py @@ -187,7 +187,7 @@ def get_mdr_search_space(classification=True, **get_search_space_params ): def get_template_search_spaces(default_search_space, classification=True, inner_predictors=None, **get_search_space_params): if inner_predictors is None: - if default_search_space == "light": + if default_search_space == "light" or default_search_space == "graph_light": inner_predictors = False else: inner_predictors = True @@ -197,9 +197,9 @@ def get_template_search_spaces(default_search_space, classification=True, inner_ return get_linear_search_space(classification, inner_predictors, **get_search_space_params) elif default_search_space == "graph": return get_graph_search_space(classification, inner_predictors, **get_search_space_params) - elif default_search_space == "graph_light": + elif default_search_space == "graph-light": return get_graph_search_space_light(classification, inner_predictors, **get_search_space_params) - elif default_search_space == "light": + elif default_search_space == "linear-light": return get_light_search_space(classification, inner_predictors, **get_search_space_params) elif default_search_space == "mdr": return get_mdr_search_space(classification, **get_search_space_params) diff --git a/tpot2/config/templates/__init__.py b/tpot2/config/templates/__init__.py deleted file mode 100644 index e69de29b..00000000 diff --git a/tpot2/config/templates/autoqtl.py b/tpot2/config/templates/autoqtl.py deleted file mode 100644 index e69de29b..00000000 diff --git a/tpot2/config/templates/stc.py b/tpot2/config/templates/stc.py deleted file mode 100644 index e69de29b..00000000 diff --git a/tpot2/search_spaces/nodes/estimator_node_gradual.py b/tpot2/search_spaces/nodes/estimator_node_gradual.py index b59e263b..b10cfba2 100644 --- a/tpot2/search_spaces/nodes/estimator_node_gradual.py +++ b/tpot2/search_spaces/nodes/estimator_node_gradual.py @@ -131,6 +131,9 @@ def gradual_hyperparameter_update(params:dict, configspace:ConfigurationSpace, r elif new_params[param] > configspace[param].upper: new_params[param] = configspace[param].upper new_params[param] = int(new_params[param]) + # TODO : add support for categorical hyperparameters + else: + new_params[param] = params[param] except: pass diff --git a/tpot2/tpot_estimator/estimator.py b/tpot2/tpot_estimator/estimator.py index 825fb93f..1077f014 100644 --- a/tpot2/tpot_estimator/estimator.py +++ b/tpot2/tpot_estimator/estimator.py @@ -113,7 +113,18 @@ def __init__(self, Parameters ---------- search_space : (String, tpot2.search_spaces.SklearnIndividualGenerator) - - String : The default search space to use for the optimization. This can be either "linear" or "graph". If "linear", will use the default linear pipeline search space. If "graph", will use the default graph pipeline search space. + - String : The default search space to use for the optimization. + | String | Description | + | :--- | :----: | + | linear | A linear pipeline with the structure of "Selector->(transformers+Passthrough)->(classifiers/regressors+Passthrough)->final classifier/regressor." For both the transformer and inner estimator layers, TPOT may choose one or more transformers/classifiers, or it may choose none. The inner classifier/regressor layer is optional. | + | linear-light | Same search space as linear, but without the inner classifier/regressor layer and with a reduced set of faster running estimators. | + | graph | TPOT will optimize a pipeline in the shape of a directed acyclic graph. The nodes of the graph can include selectors, scalers, transformers, or classifiers/regressors (inner classifiers/regressors can optionally be not included). This will return a custom GraphPipeline rather than an sklearn Pipeline. More details in Tutorial 6. | + | graph-light | Same as graph search space, but without the inner classifier/regressors and with a reduced set of faster running estimators. | + | mdr |TPOT will search over a series of feature selectors and Multifactor Dimensionality Reduction models to find a series of operators that maximize prediction accuracy. The TPOT MDR configuration is specialized for genome-wide association studies (GWAS), and is described in detail online here. + + Note that TPOT MDR may be slow to run because the feature selection routines are computationally expensive, especially on large datasets. | + + - SklearnIndividualGenerator : The search space to use for the optimization. This should be an instance of a SklearnIndividualGenerator. The search space to use for the optimization. This should be an instance of a SklearnIndividualGenerator. TPOT2 has groups of search spaces found in the following folders, tpot2.search_spaces.nodes for the nodes in the pipeline and tpot2.search_spaces.pipelines for the pipeline structure. diff --git a/tpot2/tpot_estimator/templates/tpottemplates.py b/tpot2/tpot_estimator/templates/tpottemplates.py index 6b776754..005e4747 100644 --- a/tpot2/tpot_estimator/templates/tpottemplates.py +++ b/tpot2/tpot_estimator/templates/tpottemplates.py @@ -10,7 +10,7 @@ class TPOTRegressor(TPOTEstimator): def __init__( self, - search_space = "linear", + search_space = "linear-light", scorers=['neg_mean_squared_error'], scorers_weights=[1], cv = 10, #remove this and use a value based on dataset size? @@ -44,7 +44,16 @@ def __init__( self, ---------- search_space : (String, tpot2.search_spaces.SklearnIndividualGenerator) - - String : The default search space to use for the optimization. This can be either "linear" or "graph". If "linear", will use the default linear pipeline search space. If "graph", will use the default graph pipeline search space. + - String : The default search space to use for the optimization. + | String | Description | + | :--- | :----: | + | linear | A linear pipeline with the structure of "Selector->(transformers+Passthrough)->(classifiers/regressors+Passthrough)->final classifier/regressor." For both the transformer and inner estimator layers, TPOT may choose one or more transformers/classifiers, or it may choose none. The inner classifier/regressor layer is optional. | + | linear-light | Same search space as linear, but without the inner classifier/regressor layer and with a reduced set of faster running estimators. | + | graph | TPOT will optimize a pipeline in the shape of a directed acyclic graph. The nodes of the graph can include selectors, scalers, transformers, or classifiers/regressors (inner classifiers/regressors can optionally be not included). This will return a custom GraphPipeline rather than an sklearn Pipeline. More details in Tutorial 6. | + | graph-light | Same as graph search space, but without the inner classifier/regressors and with a reduced set of faster running estimators. | + | mdr |TPOT will search over a series of feature selectors and Multifactor Dimensionality Reduction models to find a series of operators that maximize prediction accuracy. The TPOT MDR configuration is specialized for genome-wide association studies (GWAS), and is described in detail online here. + + Note that TPOT MDR may be slow to run because the feature selection routines are computationally expensive, especially on large datasets. | - SklearnIndividualGenerator : The search space to use for the optimization. This should be an instance of a SklearnIndividualGenerator. The search space to use for the optimization. This should be an instance of a SklearnIndividualGenerator. TPOT2 has groups of search spaces found in the following folders, tpot2.search_spaces.nodes for the nodes in the pipeline and tpot2.search_spaces.pipelines for the pipeline structure. @@ -263,7 +272,7 @@ def fit(self, X, y): class TPOTClassifier(TPOTEstimator): def __init__( self, - search_space = "linear", + search_space = "linear-light", scorers=['roc_auc_ovr'], scorers_weights=[1], cv = 10, @@ -298,7 +307,16 @@ def __init__( self, ---------- search_space : (String, tpot2.search_spaces.SklearnIndividualGenerator) - - String : The default search space to use for the optimization. This can be either "linear" or "graph". If "linear", will use the default linear pipeline search space. If "graph", will use the default graph pipeline search space. + - String : The default search space to use for the optimization. + | String | Description | + | :--- | :----: | + | linear | A linear pipeline with the structure of "Selector->(transformers+Passthrough)->(classifiers/regressors+Passthrough)->final classifier/regressor." For both the transformer and inner estimator layers, TPOT may choose one or more transformers/classifiers, or it may choose none. The inner classifier/regressor layer is optional. | + | linear-light | Same search space as linear, but without the inner classifier/regressor layer and with a reduced set of faster running estimators. | + | graph | TPOT will optimize a pipeline in the shape of a directed acyclic graph. The nodes of the graph can include selectors, scalers, transformers, or classifiers/regressors (inner classifiers/regressors can optionally be not included). This will return a custom GraphPipeline rather than an sklearn Pipeline. More details in Tutorial 6. | + | graph-light | Same as graph search space, but without the inner classifier/regressors and with a reduced set of faster running estimators. | + | mdr |TPOT will search over a series of feature selectors and Multifactor Dimensionality Reduction models to find a series of operators that maximize prediction accuracy. The TPOT MDR configuration is specialized for genome-wide association studies (GWAS), and is described in detail online here. + + Note that TPOT MDR may be slow to run because the feature selection routines are computationally expensive, especially on large datasets. | - SklearnIndividualGenerator : The search space to use for the optimization. This should be an instance of a SklearnIndividualGenerator. The search space to use for the optimization. This should be an instance of a SklearnIndividualGenerator. TPOT2 has groups of search spaces found in the following folders, tpot2.search_spaces.nodes for the nodes in the pipeline and tpot2.search_spaces.pipelines for the pipeline structure. From ff8137fba0acb5408d9a4adb4f07eb804ed4fecb Mon Sep 17 00:00:00 2001 From: perib Date: Mon, 23 Sep 2024 15:55:57 -0700 Subject: [PATCH 12/44] documentation, fix wrapper, fix estimatortransformer, rename GraphPipeline to GraphSearchPipeline, add cross_val_predict_cv option to templates, param update --- Tutorial/2_Search_Spaces.ipynb | 1758 ++++++++++++----- tpot2/builtin_modules/estimatortransformer.py | 30 +- tpot2/config/classifiers.py | 2 +- tpot2/config/template_search_spaces.py | 28 +- tpot2/old_config_utils/old_config_utils.py | 4 +- tpot2/search_spaces/pipelines/dynamicunion.py | 6 +- tpot2/search_spaces/pipelines/graph.py | 4 +- .../pipelines/tests/test_graphspace.py | 2 +- tpot2/search_spaces/pipelines/tree.py | 2 +- tpot2/tests/test_estimators.py | 2 +- 10 files changed, 1267 insertions(+), 571 deletions(-) diff --git a/Tutorial/2_Search_Spaces.ipynb b/Tutorial/2_Search_Spaces.ipynb index 3e975731..d97489e6 100644 --- a/Tutorial/2_Search_Spaces.ipynb +++ b/Tutorial/2_Search_Spaces.ipynb @@ -4,58 +4,89 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Everything can be done with the TPOTEstimator class. All other classes (TPOTRegressor, TPOTClassifier, TPOTSymbolicClassifier, TPOTSymbolicRegression, TPOTGeneticFeatureSetSelector, etc.) are actually just different default settings for TPOTEstimator.\n", + "# Intro\n", "\n", - "\n", - "By Default, TPOT will generate pipelines with a default set of classifiers or regressors as roots (this depends on whether classification is set to true or false). All other nodes are selected from a default list of selectors and transformers. Note: This differs from the TPOT1 behavior where by default classifiers and regressors can appear in locations other than the root. You can modify the the search space for leaves, inner nodes, and roots (final classifiers) separately through built in options or custom configuration dictionaries.\n", - "\n", - "In this tutorial we will walk through using the built in configurations, creating custom configurations, and using nested configurations." + "TPOT gives the user a lot of options for customizing the search space, from hyperparameter ranges to model selection to pipeline configuration. TPOT is able to select models, optimize their hyperparameters, as well as build a complex pipeline structure. Each level of detail has multiple options for customization. In this tutorial, first we will explore how to set up a hyperparameter search space for a single method. Next we will describe how to set up simultaneous model selection and hyperparameter tuning. And finally we will cover how to utilize these steps to configure a search space for a fixed pipeline of multiple steps as well as having TPOT optimize the pipeline structure itself.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# ConfigSpace\n", + "# Hyperparameter Search Spaces with ConfigSpace\n", + "\n", + "Hyperparameter search spaces are defined using the [ConfigSpace package found here](https://github.com/automl/ConfigSpace). More information on how to set up a hyperparameter space can be found in their [documentation here](https://automl.github.io/ConfigSpace/main/guide.html).\n", + "\n", + "TPOT uses `ConfigSpace.ConfigurationSpace` objects to define the hyperparameter search space for individual models. This object can be used to keep track of the desired hyperparameters as well as provide functions for randomly sampling from this space.\n", "\n", - "Hyperparameter search spaces are defined using the [ConfigSpace package found here](https://github.com/automl/ConfigSpace). More information on how to set up a hyperparameter space can be found in their [documentation here](https://automl.github.io/ConfigSpace/main/guide.html)." + "In short, you can use the `Integer`, `Float`, and `Categorical` functions of `ConfigSpace` to define a range of values used for each param. Alternatively, a tuple with (min,max) ints or floats can be used to specify a int/float search space, and a list is used to specify a categorical search space. A fixed value an also be provided for parameters that are not tunned. The space parameter of `ConfigurationSpace` takes in a dictionary of param name to these ranges.\n", + "\n", + "Note: If you want results to be reproducible, you need set a fixed random_state in the search space.\n", + "\n", + "Here is an example of a hyperparameter range for RandomForest" ] }, { "cell_type": "code", - "execution_count": 15, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled hyperparameters\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 10, 'p': 2, 'weights': 'distance'}\n" - ] - } - ], + "outputs": [], "source": [ "from ConfigSpace import ConfigurationSpace\n", "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", - "from sklearn.neighbors import KNeighborsClassifier\n", + "from sklearn.ensemble import RandomForestClassifier\n", "\n", - "knn_configspace = ConfigurationSpace(\n", + "rf_configspace = ConfigurationSpace(\n", " space = {\n", + " 'n_estimators': 128, #as recommended by Oshiro et al. (2012\n", + " 'max_features': Float(\"max_features\", bounds=(0.01,1), log=True), #log scale like autosklearn?\n", + " 'criterion': Categorical(\"criterion\", ['gini', 'entropy']),\n", + " 'min_samples_split': Integer(\"min_samples_split\", bounds=(2, 20)),\n", + " 'min_samples_leaf': Integer(\"min_samples_leaf\", bounds=(1, 20)),\n", + " 'bootstrap': Categorical(\"bootstrap\", [True, False]),\n", + " #random_state = 1, # If you want results to be reproducible, you can set a fixed random_state.\n", + " }\n", + ")\n", "\n", - " 'n_neighbors': (1, 10),\n", - " 'weights': Categorical(\"weights\", ['uniform', 'distance']),\n", - " 'p': (1, 3),\n", - " 'metric': Categorical(\"metric\", ['euclidean', 'minkowski']),\n", - " 'n_jobs': 1,\n", - " }\n", + "hyperparameters = dict(rf_configspace.sample_configuration())\n", + "print(\"sampled hyperparameters\")\n", + "print(hyperparameters)\n", + "\n", + "rf = RandomForestClassifier(**hyperparameters)\n", + "rf" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "More simply:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "rf_configspace = ConfigurationSpace(\n", + " space = {\n", + " 'n_estimators': 128, #as recommended by Oshiro et al. (2012\n", + " 'max_features':(0.01,1), #not log scaled\n", + " 'criterion': ['gini', 'entropy'],\n", + " 'min_samples_split': (2, 20),\n", + " 'min_samples_leaf': (1, 20),\n", + " 'bootstrap': [True, False],\n", + " #random_state = 1, # If you want results to be reproducible, you can set a fixed random_state.\n", + " }\n", ")\n", "\n", - "hyperparameters = dict(knn_configspace.sample_configuration())\n", + "hyperparameters = dict(rf_configspace.sample_configuration())\n", "print(\"sampled hyperparameters\")\n", "print(hyperparameters)\n", "\n", - "knn = KNeighborsClassifier(**hyperparameters)" + "rf = RandomForestClassifier(**hyperparameters)\n", + "rf" ] }, { @@ -68,22 +99,18 @@ "\n", "TPOT search spaces are found in the `search_spaces` module. There are two primary kinds of search spaces, node and pipeline. Node search spaces specify the search space of a single sklearn `BaseEstimator`. Pipeline search spaces define the possible structures for a group of node search spaces. These take in node search spaces and produce a pipeline using nodes from that search space. Since sklearn Pipelines are also `BaseEstimator`, pipeline search spaces are also technically node search spaces. Meaning that pipeline search spaces can take in other pipeline search spaces in order to define more complex structures. The primary differentiating factor bewteen node and pipeline search spaces is that pipeline search spaces must take in another search space as input to feed its individual nodes. Therefore, all search spaces eventually end in a node search space at the lowest level. Note that parameters for pipeline search spaces can differ, some take in only a single search space, some take in a list, or some take in multiple defined parameters.\n", "\n", - "search spaces can be found in tpot2.search_spaces.nodes and tpot2.search_spaces.pipelines\n", + "## node search spaces\n", "\n", - "### node search spaces\n", - "found in tpot2.search_spaces.nodes\n", "\n", - "\n", - "EstimatorNode, GeneticFeatureSelector\n", "| Name | Info |\n", "| :--- | :----: |\n", "| EstimatorNode | Takes in a ConfigSpace along with the class of the method. This node will optimize the hyperparameters for a single method. |\n", "| GeneticFeatureSelectorNode | Uses evolution to optimize a set of features, exports a basic sklearn Selector that simply selects the features chosen by the node. |\n", + "| FSSNode | FSS stands for FeatureSetSelector. This node takes in a list of user-defined subsets of features and selects a single predefined subset. Note that TPOT will not create new subsets nor will it select multiple subsets per node. If using a linear pipeline, this node should be set as the first step. In linear pipelines it is recommended that you only use a small number of feature sets. I recommend exploring using FSSNodes in pipelines that allow TPOT to select more than one FSSNode at a time. For example, DynamicUnionPipeline and GraphPipeline are both excellent combos for FSSNode. Use FFSNode inside a DynamicUnionPipeline at the start of linear pipeline to explore optimal combinations of subsets in linear pipelines. Set the leaf_search_space of GraphSearchPipeline TPOT can use multiple feature sets in different ways, for example, with different transformers for different sets. |\n", "\n", "\n", "\n", - "\n", - "### pipeline search spaces\n", + "## pipeline search spaces\n", "\n", "found in tpot2.search_spaces.pipelines\n", "\n", @@ -96,8 +123,10 @@ "| ChoicePipeline | Takes in a list of search spaces. Will select one node from the search space. |\n", "| SequentialPipeline | Takes in a list of search spaces. will produce a pipeline of Sequential length. Each step in the pipeline will correspond to the the search space provided in the same index. |\n", "| DynamicLinearPipeline | Takes in a single search space. Will produce a linear pipeline of variable length. Each step in the pipeline will be pulled from the search space provided. |\n", + "| UnionPipeline | Takes in a list of search spaces. The returned pipeline will include one estimator per search space joined in an sklearn FeatureUnion. Useful for having many steps in one layer. |\n", + "| DynamicUnionPipeline | Takes in a single search space. It will pull anywhere from 1 to max_estimators number of estimators from the search space and concatenate them in a FeatureUnion. |\n", "| TreePipeline |Generates a pipeline of variable length. Pipeline will have a tree structure similar to TPOT1. |\n", - "| GraphPipeline | Generates a directed acyclic graph of variable size. Search spaces for root, leaf, and inner nodes can be defined separately if desired. |\n", + "| GraphSearchPipeline | Generates a directed acyclic graph of variable size. Search spaces for root, leaf, and inner nodes can be defined separately if desired. |\n", "| WrapperPipeline | This search space is for wrapping a sklearn estimator with a method that takes another estimator and hyperparameters as arguments. For example, this can be used with sklearn.ensemble.BaggingClassifier or sklearn.ensemble.AdaBoostClassifier. |\n" ] }, @@ -105,12 +134,22 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Estimator node example" + "## Node Search Space Examples\n", + "\n", + "Node search spaces represent the smallest unit of an sklearn pipeline. All node search spaces create and optimize a single node, or estimator object. For example this could be a KNeighborsClassifier or a FeatureSetSelector.\n", + "\n", + "### EstimatorNode\n", + "\n", + "The EstimatorNode represents the hyperparameter search space for a scikit-learn estimator. \n", + "\n", + "Note that `ConfigSpace` doesn't support `None` in its search space, and does not support the booleans True or False as fixed parameters (though booleans seem to be allowed in Categorical search spaces). To get around this, use the macros defined in:\n", + "\n", + "`from tpot2.search_spaces.nodes.estimator_node import NONE_SPECIAL_STRING, TRUE_SPECIAL_STRING, FALSE_SPECIAL_STRING`" ] }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 8, "metadata": {}, "outputs": [], "source": [ @@ -141,14 +180,33 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can sample generate an individual with the generate() function. This individual samples from the search space as well as provides mutation and crossover functions to modify the current sample.\n", - "\n", - "Note that ConfigurationSpace does not support None as a parameter. Instead, use the special string \"\\\". TPOT will automatically replace instances of this string with the Python None." + "You can sample generate an individual with the generate() function. This individual samples from the search space as well as provides mutation and crossover functions to modify the current sample." ] }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "knn_individual = knn_node.generate()\n", + "knn_individual" + ] + }, + { + "cell_type": "code", + "execution_count": 11, "metadata": {}, "outputs": [ { @@ -156,17 +214,37 @@ "output_type": "stream", "text": [ "sampled hyperparameters\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 6, 'p': 3, 'weights': 'distance'}\n", - "mutated hyperparameters\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 6, 'p': 1, 'weights': 'uniform'}\n" + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 4, 'p': 1, 'weights': 'uniform'}\n" ] } ], "source": [ - "knn_individual = knn_node.generate()\n", - "\n", "print(\"sampled hyperparameters\")\n", - "print(knn_individual.hyperparameters)\n", + "print(knn_individual.hyperparameters)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "All Individual objects have mutation and crossover operators that TPOT uses to optimize the pipelines." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "mutated hyperparameters\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 10, 'p': 3, 'weights': 'distance'}\n" + ] + } + ], + "source": [ "knn_individual.mutate() # mutate the individual\n", "print(\"mutated hyperparameters\")\n", "print(knn_individual.hyperparameters)" @@ -181,7 +259,7 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 13, "metadata": {}, "outputs": [ { @@ -189,14 +267,14 @@ "output_type": "stream", "text": [ "original hyperparameters for individual 1\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 2, 'weights': 'uniform'}\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 2, 'p': 2, 'weights': 'uniform'}\n", "original hyperparameters for individual 2\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 1, 'p': 2, 'weights': 'uniform'}\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 10, 'p': 2, 'weights': 'uniform'}\n", "\n", "post crossover hyperparameters for individual 1\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 2, 'weights': 'uniform'}\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 10, 'p': 2, 'weights': 'uniform'}\n", "post crossover hyperparameters for individual 2\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 1, 'p': 2, 'weights': 'uniform'}\n" + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 10, 'p': 2, 'weights': 'uniform'}\n" ] } ], @@ -229,7 +307,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 14, "metadata": {}, "outputs": [ { @@ -639,31 +717,32 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=7)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=7)" + "KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=10)" ] }, - "execution_count": 19, + "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "knn_individual1.export_pipeline()" + "est = knn_individual1.export_pipeline()\n", + "est" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "If a dictionary of parameters is passed instead of of a ConfigSpace, then the hyperparameters will be fixed and not learned." + "If a dictionary of parameters is passed instead of of a ConfigSpace object, then the hyperparameters will always be fixed and not learned." ] }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 15, "metadata": {}, "outputs": [ { @@ -1073,13 +1152,13 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
KNeighborsClassifier(n_neighbors=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
KNeighborsClassifier(n_neighbors=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "KNeighborsClassifier(n_neighbors=10)" ] }, - "execution_count": 20, + "execution_count": 15, "metadata": {}, "output_type": "execute_result" } @@ -1107,30 +1186,41 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Pipeline Search Spaces" + "### FSSNode and GeneticFeatureSelectorNode\n", + "\n", + "Both of these are given their own tutorials. See Tutorial 3 for FFSNode and Tutorial 5 for GeneticFeatureSelectorNode" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Pipeline Search Space Examples\n", + "\n", + "Pipeline search spaces are used to define the structure and restrictions of the pipelines TPOT can search. Unlike Node search spaces, all pipeline search spaces take in other search spaces as inputs. Rather than sample hyperparameters, pipeline search spaces can select models from the input search spaces and organize them within a linear sklearn Pipeline or a TPOT GraphPipeline." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## choice search space\n", + "### ChoicePipeline\n", "\n", - "The simplest pipeline search space is the ChoicePipeline. This takes in a list of search spaces and simply selects and samples from one. In this example, we will construct a search space that takes in several options for a classifier." + "The simplest pipeline search space is the ChoicePipeline. This takes in a list of search spaces and simply selects and samples from one. In this example, we will construct a search space that takes in several options for a classifier. The resulting search space will then first select a model from KNeighborsClassifier, LogisticRegression or DecisionTreeClassifier, and then select the hyperparameters for the given model." ] }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "" + "" ] }, - "execution_count": 21, + "execution_count": 16, "metadata": {}, "output_type": "execute_result" } @@ -1228,7 +1318,7 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 17, "metadata": {}, "outputs": [ { @@ -1645,16 +1735,16 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
LogisticRegression(C=174.83656421187536, class_weight='balanced', dual=True,\n",
-       "                   max_iter=1000, n_jobs=1, penalty='l1', solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
DecisionTreeClassifier(max_depth=5, max_features='sqrt', min_samples_leaf=4,\n",
+       "                       min_samples_split=4)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "LogisticRegression(C=174.83656421187536, class_weight='balanced', dual=True,\n", - " max_iter=1000, n_jobs=1, penalty='l1', solver='saga')" + "DecisionTreeClassifier(max_depth=5, max_features='sqrt', min_samples_leaf=4,\n", + " min_samples_split=4)" ] }, - "execution_count": 22, + "execution_count": 17, "metadata": {}, "output_type": "execute_result" } @@ -1668,7 +1758,7 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 18, "metadata": {}, "outputs": [ { @@ -2085,16 +2175,13 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=3,\n",
-       "                     weights='distance')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
KNeighborsClassifier(n_jobs=1, n_neighbors=6, p=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=3,\n", - " weights='distance')" + "KNeighborsClassifier(n_jobs=1, n_neighbors=6, p=1)" ] }, - "execution_count": 23, + "execution_count": 18, "metadata": {}, "output_type": "execute_result" } @@ -2111,21 +2198,29 @@ "source": [ "TPOT2 also comes with predefined search spaces. The current search spaces were adapted from a combination of the original TPOT package as well as the search spaces used in [AutoSklearn](https://github.com/automl/auto-sklearn/tree/development/autosklearn/pipeline/components). The helper function `tpot2.config.get_search_space` takes in a string or a list of strings, and returns either a EstimatorNode or a ChoicePipeline,respectively. \n", "\n", - "strings can correspond to individual methods. Tehre are also special strings that return predefined lists of methods. \n", + "strings can correspond to individual methods. There are also special strings that return predefined lists of methods. \n", "\n", - "Special strings are \"selectors\", \"classifiers\", \"transformers\"\n", - "\n", - "EstimatorNode, GeneticFeatureSelector\n", "| Special String | Included methods |\n", "| :--- | :----: |\n", - "| \"selectors\" | \"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\", \"RFE\", \"SelectFromModel\" |\n", - "| \"classifiers\" | \"LogisticRegression\", \"KNeighborsClassifier\", \"DecisionTreeClassifier\", \"SVC\", \"LinearSVC\", \"RandomForestClassifier\", \"GradientBoostingClassifier\", \"XGBClassifier\", \"LGBMClassifier\", \"ExtraTreesClassifier\", \"SGDClassifier\", \"MLPClassifier\", \"BernoulliNB\", \"MultinomialNB\" |\n", - "| \"transformers\" | \"Binarizer\", \"Normalizer\", \"PCA\", \"ZeroCount\", \"OneHotEncoder\", \"FastICA\", \"FeatureAgglomeration\", \"Nystroem\", \"RBFSampler\" |" + "| \"selectors\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\",] |\n", + "| \"selectors_classification\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\", \"RFE_classification\", \"SelectFromModel_classification\"] |\n", + "| \"selectors_regression\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\", \"RFE_regression\", \"SelectFromModel_regression\"] |\n", + "| \"classifiers\" | [\"LGBMClassifier\", \"BaggingClassifier\", 'AdaBoostClassifier', 'BernoulliNB', 'DecisionTreeClassifier', 'ExtraTreesClassifier', 'GaussianNB', 'HistGradientBoostingClassifier', 'KNeighborsClassifier','LinearDiscriminantAnalysis', 'LogisticRegression', \"LinearSVC\", \"SVC\", 'MLPClassifier', 'MultinomialNB', \"QuadraticDiscriminantAnalysis\", 'RandomForestClassifier', 'SGDClassifier', 'XGBClassifier'] |\n", + "| \"regressors\" | [\"LGBMRegressor\", 'AdaBoostRegressor', \"ARDRegression\", 'DecisionTreeRegressor', 'ExtraTreesRegressor', 'HistGradientBoostingRegressor', 'KNeighborsRegressor', 'LinearSVR', \"MLPRegressor\", 'RandomForestRegressor', 'SGDRegressor', 'SVR', 'XGBRegressor'] |\n", + "| \"transformers\" | [\"PassKBinsDiscretizer\", \"Binarizer\", \"PCA\", \"ZeroCount\", \"ColumnOneHotEncoder\", \"FastICA\", \"FeatureAgglomeration\", \"Nystroem\", \"RBFSampler\", \"QuantileTransformer\", \"PowerTransformer\"] |\n", + "| \"scalers\" | [\"MinMaxScaler\", \"RobustScaler\", \"StandardScaler\", \"MaxAbsScaler\", \"Normalizer\", ] |\n", + "| \"all_transformers\" | [\"transformers\", \"scalers\"] |\n", + "| \"arithmatic\" | [\"AddTransformer\", \"mul_neg_1_Transformer\", \"MulTransformer\", \"SafeReciprocalTransformer\", \"EQTransformer\", \"NETransformer\", \"GETransformer\", \"GTTransformer\", \"LETransformer\", \"LTTransformer\", \"MinTransformer\", \"MaxTransformer\"] |\n", + "| \"imputers\" | [\"SimpleImputer\", \"IterativeImputer\", \"KNNImputer\"] |\n", + "| \"skrebate\" | [\"ReliefF\", \"SURF\", \"SURFstar\", \"MultiSURF\"] |\n", + "| \"genetic_encoders\" | [\"DominantEncoder\", \"RecessiveEncoder\", \"HeterosisEncoder\", \"UnderDominanceEncoder\", \"OverDominanceEncoder\"] |\n", + "| \"classifiers_sklearnex\" | [\"RandomForestClassifier_sklearnex\", \"LogisticRegression_sklearnex\", \"KNeighborsClassifier_sklearnex\", \"SVC_sklearnex\",\"NuSVC_sklearnex\"] |\n", + "| \"regressors_sklearnex\" | [\"LinearRegression_sklearnex\", \"Ridge_sklearnex\", \"Lasso_sklearnex\", \"ElasticNet_sklearnex\", \"SVR_sklearnex\", \"NuSVR_sklearnex\", \"RandomForestRegressor_sklearnex\", \"KNeighborsRegressor_sklearnex\"] |" ] }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 19, "metadata": {}, "outputs": [ { @@ -2542,19 +2637,19 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
LogisticRegression(C=0.09214193108798754, l1_ratio=0.6425731475282531,\n",
-       "                   max_iter=1000, n_jobs=1, penalty='elasticnet',\n",
-       "                   solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
LogisticRegression(C=0.6262919454224, class_weight='balanced',\n",
+       "                   l1_ratio=0.1219417333128, max_iter=1000, n_jobs=1,\n",
+       "                   penalty='elasticnet', solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "LogisticRegression(C=0.09214193108798754, l1_ratio=0.6425731475282531,\n", - " max_iter=1000, n_jobs=1, penalty='elasticnet',\n", - " solver='saga')" + "LogisticRegression(C=0.6262919454224, class_weight='balanced',\n", + " l1_ratio=0.1219417333128, max_iter=1000, n_jobs=1,\n", + " penalty='elasticnet', solver='saga')" ] }, - "execution_count": 24, + "execution_count": 19, "metadata": {}, "output_type": "execute_result" } @@ -2569,7 +2664,7 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 20, "metadata": {}, "outputs": [ { @@ -2986,16 +3081,19 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
DecisionTreeClassifier(class_weight='balanced', max_depth=1, min_samples_leaf=8,\n",
-       "                       min_samples_split=9)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
DecisionTreeClassifier(class_weight='balanced', max_depth=14,\n",
+       "                       max_features='sqrt', min_samples_leaf=3,\n",
+       "                       min_samples_split=16)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "DecisionTreeClassifier(class_weight='balanced', max_depth=1, min_samples_leaf=8,\n", - " min_samples_split=9)" + "DecisionTreeClassifier(class_weight='balanced', max_depth=14,\n", + " max_features='sqrt', min_samples_leaf=3,\n", + " min_samples_split=16)" ] }, - "execution_count": 25, + "execution_count": 20, "metadata": {}, "output_type": "execute_result" } @@ -3007,7 +3105,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 21, "metadata": {}, "outputs": [ { @@ -3424,13 +3522,13 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
LinearDiscriminantAnalysis(shrinkage=0.6166902161314916, solver='eigen')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
KNeighborsClassifier(n_jobs=1, n_neighbors=3, p=3, weights='distance')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "LinearDiscriminantAnalysis(shrinkage=0.6166902161314916, solver='eigen')" + "KNeighborsClassifier(n_jobs=1, n_neighbors=3, p=3, weights='distance')" ] }, - "execution_count": 26, + "execution_count": 21, "metadata": {}, "output_type": "execute_result" } @@ -3445,7 +3543,7 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 22, "metadata": {}, "outputs": [ { @@ -3862,16 +3960,16 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
LogisticRegression(C=0.13397662986842293, max_iter=1000, n_jobs=1, penalty='l1',\n",
-       "                   solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
AdaBoostClassifier(algorithm='SAMME', learning_rate=0.0231006103189,\n",
+       "                   n_estimators=391)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "LogisticRegression(C=0.13397662986842293, max_iter=1000, n_jobs=1, penalty='l1',\n", - " solver='saga')" + "AdaBoostClassifier(algorithm='SAMME', learning_rate=0.0231006103189,\n", + " n_estimators=391)" ] }, - "execution_count": 27, + "execution_count": 22, "metadata": {}, "output_type": "execute_result" } @@ -3885,14 +3983,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Sequential Example\n", + "## SequentialPipeline\n", "\n", - "SequentialPipelines are of fixed length and sample from a predefined distribution for each step. Here is an example of the form Selector-Transformer-Classifer" + "SequentialPipelines are of fixed length and sample from a predefined distribution for each step. " ] }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 27, "metadata": {}, "outputs": [ { @@ -3905,7 +4003,7 @@ { "data": { "text/html": [ - "
Pipeline(steps=[('selectpercentile',\n",
-       "                 SelectPercentile(percentile=67.96672316882378)),\n",
-       "                ('columnonehotencoder', ColumnOneHotEncoder()),\n",
+       "
Pipeline(steps=[('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0009684094023)),\n",
+       "                ('pca', PCA(n_components=0.8999344355157)),\n",
        "                ('logisticregression',\n",
-       "                 LogisticRegression(C=5839.203596349427,\n",
-       "                                    class_weight='balanced', max_iter=1000,\n",
-       "                                    n_jobs=1, solver='saga'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('robustscaler',\n", - " RobustScaler(quantile_range=(0.2187724978734,\n", - " 0.7909007640608))),\n", - " ('variancethreshold',\n", - " VarianceThreshold(threshold=0.0193318854527)),\n", - " ('featureunion',\n", - " FeatureUnion(transformer_list=[('skiptransformer',\n", - " SkipTransformer()),\n", + "Pipeline(steps=[('minmaxscaler', MinMaxScaler()),\n", + " ('selectfwe', SelectFwe(alpha=0.0014048740592)),\n", + " ('featureunion-1',\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('zerocount',\n", + " ZeroCount())])),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('featureunion-2',\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('estimatortransformer',\n", + " EstimatorTransformer(estimator=BernoulliNB(alpha=76.5761838773666,\n", + " fit_prior=False)))])),\n", " ('passthrough',\n", " Passthrough())])),\n", " ('kneighborsclassifier',\n", - " KNeighborsClassifier(n_jobs=1, n_neighbors=1))])" + " KNeighborsClassifier(n_jobs=1, n_neighbors=1, p=3))])" ] }, - "execution_count": 28, + "execution_count": 18, "metadata": {}, "output_type": "execute_result" } @@ -1931,25 +1917,18 @@ }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 19, "metadata": {}, "outputs": [ { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: : 0it [1:47:46, ?it/s]\n" - ] } ], "source": [ @@ -2011,7 +1990,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 22, "metadata": {}, "outputs": [], "source": [ @@ -2027,12 +2006,8 @@ "est = TPOTClassifier(memory='/to/your/path')\n", "\n", "# Method 3, with a Memory object\n", - "cachedir = mkdtemp() # Create a temporary folder\n", - "memory = Memory(cachedir=cachedir, verbose=0)\n", - "est = TPOTClassifier(memory=memory)\n", - "\n", - "# Clear the cache directory when you don't need it anymore\n", - "rmtree(cachedir)" + "memory = Memory(location='./to/your/path', verbose=0)\n", + "est = TPOTClassifier(memory=memory)\n" ] }, { @@ -2136,208 +2111,234 @@ "2. The `tpot2.TPOTEstimatorSteadyState` differs in that it will generate and evaluate the next individual as soon as an individual finishes evaluation. The number of individuals being evaluated is determined by the n_jobs parameter. There is no longer a concept of generations. The population_size parameter now refers to the size of the list of evaluated parents. When an individual is evaluated, the selection method updates the list of parents. This allows more efficient utilization when using multiple cores.\n" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### tpot2.TPOTEstimatorSteadyState" + ] + }, { "cell_type": "code", - "execution_count": null, + "execution_count": 27, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Evaluations: : 21it [00:13, 1.61it/s]\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.9786392405063291\n" + ] + } + ], "source": [ "import tpot2\n", "import sklearn\n", "import sklearn.datasets\n", "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", - "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", "\n", - "\n", - "est = tpot2.TPOTClassifier(n_jobs=40, max_time_mins=30, verbose=5, generations=1, population_size=5)\n", - "est.fit(X_train, y_train)\n", - "\n", - "\n", - "print(scorer(est, X_test, y_test))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "est._evolver_instance.population.evaluated_individuals.iloc[0]['Individual'].export_pipeline()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import tpot2\n", - "import sklearn\n", - "import sklearn.metrics\n", - "import sklearn.datasets\n", - "\n", - "scorer = sklearn.metrics.get_scorer('neg_mean_squared_error')\n", - "X, y = sklearn.datasets.load_diabetes(return_X_y=True)\n", - "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", - "\n", - "est = tpot2.tpot_estimator.templates.TPOTRegressor(n_jobs=4, max_time_mins=30, verbose=2, cv=5)\n", - "est.fit(X_train, y_train)\n", - "\n", - "print(scorer(est, X_test, y_test))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### tpot2.TPOTEstimatorSteadyState" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import tpot2\n", - "import sklearn\n", - "import sklearn.datasets\n", - "\n", - "\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = tpot2.config.get_search_space(\"selectors\"), \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "est = tpot2.TPOTEstimatorSteadyState( \n", - " search_space = graph_search_space,\n", - " scorers=['roc_auc_ovr'], #scorers can be a list of strings or a list of scorers. These get evaluated during cross validation. \n", - " scorers_weights=[1],\n", - "\n", - " classification=True,\n", - "\n", - " max_eval_time_mins=15,\n", - " max_time_mins=30,\n", - " verbose=2)\n", - "\n", - "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", - "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fitted_pipeline = est.fitted_pipeline_ # access best pipeline directly\n", - "fitted_pipeline.plot()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#view the summary of all evaluated individuals as a pandas dataframe\n", - "est.evaluated_individuals" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import tpot2\n", - "import sklearn\n", - "import sklearn.datasets\n", + "graph_search_space = tpot2.search_spaces.pipelines.GraphSearchPipeline(\n", + " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", + " leaf_search_space = tpot2.config.get_search_space(\"selectors\"), \n", + " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", + " max_size = 10,\n", + ")\n", "\n", "est = tpot2.TPOTEstimatorSteadyState( \n", " search_space = graph_search_space,\n", " scorers=['roc_auc_ovr',tpot2.objectives.complexity_scorer],\n", " scorers_weights=[1,-1],\n", "\n", + "\n", " classification=True,\n", "\n", " max_eval_time_mins=15,\n", " max_time_mins=30,\n", + " early_stop=10, #In TPOTEstimatorSteadyState, since there are no generations, early_stop is the number of pipelines to evaluate before stopping.\n", " verbose=2)\n", "\n", "\n", "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", + "X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))\n" + "print(scorer(est, X_test, y_test))" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 28, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], "source": [ "fitted_pipeline = est.fitted_pipeline_ # access best pipeline directly\n", - "fitted_pipeline.plot() #plot the best pipeline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "view the results of all evaluated individuals as a pandas dataframe" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "est.evaluated_individuals" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "view pareto front as a pandas dataframe" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "est.pareto_front" + "fitted_pipeline.plot()" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 30, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
roc_auc_scorecomplexity_scorerParentsVariation_FunctionIndividualSubmitted TimestampCompleted TimestampEval ErrorPareto_FrontInstance
00.47480520.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727144e+091.727144e+09NoneNaN[('LogisticRegression_1', 'FastICA_1'), ('Fast...
10.96298378.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727144e+091.727144e+09NoneNaN[('DecisionTreeClassifier_1', 'PCA_1'), ('Quan...
20.96231057.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727144e+091.727144e+09NoneNaN[('DecisionTreeClassifier_1', 'SelectFwe_1')]
30.95690866.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727144e+091.727144e+09NoneNaN[('DecisionTreeClassifier_1', 'SelectFwe_2'), ...
40.87919515.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727144e+091.727144e+09NoneNaN[('DecisionTreeClassifier_1', 'SelectFwe_1')]
\n", + "
" + ], + "text/plain": [ + " roc_auc_score complexity_scorer Parents Variation_Function \\\n", + "0 0.474805 20.0 NaN NaN \n", + "1 0.962983 78.0 NaN NaN \n", + "2 0.962310 57.0 NaN NaN \n", + "3 0.956908 66.0 NaN NaN \n", + "4 0.879195 15.0 NaN NaN \n", + "\n", + " Individual Submitted Timestamp \\\n", + "0 #sk-container-id-1 {\n", + " /* Definition of color scheme common for light and dark mode */\n", + " --sklearn-color-text: black;\n", + " --sklearn-color-line: gray;\n", + " /* Definition of color scheme for unfitted estimators */\n", + " --sklearn-color-unfitted-level-0: #fff5e6;\n", + " --sklearn-color-unfitted-level-1: #f6e4d2;\n", + " --sklearn-color-unfitted-level-2: #ffe0b3;\n", + " --sklearn-color-unfitted-level-3: chocolate;\n", + " /* Definition of color scheme for fitted estimators */\n", + " --sklearn-color-fitted-level-0: #f0f8ff;\n", + " --sklearn-color-fitted-level-1: #d4ebff;\n", + " --sklearn-color-fitted-level-2: #b3dbfd;\n", + " --sklearn-color-fitted-level-3: cornflowerblue;\n", + "\n", + " /* Specific color for light theme */\n", + " --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n", + " --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n", + " --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n", + " --sklearn-color-icon: #696969;\n", + "\n", + " @media (prefers-color-scheme: dark) {\n", + " /* Redefinition of color scheme for dark theme */\n", + " --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n", + " --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n", + " --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n", + " --sklearn-color-icon: #878787;\n", + " }\n", + "}\n", + "\n", + "#sk-container-id-1 {\n", + " color: var(--sklearn-color-text);\n", + "}\n", + "\n", + "#sk-container-id-1 pre {\n", + " padding: 0;\n", + "}\n", + "\n", + "#sk-container-id-1 input.sk-hidden--visually {\n", + " border: 0;\n", + " clip: rect(1px 1px 1px 1px);\n", + " clip: rect(1px, 1px, 1px, 1px);\n", + " height: 1px;\n", + " margin: -1px;\n", + " overflow: hidden;\n", + " padding: 0;\n", + " position: absolute;\n", + " width: 1px;\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-dashed-wrapped {\n", + " border: 1px dashed var(--sklearn-color-line);\n", + " margin: 0 0.4em 0.5em 0.4em;\n", + " box-sizing: border-box;\n", + " padding-bottom: 0.4em;\n", + " background-color: var(--sklearn-color-background);\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-container {\n", + " /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n", + " but bootstrap.min.css set `[hidden] { display: none !important; }`\n", + " so we also need the `!important` here to be able to override the\n", + " default hidden behavior on the sphinx rendered scikit-learn.org.\n", + " See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n", + " display: inline-block !important;\n", + " position: relative;\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-text-repr-fallback {\n", + " display: none;\n", + "}\n", + "\n", + "div.sk-parallel-item,\n", + "div.sk-serial,\n", + "div.sk-item {\n", + " /* draw centered vertical line to link estimators */\n", + " background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n", + " background-size: 2px 100%;\n", + " background-repeat: no-repeat;\n", + " background-position: center center;\n", + "}\n", + "\n", + "/* Parallel-specific style estimator block */\n", + "\n", + "#sk-container-id-1 div.sk-parallel-item::after {\n", + " content: \"\";\n", + " width: 100%;\n", + " border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n", + " flex-grow: 1;\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-parallel {\n", + " display: flex;\n", + " align-items: stretch;\n", + " justify-content: center;\n", + " background-color: var(--sklearn-color-background);\n", + " position: relative;\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-parallel-item {\n", + " display: flex;\n", + " flex-direction: column;\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-parallel-item:first-child::after {\n", + " align-self: flex-end;\n", + " width: 50%;\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-parallel-item:last-child::after {\n", + " align-self: flex-start;\n", + " width: 50%;\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-parallel-item:only-child::after {\n", + " width: 0;\n", + "}\n", + "\n", + "/* Serial-specific style estimator block */\n", + "\n", + "#sk-container-id-1 div.sk-serial {\n", + " display: flex;\n", + " flex-direction: column;\n", + " align-items: center;\n", + " background-color: var(--sklearn-color-background);\n", + " padding-right: 1em;\n", + " padding-left: 1em;\n", + "}\n", + "\n", + "\n", + "/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n", + "clickable and can be expanded/collapsed.\n", + "- Pipeline and ColumnTransformer use this feature and define the default style\n", + "- Estimators will overwrite some part of the style using the `sk-estimator` class\n", + "*/\n", + "\n", + "/* Pipeline and ColumnTransformer style (default) */\n", + "\n", + "#sk-container-id-1 div.sk-toggleable {\n", + " /* Default theme specific background. It is overwritten whether we have a\n", + " specific estimator or a Pipeline/ColumnTransformer */\n", + " background-color: var(--sklearn-color-background);\n", + "}\n", + "\n", + "/* Toggleable label */\n", + "#sk-container-id-1 label.sk-toggleable__label {\n", + " cursor: pointer;\n", + " display: block;\n", + " width: 100%;\n", + " margin-bottom: 0;\n", + " padding: 0.5em;\n", + " box-sizing: border-box;\n", + " text-align: center;\n", + "}\n", + "\n", + "#sk-container-id-1 label.sk-toggleable__label-arrow:before {\n", + " /* Arrow on the left of the label */\n", + " content: \"▸\";\n", + " float: left;\n", + " margin-right: 0.25em;\n", + " color: var(--sklearn-color-icon);\n", + "}\n", + "\n", + "#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {\n", + " color: var(--sklearn-color-text);\n", + "}\n", + "\n", + "/* Toggleable content - dropdown */\n", + "\n", + "#sk-container-id-1 div.sk-toggleable__content {\n", + " max-height: 0;\n", + " max-width: 0;\n", + " overflow: hidden;\n", + " text-align: left;\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-toggleable__content.fitted {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-toggleable__content pre {\n", + " margin: 0.2em;\n", + " border-radius: 0.25em;\n", + " color: var(--sklearn-color-text);\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-toggleable__content.fitted pre {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-fitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n", + " /* Expand drop-down */\n", + " max-height: 200px;\n", + " max-width: 100%;\n", + " overflow: auto;\n", + "}\n", + "\n", + "#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n", + " content: \"▾\";\n", + "}\n", + "\n", + "/* Pipeline/ColumnTransformer-specific style */\n", + "\n", + "#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", + " color: var(--sklearn-color-text);\n", + " background-color: var(--sklearn-color-unfitted-level-2);\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", + " background-color: var(--sklearn-color-fitted-level-2);\n", + "}\n", + "\n", + "/* Estimator-specific style */\n", + "\n", + "/* Colorize estimator box */\n", + "#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-2);\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-2);\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-label label.sk-toggleable__label,\n", + "#sk-container-id-1 div.sk-label label {\n", + " /* The background is the default theme color */\n", + " color: var(--sklearn-color-text-on-default-background);\n", + "}\n", + "\n", + "/* On hover, darken the color of the background */\n", + "#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {\n", + " color: var(--sklearn-color-text);\n", + " background-color: var(--sklearn-color-unfitted-level-2);\n", + "}\n", + "\n", + "/* Label box, darken color on hover, fitted */\n", + "#sk-container-id-1 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n", + " color: var(--sklearn-color-text);\n", + " background-color: var(--sklearn-color-fitted-level-2);\n", + "}\n", + "\n", + "/* Estimator label */\n", + "\n", + "#sk-container-id-1 div.sk-label label {\n", + " font-family: monospace;\n", + " font-weight: bold;\n", + " display: inline-block;\n", + " line-height: 1.2em;\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-label-container {\n", + " text-align: center;\n", + "}\n", + "\n", + "/* Estimator-specific */\n", + "#sk-container-id-1 div.sk-estimator {\n", + " font-family: monospace;\n", + " border: 1px dotted var(--sklearn-color-border-box);\n", + " border-radius: 0.25em;\n", + " box-sizing: border-box;\n", + " margin-bottom: 0.5em;\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-0);\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-estimator.fitted {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-0);\n", + "}\n", + "\n", + "/* on hover */\n", + "#sk-container-id-1 div.sk-estimator:hover {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-2);\n", + "}\n", + "\n", + "#sk-container-id-1 div.sk-estimator.fitted:hover {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-2);\n", + "}\n", + "\n", + "/* Specification for estimator info (e.g. \"i\" and \"?\") */\n", + "\n", + "/* Common style for \"i\" and \"?\" */\n", + "\n", + ".sk-estimator-doc-link,\n", + "a:link.sk-estimator-doc-link,\n", + "a:visited.sk-estimator-doc-link {\n", + " float: right;\n", + " font-size: smaller;\n", + " line-height: 1em;\n", + " font-family: monospace;\n", + " background-color: var(--sklearn-color-background);\n", + " border-radius: 1em;\n", + " height: 1em;\n", + " width: 1em;\n", + " text-decoration: none !important;\n", + " margin-left: 1ex;\n", + " /* unfitted */\n", + " border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n", + " color: var(--sklearn-color-unfitted-level-1);\n", + "}\n", + "\n", + ".sk-estimator-doc-link.fitted,\n", + "a:link.sk-estimator-doc-link.fitted,\n", + "a:visited.sk-estimator-doc-link.fitted {\n", + " /* fitted */\n", + " border: var(--sklearn-color-fitted-level-1) 1pt solid;\n", + " color: var(--sklearn-color-fitted-level-1);\n", + "}\n", + "\n", + "/* On hover */\n", + "div.sk-estimator:hover .sk-estimator-doc-link:hover,\n", + ".sk-estimator-doc-link:hover,\n", + "div.sk-label-container:hover .sk-estimator-doc-link:hover,\n", + ".sk-estimator-doc-link:hover {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-3);\n", + " color: var(--sklearn-color-background);\n", + " text-decoration: none;\n", + "}\n", + "\n", + "div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n", + ".sk-estimator-doc-link.fitted:hover,\n", + "div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n", + ".sk-estimator-doc-link.fitted:hover {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-3);\n", + " color: var(--sklearn-color-background);\n", + " text-decoration: none;\n", + "}\n", + "\n", + "/* Span, style for the box shown on hovering the info icon */\n", + ".sk-estimator-doc-link span {\n", + " display: none;\n", + " z-index: 9999;\n", + " position: relative;\n", + " font-weight: normal;\n", + " right: .2ex;\n", + " padding: .5ex;\n", + " margin: .5ex;\n", + " width: min-content;\n", + " min-width: 20ex;\n", + " max-width: 50ex;\n", + " color: var(--sklearn-color-text);\n", + " box-shadow: 2pt 2pt 4pt #999;\n", + " /* unfitted */\n", + " background: var(--sklearn-color-unfitted-level-0);\n", + " border: .5pt solid var(--sklearn-color-unfitted-level-3);\n", + "}\n", + "\n", + ".sk-estimator-doc-link.fitted span {\n", + " /* fitted */\n", + " background: var(--sklearn-color-fitted-level-0);\n", + " border: var(--sklearn-color-fitted-level-3);\n", + "}\n", + "\n", + ".sk-estimator-doc-link:hover span {\n", + " display: block;\n", + "}\n", + "\n", + "/* \"?\"-specific style due to the `` HTML tag */\n", + "\n", + "#sk-container-id-1 a.estimator_doc_link {\n", + " float: right;\n", + " font-size: 1rem;\n", + " line-height: 1em;\n", + " font-family: monospace;\n", + " background-color: var(--sklearn-color-background);\n", + " border-radius: 1rem;\n", + " height: 1rem;\n", + " width: 1rem;\n", + " text-decoration: none;\n", + " /* unfitted */\n", + " color: var(--sklearn-color-unfitted-level-1);\n", + " border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n", + "}\n", + "\n", + "#sk-container-id-1 a.estimator_doc_link.fitted {\n", + " /* fitted */\n", + " border: var(--sklearn-color-fitted-level-1) 1pt solid;\n", + " color: var(--sklearn-color-fitted-level-1);\n", + "}\n", + "\n", + "/* On hover */\n", + "#sk-container-id-1 a.estimator_doc_link:hover {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-3);\n", + " color: var(--sklearn-color-background);\n", + " text-decoration: none;\n", + "}\n", + "\n", + "#sk-container-id-1 a.estimator_doc_link.fitted:hover {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-3);\n", + "}\n", + "" + ], + "text/plain": [ + "RandomForestClassifier(bootstrap=False, max_features=0.0109527542096,\n", + " min_samples_leaf=15, min_samples_split=4,\n", + " n_estimators=128)" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from ConfigSpace import ConfigurationSpace\n", + "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "import tpot2\n", + "import numpy as np\n", + "import sklearn\n", + "import sklearn.datasets\n", + "\n", + "rf_configspace = ConfigurationSpace(\n", + " space = {\n", + " 'n_estimators': 128, #as recommended by Oshiro et al. (2012\n", + " 'max_features': Float(\"max_features\", bounds=(0.01,1), log=True), #log scale like autosklearn?\n", + " 'criterion': Categorical(\"criterion\", ['gini', 'entropy']),\n", + " 'min_samples_split': Integer(\"min_samples_split\", bounds=(2, 20)),\n", + " 'min_samples_leaf': Integer(\"min_samples_leaf\", bounds=(1, 20)),\n", + " 'bootstrap': Categorical(\"bootstrap\", [True, False]),\n", + " #random_state = 1, # If you want results to be reproducible, you can set a fixed random_state.\n", + " }\n", + ")\n", + "\n", + "hyperparameters = dict(rf_configspace.sample_configuration())\n", + "print(\"sampled hyperparameters\")\n", + "print(hyperparameters)\n", + "\n", + "rf = RandomForestClassifier(**hyperparameters)\n", + "rf" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "More simply:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled hyperparameters\n", + "{'bootstrap': True, 'criterion': 'entropy', 'max_features': 0.8366498702446, 'min_samples_leaf': 11, 'min_samples_split': 20, 'n_estimators': 128}\n" + ] + }, + { + "data": { + "text/html": [ + "
RandomForestClassifier(criterion='entropy', max_features=0.8366498702446,\n",
+       "                       min_samples_leaf=11, min_samples_split=20,\n",
+       "                       n_estimators=128)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "RandomForestClassifier(criterion='entropy', max_features=0.8366498702446,\n", + " min_samples_leaf=11, min_samples_split=20,\n", + " n_estimators=128)" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rf_configspace = ConfigurationSpace(\n", + " space = {\n", + " 'n_estimators': 128, #as recommended by Oshiro et al. (2012\n", + " 'max_features':(0.01,1), #not log scaled\n", + " 'criterion': ['gini', 'entropy'],\n", + " 'min_samples_split': (2, 20),\n", + " 'min_samples_leaf': (1, 20),\n", + " 'bootstrap': [True, False],\n", + " #random_state = 1, # If you want results to be reproducible, you can set a fixed random_state.\n", + " }\n", + ")\n", + "\n", + "hyperparameters = dict(rf_configspace.sample_configuration())\n", + "print(\"sampled hyperparameters\")\n", + "print(hyperparameters)\n", + "\n", + "rf = RandomForestClassifier(**hyperparameters)\n", + "rf" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# TPOT Search spaces\n", + "\n", + "TPOT allows you to both hyperparameter search spaces for individual methods as well as pipeline structure search spaces. For example, TPOT can create linear pipelines, trees, or graphs. \n", + "\n", + "TPOT search spaces are found in the `search_spaces` module. There are two primary kinds of search spaces, node and pipeline. Node search spaces specify the search space of a single sklearn `BaseEstimator`. Pipeline search spaces define the possible structures for a group of node search spaces. These take in node search spaces and produce a pipeline using nodes from that search space. Since sklearn Pipelines are also `BaseEstimator`, pipeline search spaces are also technically node search spaces. Meaning that pipeline search spaces can take in other pipeline search spaces in order to define more complex structures. The primary differentiating factor bewteen node and pipeline search spaces is that pipeline search spaces must take in another search space as input to feed its individual nodes. Therefore, all search spaces eventually end in a node search space at the lowest level. Note that parameters for pipeline search spaces can differ, some take in only a single search space, some take in a list, or some take in multiple defined parameters.\n", + "\n", + "## node search spaces\n", + "\n", + "\n", + "| Name | Info |\n", + "| :--- | :----: |\n", + "| EstimatorNode | Takes in a ConfigSpace along with the class of the method. This node will optimize the hyperparameters for a single method. |\n", + "| GeneticFeatureSelectorNode | Uses evolution to optimize a set of features, exports a basic sklearn Selector that simply selects the features chosen by the node. |\n", + "| FSSNode | FSS stands for FeatureSetSelector. This node takes in a list of user-defined subsets of features and selects a single predefined subset. Note that TPOT will not create new subsets nor will it select multiple subsets per node. If using a linear pipeline, this node should be set as the first step. In linear pipelines it is recommended that you only use a small number of feature sets. I recommend exploring using FSSNodes in pipelines that allow TPOT to select more than one FSSNode at a time. For example, DynamicUnionPipeline and GraphPipeline are both excellent combos for FSSNode. Use FFSNode inside a DynamicUnionPipeline at the start of linear pipeline to explore optimal combinations of subsets in linear pipelines. Set the leaf_search_space of GraphSearchPipeline TPOT can use multiple feature sets in different ways, for example, with different transformers for different sets. |\n", + "\n", + "\n", + "\n", + "## pipeline search spaces\n", + "\n", + "found in tpot2.search_spaces.pipelines\n", + "\n", + "WrapperPipeline - This search space is for wrapping a sklearn estimator with a method that takes another estimator and hyperparameters as arguments.\n", + " For example, this can be used with sklearn.ensemble.BaggingClassifier or sklearn.ensemble.AdaBoostClassifier.\n", + "\n", + "\n", + "| Name | Info |\n", + "| :--- | :----: |\n", + "| ChoicePipeline | Takes in a list of search spaces. Will select one node from the search space. |\n", + "| SequentialPipeline | Takes in a list of search spaces. will produce a pipeline of Sequential length. Each step in the pipeline will correspond to the the search space provided in the same index. |\n", + "| DynamicLinearPipeline | Takes in a single search space. Will produce a linear pipeline of variable length. Each step in the pipeline will be pulled from the search space provided. |\n", + "| UnionPipeline | Takes in a list of search spaces. The returned pipeline will include one estimator per search space joined in an sklearn FeatureUnion. Useful for having many steps in one layer. |\n", + "| DynamicUnionPipeline | Takes in a single search space. It will pull anywhere from 1 to max_estimators number of estimators from the search space and concatenate them in a FeatureUnion. |\n", + "| TreePipeline |Generates a pipeline of variable length. Pipeline will have a tree structure similar to TPOT1. |\n", + "| GraphSearchPipeline | Generates a directed acyclic graph of variable size. Search spaces for root, leaf, and inner nodes can be defined separately if desired. |\n", + "| WrapperPipeline | This search space is for wrapping a sklearn estimator with a method that takes another estimator and hyperparameters as arguments. For example, this can be used with sklearn.ensemble.BaggingClassifier or sklearn.ensemble.AdaBoostClassifier. |\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Node Search Space Examples\n", + "\n", + "Node search spaces represent the smallest unit of an sklearn pipeline. All node search spaces create and optimize a single node, or estimator object. For example this could be a KNeighborsClassifier or a FeatureSetSelector.\n", + "\n", + "### EstimatorNode\n", + "\n", + "The EstimatorNode represents the hyperparameter search space for a scikit-learn estimator. \n", + "\n", + "Note that `ConfigSpace` doesn't support `None` in its search space, and does not support the booleans True or False as fixed parameters (though booleans seem to be allowed in Categorical search spaces). To get around this, use the macros defined in:\n", + "\n", + "`from tpot2.search_spaces.nodes.estimator_node import NONE_SPECIAL_STRING, TRUE_SPECIAL_STRING, FALSE_SPECIAL_STRING`" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "import tpot2\n", + "from ConfigSpace import ConfigurationSpace\n", + "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", + "from sklearn.neighbors import KNeighborsClassifier\n", + "\n", + "knn_configspace = ConfigurationSpace(\n", + " space = {\n", + "\n", + " 'n_neighbors': Integer(\"n_neighbors\", bounds=(1, 10)),\n", + " 'weights': Categorical(\"weights\", ['uniform', 'distance']),\n", + " 'p': Integer(\"p\", bounds=(1, 3)),\n", + " 'metric': Categorical(\"metric\", ['euclidean', 'minkowski']),\n", + " 'n_jobs': 1,\n", + " }\n", + ")\n", + "\n", + "\n", + "knn_node = tpot2.search_spaces.nodes.EstimatorNode(\n", + " method = KNeighborsClassifier,\n", + " space = knn_configspace,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can sample generate an individual with the generate() function. This individual samples from the search space as well as provides mutation and crossover functions to modify the current sample." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "knn_individual = knn_node.generate()\n", + "knn_individual" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled hyperparameters\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 8, 'p': 2, 'weights': 'uniform'}\n" + ] + } + ], + "source": [ + "print(\"sampled hyperparameters\")\n", + "print(knn_individual.hyperparameters)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "All Individual objects have mutation and crossover operators that TPOT uses to optimize the pipelines." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "mutated hyperparameters\n", + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 1, 'p': 3, 'weights': 'distance'}\n" + ] + } + ], + "source": [ + "knn_individual.mutate() # mutate the individual\n", + "print(\"mutated hyperparameters\")\n", + "print(knn_individual.hyperparameters)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In TPOT2, crossover only modifies the individual calling the crossover function, the second individual remains the same" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "original hyperparameters for individual 1\n", + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 8, 'p': 1, 'weights': 'uniform'}\n", + "original hyperparameters for individual 2\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'distance'}\n", + "\n", + "post crossover hyperparameters for individual 1\n", + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 8, 'p': 3, 'weights': 'uniform'}\n", + "post crossover hyperparameters for individual 2\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'distance'}\n" + ] + } + ], + "source": [ + "knn_individual1 = knn_node.generate()\n", + "knn_individual2 = knn_node.generate()\n", + "\n", + "print(\"original hyperparameters for individual 1\")\n", + "print(knn_individual1.hyperparameters)\n", + "\n", + "print(\"original hyperparameters for individual 2\")\n", + "print(knn_individual2.hyperparameters)\n", + "\n", + "print()\n", + "\n", + "knn_individual1.crossover(knn_individual2) # crossover the individuals\n", + "print(\"post crossover hyperparameters for individual 1\")\n", + "print(knn_individual1.hyperparameters)\n", + "print(\"post crossover hyperparameters for individual 2\")\n", + "print(knn_individual2.hyperparameters)\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "All search spaces have an export_pipeline function that returns an sklearn `BaseEstimator`" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
KNeighborsClassifier(n_jobs=1, n_neighbors=8, p=3)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "KNeighborsClassifier(n_jobs=1, n_neighbors=8, p=3)" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "est = knn_individual1.export_pipeline()\n", + "est" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If a dictionary of parameters is passed instead of of a ConfigSpace object, then the hyperparameters will always be fixed and not learned." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
KNeighborsClassifier(n_neighbors=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "KNeighborsClassifier(n_neighbors=10)" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import tpot2\n", + "from ConfigSpace import ConfigurationSpace\n", + "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", + "from sklearn.neighbors import KNeighborsClassifier\n", + "\n", + "space = {\n", + "\n", + " 'n_neighbors':10,\n", + "}\n", + "\n", + "knn_node = tpot2.search_spaces.nodes.EstimatorNode(\n", + " method = KNeighborsClassifier,\n", + " space = space,\n", + ")\n", + "\n", + "knn_node.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### FSSNode and GeneticFeatureSelectorNode\n", + "\n", + "Both of these are given their own tutorials. See Tutorial 3 for FFSNode and Tutorial 5 for GeneticFeatureSelectorNode" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Pipeline Search Space Examples\n", + "\n", + "Pipeline search spaces are used to define the structure and restrictions of the pipelines TPOT can search. Unlike Node search spaces, all pipeline search spaces take in other search spaces as inputs. Rather than sample hyperparameters, pipeline search spaces can select models from the input search spaces and organize them within a linear sklearn Pipeline or a TPOT GraphPipeline." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### ChoicePipeline\n", + "\n", + "The simplest pipeline search space is the ChoicePipeline. This takes in a list of search spaces and simply selects and samples from one. In this example, we will construct a search space that takes in several options for a classifier. The resulting search space will then first select a model from KNeighborsClassifier, LogisticRegression or DecisionTreeClassifier, and then select the hyperparameters for the given model." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import tpot2\n", + "from ConfigSpace import ConfigurationSpace\n", + "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", + "from sklearn.neighbors import KNeighborsClassifier\n", + "from sklearn.linear_model import LogisticRegression\n", + "from sklearn.tree import DecisionTreeClassifier\n", + "\n", + "knn_configspace = ConfigurationSpace(\n", + " space = {\n", + "\n", + " 'n_neighbors': Integer(\"n_neighbors\", bounds=(1, 10)),\n", + " 'weights': Categorical(\"weights\", ['uniform', 'distance']),\n", + " 'p': Integer(\"p\", bounds=(1, 3)),\n", + " 'metric': Categorical(\"metric\", ['euclidean', 'minkowski']),\n", + " 'n_jobs': 1,\n", + " }\n", + ")\n", + "\n", + "lr_configspace = ConfigurationSpace(\n", + " space = {\n", + " 'solver': Categorical(\"solver\", ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga']),\n", + " 'penalty': Categorical(\"penalty\", ['l1', 'l2']),\n", + " 'dual': Categorical(\"dual\", [True, False]),\n", + " 'C': Float(\"C\", bounds=(1e-4, 1e4), log=True),\n", + " 'class_weight': Categorical(\"class_weight\", ['balanced']),\n", + " 'n_jobs': 1,\n", + " 'max_iter': 1000,\n", + " }\n", + " )\n", + "\n", + "dt_configspace = ConfigurationSpace(\n", + " space = {\n", + " 'criterion': Categorical(\"criterion\", ['gini', 'entropy']),\n", + " 'max_depth': Integer(\"max_depth\", bounds=(1, 11)),\n", + " 'min_samples_split': Integer(\"min_samples_split\", bounds=(2, 21)),\n", + " 'min_samples_leaf': Integer(\"min_samples_leaf\", bounds=(1, 21)),\n", + " 'max_features': Categorical(\"max_features\", ['sqrt', 'log2']),\n", + " 'min_weight_fraction_leaf': 0.0,\n", + " }\n", + " )\n", + "\n", + "knn_node = tpot2.search_spaces.nodes.EstimatorNode(\n", + " method = KNeighborsClassifier,\n", + " space = knn_configspace,\n", + ")\n", + "\n", + "lr_node = tpot2.search_spaces.nodes.EstimatorNode(\n", + " method = LogisticRegression,\n", + " space = lr_configspace,\n", + ")\n", + "\n", + "dt_node = tpot2.search_spaces.nodes.EstimatorNode(\n", + " method = DecisionTreeClassifier,\n", + " space = dt_configspace,\n", + ")\n", + "\n", + "classifier_node = tpot2.search_spaces.pipelines.ChoicePipeline(\n", + " search_spaces=[\n", + " knn_node,\n", + " lr_node,\n", + " dt_node,\n", + " ]\n", + ")\n", + "\n", + "\n", + "tpot2.search_spaces.pipelines.ChoicePipeline(\n", + " search_spaces = [\n", + " tpot2.search_spaces.nodes.EstimatorNode(\n", + " method = KNeighborsClassifier,\n", + " space = knn_configspace,\n", + " ),\n", + " tpot2.search_spaces.nodes.EstimatorNode(\n", + " method = LogisticRegression,\n", + " space = lr_configspace,\n", + " ),\n", + " tpot2.search_spaces.nodes.EstimatorNode(\n", + " method = DecisionTreeClassifier,\n", + " space = dt_configspace,\n", + " ),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Search space objects provided by pipeline search spaces work the same as with node search spaces. Note that crossover only works when both individuals have sampled the same method. " + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
LogisticRegression(C=49.6823706295106, class_weight='balanced', dual=True,\n",
+       "                   max_iter=1000, n_jobs=1, solver='liblinear')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "LogisticRegression(C=49.6823706295106, class_weight='balanced', dual=True,\n", + " max_iter=1000, n_jobs=1, solver='liblinear')" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "classifier_individual = classifier_node.generate()\n", + "\n", + "print(\"sampled pipeline\")\n", + "classifier_individual.export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "mutated pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
LogisticRegression(C=997.5163561212358, class_weight='balanced', dual=True,\n",
+       "                   max_iter=1000, n_jobs=1, solver='newton-cg')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "LogisticRegression(C=997.5163561212358, class_weight='balanced', dual=True,\n", + " max_iter=1000, n_jobs=1, solver='newton-cg')" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "print(\"mutated pipeline\")\n", + "classifier_individual.mutate()\n", + "classifier_individual.export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "TPOT2 also comes with predefined search spaces. The current search spaces were adapted from a combination of the original TPOT package as well as the search spaces used in [AutoSklearn](https://github.com/automl/auto-sklearn/tree/development/autosklearn/pipeline/components). The helper function `tpot2.config.get_search_space` takes in a string or a list of strings, and returns either a EstimatorNode or a ChoicePipeline,respectively. \n", + "\n", + "strings can correspond to individual methods. There are also special strings that return predefined lists of methods. \n", + "\n", + "| Special String | Included methods |\n", + "| :--- | :----: |\n", + "| \"selectors\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\",] |\n", + "| \"selectors_classification\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\", \"RFE_classification\", \"SelectFromModel_classification\"] |\n", + "| \"selectors_regression\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\", \"RFE_regression\", \"SelectFromModel_regression\"] |\n", + "| \"classifiers\" | [\"LGBMClassifier\", \"BaggingClassifier\", 'AdaBoostClassifier', 'BernoulliNB', 'DecisionTreeClassifier', 'ExtraTreesClassifier', 'GaussianNB', 'HistGradientBoostingClassifier', 'KNeighborsClassifier','LinearDiscriminantAnalysis', 'LogisticRegression', \"LinearSVC\", \"SVC\", 'MLPClassifier', 'MultinomialNB', \"QuadraticDiscriminantAnalysis\", 'RandomForestClassifier', 'SGDClassifier', 'XGBClassifier'] |\n", + "| \"regressors\" | [\"LGBMRegressor\", 'AdaBoostRegressor', \"ARDRegression\", 'DecisionTreeRegressor', 'ExtraTreesRegressor', 'HistGradientBoostingRegressor', 'KNeighborsRegressor', 'LinearSVR', \"MLPRegressor\", 'RandomForestRegressor', 'SGDRegressor', 'SVR', 'XGBRegressor'] |\n", + "| \"transformers\" | [\"PassKBinsDiscretizer\", \"Binarizer\", \"PCA\", \"ZeroCount\", \"ColumnOneHotEncoder\", \"FastICA\", \"FeatureAgglomeration\", \"Nystroem\", \"RBFSampler\", \"QuantileTransformer\", \"PowerTransformer\"] |\n", + "| \"scalers\" | [\"MinMaxScaler\", \"RobustScaler\", \"StandardScaler\", \"MaxAbsScaler\", \"Normalizer\", ] |\n", + "| \"all_transformers\" | [\"transformers\", \"scalers\"] |\n", + "| \"arithmatic\" | [\"AddTransformer\", \"mul_neg_1_Transformer\", \"MulTransformer\", \"SafeReciprocalTransformer\", \"EQTransformer\", \"NETransformer\", \"GETransformer\", \"GTTransformer\", \"LETransformer\", \"LTTransformer\", \"MinTransformer\", \"MaxTransformer\"] |\n", + "| \"imputers\" | [\"SimpleImputer\", \"IterativeImputer\", \"KNNImputer\"] |\n", + "| \"skrebate\" | [\"ReliefF\", \"SURF\", \"SURFstar\", \"MultiSURF\"] |\n", + "| \"genetic_encoders\" | [\"DominantEncoder\", \"RecessiveEncoder\", \"HeterosisEncoder\", \"UnderDominanceEncoder\", \"OverDominanceEncoder\"] |\n", + "| \"classifiers_sklearnex\" | [\"RandomForestClassifier_sklearnex\", \"LogisticRegression_sklearnex\", \"KNeighborsClassifier_sklearnex\", \"SVC_sklearnex\",\"NuSVC_sklearnex\"] |\n", + "| \"regressors_sklearnex\" | [\"LinearRegression_sklearnex\", \"Ridge_sklearnex\", \"Lasso_sklearnex\", \"ElasticNet_sklearnex\", \"SVR_sklearnex\", \"NuSVR_sklearnex\", \"RandomForestRegressor_sklearnex\", \"KNeighborsRegressor_sklearnex\"] |" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline 1\n" + ] + }, + { + "data": { + "text/html": [ + "
LogisticRegression(C=2532.59836574515, class_weight='balanced', max_iter=1000,\n",
+       "                   n_jobs=1, penalty='l1', solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "LogisticRegression(C=2532.59836574515, class_weight='balanced', max_iter=1000,\n", + " n_jobs=1, penalty='l1', solver='saga')" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#same pipeline search space as before.\n", + "classifier_choice = tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"])\n", + "\n", + "print(\"sampled pipeline 1\")\n", + "classifier_choice.generate().export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline 2\n" + ] + }, + { + "data": { + "text/html": [ + "
DecisionTreeClassifier(class_weight='balanced', max_depth=20,\n",
+       "                       max_features='log2', min_samples_leaf=2,\n",
+       "                       min_samples_split=4)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "DecisionTreeClassifier(class_weight='balanced', max_depth=20,\n", + " max_features='log2', min_samples_leaf=2,\n", + " min_samples_split=4)" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "print(\"sampled pipeline 2\")\n", + "classifier_choice.generate().export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline 1\n" + ] + }, + { + "data": { + "text/html": [ + "
BernoulliNB(alpha=95.0262026809266)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "BernoulliNB(alpha=95.0262026809266)" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#search space for all classifiers\n", + "classifier_choice = tpot2.config.get_search_space(\"classifiers\")\n", + "\n", + "print(\"sampled pipeline 1\")\n", + "classifier_choice.generate().export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline 2\n" + ] + }, + { + "data": { + "text/html": [ + "
BaggingClassifier(bootstrap=False, bootstrap_features=True,\n",
+       "                  max_features=0.8753796928965, max_samples=0.8146576017845,\n",
+       "                  n_estimators=17, n_jobs=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "BaggingClassifier(bootstrap=False, bootstrap_features=True,\n", + " max_features=0.8753796928965, max_samples=0.8146576017845,\n", + " n_estimators=17, n_jobs=1)" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "print(\"sampled pipeline 2\")\n", + "classifier_choice.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## SequentialPipeline\n", + "\n", + "SequentialPipelines are of fixed length and sample from a predefined distribution for each step. " + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0038921831393)),\n",
+       "                ('pca', PCA(n_components=0.7545742110409)),\n",
+       "                ('logisticregression',\n",
+       "                 LogisticRegression(C=85638.13831296022,\n",
+       "                                    class_weight='balanced',\n",
+       "                                    l1_ratio=0.6102894736188, max_iter=1000,\n",
+       "                                    n_jobs=1, penalty='elasticnet',\n",
+       "                                    solver='saga'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('variancethreshold',\n", + " VarianceThreshold(threshold=0.0038921831393)),\n", + " ('pca', PCA(n_components=0.7545742110409)),\n", + " ('logisticregression',\n", + " LogisticRegression(C=85638.13831296022,\n", + " class_weight='balanced',\n", + " l1_ratio=0.6102894736188, max_iter=1000,\n", + " n_jobs=1, penalty='elasticnet',\n", + " solver='saga'))])" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "selector_choicepipeline = tpot2.config.get_search_space(\"VarianceThreshold\")\n", + "transformer_choicepipeline = tpot2.config.get_search_space(\"PCA\")\n", + "classifier_choicepipeline = tpot2.config.get_search_space(\"LogisticRegression\")\n", + "\n", + "stc_pipeline = tpot2.search_spaces.pipelines.SequentialPipeline([\n", + " selector_choicepipeline,\n", + " transformer_choicepipeline,\n", + " classifier_choicepipeline,\n", + "])\n", + "\n", + "print(\"sampled pipeline\")\n", + "stc_pipeline.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here is an example of the form Selector-Transformer-Classifier.\n", + "\n", + "Note that each step in the sequence is a ChoicePipeline this time. Here, the SequentialPipeline can sample from search provided search space in order." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('selectpercentile',\n",
+       "                 SelectPercentile(percentile=97.2182140589731)),\n",
+       "                ('binarizer', Binarizer(threshold=0.7460953779809)),\n",
+       "                ('gaussiannb', GaussianNB())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('selectpercentile',\n", + " SelectPercentile(percentile=97.2182140589731)),\n", + " ('binarizer', Binarizer(threshold=0.7460953779809)),\n", + " ('gaussiannb', GaussianNB())])" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "selector_choicepipeline = tpot2.config.get_search_space(\"selectors\")\n", + "transformer_choicepipeline = tpot2.config.get_search_space(\"transformers\")\n", + "classifier_choicepipeline = tpot2.config.get_search_space(\"classifiers\")\n", + "\n", + "stc_pipeline = tpot2.search_spaces.pipelines.SequentialPipeline([\n", + " selector_choicepipeline,\n", + " transformer_choicepipeline,\n", + " classifier_choicepipeline,\n", + "])\n", + "\n", + "print(\"sampled pipeline\")\n", + "stc_pipeline.generate().export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.018605231348)),\n",
+       "                ('powertransformer', PowerTransformer()),\n",
+       "                ('adaboostclassifier',\n",
+       "                 AdaBoostClassifier(algorithm='SAMME',\n",
+       "                                    learning_rate=0.2576218451608,\n",
+       "                                    n_estimators=260))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.018605231348)),\n", + " ('powertransformer', PowerTransformer()),\n", + " ('adaboostclassifier',\n", + " AdaBoostClassifier(algorithm='SAMME',\n", + " learning_rate=0.2576218451608,\n", + " n_estimators=260))])" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "print(\"sampled pipeline\")\n", + "stc_pipeline.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## DynamicLinearPipeline\n", + "\n", + "DynamicLinearPipeline takes in a single search space and randomly samples and places estimators in a list without a predefined sequence. DynamicLinearPipeline are most often used when paired with LinearPipeline. A common strategy is to use DynamicLinearPipeline to optimize a series of preprocessing or feature engineering steps, followed by a final classifier or regressor." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('minmaxscaler-1', MinMaxScaler()),\n",
+       "                ('minmaxscaler-2', MinMaxScaler()),\n",
+       "                ('minmaxscaler-3', MinMaxScaler())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('minmaxscaler-1', MinMaxScaler()),\n", + " ('minmaxscaler-2', MinMaxScaler()),\n", + " ('minmaxscaler-3', MinMaxScaler())])" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import tpot2.config\n", + "\n", + "\n", + "linear_feature_engineering = tpot2.search_spaces.pipelines.DynamicLinearPipeline(search_space = tpot2.config.get_search_space([\"all_transformers\",\"selectors_classification\"]), max_length=10)\n", + "print(\"sampled pipeline\")\n", + "linear_feature_engineering.generate().export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('powertransformer', PowerTransformer()),\n",
+       "                ('nystroem',\n",
+       "                 Nystroem(gamma=0.9541024274994, kernel='cosine',\n",
+       "                          n_components=12)),\n",
+       "                ('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0109488305621))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('powertransformer', PowerTransformer()),\n", + " ('nystroem',\n", + " Nystroem(gamma=0.9541024274994, kernel='cosine',\n", + " n_components=12)),\n", + " ('variancethreshold',\n", + " VarianceThreshold(threshold=0.0109488305621))])" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "print(\"sampled pipeline\")\n", + "linear_feature_engineering.generate().export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('pipeline',\n",
+       "                 Pipeline(steps=[('pca', PCA(n_components=0.543582719063)),\n",
+       "                                 ('quantiletransformer',\n",
+       "                                  QuantileTransformer(n_quantiles=1182)),\n",
+       "                                 ('passkbinsdiscretizer',\n",
+       "                                  PassKBinsDiscretizer(n_bins=13,\n",
+       "                                                       strategy='uniform'))])),\n",
+       "                ('randomforestclassifier',\n",
+       "                 RandomForestClassifier(bootstrap=False,\n",
+       "                                        max_features=0.078312000096,\n",
+       "                                        min_samples_leaf=7, min_samples_split=3,\n",
+       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('pipeline',\n", + " Pipeline(steps=[('pca', PCA(n_components=0.543582719063)),\n", + " ('quantiletransformer',\n", + " QuantileTransformer(n_quantiles=1182)),\n", + " ('passkbinsdiscretizer',\n", + " PassKBinsDiscretizer(n_bins=13,\n", + " strategy='uniform'))])),\n", + " ('randomforestclassifier',\n", + " RandomForestClassifier(bootstrap=False,\n", + " max_features=0.078312000096,\n", + " min_samples_leaf=7, min_samples_split=3,\n", + " n_estimators=128))])" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "full_search_space = tpot2.search_spaces.pipelines.SequentialPipeline([\n", + " linear_feature_engineering,\n", + " tpot2.config.get_search_space(\"classifiers\"),\n", + "])\n", + "\n", + "print(\"sampled pipeline\")\n", + "full_search_space.generate().export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('pipeline',\n",
+       "                 Pipeline(steps=[('passkbinsdiscretizer',\n",
+       "                                  PassKBinsDiscretizer()),\n",
+       "                                 ('robustscaler',\n",
+       "                                  RobustScaler(quantile_range=(0.07166946516,\n",
+       "                                                               0.7478574798356))),\n",
+       "                                 ('zerocount', ZeroCount())])),\n",
+       "                ('baggingclassifier',\n",
+       "                 BaggingClassifier(bootstrap=False, bootstrap_features=True,\n",
+       "                                   max_features=0.1521715848495,\n",
+       "                                   max_samples=0.1213783267153, n_estimators=25,\n",
+       "                                   n_jobs=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('pipeline',\n", + " Pipeline(steps=[('passkbinsdiscretizer',\n", + " PassKBinsDiscretizer()),\n", + " ('robustscaler',\n", + " RobustScaler(quantile_range=(0.07166946516,\n", + " 0.7478574798356))),\n", + " ('zerocount', ZeroCount())])),\n", + " ('baggingclassifier',\n", + " BaggingClassifier(bootstrap=False, bootstrap_features=True,\n", + " max_features=0.1521715848495,\n", + " max_samples=0.1213783267153, n_estimators=25,\n", + " n_jobs=1))])" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "print(\"sampled pipeline\")\n", + "full_search_space.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### UnionPipeline\n", + "\n", + "Union pipelines can be useful when you want to either do multiple transformations in a single layer. Another common strategy is to do a union with a transformer and a passthrough for when you want to keep the original data in addition to the transformation. " + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('featureagglomeration',\n",
+       "                                FeatureAgglomeration(n_clusters=257,\n",
+       "                                                     pooling_func=<function median at 0x7e7cb00dcf70>)),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], "text/plain": [ - "" + "FeatureUnion(transformer_list=[('featureagglomeration',\n", + " FeatureAgglomeration(n_clusters=257,\n", + " pooling_func=)),\n", + " ('passthrough', Passthrough())])" ] }, - "execution_count": 9, + "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "knn_individual = knn_node.generate()\n", - "knn_individual" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled hyperparameters\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 4, 'p': 1, 'weights': 'uniform'}\n" - ] - } - ], - "source": [ - "print(\"sampled hyperparameters\")\n", - "print(knn_individual.hyperparameters)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "All Individual objects have mutation and crossover operators that TPOT uses to optimize the pipelines." - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "mutated hyperparameters\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 10, 'p': 3, 'weights': 'distance'}\n" - ] - } - ], - "source": [ - "knn_individual.mutate() # mutate the individual\n", - "print(\"mutated hyperparameters\")\n", - "print(knn_individual.hyperparameters)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In TPOT2, crossover only modifies the individual calling the crossover function, the second individual remains the same" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "original hyperparameters for individual 1\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 2, 'p': 2, 'weights': 'uniform'}\n", - "original hyperparameters for individual 2\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 10, 'p': 2, 'weights': 'uniform'}\n", - "\n", - "post crossover hyperparameters for individual 1\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 10, 'p': 2, 'weights': 'uniform'}\n", - "post crossover hyperparameters for individual 2\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 10, 'p': 2, 'weights': 'uniform'}\n" - ] - } - ], - "source": [ - "knn_individual1 = knn_node.generate()\n", - "knn_individual2 = knn_node.generate()\n", - "\n", - "print(\"original hyperparameters for individual 1\")\n", - "print(knn_individual1.hyperparameters)\n", - "\n", - "print(\"original hyperparameters for individual 2\")\n", - "print(knn_individual2.hyperparameters)\n", - "\n", - "print()\n", + "transform_and_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", + " tpot2.config.get_search_space(\"transformers\"),\n", + " tpot2.config.get_search_space(\"Passthrough\"),\n", + "])\n", "\n", - "knn_individual1.crossover(knn_individual2) # crossover the individuals\n", - "print(\"post crossover hyperparameters for individual 1\")\n", - "print(knn_individual1.hyperparameters)\n", - "print(\"post crossover hyperparameters for individual 2\")\n", - "print(knn_individual2.hyperparameters)\n", - "\n" + "transform_and_passthrough.generate().export_pipeline()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "All search spaces have an export_pipeline function that returns an sklearn `BaseEstimator`" + "UnionPipelines are an excellent tool to expand the capabilities of the linear search spaces." ] }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0043464782007)),\n",
+       "                ('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('powertransformer',\n",
+       "                                                 PowerTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('lineardiscriminantanalysis',\n",
+       "                 LinearDiscriminantAnalysis(shrinkage=0.6381012799603,\n",
+       "                                            solver='lsqr'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=10)" + "Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0043464782007)),\n", + " ('featureunion',\n", + " FeatureUnion(transformer_list=[('powertransformer',\n", + " PowerTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('lineardiscriminantanalysis',\n", + " LinearDiscriminantAnalysis(shrinkage=0.6381012799603,\n", + " solver='lsqr'))])" ] }, - "execution_count": 14, + "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "est = knn_individual1.export_pipeline()\n", - "est" + "stc_pipeline2 = tpot2.search_spaces.pipelines.SequentialPipeline([\n", + " tpot2.config.get_search_space(\"selectors\"),\n", + " transform_and_passthrough,\n", + " tpot2.config.get_search_space(\"classifiers\"),\n", + "])\n", + "\n", + "stc_pipeline2.generate().export_pipeline()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "If a dictionary of parameters is passed instead of of a ConfigSpace object, then the hyperparameters will always be fixed and not learned." + "Union pipelines can also be used to create \"branches\" if you are trying to create a tree-like search space. This can be particularly useful when paired with the FeatureSetSelector node (FSSNode) as each branch can learn different feature engineering for different subsets of the features, for example." ] }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
KNeighborsClassifier(n_neighbors=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "KNeighborsClassifier(n_neighbors=10)" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import tpot2\n", - "from ConfigSpace import ConfigurationSpace\n", - "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", - "from sklearn.neighbors import KNeighborsClassifier\n", - "\n", - "space = {\n", - "\n", - " 'n_neighbors':10,\n", - "}\n", - "\n", - "knn_node = tpot2.search_spaces.nodes.EstimatorNode(\n", - " method = KNeighborsClassifier,\n", - " space = space,\n", - ")\n", - "\n", - "knn_node.generate().export_pipeline()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### FSSNode and GeneticFeatureSelectorNode\n", - "\n", - "Both of these are given their own tutorials. See Tutorial 3 for FFSNode and Tutorial 5 for GeneticFeatureSelectorNode" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Pipeline Search Space Examples\n", - "\n", - "Pipeline search spaces are used to define the structure and restrictions of the pipelines TPOT can search. Unlike Node search spaces, all pipeline search spaces take in other search spaces as inputs. Rather than sample hyperparameters, pipeline search spaces can select models from the input search spaces and organize them within a linear sklearn Pipeline or a TPOT GraphPipeline." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### ChoicePipeline\n", - "\n", - "The simplest pipeline search space is the ChoicePipeline. This takes in a list of search spaces and simply selects and samples from one. In this example, we will construct a search space that takes in several options for a classifier. The resulting search space will then first select a model from KNeighborsClassifier, LogisticRegression or DecisionTreeClassifier, and then select the hyperparameters for the given model." - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import tpot2\n", - "from ConfigSpace import ConfigurationSpace\n", - "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", - "from sklearn.neighbors import KNeighborsClassifier\n", - "from sklearn.linear_model import LogisticRegression\n", - "from sklearn.tree import DecisionTreeClassifier\n", - "\n", - "knn_configspace = ConfigurationSpace(\n", - " space = {\n", - "\n", - " 'n_neighbors': Integer(\"n_neighbors\", bounds=(1, 10)),\n", - " 'weights': Categorical(\"weights\", ['uniform', 'distance']),\n", - " 'p': Integer(\"p\", bounds=(1, 3)),\n", - " 'metric': Categorical(\"metric\", ['euclidean', 'minkowski']),\n", - " 'n_jobs': 1,\n", - " }\n", - ")\n", - "\n", - "lr_configspace = ConfigurationSpace(\n", - " space = {\n", - " 'solver': Categorical(\"solver\", ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga']),\n", - " 'penalty': Categorical(\"penalty\", ['l1', 'l2']),\n", - " 'dual': Categorical(\"dual\", [True, False]),\n", - " 'C': Float(\"C\", bounds=(1e-4, 1e4), log=True),\n", - " 'class_weight': Categorical(\"class_weight\", ['balanced']),\n", - " 'n_jobs': 1,\n", - " 'max_iter': 1000,\n", - " }\n", - " )\n", - "\n", - "dt_configspace = ConfigurationSpace(\n", - " space = {\n", - " 'criterion': Categorical(\"criterion\", ['gini', 'entropy']),\n", - " 'max_depth': Integer(\"max_depth\", bounds=(1, 11)),\n", - " 'min_samples_split': Integer(\"min_samples_split\", bounds=(2, 21)),\n", - " 'min_samples_leaf': Integer(\"min_samples_leaf\", bounds=(1, 21)),\n", - " 'max_features': Categorical(\"max_features\", ['sqrt', 'log2']),\n", - " 'min_weight_fraction_leaf': 0.0,\n", - " }\n", - " )\n", - "\n", - "knn_node = tpot2.search_spaces.nodes.EstimatorNode(\n", - " method = KNeighborsClassifier,\n", - " space = knn_configspace,\n", - ")\n", - "\n", - "lr_node = tpot2.search_spaces.nodes.EstimatorNode(\n", - " method = LogisticRegression,\n", - " space = lr_configspace,\n", - ")\n", - "\n", - "dt_node = tpot2.search_spaces.nodes.EstimatorNode(\n", - " method = DecisionTreeClassifier,\n", - " space = dt_configspace,\n", - ")\n", - "\n", - "classifier_node = tpot2.search_spaces.pipelines.ChoicePipeline(\n", - " search_spaces=[\n", - " knn_node,\n", - " lr_node,\n", - " dt_node,\n", - " ]\n", - ")\n", + " background-color: var(--sklearn-color-background);\n", + " border-radius: 1rem;\n", + " height: 1rem;\n", + " width: 1rem;\n", + " text-decoration: none;\n", + " /* unfitted */\n", + " color: var(--sklearn-color-unfitted-level-1);\n", + " border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n", + "}\n", + "\n", + "#sk-container-id-20 a.estimator_doc_link.fitted {\n", + " /* fitted */\n", + " border: var(--sklearn-color-fitted-level-1) 1pt solid;\n", + " color: var(--sklearn-color-fitted-level-1);\n", + "}\n", + "\n", + "/* On hover */\n", + "#sk-container-id-20 a.estimator_doc_link:hover {\n", + " /* unfitted */\n", + " background-color: var(--sklearn-color-unfitted-level-3);\n", + " color: var(--sklearn-color-background);\n", + " text-decoration: none;\n", + "}\n", + "\n", + "#sk-container-id-20 a.estimator_doc_link.fitted:hover {\n", + " /* fitted */\n", + " background-color: var(--sklearn-color-fitted-level-3);\n", + "}\n", + "
Pipeline(steps=[('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('pipeline-1',\n",
+       "                                                 Pipeline(steps=[('selectfwe',\n",
+       "                                                                  SelectFwe(alpha=0.000231370784)),\n",
+       "                                                                 ('zerocount',\n",
+       "                                                                  ZeroCount())])),\n",
+       "                                                ('pipeline-2',\n",
+       "                                                 Pipeline(steps=[('selectpercentile',\n",
+       "                                                                  SelectPercentile(percentile=56.9207229949532)),\n",
+       "                                                                 ('rbfsampler',\n",
+       "                                                                  RBFSampler(gamma=0.9667449310006,\n",
+       "                                                                             n_components=45))]))])),\n",
+       "                ('linearsvc', LinearSVC(C=8596.144097926976, penalty='l1'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('featureunion',\n", + " FeatureUnion(transformer_list=[('pipeline-1',\n", + " Pipeline(steps=[('selectfwe',\n", + " SelectFwe(alpha=0.000231370784)),\n", + " ('zerocount',\n", + " ZeroCount())])),\n", + " ('pipeline-2',\n", + " Pipeline(steps=[('selectpercentile',\n", + " SelectPercentile(percentile=56.9207229949532)),\n", + " ('rbfsampler',\n", + " RBFSampler(gamma=0.9667449310006,\n", + " n_components=45))]))])),\n", + " ('linearsvc', LinearSVC(C=8596.144097926976, penalty='l1'))])" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "st_pipeline = tpot2.search_spaces.pipelines.SequentialPipeline([\n", + " tpot2.config.get_search_space(\"selectors\"),\n", + " tpot2.config.get_search_space(\"transformers\"),\n", + "])\n", "\n", + "branched_pipeline = tpot2.search_spaces.pipelines.SequentialPipeline([\n", + " tpot2.search_spaces.pipelines.UnionPipeline([\n", + " st_pipeline,\n", + " st_pipeline,\n", + " ]),\n", + " tpot2.config.get_search_space(\"classifiers\"),\n", + "])\n", "\n", - "tpot2.search_spaces.pipelines.ChoicePipeline(\n", - " search_spaces = [\n", - " tpot2.search_spaces.nodes.EstimatorNode(\n", - " method = KNeighborsClassifier,\n", - " space = knn_configspace,\n", - " ),\n", - " tpot2.search_spaces.nodes.EstimatorNode(\n", - " method = LogisticRegression,\n", - " space = lr_configspace,\n", - " ),\n", - " tpot2.search_spaces.nodes.EstimatorNode(\n", - " method = DecisionTreeClassifier,\n", - " space = dt_configspace,\n", - " ),\n", - " ]\n", - ")" + "branched_pipeline.generate().export_pipeline()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Search space objects provided by pipeline search spaces work the same as with node search spaces. Note that crossover only works when both individuals have sampled the same method. " + "### DynamicUnionPipeline\n", + "\n", + "DynamicUnionPipeline works similarly as UnionPipeline. Whereas UnionPipeline is fixed length, with each index corresponding to the search space provided as a list, DynamicUnionPipeline takes in a single search space and will sample 1 or more estimators/pipelines and concatenate them with a FeatureUnion. \n", + "\n", + "Note that DynamicUnionPipeline will check for pipeline uniqueness, so it will never concatenate two completely identical pipelines. In other words, all steps within the feature union will be unique.\n", + "\n", + "This can be useful when you want multiple transformers (or in some cases, pipelines), but are not sure how many or which ones." ] }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 27, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, { "data": { "text/html": [ - "
DecisionTreeClassifier(max_depth=5, max_features='sqrt', min_samples_leaf=4,\n",
-       "                       min_samples_split=4)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureUnion(transformer_list=[('powertransformer', PowerTransformer())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "DecisionTreeClassifier(max_depth=5, max_features='sqrt', min_samples_leaf=4,\n", - " min_samples_split=4)" + "FeatureUnion(transformer_list=[('powertransformer', PowerTransformer())])" ] }, - "execution_count": 17, + "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "classifier_individual = classifier_node.generate()\n", - "\n", - "print(\"sampled pipeline\")\n", - "classifier_individual.export_pipeline()" + "dynamic_transformers = tpot2.search_spaces.pipelines.DynamicUnionPipeline(tpot2.config.get_search_space(\"transformers\"), max_estimators=4)\n", + "dynamic_transformers.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "One good strategy could be to pair this with Passthrough in a feature union so that you output all the transformations along with the original data." ] }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 28, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "mutated pipeline\n" - ] - }, { "data": { "text/html": [ - "
KNeighborsClassifier(n_jobs=1, n_neighbors=6, p=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('rbfsampler',\n",
+       "                                                                RBFSampler(gamma=0.3736377055485,\n",
+       "                                                                           n_components=3)),\n",
+       "                                                               ('quantiletransformer',\n",
+       "                                                                QuantileTransformer(n_quantiles=955)),\n",
+       "                                                               ('fastica',\n",
+       "                                                                FastICA(algorithm='deflation',\n",
+       "                                                                        n_components=92))])),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "KNeighborsClassifier(n_jobs=1, n_neighbors=6, p=1)" + "FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('rbfsampler',\n", + " RBFSampler(gamma=0.3736377055485,\n", + " n_components=3)),\n", + " ('quantiletransformer',\n", + " QuantileTransformer(n_quantiles=955)),\n", + " ('fastica',\n", + " FastICA(algorithm='deflation',\n", + " n_components=92))])),\n", + " ('passthrough', Passthrough())])" ] }, - "execution_count": 18, + "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "print(\"mutated pipeline\")\n", - "classifier_individual.mutate()\n", - "classifier_individual.export_pipeline()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "TPOT2 also comes with predefined search spaces. The current search spaces were adapted from a combination of the original TPOT package as well as the search spaces used in [AutoSklearn](https://github.com/automl/auto-sklearn/tree/development/autosklearn/pipeline/components). The helper function `tpot2.config.get_search_space` takes in a string or a list of strings, and returns either a EstimatorNode or a ChoicePipeline,respectively. \n", - "\n", - "strings can correspond to individual methods. There are also special strings that return predefined lists of methods. \n", + "dynamic_transformers_with_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", + " dynamic_transformers,\n", + " tpot2.config.get_search_space(\"Passthrough\")],\n", + " )\n", "\n", - "| Special String | Included methods |\n", - "| :--- | :----: |\n", - "| \"selectors\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\",] |\n", - "| \"selectors_classification\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\", \"RFE_classification\", \"SelectFromModel_classification\"] |\n", - "| \"selectors_regression\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\", \"RFE_regression\", \"SelectFromModel_regression\"] |\n", - "| \"classifiers\" | [\"LGBMClassifier\", \"BaggingClassifier\", 'AdaBoostClassifier', 'BernoulliNB', 'DecisionTreeClassifier', 'ExtraTreesClassifier', 'GaussianNB', 'HistGradientBoostingClassifier', 'KNeighborsClassifier','LinearDiscriminantAnalysis', 'LogisticRegression', \"LinearSVC\", \"SVC\", 'MLPClassifier', 'MultinomialNB', \"QuadraticDiscriminantAnalysis\", 'RandomForestClassifier', 'SGDClassifier', 'XGBClassifier'] |\n", - "| \"regressors\" | [\"LGBMRegressor\", 'AdaBoostRegressor', \"ARDRegression\", 'DecisionTreeRegressor', 'ExtraTreesRegressor', 'HistGradientBoostingRegressor', 'KNeighborsRegressor', 'LinearSVR', \"MLPRegressor\", 'RandomForestRegressor', 'SGDRegressor', 'SVR', 'XGBRegressor'] |\n", - "| \"transformers\" | [\"PassKBinsDiscretizer\", \"Binarizer\", \"PCA\", \"ZeroCount\", \"ColumnOneHotEncoder\", \"FastICA\", \"FeatureAgglomeration\", \"Nystroem\", \"RBFSampler\", \"QuantileTransformer\", \"PowerTransformer\"] |\n", - "| \"scalers\" | [\"MinMaxScaler\", \"RobustScaler\", \"StandardScaler\", \"MaxAbsScaler\", \"Normalizer\", ] |\n", - "| \"all_transformers\" | [\"transformers\", \"scalers\"] |\n", - "| \"arithmatic\" | [\"AddTransformer\", \"mul_neg_1_Transformer\", \"MulTransformer\", \"SafeReciprocalTransformer\", \"EQTransformer\", \"NETransformer\", \"GETransformer\", \"GTTransformer\", \"LETransformer\", \"LTTransformer\", \"MinTransformer\", \"MaxTransformer\"] |\n", - "| \"imputers\" | [\"SimpleImputer\", \"IterativeImputer\", \"KNNImputer\"] |\n", - "| \"skrebate\" | [\"ReliefF\", \"SURF\", \"SURFstar\", \"MultiSURF\"] |\n", - "| \"genetic_encoders\" | [\"DominantEncoder\", \"RecessiveEncoder\", \"HeterosisEncoder\", \"UnderDominanceEncoder\", \"OverDominanceEncoder\"] |\n", - "| \"classifiers_sklearnex\" | [\"RandomForestClassifier_sklearnex\", \"LogisticRegression_sklearnex\", \"KNeighborsClassifier_sklearnex\", \"SVC_sklearnex\",\"NuSVC_sklearnex\"] |\n", - "| \"regressors_sklearnex\" | [\"LinearRegression_sklearnex\", \"Ridge_sklearnex\", \"Lasso_sklearnex\", \"ElasticNet_sklearnex\", \"SVR_sklearnex\", \"NuSVR_sklearnex\", \"RandomForestRegressor_sklearnex\", \"KNeighborsRegressor_sklearnex\"] |" + "dynamic_transformers_with_passthrough.generate().export_pipeline()" ] }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 29, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline 1\n" - ] - }, { "data": { "text/html": [ - "
LogisticRegression(C=0.6262919454224, class_weight='balanced',\n",
-       "                   l1_ratio=0.1219417333128, max_iter=1000, n_jobs=1,\n",
-       "                   penalty='elasticnet', solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('selectpercentile',\n",
+       "                 SelectPercentile(percentile=1.2220353454141)),\n",
+       "                ('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                 FeatureUnion(transformer_list=[('rbfsampler',\n",
+       "                                                                                 RBFSampler(gamma=0.0989706913466,\n",
+       "                                                                                            n_components=61)),\n",
+       "                                                                                ('passkbinsdiscretizer',\n",
+       "                                                                                 PassKBinsDiscretizer(n_bins=42))])),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('bernoullinb',\n",
+       "                 BernoulliNB(alpha=7.5106513153016, fit_prior=False))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "LogisticRegression(C=0.6262919454224, class_weight='balanced',\n", - " l1_ratio=0.1219417333128, max_iter=1000, n_jobs=1,\n", - " penalty='elasticnet', solver='saga')" + "Pipeline(steps=[('selectpercentile',\n", + " SelectPercentile(percentile=1.2220353454141)),\n", + " ('featureunion',\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('rbfsampler',\n", + " RBFSampler(gamma=0.0989706913466,\n", + " n_components=61)),\n", + " ('passkbinsdiscretizer',\n", + " PassKBinsDiscretizer(n_bins=42))])),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('bernoullinb',\n", + " BernoulliNB(alpha=7.5106513153016, fit_prior=False))])" ] }, - "execution_count": 19, + "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "#same pipeline search space as before.\n", - "classifier_choice = tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"])\n", + "stc_pipeline3 = tpot2.search_spaces.pipelines.SequentialPipeline([\n", + " tpot2.config.get_search_space(\"selectors\"),\n", + " dynamic_transformers_with_passthrough,\n", + " tpot2.config.get_search_space(\"classifiers\"),\n", + "])\n", "\n", - "print(\"sampled pipeline 1\")\n", - "classifier_choice.generate().export_pipeline()" + "stc_pipeline3.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### WrapperPipeline\n", + "\n", + "Some sklearn estimators take in other sklearn estimators as a parameter. The wrapper pipeline is used to tune both the original estimators hyperparameters simultaneously with the inner estimators hyperparameters. In fact, the inner estimator in WrapperPipeline can be any search space defined with any of the methods described in this Tutorial.\n", + "\n", + "The `get_search_space` will automatically create an inner search space for sklearn estimators that do use require an inner estimator. For example \"SelectFromModel_classification\" will return the following search space" ] }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 30, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline 2\n" - ] - }, { "data": { "text/html": [ - "
DecisionTreeClassifier(class_weight='balanced', max_depth=14,\n",
-       "                       max_features='sqrt', min_samples_leaf=3,\n",
-       "                       min_samples_split=16)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
ExtraTreesClassifier(bootstrap=True, class_weight='balanced',\n",
+       "                     criterion='entropy', max_features=0.5945413838121,\n",
+       "                     min_samples_leaf=4, min_samples_split=8, n_jobs=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "DecisionTreeClassifier(class_weight='balanced', max_depth=14,\n", - " max_features='sqrt', min_samples_leaf=3,\n", - " min_samples_split=16)" + "ExtraTreesClassifier(bootstrap=True, class_weight='balanced',\n", + " criterion='entropy', max_features=0.5945413838121,\n", + " min_samples_leaf=4, min_samples_split=8, n_jobs=1)" ] }, - "execution_count": 20, + "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "print(\"sampled pipeline 2\")\n", - "classifier_choice.generate().export_pipeline()" + "SelectFromModel_configspace_part = ConfigurationSpace(\n", + " space = {\n", + " 'threshold': Float('threshold', bounds=(1e-4, 1.0), log=True),\n", + " }\n", + " )\n", + "\n", + "extratrees_estimator_node = tpot2.config.get_search_space(\"ExtraTreesClassifier\") #this exports an ExtraTreesClassifier node\n", + "extratrees_estimator_node.generate().export_pipeline()" ] }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 31, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline 1\n" - ] - }, { "data": { "text/html": [ - "
KNeighborsClassifier(n_jobs=1, n_neighbors=3, p=3, weights='distance')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n",
+       "                                               class_weight='balanced',\n",
+       "                                               criterion='entropy',\n",
+       "                                               max_features=0.0157364601821,\n",
+       "                                               min_samples_leaf=14,\n",
+       "                                               min_samples_split=6, n_jobs=1),\n",
+       "                threshold=0.947435367985)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "KNeighborsClassifier(n_jobs=1, n_neighbors=3, p=3, weights='distance')" + "SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n", + " class_weight='balanced',\n", + " criterion='entropy',\n", + " max_features=0.0157364601821,\n", + " min_samples_leaf=14,\n", + " min_samples_split=6, n_jobs=1),\n", + " threshold=0.947435367985)" ] }, - "execution_count": 21, + "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "#search space for all classifiers\n", - "classifier_choice = tpot2.config.get_search_space(\"classifiers\")\n", + "from sklearn.ensemble import ExtraTreesClassifier\n", + "from sklearn.feature_selection import SelectFromModel\n", "\n", - "print(\"sampled pipeline 1\")\n", - "classifier_choice.generate().export_pipeline()" + "select_from_model_wrapper_searchspace = tpot2.search_spaces.pipelines.WrapperPipeline(\n", + " method=SelectFromModel,\n", + " space = SelectFromModel_configspace_part,\n", + " estimator_search_space= extratrees_estimator_node,\n", + " )\n", + "\n", + "select_from_model_wrapper_searchspace.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### WrapperPipeline strategy for ensembles/inner classifiers/regressors EstimatorTransformer- \n", + "\n", + "Sklearn Pipelines only allow classifiers/regressors as the final step. All other steps are expected to implement a transform function. We can get around this by wrapping it in another transformer class that returns the output of predict or predict_proba inside the transform() function.\n", + "\n", + "To wrap classifiers as transfomers, you can use the following class: `tpot2.builtin_modules.EstimatorTransformer`. You can specify whether to pass the outputs of predict, predict_proba, or decision function with the `method` parameter. \n", + "\n", + "An additional consideration is whether or not to use `cross_val_predict_cv`. Stacking classifiers can sometimes perform better when the predictions it is trained on come from out of sample predictions which is accomplished with CV. This is because the following pipelines can potentially better identify the accuracy of preceding models. Otherwise, some models may overfit the data and their predictions, making them seem like good predictors, which would then be weighted too highly by subsequent models. By default, TPOT `cross_val_predict_cv` is not enabled due to its computational cost.\n", + "\n", + "Note: This is not necessary for `GraphSearchPipeline` as the exported GraphPipeline estimator does have builtin support for inner/regressors. Instead of using a wrapper, you can set the `cross_val_predict_cv` param when initializing the `GraphSearchPipeline` object." ] }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 32, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline 2\n" - ] - }, { "data": { "text/html": [ - "
AdaBoostClassifier(algorithm='SAMME', learning_rate=0.0231006103189,\n",
-       "                   n_estimators=391)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
EstimatorTransformer(estimator=KNeighborsClassifier(n_jobs=1, n_neighbors=10,\n",
+       "                                                    p=1))
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "AdaBoostClassifier(algorithm='SAMME', learning_rate=0.0231006103189,\n", - " n_estimators=391)" + "EstimatorTransformer(estimator=KNeighborsClassifier(n_jobs=1, n_neighbors=10,\n", + " p=1))" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "classifiers = tpot2.config.get_search_space(\"classifiers\")\n", + "wrapped_estimators = tpot2.search_spaces.pipelines.WrapperPipeline(tpot2.builtin_modules.EstimatorTransformer, {}, classifiers)\n", + "\n", + "est = wrapped_estimators.generate().export_pipeline() #returns an estimator with a transform function\n", + "est" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0.3, 0.7],\n", + " [0.5, 0.5],\n", + " [0.8, 0.2],\n", + " [0.7, 0.3],\n", + " [0.5, 0.5]])" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import numpy as np\n", + "X, y = np.random.rand(100, 10), np.random.randint(0, 2, 100)\n", + "\n", + "est.fit_transform(X, y)[0:5]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "you can manually set the settings for an estimator the same way you would do it for an EstimatorNode. Here's another example with cross_val_predict and method being used." + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1],\n", + " [1],\n", + " [0],\n", + " [0],\n", + " [1]])" ] }, - "execution_count": 22, + "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "print(\"sampled pipeline 2\")\n", - "classifier_choice.generate().export_pipeline()" + "classifiers = tpot2.config.get_search_space(\"classifiers\")\n", + "wrapped_estimators_cv = tpot2.search_spaces.pipelines.WrapperPipeline(tpot2.builtin_modules.EstimatorTransformer, {'cross_val_predict_cv':10, 'method':'predict'}, classifiers)\n", + "est = wrapped_estimators_cv.generate().export_pipeline() #returns an estimator with a transform function\n", + "est.fit_transform(X, y)[0:5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## SequentialPipeline\n", - "\n", - "SequentialPipelines are of fixed length and sample from a predefined distribution for each step. " + "These can now be used inside a linear pipeline. This is fairly similar to the default linear pipeline search space." ] }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 35, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, { "data": { "text/html": [ - "
Pipeline(steps=[('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0009684094023)),\n",
-       "                ('pca', PCA(n_components=0.8999344355157)),\n",
-       "                ('logisticregression',\n",
-       "                 LogisticRegression(C=0.1195118894279, class_weight='balanced',\n",
-       "                                    max_iter=1000, n_jobs=1, solver='saga'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n",
+       "                ('featureunion-1',\n",
+       "                 FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                 FeatureUnion(transformer_list=[('rbfsampler',\n",
+       "                                                                                 RBFSampler(gamma=0.6486019086026,\n",
+       "                                                                                            n_components=82)),\n",
+       "                                                                                ('nystroem',\n",
+       "                                                                                 Nystroem(gamma=0.185797439118,\n",
+       "                                                                                          kernel='additive_chi2',\n",
+       "                                                                                          n_components=87))])),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('featureunion-2',\n",
+       "                 Feat...\n",
+       "                                                                                                                              missing=nan,\n",
+       "                                                                                                                              monotone_constraints=None,\n",
+       "                                                                                                                              multi_strategy=None,\n",
+       "                                                                                                                              n_estimators=100,\n",
+       "                                                                                                                              n_jobs=1,\n",
+       "                                                                                                                              nthread=1,\n",
+       "                                                                                                                              num_parallel_tree=None, ...),\n",
+       "                                                                                                      method='predict'))])),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('randomforestclassifier',\n",
+       "                 RandomForestClassifier(bootstrap=False, criterion='entropy',\n",
+       "                                        max_features=0.6976552018012,\n",
+       "                                        min_samples_leaf=8,\n",
+       "                                        min_samples_split=16,\n",
+       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('variancethreshold',\n", - " VarianceThreshold(threshold=0.0009684094023)),\n", - " ('pca', PCA(n_components=0.8999344355157)),\n", - " ('logisticregression',\n", - " LogisticRegression(C=0.1195118894279, class_weight='balanced',\n", - " max_iter=1000, n_jobs=1, solver='saga'))])" + "Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n", + " ('featureunion-1',\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('rbfsampler',\n", + " RBFSampler(gamma=0.6486019086026,\n", + " n_components=82)),\n", + " ('nystroem',\n", + " Nystroem(gamma=0.185797439118,\n", + " kernel='additive_chi2',\n", + " n_components=87))])),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('featureunion-2',\n", + " Feat...\n", + " missing=nan,\n", + " monotone_constraints=None,\n", + " multi_strategy=None,\n", + " n_estimators=100,\n", + " n_jobs=1,\n", + " nthread=1,\n", + " num_parallel_tree=None, ...),\n", + " method='predict'))])),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('randomforestclassifier',\n", + " RandomForestClassifier(bootstrap=False, criterion='entropy',\n", + " max_features=0.6976552018012,\n", + " min_samples_leaf=8,\n", + " min_samples_split=16,\n", + " n_estimators=128))])" ] }, - "execution_count": 27, + "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "selector_choicepipeline = tpot2.config.get_search_space(\"VarianceThreshold\")\n", - "transformer_choicepipeline = tpot2.config.get_search_space(\"PCA\")\n", - "classifier_choicepipeline = tpot2.config.get_search_space(\"LogisticRegression\")\n", + "dynamic_wrapped_classifiers_with_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", + " tpot2.search_spaces.pipelines.DynamicUnionPipeline(wrapped_estimators_cv, max_estimators=4),\n", + " tpot2.config.get_search_space(\"Passthrough\")\n", + " ])\n", "\n", - "stc_pipeline = tpot2.search_spaces.pipelines.SequentialPipeline([\n", - " selector_choicepipeline,\n", - " transformer_choicepipeline,\n", - " classifier_choicepipeline,\n", + "stc_pipeline4 = tpot2.search_spaces.pipelines.SequentialPipeline([\n", + " tpot2.config.get_search_space(\"scalers\"),\n", + " dynamic_transformers_with_passthrough,\n", + " dynamic_wrapped_classifiers_with_passthrough,\n", + " tpot2.config.get_search_space(\"classifiers\"),\n", "])\n", "\n", - "print(\"sampled pipeline\")\n", - "stc_pipeline.generate().export_pipeline()" + "stc_pipeline4.generate().export_pipeline()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Here is an example of the form Selector-Transformer-Classifier.\n", + "### GraphSearchPipeline\n", "\n", - "Note that each step in the sequence is a ChoicePipeline this time. Here, the SequentialPipeline can sample from search provided search space in order." + "The GraphSearchPipeline is a flexible search space without a prior restriction of pipeline structure. With GraphSearchPipeline, TPOT will create a pipeline in the shape of a directed acyclic graph. Throughout the optimization process, TPOT may add/remove nodes, add/remove edges, and performs model selection and hyperparameter tuning for each node.\n", + "\n", + "The primary parameters for the graph_search_space are the root_search_space, inner_search_space, and leaf_search_space.\n", + "\n", + "| Parameter | Type | Description |\n", + "|------------------------|-------------------------------------|-----------------------------------------------------------------------------------------------------------|\n", + "| root_search_space | SklearnIndividualGenerator | The search space for the root node of the graph. This node will be the final estimator in the pipeline. |\n", + "| inner_search_space | SklearnIndividualGenerator, optional| The search space for the inner nodes of the graph. If not defined, there will be no inner nodes. |\n", + "| leaf_search_space | SklearnIndividualGenerator, optional| The search space for the leaf nodes of the graph. If not defined, the leaf nodes will be drawn from the inner_search_space. |\n", + "| crossover_same_depth | bool, optional | If True, crossover will only occur between nodes at the same depth in the graph. If False, crossover will occur between nodes at any depth. |\n", + "| cross_val_predict_cv | int, cross-validation generator or an iterable, optional | Determines the cross-validation splitting strategy used in inner classifiers or regressors. |\n", + "| method | str, optional | The prediction method to use for the inner classifiers or regressors. If 'auto', it will try to use predict_proba, decision_function, or predict in that order. |\n", + "\n", + "This search space exports a `tpot2.GraphPipeline`. This is similar to a scikit-learn Pipeline, but for directed acyclic graph pipelines. You can learn more about using this module in Tutorial 6." ] }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 36, + "metadata": {}, + "outputs": [], + "source": [ + "graph_search_space = tpot2.search_spaces.pipelines.GraphSearchPipeline(\n", + " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", + " leaf_search_space = tpot2.config.get_search_space(\"selectors\"), \n", + " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", + " max_size = 10,\n", + ")\n", + "\n", + "ind = graph_search_space.generate()" + ] + }, + { + "cell_type": "code", + "execution_count": 37, "metadata": {}, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "est1 = ind.export_pipeline()\n", + "est1.plot() #GraphPipelines have a helpful plotting function to visualize the pipeline" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Lets add a few more mutations and plot the final pipeline to get a sense of the diversity of pipelines that can be generated with this search space" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "for i in range(0,50):\n", + " ind.mutate()\n", + " if i%5==0:\n", + " est = ind.export_pipeline()\n", + " est.plot()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### TreePipeline" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "TreePipelines work the same way as GraphPipelines, but they are limited to a tree structure. This is similar to the search space in the original TPOT.\n", + "\n", + "(This search space is still experimental and currently built off GraphSearchPipeline. It may be rewritten with its own code in the future.)" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "tree_search_space = tpot2.search_spaces.pipelines.TreePipeline(\n", + " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", + " leaf_search_space = tpot2.config.get_search_space(\"selectors\"), \n", + " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", + " max_size = 10,\n", + ")\n", + "\n", + "ind = graph_search_space.generate()\n", + "exp = ind.export_pipeline()\n", + "exp.plot()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Tips and Tricks\n", + "\n", + "* Two very helpful transformers to use with search spaces are tpot2.buildin_models.Passthrough and tpot2.builtin_models.SkipTransformer. \n", + " Passthrough will simply pass through the exact inputs it receives into the next step. This is particularly useful inside UnionSearchSpace as it allows for both the transformed data as well as the original data to be passed into the next step.\n", + " SkipTransformer will always return nothing. This is helpful when inside a union with Passthrough and an optional second method. For example, if you are unsure of whether or not you will need a transformer, you can have SkipTransformer be one option that will skip the transformation step if selected." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this example, the FeatureUnion layer will always have at least one transformer selected and will always have one passthrough" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": {}, + "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0031706560759)),\n",
-       "                ('fastica', FastICA(n_components=8)),\n",
-       "                ('histgradientboostingclassifier',\n",
-       "                 HistGradientBoostingClassifier(early_stopping=False,\n",
-       "                                                l2_regularization=5.767e-10,\n",
-       "                                                learning_rate=0.0104696477137,\n",
-       "                                                max_features=0.1498293764962,\n",
-       "                                                max_leaf_nodes=1795,\n",
-       "                                                min_samples_leaf=40, tol=0.0001,\n",
-       "                                                validation_fraction=None))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('powertransformer',\n",
+       "                                                                PowerTransformer()),\n",
+       "                                                               ('passkbinsdiscretizer',\n",
+       "                                                                PassKBinsDiscretizer(n_bins=11,\n",
+       "                                                                                     strategy='uniform'))])),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('variancethreshold',\n", - " VarianceThreshold(threshold=0.0031706560759)),\n", - " ('fastica', FastICA(n_components=8)),\n", - " ('histgradientboostingclassifier',\n", - " HistGradientBoostingClassifier(early_stopping=False,\n", - " l2_regularization=5.767e-10,\n", - " learning_rate=0.0104696477137,\n", - " max_features=0.1498293764962,\n", - " max_leaf_nodes=1795,\n", - " min_samples_leaf=40, tol=0.0001,\n", - " validation_fraction=None))])" + "FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('powertransformer',\n", + " PowerTransformer()),\n", + " ('passkbinsdiscretizer',\n", + " PassKBinsDiscretizer(n_bins=11,\n", + " strategy='uniform'))])),\n", + " ('passthrough', Passthrough())])" ] }, - "execution_count": 26, + "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "selector_choicepipeline = tpot2.config.get_search_space(\"selectors\")\n", - "transformer_choicepipeline = tpot2.config.get_search_space(\"transformers\")\n", - "classifier_choicepipeline = tpot2.config.get_search_space(\"classifiers\")\n", + "from tpot2.search_spaces.pipelines import *\n", + "from tpot2.config import get_search_space\n", "\n", - "stc_pipeline = tpot2.search_spaces.pipelines.SequentialPipeline([\n", - " selector_choicepipeline,\n", - " transformer_choicepipeline,\n", - " classifier_choicepipeline,\n", - "])\n", + "#This FeatureUnion layer will always have at least one transformer selected and will always have one passthrough\n", + "transformers_with_passthrough = UnionPipeline([\n", + " DynamicUnionPipeline(get_search_space([\"transformers\"])),\n", + " get_search_space(\"Passthrough\")\n", + " ]\n", + " )\n", "\n", - "print(\"sampled pipeline\")\n", - "stc_pipeline.generate().export_pipeline()" + "transformers_with_passthrough.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this example, the FeatureUnion layer will always one passthrough. In addition, it may select one or more transformer, but it may skip transformers altogether and only include a Passthrough. " ] }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 47, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, { "data": { "text/html": [ - "
Pipeline(steps=[('selectpercentile',\n",
-       "                 SelectPercentile(percentile=8.477077621045)),\n",
-       "                ('passkbinsdiscretizer',\n",
-       "                 PassKBinsDiscretizer(n_bins=22, strategy='uniform')),\n",
-       "                ('lineardiscriminantanalysis',\n",
-       "                 LinearDiscriminantAnalysis(shrinkage=0.8475252442099,\n",
-       "                                            solver='eigen'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureUnion(transformer_list=[('skiptransformer', SkipTransformer()),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('selectpercentile',\n", - " SelectPercentile(percentile=8.477077621045)),\n", - " ('passkbinsdiscretizer',\n", - " PassKBinsDiscretizer(n_bins=22, strategy='uniform')),\n", - " ('lineardiscriminantanalysis',\n", - " LinearDiscriminantAnalysis(shrinkage=0.8475252442099,\n", - " solver='eigen'))])" + "FeatureUnion(transformer_list=[('skiptransformer', SkipTransformer()),\n", + " ('passthrough', Passthrough())])" ] }, - "execution_count": 24, + "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "print(\"sampled pipeline\")\n", - "stc_pipeline.generate().export_pipeline()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## DynamicLinearPipeline\n", + "final_transformers_layer =UnionPipeline([\n", + " ChoicePipeline([\n", + " DynamicUnionPipeline(get_search_space([\"transformers\"])),\n", + " get_search_space(\"SkipTransformer\"),\n", + " ]),\n", + " get_search_space(\"Passthrough\")\n", + " ]\n", + " )\n", "\n", - "DynamicLinearPipeline takes in a single search space and randomly samples and places estimators in a list without a predefined sequence. DynamicLinearPipeline are most often used when paired with LinearPipeline. A common strategy is to use DynamicLinearPipeline to optimize a series of preprocessing or feature engineering steps, followed by a final classifier or regressor." + "final_transformers_layer.generate().export_pipeline()" ] }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 52, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, { "data": { "text/html": [ - "
Pipeline(steps=[('rfe',\n",
-       "                 RFE(estimator=ExtraTreesClassifier(class_weight='balanced',\n",
-       "                                                    criterion='entropy',\n",
-       "                                                    max_features=0.5499673528175,\n",
-       "                                                    min_samples_leaf=15,\n",
-       "                                                    min_samples_split=20,\n",
-       "                                                    n_jobs=1),\n",
-       "                     step=0.719969257916)),\n",
-       "                ('zerocount', ZeroCount())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('estimatortransformer-1',\n",
+       "                                                                EstimatorTransformer(cross_val_predict_cv=10,\n",
+       "                                                                                     estimator=RandomForestClassifier(criterion='entropy',\n",
+       "                                                                                                                      max_features=0.0291036830622,\n",
+       "                                                                                                                      min_samples_leaf=10,\n",
+       "                                                                                                                      min_samples_split=20,\n",
+       "                                                                                                                      n_estimators=128),\n",
+       "                                                                                     method='predict')),\n",
+       "                                                               ('estimatortransformer-2',\n",
+       "                                                                EstimatorTransformer(cross_val_predict_cv=10,\n",
+       "                                                                                     estimator=QuadraticDiscriminantAnalysis(reg_param=0.6791389504331),\n",
+       "                                                                                     method='predict')),\n",
+       "                                                               ('estimatortransformer-3',\n",
+       "                                                                EstimatorTransformer(cross_val_predict_cv=10,\n",
+       "                                                                                     estimator=QuadraticDiscriminantAnalysis(reg_param=0.8087868529112),\n",
+       "                                                                                     method='predict'))])),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('rfe',\n", - " RFE(estimator=ExtraTreesClassifier(class_weight='balanced',\n", - " criterion='entropy',\n", - " max_features=0.5499673528175,\n", - " min_samples_leaf=15,\n", - " min_samples_split=20,\n", - " n_jobs=1),\n", - " step=0.719969257916)),\n", - " ('zerocount', ZeroCount())])" + "FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('estimatortransformer-1',\n", + " EstimatorTransformer(cross_val_predict_cv=10,\n", + " estimator=RandomForestClassifier(criterion='entropy',\n", + " max_features=0.0291036830622,\n", + " min_samples_leaf=10,\n", + " min_samples_split=20,\n", + " n_estimators=128),\n", + " method='predict')),\n", + " ('estimatortransformer-2',\n", + " EstimatorTransformer(cross_val_predict_cv=10,\n", + " estimator=QuadraticDiscriminantAnalysis(reg_param=0.6791389504331),\n", + " method='predict')),\n", + " ('estimatortransformer-3',\n", + " EstimatorTransformer(cross_val_predict_cv=10,\n", + " estimator=QuadraticDiscriminantAnalysis(reg_param=0.8087868529112),\n", + " method='predict'))])),\n", + " ('passthrough', Passthrough())])" ] }, - "execution_count": 29, + "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "import tpot2.config\n", - "\n", + "inner_estimators_layer = UnionPipeline([\n", + " ChoicePipeline([\n", + " DynamicUnionPipeline(wrapped_estimators, max_estimators=4),\n", + " get_search_space(\"SkipTransformer\"),\n", + " ]),\n", + " get_search_space(\"Passthrough\")]\n", + " )\n", "\n", - "linear_feature_engineering = tpot2.search_spaces.pipelines.DynamicLinearPipeline(search_space = tpot2.config.get_search_space([\"all_transformers\",\"selectors_classification\"]), max_length=10)\n", - "print(\"sampled pipeline\")\n", - "linear_feature_engineering.generate().export_pipeline()" + "inner_estimators_layer.generate().export_pipeline()" ] }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 53, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, { "data": { "text/html": [ - "
Pipeline(steps=[('robustscaler',\n",
-       "                 RobustScaler(quantile_range=(0.181922714148,\n",
-       "                                              0.9343902611602))),\n",
-       "                ('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0001632138019)),\n",
-       "                ('minmaxscaler', MinMaxScaler())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "}\n", + "
Pipeline(steps=[('robustscaler',\n",
+       "                 RobustScaler(quantile_range=(0.1562687943568,\n",
+       "                                              0.8028910581685))),\n",
+       "                ('featureunion-1',\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('featureunion-2',\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('baggingclassifier',\n",
+       "                 BaggingClassifier(bootstrap_features=True,\n",
+       "                                   max_features=0.1392808422872,\n",
+       "                                   max_samples=0.5344888038724, n_estimators=3,\n",
+       "                                   n_jobs=1, oob_score=True))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "Pipeline(steps=[('robustscaler',\n", - " RobustScaler(quantile_range=(0.181922714148,\n", - " 0.9343902611602))),\n", - " ('variancethreshold',\n", - " VarianceThreshold(threshold=0.0001632138019)),\n", - " ('minmaxscaler', MinMaxScaler())])" + " RobustScaler(quantile_range=(0.1562687943568,\n", + " 0.8028910581685))),\n", + " ('featureunion-1',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('featureunion-2',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('baggingclassifier',\n", + " BaggingClassifier(bootstrap_features=True,\n", + " max_features=0.1392808422872,\n", + " max_samples=0.5344888038724, n_estimators=3,\n", + " n_jobs=1, oob_score=True))])" ] }, - "execution_count": 30, + "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "print(\"sampled pipeline\")\n", - "linear_feature_engineering.generate().export_pipeline()" + "final_linear_pipeline = SequentialPipeline([\n", + " get_search_space(\"scalers\"),\n", + " final_transformers_layer,\n", + " inner_estimators_layer,\n", + " get_search_space(\"classifiers\"),\n", + " ])\n", + "\n", + "final_linear_pipeline.generate().export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Optimize Search Space with TPOTEstimator\n", + "\n", + "Once you have constructed a search space, you can use TPOTEstimator to optimize a pipeline within that space. Simply pass that search space into the `search_space` parameter. Here is a cell where you can select different search spaces that we created in this tutorial." ] }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 55, + "metadata": {}, + "outputs": [], + "source": [ + "all_search_spaces ={\n", + " \"classifiers_only\" : classifier_choice,\n", + " \"stc_pipeline\" : stc_pipeline,\n", + " \"stc_pipeline2\": stc_pipeline2,\n", + " \"stc_pipeline3\": stc_pipeline3,\n", + " \"stc_pipeline4\": stc_pipeline4,\n", + " \"final_linear_pipeline\": final_linear_pipeline,\n", + " \"graph_pipeline\": graph_search_space,\n", + "}\n", + "\n", + "X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", + "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size=0.5)" + ] + }, + { + "cell_type": "code", + "execution_count": 62, "metadata": {}, "outputs": [ { - "name": "stdout", + "name": "stderr", "output_type": "stream", "text": [ - "sampled pipeline\n" + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/distributed/node.py:182: UserWarning: Port 8787 is already in use.\n", + "Perhaps you already have a cluster running?\n", + "Hosting the HTTP server on port 40273 instead\n", + " warnings.warn(\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: 60%|██████ | 3/5 [00:44<00:29, 14.98s/it]\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/preprocessing/_data.py:2785: UserWarning: n_quantiles (688) is greater than the total number of samples (284). n_quantiles is set to n_samples.\n", + " warnings.warn(\n" ] }, { "data": { "text/html": [ - "
Pipeline(steps=[('pipeline',\n",
-       "                 Pipeline(steps=[('quantiletransformer',\n",
-       "                                  QuantileTransformer(n_quantiles=1366)),\n",
-       "                                 ('featureagglomeration',\n",
-       "                                  FeatureAgglomeration(n_clusters=81)),\n",
-       "                                 ('selectpercentile',\n",
-       "                                  SelectPercentile(percentile=74.3065720272799))])),\n",
-       "                ('kneighborsclassifier',\n",
-       "                 KNeighborsClassifier(n_jobs=1, n_neighbors=13,\n",
-       "                                      weights='distance'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
TPOTEstimator(classification=True, cv=5, early_stop=2, generations=5,\n",
+       "              max_eval_time_mins=10, n_jobs=4,\n",
+       "              scorers=['roc_auc_ovr',\n",
+       "                       <function complexity_scorer at 0x7e7bacf5b9a0>],\n",
+       "              scorers_weights=[1.0, -1.0],\n",
+       "              search_space=<tpot2.search_spaces.pipelines.sequential.SequentialPipeline object at 0x7e7ba3b078b0>,\n",
+       "              verbose=2)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('pipeline',\n", - " Pipeline(steps=[('quantiletransformer',\n", - " QuantileTransformer(n_quantiles=1366)),\n", - " ('featureagglomeration',\n", - " FeatureAgglomeration(n_clusters=81)),\n", - " ('selectpercentile',\n", - " SelectPercentile(percentile=74.3065720272799))])),\n", - " ('kneighborsclassifier',\n", - " KNeighborsClassifier(n_jobs=1, n_neighbors=13,\n", - " weights='distance'))])" + "TPOTEstimator(classification=True, cv=5, early_stop=2, generations=5,\n", + " max_eval_time_mins=10, n_jobs=4,\n", + " scorers=['roc_auc_ovr',\n", + " ],\n", + " scorers_weights=[1.0, -1.0],\n", + " search_space=,\n", + " verbose=2)" ] }, - "execution_count": 31, + "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "full_search_space = tpot2.search_spaces.pipelines.SequentialPipeline([\n", - " linear_feature_engineering,\n", - " tpot2.config.get_search_space(\"classifiers\"),\n", - "])\n", + "selected_search_space = all_search_spaces[\"stc_pipeline\"] #change this to select a different search space\n", "\n", - "print(\"sampled pipeline\")\n", - "full_search_space.generate().export_pipeline()" + "\n", + "est = tpot2.TPOTEstimator(\n", + " scorers=[\"roc_auc_ovr\", tpot2.objectives.complexity_scorer],\n", + " scorers_weights=[1.0, -1.0],\n", + " classification = True,\n", + " cv = 5,\n", + " search_space = selected_search_space,\n", + " population_size= 50,\n", + " generations = 5,\n", + " max_eval_time_mins = 10,\n", + " early_stop = 2,\n", + " verbose = 2,\n", + " n_jobs=4,\n", + ")\n", + "\n", + "est.fit(X_train, y_train)" ] }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 68, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "sampled pipeline\n" + "auroc score 0.9845976642022929\n" ] - }, + } + ], + "source": [ + "# score the model\n", + "auroc_scorer = sklearn.metrics.get_scorer(\"roc_auc\")\n", + "auroc_score = auroc_scorer(est, X_test, y_test)\n", + "\n", + "print(\"auroc score\", auroc_score)" + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": {}, + "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('pipeline',\n",
-       "                 Pipeline(steps=[('zerocount', ZeroCount()),\n",
-       "                                 ('quantiletransformer',\n",
-       "                                  QuantileTransformer(n_quantiles=1844))])),\n",
-       "                ('linearsvc', LinearSVC(C=6046.824997100118, penalty='l1'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0003237844275)),\n",
+       "                ('quantiletransformer', QuantileTransformer(n_quantiles=688)),\n",
+       "                ('baggingclassifier',\n",
+       "                 BaggingClassifier(bootstrap_features=True,\n",
+       "                                   max_features=0.2631592196919,\n",
+       "                                   max_samples=0.488886320861, n_estimators=72,\n",
+       "                                   n_jobs=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('pipeline',\n", - " Pipeline(steps=[('zerocount', ZeroCount()),\n", - " ('quantiletransformer',\n", - " QuantileTransformer(n_quantiles=1844))])),\n", - " ('linearsvc', LinearSVC(C=6046.824997100118, penalty='l1'))])" + "Pipeline(steps=[('variancethreshold',\n", + " VarianceThreshold(threshold=0.0003237844275)),\n", + " ('quantiletransformer', QuantileTransformer(n_quantiles=688)),\n", + " ('baggingclassifier',\n", + " BaggingClassifier(bootstrap_features=True,\n", + " max_features=0.2631592196919,\n", + " max_samples=0.488886320861, n_estimators=72,\n", + " n_jobs=1))])" ] }, - "execution_count": 36, + "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "print(\"sampled pipeline\")\n", - "full_search_space.generate().export_pipeline()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### UnionPipeline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### DynamicUnionPipeline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### WrapperPipeline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### TreePipeline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### GraphSearchPipeline" + "#plot the best pipeline\n", + "if isinstance(est.fitted_pipeline_, tpot2.GraphPipeline):\n", + " est.fitted_pipeline_.plot()\n", + " \n", + "est.fitted_pipeline_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# Optimize Search Space with TPOTEstimator\n", - "\n", - "Once you have constructed a search space, you can use TPOTEstimator to optimize a pipeline within that space." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import tpot2\n", - "import numpy as np\n", - "import sklearn\n", - "import sklearn.datasets\n", - "\n", - "# create dummy dataset\n", - "X, y = sklearn.datasets.make_classification(n_samples=200, n_features=10, n_classes=2)\n", - "\n", - "# train test split\n", - "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size=0.5)\n", - "\n", - "\n", - "\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = tpot2.config.get_search_space(\"selectors\"), \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "est = tpot2.TPOTEstimator(\n", - " scorers = [\"roc_auc\"],\n", - " scorers_weights = [1],\n", - " classification = True,\n", - " cv = 5,\n", - " search_space = graph_search_space,\n", - " population_size= 10,\n", - " generations = 5,\n", - " max_eval_time_mins = 60*5,\n", - " verbose = 2,\n", - ")\n", + "## Template Search Spaces\n", "\n", - "est.fit(X_train, y_train)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# score the model\n", + "As mentioned in Tutorial 1, TPOT has several buildin search spaces. Here is the same table:\n", "\n", - "auroc_scorer = sklearn.metrics.get_scorer(\"roc_auc\")\n", - "auroc_score = auroc_scorer(est, X_test, y_test)\n", + "| String | Description |\n", + "| :--- | :----: |\n", + "| linear | A linear pipeline with the structure of \"Selector->(transformers+Passthrough)->(classifiers/regressors+Passthrough)->final classifier/regressor.\" For both the transformer and inner estimator layers, TPOT may choose one or more transformers/classifiers, or it may choose none. The inner classifier/regressor layer is optional. |\n", + "| linear-light | Same search space as linear, but without the inner classifier/regressor layer and with a reduced set of faster running estimators. |\n", + "| graph | TPOT will optimize a pipeline in the shape of a directed acyclic graph. The nodes of the graph can include selectors, scalers, transformers, or classifiers/regressors (inner classifiers/regressors can optionally be not included). This will return a custom GraphPipeline rather than an sklearn Pipeline. More details in Tutorial 6. |\n", + "| graph-light | Same as graph search space, but without the inner classifier/regressors and with a reduced set of faster running estimators. |\n", + "| mdr |TPOT will search over a series of feature selectors and Multifactor Dimensionality Reduction models to find a series of operators that maximize prediction accuracy. The TPOT MDR configuration is specialized for genome-wide association studies (GWAS), and is described in detail online here. |\n", "\n", - "print(\"auroc score\", auroc_score)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#plot the best pipeline\n", - "est.fitted_pipeline_.plot()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "est" + "Rather than create your own search space, you can simply pass the string into the `search_space` param. Alternatively, you can access tpot2.config.template_search_spaces.get_template_search_spaces directly which offers a few more customizable options for each template including `cross_val_predict_cv` and whether or not stacked classifiers/regressors are allowed. Or you can copy the code and customize it manually!" ] }, { @@ -7373,7 +16063,7 @@ "from tpot2.search_spaces.pipelines import *\n", "from tpot2.config import get_search_space\n", "\n", - "selectors = get_search_space([\"selectors\",\"selectors_classification\", \"Passthrough\"])\n", + "selectors = get_search_space([\"selectors_classification\", \"Passthrough\"])\n", "estimators = get_search_space([\"classifiers\"])\n", "\n", "\n", From 01f3a785e4d6652f436b86b5bfcd8b03d95a4c92 Mon Sep 17 00:00:00 2001 From: perib Date: Tue, 24 Sep 2024 07:12:30 -0700 Subject: [PATCH 15/44] removed cross_val_predict_cv from the estimators. use wrapper pipeline instead (or set in graphsearchpipeline --- tpot2/builtin_modules/estimatortransformer.py | 6 ++++- tpot2/search_spaces/pipelines/graph.py | 9 ++++--- tpot2/search_spaces/pipelines/sequential.py | 3 +-- tpot2/tpot_estimator/estimator.py | 24 ++----------------- .../tpot_estimator/steady_state_estimator.py | 24 ++----------------- 5 files changed, 16 insertions(+), 50 deletions(-) diff --git a/tpot2/builtin_modules/estimatortransformer.py b/tpot2/builtin_modules/estimatortransformer.py index 5ae15a1c..9ed10c89 100644 --- a/tpot2/builtin_modules/estimatortransformer.py +++ b/tpot2/builtin_modules/estimatortransformer.py @@ -23,7 +23,11 @@ def __init__(self, estimator, method='auto', passthrough=False, cross_val_predic passthrough : bool, default=False Whether to pass the original input through. cross_val_predict_cv : int, default=0 - If greater than 0, will use cross_val_predict with the specified cv value to generate the transformed output. + Number of folds to use for the cross_val_predict function for inner classifiers and regressors. Estimators will still be fit on the full dataset, but the following node will get the outputs from cross_val_predict. + + - 0-1 : When set to 0 or 1, the cross_val_predict function will not be used. The next layer will get the outputs from fitting and transforming the full dataset. + - >=2 : When fitting pipelines with inner classifiers or regressors, they will still be fit on the full dataset. + However, the output to the next node will come from cross_val_predict with the specified number of folds. """ self.estimator = estimator diff --git a/tpot2/search_spaces/pipelines/graph.py b/tpot2/search_spaces/pipelines/graph.py index c972b59a..f767f88e 100644 --- a/tpot2/search_spaces/pipelines/graph.py +++ b/tpot2/search_spaces/pipelines/graph.py @@ -695,7 +695,6 @@ def __init__(self, crossover_same_depth: bool = False, cross_val_predict_cv: Union[int, Callable] = 0, #signature function(estimator, X, y=none) method: str = 'auto', - memory=None, use_label_encoder: bool = False): """ @@ -723,8 +722,12 @@ def __init__(self, crossover_same_depth: bool, optional If True, crossover will only occur between nodes at the same depth in the graph. If False, crossover will occur between nodes at any depth. - cross_val_predict_cv: int, cross-validation generator or an iterable, optional - Determines the cross-validation splitting strategy used in inner classifiers or regressors + cross_val_predict_cv : int, default=0 + Number of folds to use for the cross_val_predict function for inner classifiers and regressors. Estimators will still be fit on the full dataset, but the following node will get the outputs from cross_val_predict. + + - 0-1 : When set to 0 or 1, the cross_val_predict function will not be used. The next layer will get the outputs from fitting and transforming the full dataset. + - >=2 : When fitting pipelines with inner classifiers or regressors, they will still be fit on the full dataset. + However, the output to the next node will come from cross_val_predict with the specified number of folds. method: str, optional The prediction method to use for the inner classifiers or regressors. If 'auto', it will try to use predict_proba, decision_function, or predict in that order. diff --git a/tpot2/search_spaces/pipelines/sequential.py b/tpot2/search_spaces/pipelines/sequential.py index ab9f97da..3fae8a52 100644 --- a/tpot2/search_spaces/pipelines/sequential.py +++ b/tpot2/search_spaces/pipelines/sequential.py @@ -12,10 +12,9 @@ class SequentialPipelineIndividual(SklearnIndividual): # takes in a list of search spaces. each space is a list of SklearnIndividualGenerators. # will produce a pipeline of Sequential length. Each step in the pipeline will correspond to the the search space provided in the same index. - def __init__(self, search_spaces : List[SklearnIndividualGenerator], memory=None, rng=None) -> None: + def __init__(self, search_spaces : List[SklearnIndividualGenerator], rng=None) -> None: super().__init__() self.search_spaces = search_spaces - self.memory = memory self.pipeline = [] for space in self.search_spaces: diff --git a/tpot2/tpot_estimator/estimator.py b/tpot2/tpot_estimator/estimator.py index 1077f014..1e53ee64 100644 --- a/tpot2/tpot_estimator/estimator.py +++ b/tpot2/tpot_estimator/estimator.py @@ -43,7 +43,6 @@ def __init__(self, bigger_is_better = True, export_graphpipeline = False, - cross_val_predict_cv = 0, memory = None, categorical_features = None, @@ -157,13 +156,6 @@ def __init__(self, bigger_is_better : bool, default=True If True, the objective function is maximized. If False, the objective function is minimized. Use negative weights to reverse the direction. - cross_val_predict_cv : int, default=0 - Number of folds to use for the cross_val_predict function for inner classifiers and regressors. Estimators will still be fit on the full dataset, but the following node will get the outputs from cross_val_predict. - - - 0-1 : When set to 0 or 1, the cross_val_predict function will not be used. The next layer will get the outputs from fitting and transforming the full dataset. - - >=2 : When fitting pipelines with inner classifiers or regressors, they will still be fit on the full dataset. - However, the output to the next node will come from cross_val_predict with the specified number of folds. - memory: Memory object or string, default=None If supplied, pipeline will cache each transformer after calling fit with joblib.Memory. This feature is used to avoid computing the fit transformers within a pipeline if the parameters @@ -390,13 +382,8 @@ def __init__(self, self.search_space = search_space self.export_graphpipeline = export_graphpipeline - self.cross_val_predict_cv = cross_val_predict_cv self.memory = memory - if self.cross_val_predict_cv !=0 or self.memory is not None: - if not self.export_graphpipeline: - raise ValueError("cross_val_predict_cv and memory parameters are parameters for GraphPipeline. To enable these options export_graphpipeline to be True. Otherwise these can be passed into the relevant Search spaces as parameters.") - self.categorical_features = categorical_features self.subsets = subsets @@ -629,7 +616,6 @@ def objective_function(pipeline_individual, other_objective_functions=self.other_objective_functions, export_graphpipeline=self.export_graphpipeline, memory=self.memory, - cross_val_predict_cv=self.cross_val_predict_cv, **kwargs): return objective_function_generator( pipeline_individual, @@ -641,7 +627,6 @@ def objective_function(pipeline_individual, other_objective_functions=other_objective_functions, export_graphpipeline=export_graphpipeline, memory=memory, - cross_val_predict_cv=cross_val_predict_cv, **kwargs, ) @@ -786,8 +771,6 @@ def ind_generator(rng): other_objective_functions=self.other_objective_functions, export_graphpipeline=self.export_graphpipeline, memory=self.memory, - cross_val_predict_cv=self.cross_val_predict_cv, - **kwargs: objective_function_generator( ind, X, @@ -798,7 +781,6 @@ def ind_generator(rng): other_objective_functions=other_objective_functions, export_graphpipeline=export_graphpipeline, memory=memory, - cross_val_predict_cv=cross_val_predict_cv, **kwargs, )] @@ -844,7 +826,6 @@ def ind_generator(rng): other_objective_functions=self.other_objective_functions, export_graphpipeline=self.export_graphpipeline, memory=self.memory, - cross_val_predict_cv=self.cross_val_predict_cv, **kwargs: val_objective_function_generator( ind, X, @@ -855,7 +836,6 @@ def ind_generator(rng): other_objective_functions=other_objective_functions, export_graphpipeline=export_graphpipeline, memory=memory, - cross_val_predict_cv=cross_val_predict_cv, **kwargs, )] @@ -890,7 +870,7 @@ def ind_generator(rng): #TODO #best_individual_pipeline = best_individual.export_pipeline(memory=self.memory, cross_val_predict_cv=self.cross_val_predict_cv) if self.export_graphpipeline: - best_individual_pipeline = best_individual.export_flattened_graphpipeline(memory=self.memory, cross_val_predict_cv=self.cross_val_predict_cv) + best_individual_pipeline = best_individual.export_flattened_graphpipeline(memory=self.memory) else: best_individual_pipeline = best_individual.export_pipeline(memory=self.memory) @@ -982,7 +962,7 @@ def make_evaluated_individuals(self): self.evaluated_individuals = self.evaluated_individuals.set_index(self.evaluated_individuals.index.map(object_to_int)) self.evaluated_individuals['Parents'] = self.evaluated_individuals['Parents'].apply(lambda row: convert_parents_tuples_to_integers(row, object_to_int)) - self.evaluated_individuals["Instance"] = self.evaluated_individuals["Individual"].apply(lambda ind: apply_make_pipeline(ind, preprocessing_pipeline=self._preprocessing_pipeline, export_graphpipeline=self.export_graphpipeline, memory=self.memory, cross_val_predict_cv=self.cross_val_predict_cv)) + self.evaluated_individuals["Instance"] = self.evaluated_individuals["Individual"].apply(lambda ind: apply_make_pipeline(ind, preprocessing_pipeline=self._preprocessing_pipeline, export_graphpipeline=self.export_graphpipeline, memory=self.memory)) return self.evaluated_individuals diff --git a/tpot2/tpot_estimator/steady_state_estimator.py b/tpot2/tpot_estimator/steady_state_estimator.py index bcfef964..b86b9f8f 100644 --- a/tpot2/tpot_estimator/steady_state_estimator.py +++ b/tpot2/tpot_estimator/steady_state_estimator.py @@ -40,7 +40,6 @@ def __init__(self, export_graphpipeline = False, - cross_val_predict_cv = 0, memory = None, categorical_features = None, @@ -192,13 +191,6 @@ def __init__(self, - list : a list of strings out of the above options to include the corresponding methods in the configuration dictionary. - None : If None, a leaf will not be required (i.e. the pipeline can be a single root node). Leaf nodes will be generated from the inner_config_dict. - cross_val_predict_cv : int, default=0 - Number of folds to use for the cross_val_predict function for inner classifiers and regressors. Estimators will still be fit on the full dataset, but the following node will get the outputs from cross_val_predict. - - - 0-1 : When set to 0 or 1, the cross_val_predict function will not be used. The next layer will get the outputs from fitting and transforming the full dataset. - - >=2 : When fitting pipelines with inner classifiers or regressors, they will still be fit on the full dataset. - However, the output to the next node will come from cross_val_predict with the specified number of folds. - categorical_features: list or None Categorical columns to inpute and/or one hot encode during the preprocessing step. Used only if preprocessing is not False. - None : If None, TPOT2 will automatically use object columns in pandas dataframes as objects for one hot encoding in preprocessing. @@ -431,14 +423,8 @@ def __init__(self, self.bigger_is_better = bigger_is_better self.export_graphpipeline = export_graphpipeline - self.cross_val_predict_cv = cross_val_predict_cv self.memory = memory - if self.cross_val_predict_cv !=0 or self.memory is not None: - if not self.export_graphpipeline: - raise ValueError("cross_val_predict_cv and memory parameters are parameters for GraphPipeline. To enable these options export_graphpipeline to be True. Otherwise these can be passed into the relevant Search spaces as parameters.") - - self.categorical_features = categorical_features self.subsets = subsets self.preprocessing = preprocessing @@ -673,7 +659,6 @@ def objective_function(pipeline_individual, other_objective_functions=self.other_objective_functions, export_graphpipeline=self.export_graphpipeline, memory=self.memory, - cross_val_predict_cv=self.cross_val_predict_cv, **kwargs): return objective_function_generator( pipeline_individual, @@ -685,7 +670,6 @@ def objective_function(pipeline_individual, other_objective_functions=other_objective_functions, export_graphpipeline=export_graphpipeline, memory=memory, - cross_val_predict_cv=cross_val_predict_cv, **kwargs, ) @@ -794,7 +778,6 @@ def ind_generator(rng): other_objective_functions=self.other_objective_functions, export_graphpipeline=self.export_graphpipeline, memory=self.memory, - cross_val_predict_cv=self.cross_val_predict_cv, **kwargs: objective_function_generator( ind, @@ -806,7 +789,6 @@ def ind_generator(rng): other_objective_functions=other_objective_functions, export_graphpipeline=export_graphpipeline, memory=memory, - cross_val_predict_cv=cross_val_predict_cv, **kwargs, )] @@ -848,7 +830,6 @@ def ind_generator(rng): other_objective_functions=self.other_objective_functions, export_graphpipeline=self.export_graphpipeline, memory=self.memory, - cross_val_predict_cv=self.cross_val_predict_cv, **kwargs: val_objective_function_generator( ind, X, @@ -859,7 +840,6 @@ def ind_generator(rng): other_objective_functions=other_objective_functions, export_graphpipeline=export_graphpipeline, memory=memory, - cross_val_predict_cv=cross_val_predict_cv, **kwargs, )] @@ -891,7 +871,7 @@ def ind_generator(rng): if self.export_graphpipeline: - best_individual_pipeline = best_individual.export_flattened_graphpipeline(memory=self.memory, cross_val_predict_cv=self.cross_val_predict_cv) + best_individual_pipeline = best_individual.export_flattened_graphpipeline(memory=self.memory) else: best_individual_pipeline = best_individual.export_pipeline(memory=self.memory) @@ -981,7 +961,7 @@ def make_evaluated_individuals(self): self.evaluated_individuals = self.evaluated_individuals.set_index(self.evaluated_individuals.index.map(object_to_int)) self.evaluated_individuals['Parents'] = self.evaluated_individuals['Parents'].apply(lambda row: convert_parents_tuples_to_integers(row, object_to_int)) - self.evaluated_individuals["Instance"] = self.evaluated_individuals["Individual"].apply(lambda ind: apply_make_pipeline(ind, preprocessing_pipeline=self._preprocessing_pipeline, export_graphpipeline=self.export_graphpipeline, memory=self.memory, cross_val_predict_cv=self.cross_val_predict_cv)) + self.evaluated_individuals["Instance"] = self.evaluated_individuals["Individual"].apply(lambda ind: apply_make_pipeline(ind, preprocessing_pipeline=self._preprocessing_pipeline, export_graphpipeline=self.export_graphpipeline, memory=self.memory)) return self.evaluated_individuals From 29a4745899728de34ed1bf2fc3894ecf1262cd63 Mon Sep 17 00:00:00 2001 From: perib Date: Tue, 24 Sep 2024 07:49:04 -0700 Subject: [PATCH 16/44] replaces deprecated call with replacement function --- tpot2/config/classifiers.py | 34 ++++++++++++++++----------------- tpot2/config/get_configspace.py | 2 +- tpot2/config/imputers.py | 4 ++-- tpot2/config/regressors.py | 22 ++++++++++----------- tpot2/config/transformers.py | 8 ++++---- 5 files changed, 35 insertions(+), 35 deletions(-) diff --git a/tpot2/config/classifiers.py b/tpot2/config/classifiers.py index ddcff3d0..dd30efb5 100644 --- a/tpot2/config/classifiers.py +++ b/tpot2/config/classifiers.py @@ -28,8 +28,8 @@ def get_LogisticRegression_ConfigurationSpace(random_state): cs = ConfigurationSpace(space) - cs.add_hyperparameters([penalty, C, l1_ratio, class_weight]) - cs.add_conditions([l1_ratio_condition]) + cs.add([penalty, C, l1_ratio, class_weight]) + cs.add([l1_ratio_condition]) return cs @@ -68,8 +68,8 @@ def get_BaggingClassifier_ConfigurationSpace(random_state): space = space ) - cs.add_hyperparameters([bootstrap, oob_score]) - cs.add_conditions([oob_condition]) + cs.add([bootstrap, oob_score]) + cs.add([oob_condition]) return cs @@ -107,8 +107,8 @@ def get_LinearSVC_ConfigurationSpace(random_state): cs = ConfigurationSpace(space) - cs.add_hyperparameters([penalty, C, loss]) - cs.add_conditions([loss_condition]) + cs.add([penalty, C, loss]) + cs.add([loss_condition]) return cs @@ -136,8 +136,8 @@ def get_SVC_ConfigurationSpace(random_state): cs = ConfigurationSpace(space) - cs.add_hyperparameters([kernel, C, coef0, degree, gamma, shrinking, class_weight]) - cs.add_conditions([degree_condition, gamma_condition, coef0_condition]) + cs.add([kernel, C, coef0, degree, gamma, shrinking, class_weight]) + cs.add([degree_condition, gamma_condition, coef0_condition]) return cs @@ -250,8 +250,8 @@ def get_SGDClassifier_ConfigurationSpace(random_state): space = space ) - cs.add_hyperparameters([power_t, learning_rate]) - cs.add_conditions([powertcond]) + cs.add([power_t, learning_rate]) + cs.add([powertcond]) return cs @@ -321,8 +321,8 @@ def get_LinearDiscriminantAnalysis_ConfigurationSpace(): shrinkcond = NotEqualsCondition(shrinkage, solver, 'svd') cs = ConfigurationSpace() - cs.add_hyperparameters([solver, shrinkage]) - cs.add_conditions([shrinkcond]) + cs.add([solver, shrinkage]) + cs.add([shrinkcond]) return cs @@ -360,8 +360,8 @@ def get_GradientBoostingClassifier_ConfigurationSpace(n_classes, random_state): cs = ConfigurationSpace( space = space ) - cs.add_hyperparameters([n_iter_no_change, validation_fraction, early_stop ]) - cs.add_conditions([validation_fraction_cond, n_iter_no_change_cond]) + cs.add([n_iter_no_change, validation_fraction, early_stop ]) + cs.add([validation_fraction_cond, n_iter_no_change_cond]) return cs def GradientBoostingClassifier_hyperparameter_parser(params): @@ -429,8 +429,8 @@ def get_HistGradientBoostingClassifier_ConfigurationSpace(random_state): cs = ConfigurationSpace( space = space ) - cs.add_hyperparameters([n_iter_no_change, validation_fraction, early_stop ]) - cs.add_conditions([validation_fraction_cond, n_iter_no_change_cond]) + cs.add([n_iter_no_change, validation_fraction, early_stop ]) + cs.add([validation_fraction_cond, n_iter_no_change_cond]) return cs @@ -506,7 +506,7 @@ def get_MLPClassifier_ConfigurationSpace(random_state): learning_rate_init = Float("learning_rate_init", bounds=(1e-4, 1e-1), log=True) learning_rate = Categorical("learning_rate", ['constant', 'invscaling', 'adaptive']) - cs.add_hyperparameters([n_hidden_layers, n_nodes_per_layer, activation, alpha, learning_rate, early_stopping, learning_rate_init]) + cs.add([n_hidden_layers, n_nodes_per_layer, activation, alpha, learning_rate, early_stopping, learning_rate_init]) return cs diff --git a/tpot2/config/get_configspace.py b/tpot2/config/get_configspace.py index 46b13b60..a044c215 100644 --- a/tpot2/config/get_configspace.py +++ b/tpot2/config/get_configspace.py @@ -285,7 +285,7 @@ def get_configspace(name, n_classes=3, n_samples=1000, n_features=100, random_st case "PowerTransformer": return {} case "QuantileTransformer": - return transformers.get_QuantileTransformer_configspace(random_state=random_state) + return transformers.get_QuantileTransformer_configspace(n_samples=n_samples, random_state=random_state) case "RobustScaler": return transformers.RobustScaler_configspace case "ColumnOneHotEncoder": diff --git a/tpot2/config/imputers.py b/tpot2/config/imputers.py index 2c33629f..b991bf8d 100644 --- a/tpot2/config/imputers.py +++ b/tpot2/config/imputers.py @@ -38,8 +38,8 @@ def get_IterativeImputer_config_space(n_features, random_state): space['random_state'] = random_state cs = ConfigurationSpace(space=space) - cs.add_hyperparameters([estimator, sample_posterior]) - cs.add_conditions([sampling_condition]) + cs.add([estimator, sample_posterior]) + cs.add([sampling_condition]) return cs def get_KNNImputer_config_space(n_samples): diff --git a/tpot2/config/regressors.py b/tpot2/config/regressors.py index 11362cce..9e586947 100644 --- a/tpot2/config/regressors.py +++ b/tpot2/config/regressors.py @@ -62,10 +62,10 @@ def get_SGDRegressor_ConfigurationSpace(random_state): eta0_in_inv_con = InCondition(eta0, learning_rate, ["invscaling", "constant"]) power_t_condition = EqualsCondition(power_t, learning_rate, "invscaling") - cs.add_hyperparameters( + cs.add( [l1_ratio, penalty, epsilon, loss, eta0, learning_rate, power_t] ) - cs.add_conditions( + cs.add( [elasticnet, epsilon_condition, power_t_condition, eta0_in_inv_con] ) @@ -283,8 +283,8 @@ def get_SVR_ConfigurationSpace(): gamma_condition = InCondition(gamma, kernel, ['poly', 'rbf',]) coef0_condition = InCondition(coef0, kernel, ['poly', 'sigmoid']) - cs.add_hyperparameters([kernel, degree, gamma, coef0]) - cs.add_conditions([degree_condition,gamma_condition]) + cs.add([kernel, degree, gamma, coef0]) + cs.add([degree_condition,gamma_condition]) return cs @@ -409,8 +409,8 @@ def get_GradientBoostingRegressor_ConfigurationSpace(random_state): cs = ConfigurationSpace( space = space ) - cs.add_hyperparameters([n_iter_no_change, validation_fraction, early_stop ]) - cs.add_conditions([validation_fraction_cond, n_iter_no_change_cond]) + cs.add([n_iter_no_change, validation_fraction, early_stop ]) + cs.add([validation_fraction_cond, n_iter_no_change_cond]) return cs def GradientBoostingRegressor_hyperparameter_parser(params): @@ -479,8 +479,8 @@ def get_HistGradientBoostingRegressor_ConfigurationSpace(random_state): cs = ConfigurationSpace( space = space ) - cs.add_hyperparameters([n_iter_no_change, validation_fraction, early_stop ]) - cs.add_conditions([validation_fraction_cond, n_iter_no_change_cond]) + cs.add([n_iter_no_change, validation_fraction, early_stop ]) + cs.add([validation_fraction_cond, n_iter_no_change_cond]) return cs @@ -549,7 +549,7 @@ def get_MLPRegressor_ConfigurationSpace(random_state): learning_rate_init = Float("learning_rate_init", bounds=(1e-4, 1e-1), log=True) learning_rate = Categorical("learning_rate", ['constant', 'invscaling', 'adaptive']) - cs.add_hyperparameters([n_hidden_layers, n_nodes_per_layer, activation, alpha, learning_rate, early_stopping, learning_rate_init]) + cs.add([n_hidden_layers, n_nodes_per_layer, activation, alpha, learning_rate, early_stopping, learning_rate_init]) return cs @@ -592,8 +592,8 @@ def get_BaggingRegressor_ConfigurationSpace(random_state): space = space ) - cs.add_hyperparameters([bootstrap, oob_score]) - cs.add_conditions([oob_condition]) + cs.add([bootstrap, oob_score]) + cs.add([oob_condition]) return cs diff --git a/tpot2/config/transformers.py b/tpot2/config/transformers.py index 6d393460..ced2dee2 100644 --- a/tpot2/config/transformers.py +++ b/tpot2/config/transformers.py @@ -54,8 +54,8 @@ def get_FeatureAgglomeration_configspace(n_samples): metric_condition = NotEqualsCondition(metric, linkage, 'ward') cs = ConfigurationSpace() - cs.add_hyperparameters([linkage, metric, n_clusters, pooling_func]) - cs.add_condition(metric_condition) + cs.add([linkage, metric, n_clusters, pooling_func]) + cs.add(metric_condition) return cs @@ -108,10 +108,10 @@ def get_RBFSampler_configspace(n_features=100, random_state=None): ) -def get_QuantileTransformer_configspace(random_state=None): +def get_QuantileTransformer_configspace(random_state=None, n_samples=1000): space = { - 'n_quantiles': Integer('n_quantiles', bounds=(10, 2000)), + 'n_quantiles': Integer('n_quantiles', bounds=(10, n_samples)), 'output_distribution': Categorical('output_distribution', ['uniform', 'normal']), } From b0d58a8129bfaa945ee51c5079c51d9bd9113e93 Mon Sep 17 00:00:00 2001 From: perib Date: Tue, 24 Sep 2024 10:04:46 -0700 Subject: [PATCH 17/44] add iterative imputer with learned estimators and update tutorials more --- Tutorial/1_Using_TPOT.ipynb | 916 +- Tutorial/2_Search_Spaces.ipynb | 15163 +--------------------------- tpot2/config/get_configspace.py | 13 +- tpot2/config/imputers.py | 19 + tpot2/tpot_estimator/estimator.py | 3 - 5 files changed, 776 insertions(+), 15338 deletions(-) diff --git a/Tutorial/1_Using_TPOT.ipynb b/Tutorial/1_Using_TPOT.ipynb index 34e74c2b..c1d72eeb 100644 --- a/Tutorial/1_Using_TPOT.ipynb +++ b/Tutorial/1_Using_TPOT.ipynb @@ -8,19 +8,19 @@ "Automated machine learning (AutoML) takes a higher-level approach to machine learning than most practitioners are used to, so we've gathered a handful of guidelines on what to expect when running AutoML software such as TPOT.\n", "\n", "#### AUTOML ALGORITHMS AREN'T INTENDED TO RUN FOR ONLY A FEW MINUTES\n", - "Of course, you can run TPOT for only a few minutes and it will find a reasonably good pipeline for your dataset. However, if you don't run TPOT for long enough, it may not find the best possible pipeline for your dataset. It may even not find any suitable pipeline at all, in which case a RuntimeError('A pipeline has not yet been optimized. Please call fit() first.') will be raised. Often it is worthwhile to run multiple instances of TPOT in parallel for a long time (hours to days) to allow TPOT to thoroughly search the pipeline space for your dataset.\n", + "Of course, you can run TPOT for only a few minutes, and it will find a reasonably good pipeline for your dataset. However, if you don't run TPOT for long enough, it may not find the best possible pipeline for your dataset. It may not even find any suitable pipeline at all, in which case a RuntimeError('A pipeline has not yet been optimized. Please call fit() first.') will be raised. Often it is worthwhile to run multiple instances of TPOT in parallel for a long time (hours to days) to allow TPOT to thoroughly search the pipeline space for your dataset.\n", "\n", "#### AUTOML ALGORITHMS CAN TAKE A LONG TIME TO FINISH THEIR SEARCH\n", - "AutoML algorithms aren't as simple as fitting one model on the dataset; they are considering multiple machine learning algorithms (random forests, linear models, SVMs, etc.) in a pipeline with multiple preprocessing steps (missing value imputation, scaling, PCA, feature selection, etc.), the hyperparameters for all of the models and preprocessing steps, as well as multiple ways to ensemble or stack the algorithms within the pipeline.\n", + "AutoML algorithms aren't as simple as fitting one model on the dataset; they consider multiple machine learning algorithms (random forests, linear models, SVMs, etc.) in a pipeline with multiple preprocessing steps (missing value imputation, scaling, PCA, feature selection, etc.), the hyperparameters for all of the models and preprocessing steps, and multiple ways to ensemble or stack the algorithms within the pipeline.\n", "\n", "As such, TPOT will take a while to run on larger datasets, but it's important to realize why. With the default TPOT settings (100 generations with 100 population size), TPOT will evaluate 10,000 pipeline configurations before finishing. To put this number into context, think about a grid search of 10,000 hyperparameter combinations for a machine learning algorithm and how long that grid search will take. That is 10,000 model configurations to evaluate with 10-fold cross-validation, which means that roughly 100,000 models are fit and evaluated on the training data in one grid search. That's a time-consuming procedure, even for simpler models like decision trees.\n", "\n", - "Typical TPOT runs will take hours to days to finish (unless it's a small dataset), but you can always interrupt the run partway through and see the best results so far. TPOT also provides a warm_start parameter that lets you restart a TPOT run from where it left off.\n", + "Typical TPOT runs will take hours to days to finish (unless it's a small dataset), but you can always interrupt the run partway through and see the best results so far. TPOT also provides a warm_start and a periodic_checkpoint_folder parameter that lets you restart a TPOT run from where it left off.\n", "\n", "#### AUTOML ALGORITHMS CAN RECOMMEND DIFFERENT SOLUTIONS FOR THE SAME DATASET\n", - "If you're working with a reasonably complex dataset or run TPOT for a short amount of time, different TPOT runs may result in different pipeline recommendations. TPOT's optimization algorithm is stochastic in nature, which means that it uses randomness (in part) to search the possible pipeline space. When two TPOT runs recommend different pipelines, this means that the TPOT runs didn't converge due to lack of time or that multiple pipelines perform more-or-less the same on your dataset.\n", + "If you're working with a reasonably complex dataset or run TPOT for a short amount of time, different TPOT runs may result in different pipeline recommendations. TPOT's optimization algorithm is stochastic, which means that it uses randomness (in part) to search the possible pipeline space. When two TPOT runs recommend different pipelines, this means that the TPOT runs didn't converge due to lack of time or that multiple pipelines perform more-or-less the same on your dataset.\n", "\n", - "This is actually an advantage over fixed grid search techniques: TPOT is meant to be an assistant that gives you ideas on how to solve a particular machine learning problem by exploring pipeline configurations that you might have never considered, then leaves the fine-tuning to more constrained parameter tuning techniques such as grid search." + "This is actually an advantage over fixed grid search techniques: TPOT is meant to be an assistant that gives you ideas on how to solve a particular machine learning problem by exploring pipeline configurations that you might have never considered, then leaves the fine-tuning to more constrained parameter tuning techniques such as grid search or bayesian optimization." ] }, { @@ -29,7 +29,7 @@ "source": [ "# TPOT with code\n", "\n", - "We've taken care to design the TPOT interface to be as similar as possible to scikit-learn.\n", + "We've designed the TPOT interface to be as similar as possible to scikit-learn.\n", "\n", "TPOT can be imported just like any regular Python module. To import TPOT, type:" ] @@ -85,21 +85,21 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "Generation: : 3it [00:31, 10.38s/it]\n" + "Generation: : 4it [00:32, 8.03s/it]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ - "auroc_score: 0.9904100529100529\n" + "auroc_score: 0.9900793650793651\n" ] } ], @@ -176,9 +176,9 @@ "source": [ "## Measuring Model Complexity\n", "\n", - "When running TPOT, it can sometimes be beneficial to include a secondary objective that measures model complexity. More complex models can yield higher performance but this comes at the cost of interpretability. Simpler models may be more interpretable, but often have lower predictive performance. Sometimes, however, vast increases in complexity only marginally improve predictive performance. There may be other simpler and more interpretable pipelines with marginal performance decreases that could be acceptable for the increased interpretability. However, these pipelines are often missed by optimizing purely for performance. By including both performance and complexity as objective functions, TPOT will attempt to optimize the best pipeline for all complexity levels simultaneously. After optimization, the user will be able to see the complexity vs performance tradeoff and make the decision of which pipeline best suits their needs. \n", + "When running TPOT, including a secondary objective that measures model complexity can sometimes be beneficial. More complex models can yield higher performance, but this comes at the cost of interpretability. Simpler models may be more interpretable but often have lower predictive performance. Sometimes, however, vast increases in complexity only marginally improve predictive performance. There may be other simpler and more interpretable pipelines with marginal performance decreases that could be acceptable for the increased interpretability. However, these pipelines are often missed when optimizing purely for performance. By including both performance and complexity as objective functions, TPOT will attempt to optimize the best pipeline for all complexity levels simultaneously. After optimization, the user will be able to see the complexity vs performance tradeoff and decide which pipeline best suits their needs. \n", "\n", - "Two methods of measuring complexity to consider would be `tpot2.objectives.number_of_nodes_objective` or `tpot2.objectives.complexity_scorer`. The number of nodes objective simply calculates the number of steps within a pipeline. This is a simple metric, however it does not differentiate between the complexity of different model types. For example, a simple LogisticRegression counts the same as the much more complex XGBoost. The complexity scorer tries to estimate the number of learned parameters included in the classifiers and regressors of the pipeline. It is challenging and potentially subjective how to exactly quantify and compare complexity between different classes of models. However, this function provides a reasonable heuristic for the evolutionary algorithm that at least separates out qualitatively more or less complex algorithms from one another. While it may be hard to exactly compare the relative complexities of LogisticRegression and XGBoost, for example, both will always be on opposite ends of the complexity values returned by this function. This allows for pareto fronts with LogisticRegression on one side, and XGBoost on the other.\n", + "Two methods of measuring complexity to consider would be `tpot2.objectives.number_of_nodes_objective` or `tpot2.objectives.complexity_scorer`. The number of nodes objective simply calculates the number of steps within a pipeline. This is a simple metric, however it does not differentiate between the complexity of different model types. For example, a simple LogisticRegression counts the same as the much more complex XGBoost. The complexity scorer tries to estimate the number of learned parameters included in the classifiers and regressors of the pipeline. It is challenging and potentially subjective how to exactly quantify and compare complexity between different classes of models. However, this function provides a reasonable heuristic for the evolutionary algorithm that at least separates out qualitatively more or less complex algorithms from one another. While it may be hard to compare the relative complexities of LogisticRegression and XGBoost exactly, for example, both will always be on opposite ends of the complexity values returned by this function. This allows for pareto fronts with LogisticRegression on one side, and XGBoost on the other.\n", "\n", "An example of this analysis is demonstrated in a following section." ] @@ -211,12 +211,12 @@ "source": [ "## Terminating Optimization (Early Stopping)\n", "\n", - "Note that we use a short time duration for a quick example, but in practice you may need to run TPOT for a longer duration. by default, TPOT sets a time limit of 1 hour with a max limit of 5 minutes per pipeline. In practice you may want to increase these values.\n", + "Note that we use a short time duration for a quick example, but in practice, you may need to run TPOT for a longer duration. By default, TPOT sets a time limit of 1 hour with a max limit of 5 minutes per pipeline. In practice, you may want to increase these values.\n", "\n", - "There are three methods of terminating a TPOT run and ending the optimization process. TPOT will always terminate as soon as one of the conditions is met.\n", - "* `max_time_mins` : (Default, 1 hour) After this many minutes, TPOT will terminate and return the best pipeline it found so far.\n", - "* `early_stop` : An int causes TPOT to terminate early if it goes that number of generations without seeing an improvement in performance. Generally a value of around 5 to 20 is sufficient to be reasonably sure that performance has converged.\n", - "* `generations` : The total number of generations of the evolutionary algorithm to run.\n", + "There are three methods of terminating a TPOT run and ending the optimization process. TPOT will terminate as soon as one of the conditions is met.\n", + "* `max_time_mins` : (Default, 60 minutes) After this many minutes, TPOT will terminate and return the best pipeline it found so far.\n", + "* `early_stop` : The number of generations without seeing an improvement in performance, after which TPOT terminates. Generally, a value of around 5 to 20 is sufficient to be reasonably sure that performance has converged.\n", + "* `generations`: The total number of generations of the evolutionary algorithm to run.\n", "\n", "By default, TPOT will run until the time limit is up, with no generation or early stop limits." ] @@ -290,14 +290,14 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "Generation: : 4it [01:39, 24.97s/it]\n", + "Generation: : 5it [01:33, 18.64s/it]\n", "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/linear_model/_sag.py:349: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge\n", " warnings.warn(\n" ] @@ -306,7 +306,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "0.9948227797690163\n" + "0.9797527706734868\n" ] } ], @@ -352,7 +352,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 6, "metadata": {}, "outputs": [ { @@ -762,11 +762,12 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
Pipeline(steps=[('standardscaler', StandardScaler()),\n",
-       "                ('selectfwe', SelectFwe(alpha=0.0001226579434)),\n",
+       "
Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n",
+       "                ('selectfwe', SelectFwe(alpha=0.0142080454732)),\n",
        "                ('featureunion-1',\n",
-       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
-       "                                                 SkipTransformer()),\n",
+       "                 FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                 FeatureUnion(transformer_list=[('passkbinsdiscretizer',\n",
+       "                                                                                 PassKBinsDiscretizer(n_bins=8))])),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
        "                ('featureunion-2',\n",
@@ -775,12 +776,14 @@
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
        "                ('logisticregression',\n",
-       "                 LogisticRegression(C=31921.0176296069, max_iter=1000, n_jobs=1,\n",
-       "                                    solver='saga'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
MaxAbsScaler()
SelectFwe(alpha=0.0142080454732)
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('passkbinsdiscretizer',\n",
+       "                                                                PassKBinsDiscretizer(n_bins=8))])),\n",
+       "                               ('passthrough', Passthrough())])
PassKBinsDiscretizer(n_bins=8)
Passthrough()
FeatureUnion(transformer_list=[('skiptransformer', SkipTransformer()),\n",
+       "                               ('passthrough', Passthrough())])
SkipTransformer()
Passthrough()
LogisticRegression(C=462.7983711938423, class_weight='balanced', max_iter=1000,\n",
+       "                   n_jobs=1, solver='saga')
" ], "text/plain": [ - "Pipeline(steps=[('standardscaler', StandardScaler()),\n", - " ('selectfwe', SelectFwe(alpha=0.0001226579434)),\n", + "Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n", + " ('selectfwe', SelectFwe(alpha=0.0142080454732)),\n", " ('featureunion-1',\n", - " FeatureUnion(transformer_list=[('skiptransformer',\n", - " SkipTransformer()),\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('passkbinsdiscretizer',\n", + " PassKBinsDiscretizer(n_bins=8))])),\n", " ('passthrough',\n", " Passthrough())])),\n", " ('featureunion-2',\n", @@ -808,11 +816,12 @@ " ('passthrough',\n", " Passthrough())])),\n", " ('logisticregression',\n", - " LogisticRegression(C=31921.0176296069, max_iter=1000, n_jobs=1,\n", - " solver='saga'))])" + " LogisticRegression(C=462.7983711938423,\n", + " class_weight='balanced', max_iter=1000,\n", + " n_jobs=1, solver='saga'))])" ] }, - "execution_count": 11, + "execution_count": 6, "metadata": {}, "output_type": "execute_result" } @@ -824,22 +833,22 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "array([1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1,\n", - " 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1,\n", - " 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0,\n", - " 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1,\n", - " 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0,\n", - " 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0,\n", - " 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1])" + "array([0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0,\n", + " 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1,\n", + " 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0,\n", + " 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1,\n", + " 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0,\n", + " 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1,\n", + " 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1])" ] }, - "execution_count": 12, + "execution_count": 7, "metadata": {}, "output_type": "execute_result" } @@ -859,7 +868,7 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 8, "metadata": {}, "outputs": [], "source": [ @@ -883,7 +892,7 @@ "\n", "| Column | Description |\n", "| :--- | :----: |\n", - "| | The first set of columns will correspond to each objective function. These can either be automatically named by TPOT, or passed in by the user. |\n", + "| \\ | The first set of columns will correspond to each objective function. These can either be automatically named by TPOT, or passed in by the user. |\n", "| Parents | This contains a tuple that contains the indexes of the 'parents' of the current pipeline. For example, (29, 42) means that the pipelines in indexes 29 and 42 were utilized to generate that pipeline. |\n", "| Variation_Function | The function applied to the parents to generate the new pipeline |\n", "| Individual | The individual class that represents a specific pipeline and hyperparameter configuration. This class also contains functions for mutation and crossover. To get the sklearn estimator/pipeline object from the individual you can call the `export_pipeline()` function. (as in, `pipe = ind.export_pipeline()`) |\n", @@ -896,7 +905,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 9, "metadata": {}, "outputs": [ { @@ -905,7 +914,7 @@ "['roc_auc_score', 'complexity_scorer']" ] }, - "execution_count": 14, + "execution_count": 9, "metadata": {}, "output_type": "execute_result" } @@ -917,7 +926,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 10, "metadata": {}, "outputs": [ { @@ -957,73 +966,73 @@ " \n", " \n", " 0\n", - " NaN\n", - " NaN\n", + " 0.994779\n", + " 38.8\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", - " INVALID\n", + " 1.727192e+09\n", + " 1.727192e+09\n", + " None\n", " NaN\n", - " (StandardScaler(), VarianceThreshold(threshold...\n", + " (Normalizer(norm='l1'), Passthrough(), Feature...\n", " \n", " \n", " 1\n", - " 0.990146\n", - " 10.0\n", + " 0.884608\n", + " 12.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " (MaxAbsScaler(), VarianceThreshold(threshold=0...\n", + " (MinMaxScaler(), SelectPercentile(percentile=6...\n", " \n", " \n", " 2\n", - " NaN\n", - " NaN\n", + " 0.995994\n", + " 277.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", - " INVALID\n", + " 1.727192e+09\n", + " 1.727192e+09\n", + " None\n", " NaN\n", - " (StandardScaler(), VarianceThreshold(threshold...\n", + " (MaxAbsScaler(), Passthrough(), FeatureUnion(t...\n", " \n", " \n", " 3\n", - " 0.961892\n", - " 80.0\n", + " 0.969714\n", + " 97.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " (RobustScaler(quantile_range=(0.2009793033711,...\n", + " (RobustScaler(quantile_range=(0.1797291876324,...\n", " \n", " \n", " 4\n", - " 0.955582\n", - " 8.0\n", + " 0.977700\n", + " 10.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " (MaxAbsScaler(), Passthrough(), FeatureUnion(t...\n", + " (RobustScaler(quantile_range=(0.0723902721638,...\n", " \n", " \n", " ...\n", @@ -1040,93 +1049,106 @@ " ...\n", " \n", " \n", - " 245\n", - " 0.981354\n", - " 11.0\n", - " (176, 176)\n", - " ind_mutate\n", + " 295\n", + " 0.994663\n", + " 17.0\n", + " (245, 2)\n", + " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", - " NaN\n", - " (MinMaxScaler(), SelectFwe(alpha=0.00036066272...\n", + " 1.0\n", + " (MinMaxScaler(), SelectPercentile(percentile=3...\n", " \n", " \n", - " 246\n", - " 0.972795\n", - " 10.0\n", - " (145, 58)\n", - " ind_crossover\n", + " 296\n", + " NaN\n", + " NaN\n", + " (190, 190)\n", + " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", - " None\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", + " INVALID\n", " NaN\n", - " (MaxAbsScaler(), Passthrough(), FeatureUnion(t...\n", + " (RobustScaler(quantile_range=(0.0790382918495,...\n", " \n", " \n", - " 247\n", - " 0.895754\n", - " 6.0\n", - " (195, 195)\n", - " ind_mutate\n", + " 297\n", + " 0.995224\n", + " 231.0\n", + " (168, 232)\n", + " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " (StandardScaler(), VarianceThreshold(threshold...\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.03220532277...\n", " \n", " \n", - " 248\n", - " 0.978311\n", - " 7.0\n", - " (32, 32)\n", - " ind_mutate\n", + " 298\n", + " 0.974412\n", + " 10.0\n", + " (93, 205)\n", + " ind_mutate , ind_mutate , ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " (MaxAbsScaler(), SelectFwe(alpha=0.00140487405...\n", + " (MaxAbsScaler(), Passthrough(), FeatureUnion(t...\n", " \n", " \n", - " 249\n", - " 0.983915\n", - " 9.0\n", - " (99, 99)\n", + " 299\n", + " 0.932393\n", + " 8.0\n", + " (234, 234)\n", " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " (StandardScaler(), VarianceThreshold(threshold...\n", + " (Normalizer(norm='l1'), VarianceThreshold(thre...\n", " \n", " \n", "\n", - "

250 rows × 11 columns

\n", + "

300 rows × 11 columns

\n", "" ], "text/plain": [ - " roc_auc_score complexity_scorer Parents Variation_Function \\\n", - "0 NaN NaN NaN NaN \n", - "1 0.990146 10.0 NaN NaN \n", - "2 NaN NaN NaN NaN \n", - "3 0.961892 80.0 NaN NaN \n", - "4 0.955582 8.0 NaN NaN \n", - ".. ... ... ... ... \n", - "245 0.981354 11.0 (176, 176) ind_mutate \n", - "246 0.972795 10.0 (145, 58) ind_crossover \n", - "247 0.895754 6.0 (195, 195) ind_mutate \n", - "248 0.978311 7.0 (32, 32) ind_mutate \n", - "249 0.983915 9.0 (99, 99) ind_mutate \n", + " roc_auc_score complexity_scorer Parents \\\n", + "0 0.994779 38.8 NaN \n", + "1 0.884608 12.0 NaN \n", + "2 0.995994 277.0 NaN \n", + "3 0.969714 97.0 NaN \n", + "4 0.977700 10.0 NaN \n", + ".. ... ... ... \n", + "295 0.994663 17.0 (245, 2) \n", + "296 NaN NaN (190, 190) \n", + "297 0.995224 231.0 (168, 232) \n", + "298 0.974412 10.0 (93, 205) \n", + "299 0.932393 8.0 (234, 234) \n", + "\n", + " Variation_Function \\\n", + "0 NaN \n", + "1 NaN \n", + "2 NaN \n", + "3 NaN \n", + "4 NaN \n", + ".. ... \n", + "295 ind_crossover \n", + "296 ind_mutate \n", + "297 ind_crossover \n", + "298 ind_mutate , ind_mutate , ind_crossover \n", + "299 ind_mutate \n", "\n", " Individual Generation \\\n", "0 " ] @@ -1206,7 +1228,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -1238,7 +1260,7 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 12, "metadata": {}, "outputs": [ { @@ -1277,88 +1299,158 @@ " \n", " \n", " \n", - " 58\n", - " 0.997081\n", - " 30.4\n", - " (17, 17)\n", + " 211\n", + " 0.997632\n", + " 231.0\n", + " (2, 2)\n", " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 1.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 4.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " 1.0\n", - " (StandardScaler(), SelectFwe(alpha=0.000122657...\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.01420804547...\n", " \n", " \n", - " 141\n", - " 0.994594\n", - " 11.0\n", - " (46, 46)\n", + " 251\n", + " 0.996697\n", + " 213.9\n", + " (211, 211)\n", " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 2.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " 1.0\n", - " (MaxAbsScaler(), Passthrough(), FeatureUnion(t...\n", + " (MaxAbsScaler(), VarianceThreshold(threshold=0...\n", " \n", " \n", - " 204\n", - " 0.993951\n", - " 9.0\n", - " (90, 90)\n", - " ind_mutate\n", + " 261\n", + " 0.996394\n", + " 182.4\n", + " (211, 182)\n", + " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " 1.0\n", - " (MinMaxScaler(), Passthrough(), FeatureUnion(t...\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.01420804547...\n", " \n", " \n", - " 149\n", - " 0.985930\n", - " 8.0\n", - " (88, 88)\n", - " ind_mutate\n", + " 132\n", + " 0.996313\n", + " 33.0\n", + " (85, 18)\n", + " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 2.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " 1.0\n", - " (MinMaxScaler(), SelectPercentile(percentile=5...\n", + " (StandardScaler(), SelectFwe(alpha=0.000377258...\n", " \n", " \n", - " 178\n", - " 0.980084\n", - " 7.0\n", - " (131, 131)\n", - " ind_mutate\n", + " 190\n", + " 0.996207\n", + " 27.5\n", + " (141, 122)\n", + " ind_mutate , ind_mutate , ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 3.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " 1.0\n", - " (MaxAbsScaler(), SelectFwe(alpha=0.00020686984...\n", + " (MaxAbsScaler(), VarianceThreshold(threshold=0...\n", " \n", " \n", - " 201\n", - " 0.949153\n", - " 6.0\n", - " (176, 32)\n", + " 250\n", + " 0.995593\n", + " 24.3\n", + " (173, 173)\n", + " ind_mutate\n", + " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", + " None\n", + " 1.0\n", + " (MinMaxScaler(), SelectFwe(alpha=0.00034913463...\n", + " \n", + " \n", + " 295\n", + " 0.994663\n", + " 17.0\n", + " (245, 2)\n", + " ind_crossover\n", + " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", + " None\n", + " 1.0\n", + " (MinMaxScaler(), SelectPercentile(percentile=3...\n", + " \n", + " \n", + " 227\n", + " 0.990767\n", + " 11.0\n", + " (188, 168)\n", + " ind_crossover\n", + " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", + " 4.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", + " None\n", + " 1.0\n", + " (Passthrough(), SelectFwe(alpha=0.000109999882...\n", + " \n", + " \n", + " 226\n", + " 0.989583\n", + " 9.0\n", + " (144, 90)\n", " ind_mutate , ind_mutate , ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 4.0\n", - " 1.727136e+09\n", - " 1.727136e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " 1.0\n", - " (MinMaxScaler(), SelectFwe(alpha=0.00140487405...\n", + " (StandardScaler(), VarianceThreshold(threshold...\n", + " \n", + " \n", + " 213\n", + " 0.976500\n", + " 8.1\n", + " (151, 151)\n", + " ind_crossover , ind_mutate\n", + " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", + " 4.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", + " None\n", + " 1.0\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.03007503043...\n", + " \n", + " \n", + " 290\n", + " 0.960114\n", + " 6.0\n", + " (240, 240)\n", + " ind_mutate\n", + " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", + " 5.0\n", + " 1.727192e+09\n", + " 1.727192e+09\n", + " None\n", + " 1.0\n", + " (MinMaxScaler(), VarianceThreshold(threshold=0...\n", " \n", " \n", "\n", @@ -1366,47 +1458,72 @@ ], "text/plain": [ " roc_auc_score complexity_scorer Parents \\\n", - "58 0.997081 30.4 (17, 17) \n", - "141 0.994594 11.0 (46, 46) \n", - "204 0.993951 9.0 (90, 90) \n", - "149 0.985930 8.0 (88, 88) \n", - "178 0.980084 7.0 (131, 131) \n", - "201 0.949153 6.0 (176, 32) \n", + "211 0.997632 231.0 (2, 2) \n", + "251 0.996697 213.9 (211, 211) \n", + "261 0.996394 182.4 (211, 182) \n", + "132 0.996313 33.0 (85, 18) \n", + "190 0.996207 27.5 (141, 122) \n", + "250 0.995593 24.3 (173, 173) \n", + "295 0.994663 17.0 (245, 2) \n", + "227 0.990767 11.0 (188, 168) \n", + "226 0.989583 9.0 (144, 90) \n", + "213 0.976500 8.1 (151, 151) \n", + "290 0.960114 6.0 (240, 240) \n", "\n", " Variation_Function \\\n", - "58 ind_mutate \n", - "141 ind_mutate \n", - "204 ind_mutate \n", - "149 ind_mutate \n", - "178 ind_mutate \n", - "201 ind_mutate , ind_mutate , ind_crossover \n", + "211 ind_mutate \n", + "251 ind_mutate \n", + "261 ind_crossover \n", + "132 ind_crossover \n", + "190 ind_mutate , ind_mutate , ind_crossover \n", + "250 ind_mutate \n", + "295 ind_crossover \n", + "227 ind_crossover \n", + "226 ind_mutate , ind_mutate , ind_crossover \n", + "213 ind_crossover , ind_mutate \n", + "290 ind_mutate \n", "\n", " Individual Generation \\\n", - "58
Pipeline(steps=[('minmaxscaler', MinMaxScaler()),\n",
-       "                ('selectfwe', SelectFwe(alpha=0.0014048740592)),\n",
+       "                ('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0004675292341)),\n",
        "                ('featureunion-1',\n",
        "                 FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                                 FeatureUnion(transformer_list=[('zerocount',\n",
-       "                                                                                 ZeroCount())])),\n",
+       "                                                 FeatureUnion(transformer_list=[('quantiletransformer',\n",
+       "                                                                                 QuantileTransformer(n_quantiles=104))])),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
        "                ('featureunion-2',\n",
-       "                 FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                                 FeatureUnion(transformer_list=[('estimatortransformer',\n",
-       "                                                                                 EstimatorTransformer(estimator=BernoulliNB(alpha=76.5761838773666,\n",
-       "                                                                                                                            fit_prior=False)))])),\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
        "                ('kneighborsclassifier',\n",
-       "                 KNeighborsClassifier(n_jobs=1, n_neighbors=1, p=3))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
MinMaxScaler()
VarianceThreshold(threshold=0.0004675292341)
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('quantiletransformer',\n",
+       "                                                                QuantileTransformer(n_quantiles=104))])),\n",
+       "                               ('passthrough', Passthrough())])
QuantileTransformer(n_quantiles=104)
Passthrough()
FeatureUnion(transformer_list=[('skiptransformer', SkipTransformer()),\n",
+       "                               ('passthrough', Passthrough())])
SkipTransformer()
Passthrough()
KNeighborsClassifier(n_jobs=1, n_neighbors=1)
" ], "text/plain": [ "Pipeline(steps=[('minmaxscaler', MinMaxScaler()),\n", - " ('selectfwe', SelectFwe(alpha=0.0014048740592)),\n", + " ('variancethreshold',\n", + " VarianceThreshold(threshold=0.0004675292341)),\n", " ('featureunion-1',\n", " FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('zerocount',\n", - " ZeroCount())])),\n", + " FeatureUnion(transformer_list=[('quantiletransformer',\n", + " QuantileTransformer(n_quantiles=104))])),\n", " ('passthrough',\n", " Passthrough())])),\n", " ('featureunion-2',\n", - " FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('estimatortransformer',\n", - " EstimatorTransformer(estimator=BernoulliNB(alpha=76.5761838773666,\n", - " fit_prior=False)))])),\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", " ('passthrough',\n", " Passthrough())])),\n", " ('kneighborsclassifier',\n", - " KNeighborsClassifier(n_jobs=1, n_neighbors=1, p=3))])" + " KNeighborsClassifier(n_jobs=1, n_neighbors=1))])" ] }, - "execution_count": 18, + "execution_count": 13, "metadata": {}, "output_type": "execute_result" } @@ -1908,21 +2019,19 @@ "source": [ "## Plot performance over time + Continuing a run from where it left off\n", "\n", - "Plotting the performance over time is a good way of trying to access whether or not the TPOT model has converged. If performance seems to asymptote over time, there may not be much more performance to be gained by running for a longer period of time. If the plot looks like it is still actively improving, it may be worth running TPOT for a longer duration. \n", + "Plotting performance over time is a good way to assess whether or not the TPOT model has converged. If performance asymptotes over time, there may not be much more performance to be gained by running for a longer period. If the plot looks like it is still improving, it may be worth running TPOT for a longer duration. \n", "\n", - "There are two ways to resume TPOT. If the `warm_start` parameter is set to True, subsequent calls to `fit` will continue training where it left off (The conventional scikit-learn default is to retrain from scratch on subsequent calls to fit). Additionally, if `periodic_checkpoint_folder` is set, TPOT will periodically save its current state. If TPOT terminates normally, is interrupted (job canceled, PC shut off), or crashes (memory issues), it will be able to resume training from where it left off. ** NOTE: If the periodic_checkpoint_folder is set, TPOT will always resume from the **\n", - "\n", - "In this case we can see that performance is near optimal and has slowed, so more time is likely unnecessary." + "In this case, we can see that performance is near optimal and has slowed, so more time is likely unnecessary." ] }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 14, "metadata": {}, "outputs": [ { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -1947,6 +2056,20 @@ "plt.show()\n" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Checkpointing\n", + "\n", + "There are two ways to resume TPOT. \n", + "* If the `warm_start` parameter is set to True, subsequent calls to `fit` will continue training where it left off (The conventional scikit-learn default is to retrain from scratch on subsequent calls to fit). \n", + "* If `periodic_checkpoint_folder` is set, TPOT will periodically save its current state to disk. If TPOT is interrupted (job canceled, PC shut off, crashes), you can resume training from where it left off. The checkpoint folder stores a data frame of all evaluated pipelines. This data frame can be loaded and inspected to help diagnose problems when debugging.\n", + "\n", + "\n", + "**Note: TPOT does not clean up the checkpoint files. If the `periodic_checkpoint_folder` parameter is set, training from the last saved point will always continue, even if the input data has changed. A common issue is forgetting to change this folder between experiments and TPOT continuing training from pipelines optimized for another dataset. If you intend to start a run from scratch, you must either remove the parameter, supply an empty folder, or delete the original checkpoint folder.**" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -1983,14 +2106,14 @@ "source": [ "# Pipeline caching in TPOT (joblib.Memory)\n", "\n", - "With the memory parameter, pipelines can cache the results of each transformer after fitting them. This feature is used to avoid repeated computation by transformers within a pipeline if the parameters and input data are identical to another fitted pipeline during optimization process. TPOT allows users to specify a custom directory path or joblib.Memory in case they want to re-use the memory cache in future TPOT runs (or a warm_start run).\n", + "With the memory parameter, pipelines can cache the results of each transformer after fitting them. This feature is used to avoid repeated computation by transformers within a pipeline if the parameters and input data are identical to another fitted pipeline during the optimization process. TPOT allows users to specify a custom directory path or joblib.Memory in case they want to re-use the memory cache in future TPOT runs (or a warm_start run).\n", "\n", "There are three methods for enabling memory caching in TPOT:" ] }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 15, "metadata": {}, "outputs": [], "source": [ @@ -2017,23 +2140,6 @@ "**Note: TPOT does NOT clean up memory caches if users set a custom directory path or Memory object. We recommend that you clean up the memory caches when you don't need it anymore.**" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Checkpointing\n", - "\n", - "TPOT can checkpoint its progress to disk and resume from that point later if the `periodic_checkpoint_folder` parameter is used. TPOT will save its internal dataframe of pipelines and their performance to disk every generation, allowing you to interrupt TPOT’s execution and resume it later on the same or a different machine.\n", - "\n", - "This feature is useful in several scenarios:\n", - "\n", - "Interrupting TPOT’s execution and resuming it later on the same or a different machine.\n", - "Handling unexpected terminations, such as power outages, cluster job cancellations, bugs, errors, or out-of-memory issues. The checkpointed dataframe can be loaded and inspected to help diagnose problems.\n", - "Running TPOT on a cluster and periodically saving its progress to disk.\n", - "\n", - "**Note: TPOT does not clean up the checkpoint files. If the `periodic_checkpoint_folder` parameter is set, it will always continue training from the last saved point, even if the input data has changed. A common issue is forgetting to change this folder between experiments, and TPOT continuing training from pipelines optimized for another dataset. If you intend to start a run from scratch, you must either remove the parameter, supply an empty folder, or delete the original checkpoint folder.**" - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -2051,48 +2157,49 @@ "\n", "If you are experiencing issues with TPOT, here are some common issues and how to address them.\n", "\n", - "* Performance is lower than expected. what can I do?\n", + "* Performance is lower than expected. What can I do?\n", " * TPOT may have to be run for a longer duration, increase `max_time_mins`, `early_stop`, or `generations`.\n", - " * Individual pipelines may need more time to complete fitting, increase `max_eval_time_seconds`.\n", - " * The configuration may not include the optimal model types or hyperparameter ranges, explore other included templates or customize your own search space (see Tutorial 2!)\n", - " * Check that `periodic_checkpoint_folder` is set correctly. A common issue is forgetting to change this folder between experiments, and TPOT continuing training from pipelines optimized for another dataset.\n", + " * Individual pipelines may need more time to complete fitting; increase `max_eval_time_seconds.`\n", + " * The configuration may not include the optimal model types or hyperparameter ranges, explore other included templates, or customize your own search space (see Tutorial 2!)\n", + " * Check that `periodic_checkpoint_folder` is set correctly. A common issue is forgetting to change this folder between experiments and TPOT continuing training from pipelines optimized for another dataset.\n", "* TPOT is too slow! It is running forever and never terminating\n", - " * Check that at least one of the three termination conditions is set to a reasonable level. These are `max_time_mins`, `early_stop`, or `generations`. Additionally check that `max_eval_time_seconds` is giving enough time for most models to train without being overly long. (Some estimators may take an unreasonably long time to fit, this parameter is intended to prevent them from slowing everything to a halt. In my experience, SVC and SVR tend to be the culprits, so removing them from the search space may also improve run time).\n", + " * Check that at least one of the three termination conditions is set to a reasonable level. These are `max_time_mins`, `early_stop`, or `generations`. Additionally, check that `max_eval_time_seconds` gives enough time for most models to train without being overly long. (Some estimators may take an unreasonably long time to fit; this parameter is intended to prevent them from slowing everything to a halt. In my experience, SVC and SVR tend to be the culprits, so removing them from the search space may also improve run time).\n", " * Set the `memory` parameter to allow TPOT to prevent repeated work when using either scikit-learn pipelines or TPOT GraphPipelines.\n", " * Increase n_jobs to use more processes/CPU power. See Tutorial 7 for advanced Dask usage, including parallelizing across multiple nodes on an HPC.\n", " * Use feature selection, either the build in configuration of sklearn methods (see Tutorial 2), or genetic feature selection (see Tutorials 3 and 5 for two different strategies).\n", " * Use successive halving to reduce computational load (See tutorial 8).\n", - "* Many pipelines in the evaluated_individuals dataframe have crashed or turned up invalid!\n", - " * This may actually be normal and is expected behavior for TPOT. In some cases, TPOT may attempt an invalid hyperparameter combination which results in the pipeline not working. Other times, the pipeline configuration itself may be invalid. For example, a selector may not select any features due to its hyperparameter. Another common example is `MultinomialNB` throwing an error because it expects positive values, but a prior transformation yielded a negative value. \n", + "* Many pipelines in the evaluated_individuals data frame have crashed or turned up invalid!\n", + " * This is normal and is expected behavior for TPOT. In some cases, TPOT may attempt an invalid hyperparameter combination, resulting in the pipeline not working. Other times, the pipeline configuration itself may be invalid. For example, a selector may not select any features due to its hyperparameter. Another common example is `MultinomialNB` throwing an error because it expects positive values, but a prior transformation yielded a negative value. \n", " * If you used custom search spaces, you can use `ConfigSpace` conditionals to prevent invalid hyperparameters (this may still occur due to how TPOT uses crossover).\n", " * Setting `verbose=5` will print out the full error message for all failed pipelines. This can be useful for debugging whether or not there is something misconfigured in your pipeline, custom search space modules, or something else.\n", "* TPOT is crashing due to memory issues\n", - " * Set the `memory_limit` parameter so that n_jobs*memorylimit is less than the available RAM on your machine plus some wiggle room. This should prevent crashing due to memory concerns.\n", - " * Using feature selection may also improve memory usage as described above.\n", + " * Set the `memory_limit` parameter so that n_jobs*memorylimit is less than the available RAM on your machine, plus some wiggle room. This should prevent crashing due to memory concerns.\n", + " * Using feature selection may also improve memory usage, as described above.\n", " * Remove modules that create high RAM usage (e.g. multiple PolynomialFeatures or one with high degree).\n", "* Why are my TPOT runs not reproducible when random_state is set?\n", " * Check that `periodic_checkpoint_folder` is set correctly. If this is set to a non-empty folder, TPOT will continue training from the checkpoint rather than start a new run from scratch. For TPOT runs to be reproducible, they have to have the same starting points.\n", - " * If using custom search spaces, make sure to pass in a fixed `random_state` value into the configspace of the scikit-learn modules that utilize them. TPOT does not check whether estimators do or do not take in a random state value (See Tutorial 2).\n", + " * If using custom search spaces, pass in a fixed `random_state` value into the configspace of the scikit-learn modules that utilize them. TPOT does not check whether estimators do or do not take in a random state value (See Tutorial 2).\n", " * If using the pre-built search spaces provided by TPOT, make sure to pass in `random_state` to `tpot2.config.get_configspace` or `tpot2.config.template_search_spaces.get_template_search_spaces`. This ensures all estimators that support it get a fixed random_state value. (See Tutorial 2).\n", - " * If using custom Node and Pipeline types, make sure that all random decisions utilize the rng parameter passed into the mutation/crossover functions.\n", - " * If `max_eval_time_mins` is set, TPOT will terminate pipelines that go over this time limit. If the pipeline evaluation happens to be very similar to the time limit, its possible that small random fluctuations in CPU allocation may cause a give pipeline to happen to be evaluated in one run but not another. This slightly different result would throw off the random number generator thoughout the rest of the run. Setting `max_eval_time_mins` to None or a higher value may prevent this edge case.\n", + " * If using custom Node and Pipeline types, ensure all random decisions utilize the rng parameter passed into the mutation/crossover functions.\n", + " * If `max_eval_time_mins` is set, TPOT will terminate pipelines that exceed this time limit. If the pipeline evaluation happens to be very similar to the time limit, small random fluctuations in CPU allocation may cause a given pipeline to be evaluated in one run but not another. This slightly different result would throw off the random number generator throughout the rest of the run. Setting `max_eval_time_mins` to None or a higher value may prevent this edge case.\n", " * If using `TPOTEstimatorSteadyState` with `n_jobs`>1, it is also possible that random fluctuations in CPU allocation slightly change the order in which pipelines are evaluated, which will affect the downstream results. `TPOTEstimatorSteadyState` is more reliably reproducible when `n_jobs=1` (This is not an issue for the default `TPOTEstimator`, `TPOTClassifier`, `TPOTRegressor` as they used a batched generational approach where execution order does not impact results).\n", - "* TPOT is not using all the CPU cores I expected given my `n_jobs` setting.\n", - " * The default TPOT algorithm uses a generational approach. This means the TPOT will need to fully evaluated `population_size` (default 50) pipelines before starting the next batch. Often, TPOT will be waiting for the last few pipelines to finish evaluating, which could be less than `n_jobs`. Some estimators or pipelines can be significantly slower to evaluated than others. This can be addressed in a few ways:\n", - " * Decrease `max_eval_time_mins` to cut long running pipeline evaluations early.\n", + "* TPOT is not using all the CPU cores I expected, given my `n_jobs` setting.\n", + " * The default TPOT algorithm uses a generational approach. This means the TPOT will need to evaluate `population_size` (default 50) pipelines before starting the next batch. At the end of each generation, TPOT may leave threads unused while it waits for the last few pipelines to finish evaluating. Some estimators or pipelines can be significantly slower to evaluate than others. This can be addressed in a few ways:\n", + " * Decrease `max_eval_time_mins` to cut long-running pipeline evaluations early.\n", " * Remove estimators or hyperparameter configurations that are prone to very slow convergence (which is very often `SVC` or `SVR`).\n", " * Alternatively, `TPOTEstimatorSteadyState` uses a slightly different backend for the evolutionary algorithm that does not utilize the generational approach. Instead, new pipelines are generated and evaluated as soon as the previous one finishes. With this estimator, all cores should be utilized at all times. \n", + " * Sometimes, setting n_jobs to a multiple of the number of threads can help minimize the chances of threads being idle while waiting for others to finish\n", "\n", "\n", "\n", "## Other things to be aware of:\n", "\n", - "* **Overfitting** On small datasets, it is not impossible for TPOT to over fit the cross validation score itself. This can lead to lower than expected performance on held out datasets. TPOT will always return the model with the highest CV score as its final fitted_pipeline. However, if the highest performing model as evaluated by cross validation actually was just overfit to the CV score, it may actually be worse performing compared to other models on the pareto front.\n", - " * Using a secondary complexity objective and evaluating the entire pareto front may be beneficial. In some cases a lower performing pipeline with lower complexity can actually perform better on held out sets. These can either be evaluated and compared on a held out validation set, or sometimes, if very data limited, simply using a different seed of splitting the CV folds can work as well.\n", - " * TPOT can do this automatically. The `validation_strategy` parameter than select between re-testing the final pareto front on either a held out validation set (percent of data set by `validation_fraction`) or on a different seed for splitting the CV folds. These can be selected by setting `validation_strategy` to \"split\" or \"reshuffled\", respectively.\n", - " * Increasing the number of folds of cross validation can mitigate this. \n", - " * Nested cross validation can also be used to estimate the performance of the TPOT optimization algorithm itself.\n", - " * Removing more complex methods from the search space can reduce the changes of overfitting" + "* **Overfitting** On small datasets, it is not impossible for TPOT to overfit the cross-validation score itself. This can lead to lower-than-expected performance on held-out datasets. TPOT will always return the model with the highest CV score as its final fitted_pipeline. However, if the highest performing model, as evaluated by cross-validation, actually was just overfit to the CV score, it may actually be worse performing compared to other models on the Pareto front.\n", + "  * Using a secondary complexity objective and evaluating the entire pareto front may be beneficial. In some cases a lower performing pipeline with lower complexity can actually perform better on held out sets. These can either be evaluated and compared on a held out validation set, or sometimes, if very data limited, simply using a different seed of splitting the CV folds can work as well.\n", + "    * TPOT can do this automatically. The `validation_strategy` parameter can be set to re-test the final pareto front on either a held-out validation set (percent of data set by `validation_fraction`) or a different seed for splitting the CV folds. These can be selected by setting `validation_strategy` to \"split\" or \"reshuffled\", respectively.\n", + "  * Increasing the number of folds of cross-validation can mitigate this. \n", + "  * Nested cross-validation can also be used to estimate the performance of the TPOT optimization algorithm itself.\n", + "  * Removing more complex methods from the search space can reduce the chances of overfitting" ] }, { @@ -2102,13 +2209,13 @@ "source": [ "# More Options\n", "\n", - "`tpot2.TPOTClassifier` and `tpot2.TPOTRegressor` have a simplified set of hyperparameters with default values set for classification and regression problems. Currently, both of these use the standard evolutionary algorithm in the `tpot2.TPOTEstimator` class. If you want more control you can look into either the `tpot2.TPOTEstimator` or `tpot2.TPOTEstimatorSteadyState` class.\n", + "`tpot2.TPOTClassifier` and `tpot2.TPOTRegressor` have a simplified set of hyperparameters with default values set for classification and regression problems. Currently, both of these use the standard evolutionary algorithm in the `tpot2.TPOTEstimator` class. If you want more control, you can look into either the `tpot2.TPOTEstimator` or `tpot2.TPOTEstimatorSteadyState` class.\n", "\n", "There are two evolutionary algorithms built into TPOT2, which corresponds to two different estimator classes.\n", "\n", "1. The `tpot2.TPOTEstimator` uses a standard evolutionary algorithm that evaluates exactly population_size individuals each generation. This is similar to the algorithm in TPOT1. The next generation does not start until the previous is completely finished evaluating. This leads to underutilized CPU time as the cores are waiting for the last individuals to finish training, but may preserve diversity in the population. \n", "\n", - "2. The `tpot2.TPOTEstimatorSteadyState` differs in that it will generate and evaluate the next individual as soon as an individual finishes evaluation. The number of individuals being evaluated is determined by the n_jobs parameter. There is no longer a concept of generations. The population_size parameter now refers to the size of the list of evaluated parents. When an individual is evaluated, the selection method updates the list of parents. This allows more efficient utilization when using multiple cores.\n" + "2. The `tpot2.TPOTEstimatorSteadyState` differs in that it will generate and evaluate the next individual as soon as an individual finishes the evaluation. The number of individuals being evaluated is determined by the n_jobs parameter. There is no longer a concept of generations. The population_size parameter now refers to the size of the list of evaluated parents. When an individual is evaluated, the selection method updates the list of parents. This allows more efficient utilization when using multiple cores.\n" ] }, { @@ -2120,21 +2227,21 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "Evaluations: : 21it [00:13, 1.61it/s]\n" + "Evaluations: : 25it [00:10, 2.47it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ - "0.9786392405063291\n" + "0.9824771007566706\n" ] } ], @@ -2174,12 +2281,12 @@ }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 17, "metadata": {}, "outputs": [ { "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8fJSN1AAAACXBIWXMAAA9hAAAPYQGoP6dpAAAbKUlEQVR4nO3dfZBddZ3n8U933zx1SCdpWjohIZGBJICQhGQUJcA4gsPAjhGKGkVmrNV1ZnVndlYdy9LCmVlGXdwS1LLWGtHxadhRdFQsIj5QCssogUENJEiIECEm6Tw0STok5Dn9sH8QIiE8JJCQ9NfXq4oq7ul77vmde6n6vTn3nnOaBgYGBgIAwKDXfKQHAADAoSHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoIjGkR7AS6Gvry89PT3p7u5Od3d31q1dm53bt6e/ry/NLS0ZNmJEXjZuXDo7O9PZ2Zn29va0tLQc6WEDAM/B/L6/poGBgYEjPYjDZePGjVm0aFF+ec892bF1awZ6e3PM9u0Z3dOTIb29aR4YSH9TU3Y3GtnU3p4tI0akqdHI8JEjc8asWZkxY0bGjh17pHcDAHgK8/uzKxl2q1evzp133JFlS5dmyLZtmbRiZcb39GT01q0Z0tf3rOvtbmnJppEjs6a9PSsmnZDdra05ccqUzDn33IwfP/4l3AMA4OnM78+vVNj19vZm/vz5+fn8+Tlm/fqcvHxFJq5fn5b+/oN+rb7m5nR1dOTXkydlS0dHXjlnTubMmZNG43fi22sAOGqY3w9cmbBbu3ZtvjdvXjZ2rcopS5dmyqpVaT4Eu9bf1JSlEybkV1OmpH3ihFw8d27GjRt3CEYMADwf8/vBKRF2y5cvz3e+8Y20rl6T2UuWpG3btkO+jc2trVlw6qnZdvzxufTNb8rkyZMP+TYAgN8yvx+8QR92y5cvz7dvuCHHLl+RVz3wQBov4LDsgeptbs7drzgtPZMm5bK3vGXQf/gAcLQyv78wg/o6dmvXrs13vvGNtC9fkVcvXnxYP/QkafT35zX3L077ihX5zjf+LWvXrj2s2wOA30Xm9xdu0IZdb29vvjdvXlpXr8lZDzxwSL5vPxDNAwM5a/EDGbFmdb4/b156e3tfku0CwO8C8/uLM2jDbv78+dnYtSqzlyw57CX/dI3+/sx+YEl6Vq3KnXfe+ZJuGwAqM7+/OIMy7FavXp2fz5+fU5YuPSw/pDwQo7dty7SHluZnd9yRNWvWHJExAEAl5vcXb1CG3Z133JFj1q/PlFWrjug4pq5alWPWr8/8O+44ouMAgArM7y/eoAu7jRs3ZtnSpTl5+YqX7Hv3Z9M8MJCTlq/IsoceysaNG4/oWABgMDO/HxqDLuwWLVqUIdu2ZeL69Ud6KEmSE9avT2Pbttx3331HeigAMGiZ3w+NQRV2fX19+eU992TSipUv6DYih0NLf38mr1yZ+xYsSN9z3KcOAHhm5vdD56DC7vrrr8/s2bOzadOmvO1tb8uJJ56493Tg+++/P6997Wufc/158+blU5/61HM+56qrrspnPvOZ/ZbffvvtueSSS7Jj69aM7+k5mGE/pzU7d+a/PfBAzv/Fz3PRgl/kfQ/+Kpt6d+fG7u7872WPHNBrjN/Qkx1bt+bmm2/OG9/4xpx99tmZN2/e3r8/db+XLFmSGTNmZObMmbn77rvz/ve//0Xvw80335zTTz89zc3Nuf/++1/06wHAofLhD384p59+es4444z8/u//fpYtW7bfc3p6nphH/8s3vv6CtvHPXSv3eXzqHT/N3Hvv2fvPrhcYi0/O7z1P646jed494Dve3njjjbnmmmty2223ZfTo0UmeuNbMDTfckLe+9a0H9Bpz5859YaPcY+fOnRno7c2YLVsOar2+gYG0NDXtt3xgYCB/veSBvHX88fnsaaclSe7YuDGbDvLaNSN6erJ9y5Z86EMfyuLFi5Mkl112WdasWZOOjo599vumm27KW97ylnzwgx9Mkpx11lkHvh99fWlpadlv+bRp0/Ktb30r73rXuw5q3ABwON155525/fbbs3DhwjQajXR1dWXkyJH7Pa+7uzsDvb1peoG/rfvnrq785cQT9j4e1Whk3pmzXvC4nzR669YM9Pamu7s7L3vZy/YuP5rn3QMOuyuvvDK33nrrPjv23ve+N9dcc03+/M//fJ/n9vb25n3ve1/uuuuu7Nq1K1dddVUuueSSfOUrX8n999+fa6+9Ng899FCuuOKKNBqNzJkzJ//+7/+eX/ziF0mShQsX5rzzzktXV1euvvrqXH755UmSdevW5f9+9av53OrVubCjI++d/PIkyee7VuamRx9NU5L/OvGEzD3uuNz92GO5rmtl2hqNrNu1K5+adkre/atfZWtfX5KBXDPtlGzYvSsjW1pyaWfn3rGfM3ZskuQXmzbvXfbjDRty3cqV2TXQnwnDhuXaaadkZEtLvrN6Vf6pqyuNJEPXrMn4k35vb9j19vbm1a9+de666658/etfz5IlS/K6170un/zkJ9NoNHL77bfnXe96V774xS/mS1/6UrZs2ZIPfOADWbp0aQYGBvLRj340Z511Vj7+8Y/n0UcfzbJlyzJt2rRcffXV+302Y8aMSZLs3r07PT09Wbdu3YF+rABw2Dz44INpbW3dewLCsGHD9h4U+sQnPpEdO3Zk1qxZueyyyzJy6xOXN+nfc3Ttuq6V+fGGnuwa6M8V44/PFePHJ0k+s2J5frB+fZrTlD8d15n1u3bn8d7ezL33nsxqa8tVJ538jGO5+J4F+faMmUmS2f9xV756xvSc2daWuffek6+eMT3NTU256te/zsPbnxjHh37v9zK7bXSO2b493d3dOf300/e+1pQpUw7L+3UoHHDYfe9738uECRP2WTZ16tRMmzYtN910U04++bdv5Be+8IW8/OUvz6c//els3rw5Z511Vi666KJ91n3Pe96Tv/u7v8sll1ySD33oQ/v8bdmyZbntttuyfPnyXHjhhXvDbsmSJfn4JZfk/JVdufy+RXld+7FpTvKDdetz48wzs72vL5ctWpiz9hxRXPT44/nBrNnpHDYsX+zqyqtGj87fvvzl6R0YyO7+/vxs02M59Rn+z+HpXjm6LRccOzNJ8tmVK/LNtWtyUWNIruvqyv/q7MzEoUOzcObM3P20o2kPP/xwjjvuuL2PP/vZz+799zVr1uSWW25Jkn2e86Q3vOEN+y376U9/mi984QvPOdY/+IM/eN79AYCX0jPNc09avHhxHl66NP/5xBPT39+ftd1rc/e2bVm5bVs+M64zuwcG8jddK/Oa1tb8pq83P9u0Kd+ZeWaGNjfnsd27M2bIkHx97Zp9jtA9GXpJMnPUqHz45CmZfsyoLHr88QwkmdY6Mgs2b87Jra1JnjjCd81vluX1xx6bazqmZe3OnfnLxYvz3Vmz0tazMesG0S3GDjjsvva1r+Xv//7v91t+5ZVX7j3y9KQf/ehHWbx4cf7lX/4lSbJ169aseto1aRYsWJA3vvGNSZLLL798b+QkycUXX5xGo5GTTjopjz322N7lJ590UsYNH56hzc15/bHH5t7NTxxV+6OOYzOsuTnDmpvzmtFj8sstWzKqpSWz2trSOWxYkmT6qFH5wEMPpqWpKX/c0ZFpI0fmiSO++39F+3Srd+zM3yxbkp7du7Otrz+vHNmaPxo7NqcPH55PrFuX80eNyuRduzJ8T1ACAAdu544daezatffxgm3bcufWrVm4fXuSZGt/f361YUPuTXJZ57gMbX7iFIExQ4Y84+s901exs9rasmDz5gxkIH8xcWK+t25dTm5tzZmjRiVJ7tz4WH7S05PPrFyRJHmsd3d29fdnaG9vduzYcah3+bA54LD71re+lYkTJ+btb3/7PsvPPPPMjB07NrfeeuveZQMDA/n85z+f8847b5/n/uQnP3nG1x542nfqw/bE2DN56rVtmpI8/dv4gfw21UY0//bckFeOHp2vTZ+R23t68u5fLcn7X35iTm5tzY97Njzrtp700UcezrtOmJRzx47NzesezW2PPpok+duOjjywc2fu3Lo1X7n11rzh0kuf97UAgH3NmjEjTZt/+xOogSRva2/PhXuiK0ma0pR794TeC9pGW1s+9sgjaWpK3nb8hNywZk0WbN6c2W2j92xzIJ877RU5fvjwfdZrHuhP3yC6b+wBnxX7/e9/Px/5yEfywx/+cL+/XXnllbn22mv3Pr7gggty3XXX7T09eOHChfutM2vWrHz3u99Nknzzm988oDH8+uGHs27btuzq78+PN2zIzLa2zG5ry482bMiu/v5s6t2duzc9ljOe8h/Ck1bt2JGOoUNz+fjxeeNxx+XBrVtz9pgxeby3NzftCbUkuW3DhqzYse9/OFv6+jJu6ND0Dwzk5nXrMmTIkLS0tGR1b29eMXx4/qK9PS3Nzdl6hG5/AgCD2abHH8/AUw7GzB4xIt/fvDk79/zerqu3N8Pb2nL2mDH5dvfavWe5PrZ7d5Kkpakpfc9z4sVJI0bkNzu2Z2d/f45pNHJi64jc9Gh3ZrW1JUnOHjM2X33KLcSW7DlRs7+pOS2NAz4OdsQd8EgnTJiQefPm5aKLLspNN920z9/OO++8TJ48ee/jd77znXnkkUcyc+bMDAwMZOrUqbnxxhv3WedTn/pU/uzP/ixXX311zjvvvLTteWOfy7SpU/Oln/881z76aC7s6MiMPQH3xx0duXThvWlK8j8mTc5xQ4dm2dMi6+5Nm/LFVV1pNDWlrdHIJ6edkqampvzTqaflw488nP+zYnmGNjfntJHH5B9Gn7TPun91wqS884EHMn7Y0Jwy8phs6etN53Gd+Z/3/zK/2b49/f39mTVlSlqedkj4/PPPz7/+67/mm9/8ZpYsWZJ//Md/zMc//vEce+yxecc73pH58+fvc/LElVdemUWLFqWvry/nnntuPvaxj+3z/Gdz22235b3vfW82bNiQ0aNH55xzzsnnPve5530/AeBwWrRoUT74wQ9my55Imj59eq699trcddddufrqq7N79+4MGTIkp51+eoZ2rUpzc3PGdY7LJUnWr+rKf+/uTgaS9qFD8rnjJ+S1I0dm8ZYtuWThvWk0NeVPO8flrccfn0uP68yf3LMgZ40Z86wnTzQ1NWVKa2smDHviiNysUW35fz09mbjnCN1fT5qUjzz8cP7kngXpGxjIa8aMyT8cc3J2NRoZ+rSjeLfcckve8Y53ZN26dbngggvyh3/4h7nhhhsO2/t4MJoGnv496Etk27ZtGTFiRJqamnLNNdeku7t7n6N+z+TWW2/Ng7fcktff9R8v0SgPzM6dO/PDV70yP1iyJLfddluSpNFo7L3cCQDw7I7W+T1JfvSaV2fahRfm/PPPP9JDOSBH7Njiz372s7znPe9JX19fJk6cmOuvv/551+ns7MyCESOyu6UlQ46iq0A3t7amf89RtVGjRmXdunX5wAc+IOoA4AAcrfP77paWbBkxIp1PuSza0e6Ihd1rX/vaZ/zt3XPp7OxMU6ORTSNHpuMpP7I80jaNHJmmRiOvf/3rc8UVVxy27Xz5y1/Opz/96X2WXX755XsvdgwAg9HROr9/tacnX/7853P9t7+dxp7f2R3t8+7g+TVgkvb29gwfOTJr2tuPqg9+zbFPjKu9vf2wbuftb3/7fmclA8Bgd7TO77Nnzsi4mTPzV+9+9zPe+elodFD3ij3SWlpacsasWVkx6YT0NR8dQ+9rbs7yE07I9NmzB82HDgBHE/P7oXN0vHsHYcaMGdnd2pquo+T3ays7OtLb2prp06cf6aEAwKBlfj80Bl3YjR07NidOmZJfT56U/qbnv2vE4dTf1JSHJ0/KiVOnZuyee8wCAAfP/H5oDLqwS5I5556bLR0dWfq0e9e+1B6aMCFbOjoy55xzjug4AKAC8/uLNyjDbvz48XnlnDn51ZQp2bznBr4vtU2trXlw6pS86pxzMn78+CMyBgCoxPz+4g3KsEuSOXPmZOzECVlw6qnpfYl/aNnb3JwFp52a9gkTcvbZZ7+k2waAyszvL86gDbtGo5H/NHduth1/fO5+xWkv2ffx/U1NufsVp2X7+ONz8dy5e69rAwC8eOb3F2fQhl2SjBs3Lpe++U3pmTQpd53+isNe9r3Nzbnr9FekZ9KkXPrmN2XcuHGHdXsA8LvI/P7CHbF7xR5Ky5cvz3e+8W9pXb06s5csSdu2bYd8G5taW7PgtFOzffzxufTNb8rkyZMP+TYAgN8yvx+8EmGXJGvXrs335s3Lxq5VOWXp0kxZtSrNh2DX+pua8tCECXlw6pS0T5iQi+fOHdQlDwCDifn94JQJuyTp7e3N/Pnz8/P583PM+vU5afmKnLB+fVr6+w/6tfqam7OyoyMPT56ULR0dedU55+Tss88etN+5A8BgZX4/cKXC7kmrV6/OnfPnZ9lDD6WxbVsmr1yZ8Rt6Mnrr1gzp63vW9Xa3tGTTyJFZc2x7lp9wQnpbW3Pi1KmZM0hPeQaASszvz69k2D1p48aNue+++3LfggXZsXVrBnp7c8z27Wnr2Zihvb1pHuhPf1NzdjUa2dw+NltGjEhTo5HhI0dm+uzZmT59+qC74jQAVGd+f3alw+5JfX196enpSXd3d7q7u7Nu7drs2rEjfb29aWk0MnT48Lxs3Lh0dnams7Mz7e3tg+qGvwDwu8j8vr/fibADAPhdMKivYwcAwG8JOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQhLADAChC2AEAFCHsAACKEHYAAEUIOwCAIoQdAEARwg4AoAhhBwBQxP8H8dhEH6rlQ+YAAAAASUVORK5CYII=", + "image/png": "", "text/plain": [ "
" ] @@ -2195,7 +2302,7 @@ }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 18, "metadata": {}, "outputs": [ { @@ -2234,68 +2341,68 @@ " \n", " \n", " 0\n", - " 0.474805\n", - " 20.0\n", + " 0.940426\n", + " 98.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.graph.GraphPipe...\n", - " 1.727144e+09\n", - " 1.727144e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " [('LogisticRegression_1', 'FastICA_1'), ('Fast...\n", + " [('DecisionTreeClassifier_1', 'SelectPercentil...\n", " \n", " \n", " 1\n", - " 0.962983\n", - " 78.0\n", + " 0.954244\n", + " 70.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.graph.GraphPipe...\n", - " 1.727144e+09\n", - " 1.727144e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " [('DecisionTreeClassifier_1', 'PCA_1'), ('Quan...\n", + " [('DecisionTreeClassifier_1', 'ColumnOneHotEnc...\n", " \n", " \n", " 2\n", - " 0.962310\n", - " 57.0\n", + " 0.967942\n", + " 13.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.graph.GraphPipe...\n", - " 1.727144e+09\n", - " 1.727144e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " [('DecisionTreeClassifier_1', 'SelectFwe_1')]\n", + " [('KNeighborsClassifier_1', 'SelectFwe_2'), ('...\n", " \n", " \n", " 3\n", - " 0.956908\n", - " 66.0\n", + " NaN\n", + " NaN\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.graph.GraphPipe...\n", - " 1.727144e+09\n", - " 1.727144e+09\n", - " None\n", + " 1.727192e+09\n", + " 1.727192e+09\n", + " INVALID\n", " NaN\n", - " [('DecisionTreeClassifier_1', 'SelectFwe_2'), ...\n", + " [('LogisticRegression_1', 'FeatureAgglomeratio...\n", " \n", " \n", " 4\n", - " 0.879195\n", - " 15.0\n", + " 0.953421\n", + " 80.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.graph.GraphPipe...\n", - " 1.727144e+09\n", - " 1.727144e+09\n", + " 1.727192e+09\n", + " 1.727192e+09\n", " None\n", " NaN\n", - " [('DecisionTreeClassifier_1', 'SelectFwe_1')]\n", + " [('DecisionTreeClassifier_1', 'SelectPercentil...\n", " \n", " \n", "\n", @@ -2303,35 +2410,35 @@ ], "text/plain": [ " roc_auc_score complexity_scorer Parents Variation_Function \\\n", - "0 0.474805 20.0 NaN NaN \n", - "1 0.962983 78.0 NaN NaN \n", - "2 0.962310 57.0 NaN NaN \n", - "3 0.956908 66.0 NaN NaN \n", - "4 0.879195 15.0 NaN NaN \n", + "0 0.940426 98.0 NaN NaN \n", + "1 0.954244 70.0 NaN NaN \n", + "2 0.967942 13.0 NaN NaN \n", + "3 NaN NaN NaN NaN \n", + "4 0.953421 80.0 NaN NaN \n", "\n", " Individual Submitted Timestamp \\\n", - "0 #sk-container-id-1 {\n", - " /* Definition of color scheme common for light and dark mode */\n", - " --sklearn-color-text: black;\n", - " --sklearn-color-line: gray;\n", - " /* Definition of color scheme for unfitted estimators */\n", - " --sklearn-color-unfitted-level-0: #fff5e6;\n", - " --sklearn-color-unfitted-level-1: #f6e4d2;\n", - " --sklearn-color-unfitted-level-2: #ffe0b3;\n", - " --sklearn-color-unfitted-level-3: chocolate;\n", - " /* Definition of color scheme for fitted estimators */\n", - " --sklearn-color-fitted-level-0: #f0f8ff;\n", - " --sklearn-color-fitted-level-1: #d4ebff;\n", - " --sklearn-color-fitted-level-2: #b3dbfd;\n", - " --sklearn-color-fitted-level-3: cornflowerblue;\n", - "\n", - " /* Specific color for light theme */\n", - " --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n", - " --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n", - " --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n", - " --sklearn-color-icon: #696969;\n", - "\n", - " @media (prefers-color-scheme: dark) {\n", - " /* Redefinition of color scheme for dark theme */\n", - " --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n", - " --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n", - " --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n", - " --sklearn-color-icon: #878787;\n", - " }\n", - "}\n", - "\n", - "#sk-container-id-1 {\n", - " color: var(--sklearn-color-text);\n", - "}\n", - "\n", - "#sk-container-id-1 pre {\n", - " padding: 0;\n", - "}\n", - "\n", - "#sk-container-id-1 input.sk-hidden--visually {\n", - " border: 0;\n", - " clip: rect(1px 1px 1px 1px);\n", - " clip: rect(1px, 1px, 1px, 1px);\n", - " height: 1px;\n", - " margin: -1px;\n", - " overflow: hidden;\n", - " padding: 0;\n", - " position: absolute;\n", - " width: 1px;\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-dashed-wrapped {\n", - " border: 1px dashed var(--sklearn-color-line);\n", - " margin: 0 0.4em 0.5em 0.4em;\n", - " box-sizing: border-box;\n", - " padding-bottom: 0.4em;\n", - " background-color: var(--sklearn-color-background);\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-container {\n", - " /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n", - " but bootstrap.min.css set `[hidden] { display: none !important; }`\n", - " so we also need the `!important` here to be able to override the\n", - " default hidden behavior on the sphinx rendered scikit-learn.org.\n", - " See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n", - " display: inline-block !important;\n", - " position: relative;\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-text-repr-fallback {\n", - " display: none;\n", - "}\n", - "\n", - "div.sk-parallel-item,\n", - "div.sk-serial,\n", - "div.sk-item {\n", - " /* draw centered vertical line to link estimators */\n", - " background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n", - " background-size: 2px 100%;\n", - " background-repeat: no-repeat;\n", - " background-position: center center;\n", - "}\n", - "\n", - "/* Parallel-specific style estimator block */\n", - "\n", - "#sk-container-id-1 div.sk-parallel-item::after {\n", - " content: \"\";\n", - " width: 100%;\n", - " border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n", - " flex-grow: 1;\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-parallel {\n", - " display: flex;\n", - " align-items: stretch;\n", - " justify-content: center;\n", - " background-color: var(--sklearn-color-background);\n", - " position: relative;\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-parallel-item {\n", - " display: flex;\n", - " flex-direction: column;\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-parallel-item:first-child::after {\n", - " align-self: flex-end;\n", - " width: 50%;\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-parallel-item:last-child::after {\n", - " align-self: flex-start;\n", - " width: 50%;\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-parallel-item:only-child::after {\n", - " width: 0;\n", - "}\n", - "\n", - "/* Serial-specific style estimator block */\n", - "\n", - "#sk-container-id-1 div.sk-serial {\n", - " display: flex;\n", - " flex-direction: column;\n", - " align-items: center;\n", - " background-color: var(--sklearn-color-background);\n", - " padding-right: 1em;\n", - " padding-left: 1em;\n", - "}\n", - "\n", - "\n", - "/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n", - "clickable and can be expanded/collapsed.\n", - "- Pipeline and ColumnTransformer use this feature and define the default style\n", - "- Estimators will overwrite some part of the style using the `sk-estimator` class\n", - "*/\n", - "\n", - "/* Pipeline and ColumnTransformer style (default) */\n", - "\n", - "#sk-container-id-1 div.sk-toggleable {\n", - " /* Default theme specific background. It is overwritten whether we have a\n", - " specific estimator or a Pipeline/ColumnTransformer */\n", - " background-color: var(--sklearn-color-background);\n", - "}\n", - "\n", - "/* Toggleable label */\n", - "#sk-container-id-1 label.sk-toggleable__label {\n", - " cursor: pointer;\n", - " display: block;\n", - " width: 100%;\n", - " margin-bottom: 0;\n", - " padding: 0.5em;\n", - " box-sizing: border-box;\n", - " text-align: center;\n", - "}\n", - "\n", - "#sk-container-id-1 label.sk-toggleable__label-arrow:before {\n", - " /* Arrow on the left of the label */\n", - " content: \"▸\";\n", - " float: left;\n", - " margin-right: 0.25em;\n", - " color: var(--sklearn-color-icon);\n", - "}\n", - "\n", - "#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {\n", - " color: var(--sklearn-color-text);\n", - "}\n", - "\n", - "/* Toggleable content - dropdown */\n", - "\n", - "#sk-container-id-1 div.sk-toggleable__content {\n", - " max-height: 0;\n", - " max-width: 0;\n", - " overflow: hidden;\n", - " text-align: left;\n", - " /* unfitted */\n", - " background-color: var(--sklearn-color-unfitted-level-0);\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-toggleable__content.fitted {\n", - " /* fitted */\n", - " background-color: var(--sklearn-color-fitted-level-0);\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-toggleable__content pre {\n", - " margin: 0.2em;\n", - " border-radius: 0.25em;\n", - " color: var(--sklearn-color-text);\n", - " /* unfitted */\n", - " background-color: var(--sklearn-color-unfitted-level-0);\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-toggleable__content.fitted pre {\n", - " /* unfitted */\n", - " background-color: var(--sklearn-color-fitted-level-0);\n", - "}\n", - "\n", - "#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n", - " /* Expand drop-down */\n", - " max-height: 200px;\n", - " max-width: 100%;\n", - " overflow: auto;\n", - "}\n", - "\n", - "#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n", - " content: \"▾\";\n", - "}\n", - "\n", - "/* Pipeline/ColumnTransformer-specific style */\n", - "\n", - "#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", - " color: var(--sklearn-color-text);\n", - " background-color: var(--sklearn-color-unfitted-level-2);\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", - " background-color: var(--sklearn-color-fitted-level-2);\n", - "}\n", - "\n", - "/* Estimator-specific style */\n", - "\n", - "/* Colorize estimator box */\n", - "#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", - " /* unfitted */\n", - " background-color: var(--sklearn-color-unfitted-level-2);\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n", - " /* fitted */\n", - " background-color: var(--sklearn-color-fitted-level-2);\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-label label.sk-toggleable__label,\n", - "#sk-container-id-1 div.sk-label label {\n", - " /* The background is the default theme color */\n", - " color: var(--sklearn-color-text-on-default-background);\n", - "}\n", - "\n", - "/* On hover, darken the color of the background */\n", - "#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {\n", - " color: var(--sklearn-color-text);\n", - " background-color: var(--sklearn-color-unfitted-level-2);\n", - "}\n", - "\n", - "/* Label box, darken color on hover, fitted */\n", - "#sk-container-id-1 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n", - " color: var(--sklearn-color-text);\n", - " background-color: var(--sklearn-color-fitted-level-2);\n", - "}\n", - "\n", - "/* Estimator label */\n", - "\n", - "#sk-container-id-1 div.sk-label label {\n", - " font-family: monospace;\n", - " font-weight: bold;\n", - " display: inline-block;\n", - " line-height: 1.2em;\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-label-container {\n", - " text-align: center;\n", - "}\n", - "\n", - "/* Estimator-specific */\n", - "#sk-container-id-1 div.sk-estimator {\n", - " font-family: monospace;\n", - " border: 1px dotted var(--sklearn-color-border-box);\n", - " border-radius: 0.25em;\n", - " box-sizing: border-box;\n", - " margin-bottom: 0.5em;\n", - " /* unfitted */\n", - " background-color: var(--sklearn-color-unfitted-level-0);\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-estimator.fitted {\n", - " /* fitted */\n", - " background-color: var(--sklearn-color-fitted-level-0);\n", - "}\n", - "\n", - "/* on hover */\n", - "#sk-container-id-1 div.sk-estimator:hover {\n", - " /* unfitted */\n", - " background-color: var(--sklearn-color-unfitted-level-2);\n", - "}\n", - "\n", - "#sk-container-id-1 div.sk-estimator.fitted:hover {\n", - " /* fitted */\n", - " background-color: var(--sklearn-color-fitted-level-2);\n", - "}\n", - "\n", - "/* Specification for estimator info (e.g. \"i\" and \"?\") */\n", - "\n", - "/* Common style for \"i\" and \"?\" */\n", - "\n", - ".sk-estimator-doc-link,\n", - "a:link.sk-estimator-doc-link,\n", - "a:visited.sk-estimator-doc-link {\n", - " float: right;\n", - " font-size: smaller;\n", - " line-height: 1em;\n", - " font-family: monospace;\n", - " background-color: var(--sklearn-color-background);\n", - " border-radius: 1em;\n", - " height: 1em;\n", - " width: 1em;\n", - " text-decoration: none !important;\n", - " margin-left: 1ex;\n", - " /* unfitted */\n", - " border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n", - " color: var(--sklearn-color-unfitted-level-1);\n", - "}\n", - "\n", - ".sk-estimator-doc-link.fitted,\n", - "a:link.sk-estimator-doc-link.fitted,\n", - "a:visited.sk-estimator-doc-link.fitted {\n", - " /* fitted */\n", - " border: var(--sklearn-color-fitted-level-1) 1pt solid;\n", - " color: var(--sklearn-color-fitted-level-1);\n", - "}\n", - "\n", - "/* On hover */\n", - "div.sk-estimator:hover .sk-estimator-doc-link:hover,\n", - ".sk-estimator-doc-link:hover,\n", - "div.sk-label-container:hover .sk-estimator-doc-link:hover,\n", - ".sk-estimator-doc-link:hover {\n", - " /* unfitted */\n", - " background-color: var(--sklearn-color-unfitted-level-3);\n", - " color: var(--sklearn-color-background);\n", - " text-decoration: none;\n", - "}\n", - "\n", - "div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n", - ".sk-estimator-doc-link.fitted:hover,\n", - "div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n", - ".sk-estimator-doc-link.fitted:hover {\n", - " /* fitted */\n", - " background-color: var(--sklearn-color-fitted-level-3);\n", - " color: var(--sklearn-color-background);\n", - " text-decoration: none;\n", - "}\n", - "\n", - "/* Span, style for the box shown on hovering the info icon */\n", - ".sk-estimator-doc-link span {\n", - " display: none;\n", - " z-index: 9999;\n", - " position: relative;\n", - " font-weight: normal;\n", - " right: .2ex;\n", - " padding: .5ex;\n", - " margin: .5ex;\n", - " width: min-content;\n", - " min-width: 20ex;\n", - " max-width: 50ex;\n", - " color: var(--sklearn-color-text);\n", - " box-shadow: 2pt 2pt 4pt #999;\n", - " /* unfitted */\n", - " background: var(--sklearn-color-unfitted-level-0);\n", - " border: .5pt solid var(--sklearn-color-unfitted-level-3);\n", - "}\n", - "\n", - ".sk-estimator-doc-link.fitted span {\n", - " /* fitted */\n", - " background: var(--sklearn-color-fitted-level-0);\n", - " border: var(--sklearn-color-fitted-level-3);\n", - "}\n", - "\n", - ".sk-estimator-doc-link:hover span {\n", - " display: block;\n", - "}\n", - "\n", - "/* \"?\"-specific style due to the `` HTML tag */\n", - "\n", - "#sk-container-id-1 a.estimator_doc_link {\n", - " float: right;\n", - " font-size: 1rem;\n", - " line-height: 1em;\n", - " font-family: monospace;\n", - " background-color: var(--sklearn-color-background);\n", - " border-radius: 1rem;\n", - " height: 1rem;\n", - " width: 1rem;\n", - " text-decoration: none;\n", - " /* unfitted */\n", - " color: var(--sklearn-color-unfitted-level-1);\n", - " border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n", - "}\n", - "\n", - "#sk-container-id-1 a.estimator_doc_link.fitted {\n", - " /* fitted */\n", - " border: var(--sklearn-color-fitted-level-1) 1pt solid;\n", - " color: var(--sklearn-color-fitted-level-1);\n", - "}\n", - "\n", - "/* On hover */\n", - "#sk-container-id-1 a.estimator_doc_link:hover {\n", - " /* unfitted */\n", - " background-color: var(--sklearn-color-unfitted-level-3);\n", - " color: var(--sklearn-color-background);\n", - " text-decoration: none;\n", - "}\n", - "\n", - "#sk-container-id-1 a.estimator_doc_link.fitted:hover {\n", - " /* fitted */\n", - " background-color: var(--sklearn-color-fitted-level-3);\n", - "}\n", - "" - ], - "text/plain": [ - "RandomForestClassifier(bootstrap=False, max_features=0.0109527542096,\n", - " min_samples_leaf=15, min_samples_split=4,\n", - " n_estimators=128)" - ] - }, - "execution_count": 1, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "from ConfigSpace import ConfigurationSpace\n", "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", @@ -501,441 +69,9 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled hyperparameters\n", - "{'bootstrap': True, 'criterion': 'entropy', 'max_features': 0.8366498702446, 'min_samples_leaf': 11, 'min_samples_split': 20, 'n_estimators': 128}\n" - ] - }, - { - "data": { - "text/html": [ - "
RandomForestClassifier(criterion='entropy', max_features=0.8366498702446,\n",
-       "                       min_samples_leaf=11, min_samples_split=20,\n",
-       "                       n_estimators=128)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "RandomForestClassifier(criterion='entropy', max_features=0.8366498702446,\n", - " min_samples_leaf=11, min_samples_split=20,\n", - " n_estimators=128)" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "rf_configspace = ConfigurationSpace(\n", " space = {\n", @@ -963,9 +99,9 @@ "source": [ "# TPOT Search spaces\n", "\n", - "TPOT allows you to both hyperparameter search spaces for individual methods as well as pipeline structure search spaces. For example, TPOT can create linear pipelines, trees, or graphs. \n", + "TPOT allows you to create hyperparameter search spaces for individual methods and pipeline structure search spaces. For example, TPOT can create linear pipelines, trees, or graphs. \n", "\n", - "TPOT search spaces are found in the `search_spaces` module. There are two primary kinds of search spaces, node and pipeline. Node search spaces specify the search space of a single sklearn `BaseEstimator`. Pipeline search spaces define the possible structures for a group of node search spaces. These take in node search spaces and produce a pipeline using nodes from that search space. Since sklearn Pipelines are also `BaseEstimator`, pipeline search spaces are also technically node search spaces. Meaning that pipeline search spaces can take in other pipeline search spaces in order to define more complex structures. The primary differentiating factor bewteen node and pipeline search spaces is that pipeline search spaces must take in another search space as input to feed its individual nodes. Therefore, all search spaces eventually end in a node search space at the lowest level. Note that parameters for pipeline search spaces can differ, some take in only a single search space, some take in a list, or some take in multiple defined parameters.\n", + "TPOT search spaces are found in the `search_spaces` module. There are two primary kinds of search spaces, node and pipeline. Node search spaces specify a single sklearn `BaseEstimator` search space. Pipeline search spaces define the possible structures for a group of node search spaces. These take in node search spaces and produce a pipeline using nodes from that search space. Since sklearn Pipelines are also `BaseEstimator`, pipeline search spaces are also technically node search spaces. This means that pipeline search spaces can take in other pipeline search spaces in order to define more complex structures. The primary differentiating factor between node and pipeline search spaces is that pipeline search spaces must take in another search space as input to feed its individual nodes. Therefore, all search spaces eventually end in a node search space at the lowest level. Note that parameters for pipeline search spaces can differ, some take in only a single search space, some take in a list, or some take in multiple defined parameters.\n", "\n", "## node search spaces\n", "\n", @@ -1004,7 +140,7 @@ "source": [ "## Node Search Space Examples\n", "\n", - "Node search spaces represent the smallest unit of an sklearn pipeline. All node search spaces create and optimize a single node, or estimator object. For example this could be a KNeighborsClassifier or a FeatureSetSelector.\n", + "Node search spaces represent the smallest unit of an sklearn pipeline. All node search spaces create and optimize a single node which exports a single estimator object. For example this could be a KNeighborsClassifier or a FeatureSetSelector.\n", "\n", "### EstimatorNode\n", "\n", @@ -1017,7 +153,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 11, "metadata": {}, "outputs": [], "source": [ @@ -1053,20 +189,9 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "knn_individual = knn_node.generate()\n", "knn_individual" @@ -1074,18 +199,9 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled hyperparameters\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 8, 'p': 2, 'weights': 'uniform'}\n" - ] - } - ], + "outputs": [], "source": [ "print(\"sampled hyperparameters\")\n", "print(knn_individual.hyperparameters)" @@ -1100,18 +216,9 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "mutated hyperparameters\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 1, 'p': 3, 'weights': 'distance'}\n" - ] - } - ], + "outputs": [], "source": [ "knn_individual.mutate() # mutate the individual\n", "print(\"mutated hyperparameters\")\n", @@ -1127,25 +234,9 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "original hyperparameters for individual 1\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 8, 'p': 1, 'weights': 'uniform'}\n", - "original hyperparameters for individual 2\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'distance'}\n", - "\n", - "post crossover hyperparameters for individual 1\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 8, 'p': 3, 'weights': 'uniform'}\n", - "post crossover hyperparameters for individual 2\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'distance'}\n" - ] - } - ], + "outputs": [], "source": [ "knn_individual1 = knn_node.generate()\n", "knn_individual2 = knn_node.generate()\n", @@ -1175,427 +266,9 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
KNeighborsClassifier(n_jobs=1, n_neighbors=8, p=3)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "KNeighborsClassifier(n_jobs=1, n_neighbors=8, p=3)" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "est = knn_individual1.export_pipeline()\n", "est" @@ -1610,427 +283,9 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
KNeighborsClassifier(n_neighbors=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "KNeighborsClassifier(n_neighbors=10)" - ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "import tpot2\n", "from ConfigSpace import ConfigurationSpace\n", @@ -2079,20 +334,9 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "import tpot2\n", "from ConfigSpace import ConfigurationSpace\n", @@ -2186,437 +430,9 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, - { - "data": { - "text/html": [ - "
LogisticRegression(C=49.6823706295106, class_weight='balanced', dual=True,\n",
-       "                   max_iter=1000, n_jobs=1, solver='liblinear')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "LogisticRegression(C=49.6823706295106, class_weight='balanced', dual=True,\n", - " max_iter=1000, n_jobs=1, solver='liblinear')" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "classifier_individual = classifier_node.generate()\n", "\n", @@ -2626,437 +442,9 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "mutated pipeline\n" - ] - }, - { - "data": { - "text/html": [ - "
LogisticRegression(C=997.5163561212358, class_weight='balanced', dual=True,\n",
-       "                   max_iter=1000, n_jobs=1, solver='newton-cg')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "LogisticRegression(C=997.5163561212358, class_weight='balanced', dual=True,\n", - " max_iter=1000, n_jobs=1, solver='newton-cg')" - ] - }, - "execution_count": 12, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "print(\"mutated pipeline\")\n", "classifier_individual.mutate()\n", @@ -3091,437 +479,9 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline 1\n" - ] - }, - { - "data": { - "text/html": [ - "
LogisticRegression(C=2532.59836574515, class_weight='balanced', max_iter=1000,\n",
-       "                   n_jobs=1, penalty='l1', solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "LogisticRegression(C=2532.59836574515, class_weight='balanced', max_iter=1000,\n", - " n_jobs=1, penalty='l1', solver='saga')" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "#same pipeline search space as before.\n", "classifier_choice = tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"])\n", @@ -3532,440 +492,9 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline 2\n" - ] - }, - { - "data": { - "text/html": [ - "
DecisionTreeClassifier(class_weight='balanced', max_depth=20,\n",
-       "                       max_features='log2', min_samples_leaf=2,\n",
-       "                       min_samples_split=4)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "DecisionTreeClassifier(class_weight='balanced', max_depth=20,\n", - " max_features='log2', min_samples_leaf=2,\n", - " min_samples_split=4)" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "print(\"sampled pipeline 2\")\n", "classifier_choice.generate().export_pipeline()" @@ -3973,434 +502,9 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline 1\n" - ] - }, - { - "data": { - "text/html": [ - "
BernoulliNB(alpha=95.0262026809266)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "BernoulliNB(alpha=95.0262026809266)" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "#search space for all classifiers\n", "classifier_choice = tpot2.config.get_search_space(\"classifiers\")\n", @@ -4411,440 +515,9 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline 2\n" - ] - }, - { - "data": { - "text/html": [ - "
BaggingClassifier(bootstrap=False, bootstrap_features=True,\n",
-       "                  max_features=0.8753796928965, max_samples=0.8146576017845,\n",
-       "                  n_estimators=17, n_jobs=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "BaggingClassifier(bootstrap=False, bootstrap_features=True,\n", - " max_features=0.8753796928965, max_samples=0.8146576017845,\n", - " n_estimators=17, n_jobs=1)" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "print(\"sampled pipeline 2\")\n", "classifier_choice.generate().export_pipeline()" @@ -4861,460 +534,9 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, - { - "data": { - "text/html": [ - "
Pipeline(steps=[('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0038921831393)),\n",
-       "                ('pca', PCA(n_components=0.7545742110409)),\n",
-       "                ('logisticregression',\n",
-       "                 LogisticRegression(C=85638.13831296022,\n",
-       "                                    class_weight='balanced',\n",
-       "                                    l1_ratio=0.6102894736188, max_iter=1000,\n",
-       "                                    n_jobs=1, penalty='elasticnet',\n",
-       "                                    solver='saga'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('variancethreshold',\n", - " VarianceThreshold(threshold=0.0038921831393)),\n", - " ('pca', PCA(n_components=0.7545742110409)),\n", - " ('logisticregression',\n", - " LogisticRegression(C=85638.13831296022,\n", - " class_weight='balanced',\n", - " l1_ratio=0.6102894736188, max_iter=1000,\n", - " n_jobs=1, penalty='elasticnet',\n", - " solver='saga'))])" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "selector_choicepipeline = tpot2.config.get_search_space(\"VarianceThreshold\")\n", "transformer_choicepipeline = tpot2.config.get_search_space(\"PCA\")\n", @@ -5341,443 +563,9 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, - { - "data": { - "text/html": [ - "
Pipeline(steps=[('selectpercentile',\n",
-       "                 SelectPercentile(percentile=97.2182140589731)),\n",
-       "                ('binarizer', Binarizer(threshold=0.7460953779809)),\n",
-       "                ('gaussiannb', GaussianNB())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('selectpercentile',\n", - " SelectPercentile(percentile=97.2182140589731)),\n", - " ('binarizer', Binarizer(threshold=0.7460953779809)),\n", - " ('gaussiannb', GaussianNB())])" - ] - }, - "execution_count": 18, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "selector_choicepipeline = tpot2.config.get_search_space(\"selectors\")\n", "transformer_choicepipeline = tpot2.config.get_search_space(\"transformers\")\n", @@ -5795,450 +583,9 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, - { - "data": { - "text/html": [ - "
Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.018605231348)),\n",
-       "                ('powertransformer', PowerTransformer()),\n",
-       "                ('adaboostclassifier',\n",
-       "                 AdaBoostClassifier(algorithm='SAMME',\n",
-       "                                    learning_rate=0.2576218451608,\n",
-       "                                    n_estimators=260))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.018605231348)),\n", - " ('powertransformer', PowerTransformer()),\n", - " ('adaboostclassifier',\n", - " AdaBoostClassifier(algorithm='SAMME',\n", - " learning_rate=0.2576218451608,\n", - " n_estimators=260))])" - ] - }, - "execution_count": 19, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "print(\"sampled pipeline\")\n", "stc_pipeline.generate().export_pipeline()" @@ -6255,440 +602,9 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, - { - "data": { - "text/html": [ - "
Pipeline(steps=[('minmaxscaler-1', MinMaxScaler()),\n",
-       "                ('minmaxscaler-2', MinMaxScaler()),\n",
-       "                ('minmaxscaler-3', MinMaxScaler())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('minmaxscaler-1', MinMaxScaler()),\n", - " ('minmaxscaler-2', MinMaxScaler()),\n", - " ('minmaxscaler-3', MinMaxScaler())])" - ] - }, - "execution_count": 20, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "import tpot2.config\n", "\n", @@ -6700,449 +616,9 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, - { - "data": { - "text/html": [ - "
Pipeline(steps=[('powertransformer', PowerTransformer()),\n",
-       "                ('nystroem',\n",
-       "                 Nystroem(gamma=0.9541024274994, kernel='cosine',\n",
-       "                          n_components=12)),\n",
-       "                ('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0109488305621))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('powertransformer', PowerTransformer()),\n", - " ('nystroem',\n", - " Nystroem(gamma=0.9541024274994, kernel='cosine',\n", - " n_components=12)),\n", - " ('variancethreshold',\n", - " VarianceThreshold(threshold=0.0109488305621))])" - ] - }, - "execution_count": 21, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "print(\"sampled pipeline\")\n", "linear_feature_engineering.generate().export_pipeline()" @@ -7150,472 +626,9 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, - { - "data": { - "text/html": [ - "
Pipeline(steps=[('pipeline',\n",
-       "                 Pipeline(steps=[('pca', PCA(n_components=0.543582719063)),\n",
-       "                                 ('quantiletransformer',\n",
-       "                                  QuantileTransformer(n_quantiles=1182)),\n",
-       "                                 ('passkbinsdiscretizer',\n",
-       "                                  PassKBinsDiscretizer(n_bins=13,\n",
-       "                                                       strategy='uniform'))])),\n",
-       "                ('randomforestclassifier',\n",
-       "                 RandomForestClassifier(bootstrap=False,\n",
-       "                                        max_features=0.078312000096,\n",
-       "                                        min_samples_leaf=7, min_samples_split=3,\n",
-       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('pipeline',\n", - " Pipeline(steps=[('pca', PCA(n_components=0.543582719063)),\n", - " ('quantiletransformer',\n", - " QuantileTransformer(n_quantiles=1182)),\n", - " ('passkbinsdiscretizer',\n", - " PassKBinsDiscretizer(n_bins=13,\n", - " strategy='uniform'))])),\n", - " ('randomforestclassifier',\n", - " RandomForestClassifier(bootstrap=False,\n", - " max_features=0.078312000096,\n", - " min_samples_leaf=7, min_samples_split=3,\n", - " n_estimators=128))])" - ] - }, - "execution_count": 22, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "full_search_space = tpot2.search_spaces.pipelines.SequentialPipeline([\n", " linear_feature_engineering,\n", @@ -7628,472 +641,9 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "sampled pipeline\n" - ] - }, - { - "data": { - "text/html": [ - "
Pipeline(steps=[('pipeline',\n",
-       "                 Pipeline(steps=[('passkbinsdiscretizer',\n",
-       "                                  PassKBinsDiscretizer()),\n",
-       "                                 ('robustscaler',\n",
-       "                                  RobustScaler(quantile_range=(0.07166946516,\n",
-       "                                                               0.7478574798356))),\n",
-       "                                 ('zerocount', ZeroCount())])),\n",
-       "                ('baggingclassifier',\n",
-       "                 BaggingClassifier(bootstrap=False, bootstrap_features=True,\n",
-       "                                   max_features=0.1521715848495,\n",
-       "                                   max_samples=0.1213783267153, n_estimators=25,\n",
-       "                                   n_jobs=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('pipeline',\n", - " Pipeline(steps=[('passkbinsdiscretizer',\n", - " PassKBinsDiscretizer()),\n", - " ('robustscaler',\n", - " RobustScaler(quantile_range=(0.07166946516,\n", - " 0.7478574798356))),\n", - " ('zerocount', ZeroCount())])),\n", - " ('baggingclassifier',\n", - " BaggingClassifier(bootstrap=False, bootstrap_features=True,\n", - " max_features=0.1521715848495,\n", - " max_samples=0.1213783267153, n_estimators=25,\n", - " n_jobs=1))])" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "print(\"sampled pipeline\")\n", "full_search_space.generate().export_pipeline()" @@ -8110,437 +660,9 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
FeatureUnion(transformer_list=[('featureagglomeration',\n",
-       "                                FeatureAgglomeration(n_clusters=257,\n",
-       "                                                     pooling_func=<function median at 0x7e7cb00dcf70>)),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "FeatureUnion(transformer_list=[('featureagglomeration',\n", - " FeatureAgglomeration(n_clusters=257,\n", - " pooling_func=)),\n", - " ('passthrough', Passthrough())])" - ] - }, - "execution_count": 24, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "transform_and_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", " tpot2.config.get_search_space(\"transformers\"),\n", @@ -8559,452 +681,9 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0043464782007)),\n",
-       "                ('featureunion',\n",
-       "                 FeatureUnion(transformer_list=[('powertransformer',\n",
-       "                                                 PowerTransformer()),\n",
-       "                                                ('passthrough',\n",
-       "                                                 Passthrough())])),\n",
-       "                ('lineardiscriminantanalysis',\n",
-       "                 LinearDiscriminantAnalysis(shrinkage=0.6381012799603,\n",
-       "                                            solver='lsqr'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0043464782007)),\n", - " ('featureunion',\n", - " FeatureUnion(transformer_list=[('powertransformer',\n", - " PowerTransformer()),\n", - " ('passthrough',\n", - " Passthrough())])),\n", - " ('lineardiscriminantanalysis',\n", - " LinearDiscriminantAnalysis(shrinkage=0.6381012799603,\n", - " solver='lsqr'))])" - ] - }, - "execution_count": 25, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "stc_pipeline2 = tpot2.search_spaces.pipelines.SequentialPipeline([\n", " tpot2.config.get_search_space(\"selectors\"),\n", @@ -9024,472 +703,9 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
Pipeline(steps=[('featureunion',\n",
-       "                 FeatureUnion(transformer_list=[('pipeline-1',\n",
-       "                                                 Pipeline(steps=[('selectfwe',\n",
-       "                                                                  SelectFwe(alpha=0.000231370784)),\n",
-       "                                                                 ('zerocount',\n",
-       "                                                                  ZeroCount())])),\n",
-       "                                                ('pipeline-2',\n",
-       "                                                 Pipeline(steps=[('selectpercentile',\n",
-       "                                                                  SelectPercentile(percentile=56.9207229949532)),\n",
-       "                                                                 ('rbfsampler',\n",
-       "                                                                  RBFSampler(gamma=0.9667449310006,\n",
-       "                                                                             n_components=45))]))])),\n",
-       "                ('linearsvc', LinearSVC(C=8596.144097926976, penalty='l1'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('featureunion',\n", - " FeatureUnion(transformer_list=[('pipeline-1',\n", - " Pipeline(steps=[('selectfwe',\n", - " SelectFwe(alpha=0.000231370784)),\n", - " ('zerocount',\n", - " ZeroCount())])),\n", - " ('pipeline-2',\n", - " Pipeline(steps=[('selectpercentile',\n", - " SelectPercentile(percentile=56.9207229949532)),\n", - " ('rbfsampler',\n", - " RBFSampler(gamma=0.9667449310006,\n", - " n_components=45))]))])),\n", - " ('linearsvc', LinearSVC(C=8596.144097926976, penalty='l1'))])" - ] - }, - "execution_count": 26, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "st_pipeline = tpot2.search_spaces.pipelines.SequentialPipeline([\n", " tpot2.config.get_search_space(\"selectors\"),\n", @@ -9522,427 +738,9 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
FeatureUnion(transformer_list=[('powertransformer', PowerTransformer())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "FeatureUnion(transformer_list=[('powertransformer', PowerTransformer())])" - ] - }, - "execution_count": 27, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "dynamic_transformers = tpot2.search_spaces.pipelines.DynamicUnionPipeline(tpot2.config.get_search_space(\"transformers\"), max_estimators=4)\n", "dynamic_transformers.generate().export_pipeline()" @@ -9957,454 +755,9 @@ }, { "cell_type": "code", - "execution_count": 28, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                FeatureUnion(transformer_list=[('rbfsampler',\n",
-       "                                                                RBFSampler(gamma=0.3736377055485,\n",
-       "                                                                           n_components=3)),\n",
-       "                                                               ('quantiletransformer',\n",
-       "                                                                QuantileTransformer(n_quantiles=955)),\n",
-       "                                                               ('fastica',\n",
-       "                                                                FastICA(algorithm='deflation',\n",
-       "                                                                        n_components=92))])),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('rbfsampler',\n", - " RBFSampler(gamma=0.3736377055485,\n", - " n_components=3)),\n", - " ('quantiletransformer',\n", - " QuantileTransformer(n_quantiles=955)),\n", - " ('fastica',\n", - " FastICA(algorithm='deflation',\n", - " n_components=92))])),\n", - " ('passthrough', Passthrough())])" - ] - }, - "execution_count": 28, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "dynamic_transformers_with_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", " dynamic_transformers,\n", @@ -10416,469 +769,9 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
Pipeline(steps=[('selectpercentile',\n",
-       "                 SelectPercentile(percentile=1.2220353454141)),\n",
-       "                ('featureunion',\n",
-       "                 FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                                 FeatureUnion(transformer_list=[('rbfsampler',\n",
-       "                                                                                 RBFSampler(gamma=0.0989706913466,\n",
-       "                                                                                            n_components=61)),\n",
-       "                                                                                ('passkbinsdiscretizer',\n",
-       "                                                                                 PassKBinsDiscretizer(n_bins=42))])),\n",
-       "                                                ('passthrough',\n",
-       "                                                 Passthrough())])),\n",
-       "                ('bernoullinb',\n",
-       "                 BernoulliNB(alpha=7.5106513153016, fit_prior=False))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('selectpercentile',\n", - " SelectPercentile(percentile=1.2220353454141)),\n", - " ('featureunion',\n", - " FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('rbfsampler',\n", - " RBFSampler(gamma=0.0989706913466,\n", - " n_components=61)),\n", - " ('passkbinsdiscretizer',\n", - " PassKBinsDiscretizer(n_bins=42))])),\n", - " ('passthrough',\n", - " Passthrough())])),\n", - " ('bernoullinb',\n", - " BernoulliNB(alpha=7.5106513153016, fit_prior=False))])" - ] - }, - "execution_count": 29, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "stc_pipeline3 = tpot2.search_spaces.pipelines.SequentialPipeline([\n", " tpot2.config.get_search_space(\"selectors\"),\n", @@ -10902,433 +795,9 @@ }, { "cell_type": "code", - "execution_count": 30, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
ExtraTreesClassifier(bootstrap=True, class_weight='balanced',\n",
-       "                     criterion='entropy', max_features=0.5945413838121,\n",
-       "                     min_samples_leaf=4, min_samples_split=8, n_jobs=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "ExtraTreesClassifier(bootstrap=True, class_weight='balanced',\n", - " criterion='entropy', max_features=0.5945413838121,\n", - " min_samples_leaf=4, min_samples_split=8, n_jobs=1)" - ] - }, - "execution_count": 30, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "SelectFromModel_configspace_part = ConfigurationSpace(\n", " space = {\n", @@ -11342,449 +811,9 @@ }, { "cell_type": "code", - "execution_count": 31, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n",
-       "                                               class_weight='balanced',\n",
-       "                                               criterion='entropy',\n",
-       "                                               max_features=0.0157364601821,\n",
-       "                                               min_samples_leaf=14,\n",
-       "                                               min_samples_split=6, n_jobs=1),\n",
-       "                threshold=0.947435367985)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n", - " class_weight='balanced',\n", - " criterion='entropy',\n", - " max_features=0.0157364601821,\n", - " min_samples_leaf=14,\n", - " min_samples_split=6, n_jobs=1),\n", - " threshold=0.947435367985)" - ] - }, - "execution_count": 31, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "from sklearn.ensemble import ExtraTreesClassifier\n", "from sklearn.feature_selection import SelectFromModel\n", @@ -11815,430 +844,9 @@ }, { "cell_type": "code", - "execution_count": 32, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
EstimatorTransformer(estimator=KNeighborsClassifier(n_jobs=1, n_neighbors=10,\n",
-       "                                                    p=1))
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "EstimatorTransformer(estimator=KNeighborsClassifier(n_jobs=1, n_neighbors=10,\n", - " p=1))" - ] - }, - "execution_count": 32, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "classifiers = tpot2.config.get_search_space(\"classifiers\")\n", "wrapped_estimators = tpot2.search_spaces.pipelines.WrapperPipeline(tpot2.builtin_modules.EstimatorTransformer, {}, classifiers)\n", @@ -12249,24 +857,9 @@ }, { "cell_type": "code", - "execution_count": 33, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "array([[0.3, 0.7],\n", - " [0.5, 0.5],\n", - " [0.8, 0.2],\n", - " [0.7, 0.3],\n", - " [0.5, 0.5]])" - ] - }, - "execution_count": 33, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "import numpy as np\n", "X, y = np.random.rand(100, 10), np.random.randint(0, 2, 100)\n", @@ -12283,24 +876,9 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "array([[1],\n", - " [1],\n", - " [0],\n", - " [0],\n", - " [1]])" - ] - }, - "execution_count": 34, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "classifiers = tpot2.config.get_search_space(\"classifiers\")\n", "wrapped_estimators_cv = tpot2.search_spaces.pipelines.WrapperPipeline(tpot2.builtin_modules.EstimatorTransformer, {'cross_val_predict_cv':10, 'method':'predict'}, classifiers)\n", @@ -12317,594 +895,9 @@ }, { "cell_type": "code", - "execution_count": 35, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n",
-       "                ('featureunion-1',\n",
-       "                 FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                                 FeatureUnion(transformer_list=[('rbfsampler',\n",
-       "                                                                                 RBFSampler(gamma=0.6486019086026,\n",
-       "                                                                                            n_components=82)),\n",
-       "                                                                                ('nystroem',\n",
-       "                                                                                 Nystroem(gamma=0.185797439118,\n",
-       "                                                                                          kernel='additive_chi2',\n",
-       "                                                                                          n_components=87))])),\n",
-       "                                                ('passthrough',\n",
-       "                                                 Passthrough())])),\n",
-       "                ('featureunion-2',\n",
-       "                 Feat...\n",
-       "                                                                                                                              missing=nan,\n",
-       "                                                                                                                              monotone_constraints=None,\n",
-       "                                                                                                                              multi_strategy=None,\n",
-       "                                                                                                                              n_estimators=100,\n",
-       "                                                                                                                              n_jobs=1,\n",
-       "                                                                                                                              nthread=1,\n",
-       "                                                                                                                              num_parallel_tree=None, ...),\n",
-       "                                                                                                      method='predict'))])),\n",
-       "                                                ('passthrough',\n",
-       "                                                 Passthrough())])),\n",
-       "                ('randomforestclassifier',\n",
-       "                 RandomForestClassifier(bootstrap=False, criterion='entropy',\n",
-       "                                        max_features=0.6976552018012,\n",
-       "                                        min_samples_leaf=8,\n",
-       "                                        min_samples_split=16,\n",
-       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n", - " ('featureunion-1',\n", - " FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('rbfsampler',\n", - " RBFSampler(gamma=0.6486019086026,\n", - " n_components=82)),\n", - " ('nystroem',\n", - " Nystroem(gamma=0.185797439118,\n", - " kernel='additive_chi2',\n", - " n_components=87))])),\n", - " ('passthrough',\n", - " Passthrough())])),\n", - " ('featureunion-2',\n", - " Feat...\n", - " missing=nan,\n", - " monotone_constraints=None,\n", - " multi_strategy=None,\n", - " n_estimators=100,\n", - " n_jobs=1,\n", - " nthread=1,\n", - " num_parallel_tree=None, ...),\n", - " method='predict'))])),\n", - " ('passthrough',\n", - " Passthrough())])),\n", - " ('randomforestclassifier',\n", - " RandomForestClassifier(bootstrap=False, criterion='entropy',\n", - " max_features=0.6976552018012,\n", - " min_samples_leaf=8,\n", - " min_samples_split=16,\n", - " n_estimators=128))])" - ] - }, - "execution_count": 35, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "dynamic_wrapped_classifiers_with_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", " tpot2.search_spaces.pipelines.DynamicUnionPipeline(wrapped_estimators_cv, max_estimators=4),\n", @@ -12945,7 +938,7 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 44, "metadata": {}, "outputs": [], "source": [ @@ -12961,20 +954,9 @@ }, { "cell_type": "code", - "execution_count": 37, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8fJSN1AAAACXBIWXMAAA9hAAAPYQGoP6dpAAAamklEQVR4nO3df5DV9X3v8df+QGRBhBUFREET0fgLDKJWMDNhEtskJljHG2NyJ2kbb72JN5o7nTh1dJIab20mqdGpTU2aZqapd6yx0WQuamyaVmsrWINgMCpR/BF+s4KLILsg7O65fwhEBCIICPv28fiL/Z5zvt/P2cPM57nn+6up0Wg0AgBAv9e8vwcAAMDeIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgiNb9PYC3Q29vbzo7O9PR0ZGOjo6sXLEir65fn77e3jS3tGTgoEE5fNSojBw5MiNHjkx7e3taWlr297ABgN/C/L69pkaj0djfg9hXVq9enXnz5uWXc+dmQ1dXGj09GbJ+fQ7t7MyAnp40Nxrpa2rKptbWrGlvz7pBg9LU2pqDBw/OqZMmZeLEiRk+fPj+fhsAwOuY33euZNgtW7Yssx56KC8sWJAB3d0Zu2hxRnd25tCurgzo7d3p6za1tGTN4MFZ3t6eRWOPzqa2thw7fnymvu99GT169Nv4DgCANzK/v7lSYdfT05OZM2dm9syZGbJqVY5buChHrVqVlr6+3V5Xb3NzlowYkWfHjc26ESNyxtSpmTp1alpb3xF7rwHggGF+33Vlwm7FihW5d8aMrF6yNO9ZsCDjly5N8154a31NTVkwZkx+NX582o8ak49Mn55Ro0bthREDAG/G/L57SoTdwoUL8+M77kjbsuU5ff78DO3u3uvbWNvWljknnpjuI4/MBZ+4KOPGjdvr2wAAfsP8vvv6fdgtXLgwd91+ew5buChnPvVUWt/C17K7qqe5OY+cfFI6x47NhZ/8ZL//8AHgQGV+f2v69XXsVqxYkR/fcUfaFy7K7zz55D790JOkta8vZz/xZNoXLcqP7/inrFixYp9uDwDeiczvb12/Dbuenp7cO2NG2pYtz1lPPbVX9rfviuZGI2c9+VQGLV+Wn8yYkZ6enrdluwDwTmB+3zP9NuxmzpyZ1UuW5vT58/d5yb9Ra19fTn9qfjqXLs2sWbPe1m0DQGXm9z3TL8Nu2bJlmT1zZt6zYME+OZByVxza3Z0TnlmQnz/0UJYvX75fxgAAlZjf91y/DLtZDz2UIatWZfzSpft1HMcvXZohq1Zl5kMP7ddxAEAF5vc91+/CbvXq1XlhwYIct3DR27bffWeaG428e+GivPDMM1m9evV+HQsA9Gfm972j34XdvHnzMqC7O0etWrW/h5IkOXrVqrR2d+fxxx/f30MBgH7L/L539Kuw6+3tzS/nzs3YRYvf0m1E9oWWvr6MW7w4j8+Zk97fcp86AGDHzO97zz4Nu0cffTRXXnnlTh+fMWNGbrrppl1eX2dnZzZ0deWZp57K9MfmZvpjc3PyzIfysblzMv2xufnekiV7NN5nurrymV8+nnMfnZ0PzXk01z77bDb19eXmhQvzf5ct2+nrRr/02rjuueeenH/++ZkyZUpmzJixw+d+5zvfyR133JEkefDBB3PyySfnrLPO2u3fxc5873vfy/jx49PU1JR169bt8foAYE9cd911OeWUU3Lqqadm8uTJeeGFF7Z7zpb5/bN3/OAtbePvlize5ucTH/rPrZ0w/bG52fgWY3HL/N7Z2bnN8nvuuSennHJKmpub88QTT7ylde8ru3zH2+9+97u59NJLd2vlkydPzuTJk3f6+PTp03drfR0dHWn09OT8IUNy4XsnJUmmzf55fjDxtAxuadn6vEajkUaS5qamXV73+t7efH7+U7nu3cdl6vDhaTQauXvlymzchf38gzo7s37dulxzzTV58sknkyQXXnhhli9fnhEjRmzz3M997nNb/3377bfny1/+ci6++OJdHucWvb29aXnde97irLPOyr/8y79k2rRpu71OANibZs2alX//93/PL37xi7S2tmbJkiUZPHjwds/bMr83vcVj6/5uyZL88VFHb/35kNbWzNjcCXvi0K6uNHp60tHRkcMPP3zr8hNOOCF33nnnNnP6gWKXw+6WW27JpZdemnXr1uWyyy7L/Pnz02g08ld/9VeZOnVq1q5dm89//vP55S9/mebm5txyyy3ZuHFjvvWtb+XOO+/MAw88kCuuuCLNzc0ZMGBAHn300Xz/+9/PE088kRtuuCHPP/98PvvZz6azszPHHHNMvv/976e9vT3vf//7c9ZZZ+X+++/PqlWr8t+nTdvpdW3O/K+H8/FRo/Lwyy/nmyeckH9etSo/e+mlbOrry6dGH5lPjh6dJPn24kXbLb975cqcMfTQTB0+PEnS1NSU6Uccsd02bl++PHd2rMirfX05sa0tXzpsRHp7NmX2fz60zV8hPT09+drXvpa77747AwYMyNixY3PrrbfmG9/4Rg477LAMGTIkd9xxR+67777cd999mTx5cubPn5+vfvWrefHFF/OlL30py5cvz8CBA3PTTTdl/PjxufzyyzN8+PA8/vjjmTZtWr74xS9uN74tNzDu6+vLypUrs379+l39iAFgr3r66afT1ta29QSEgQMHpqenJ7fffnu++c1vZsOGDZk0aVIuvPDCDO567fImfZvn+O8sWZx/fakzGxuvzdWf2jyHf2vRwty3alWa05SPjxqZVRs35ZWenkx/bG4mDR2aa9993A7H8pG5c3LXxNOSJKf/18O57dQJee/QoZn+2NzcduqENDc15dpnn81z618bxzXveldOH3pohqxfn46Ojpxyyilb1zV+/Ph98vvaG3Y57J5++ukkyZ//+Z/nggsuyK233polS5bkvPPOy7x583LdddflmGOOyW233Zbe3t50dXVl7ty5W19/44035sYbb8y5556bNWvWbLf+K664IpdddlkuuuiifP3rX8+1116bm2++Oclr/xFmz56d/3nppXnowQfzB0eM3OEYX+7pyeShh+bKY47Nf6zuzEsbN+VHp703G/v68snH52Vae3ue6e7a4fJnu7tz4g7+inijD48YkYtHjcqLL76Yr69YngfXvJxzBg/O/XMezXkf/Wh++KMfbfOet3jqqadyxA5C8eWXX86tt96aW2+9NUny7W9/e7vnTJ06dbtls2bNyvXXX/9bx/qud73rTd8PAOxrO5r/tnjyySfz3IIF+YNjj01fX19WdKzII93dWdzdnW+NGplNjUYuX7I4Z7e15de9Pfn5mjX58WnvzUHNzXl506YMGzAgP1ixfJtv6LaEXpKcdsghue648Zkw5JDMe+WVNJKc0DY4c9auzXFtbUle+4bvL3/9Qs497LD85YgTsuLVV/PHTz6ZuydNytDO1VnZj24xtstht8XPfvaz/OQnP8lXv/rVJMlLL72UjRs35v777996XFlLS0uGDh26zeumTp2aq666KvPnz8/HP/7xHHroods8Pnv27Nx9991Jkk9/+tM577zztj52/vnnJ0mOGj06s9auTXYSdgc3N2dae3uSZObql3N/Z2d+vva1iFzX05NFG9bvdHnSyK7suf1VV1dueP65vNLTk1d6ezN6wICcM3hw3jV8eB6ZPfvNVwAAbOPVDRvSunHj1p/ndHdnVldXfrF5r1NXX19+9dJLeSzJhSNH5aDm104RGDZgwA7Xt6NdsZOGDs2ctWvTSCP/46ijcu/KlTmurS3vPeSQJMms1S/nPzo7863Fi5IkL/dsysa+vhzU05MNGzbs7be8z+xy2L3nPe9J8trxa/fcc0/Gjh27Wxu66qqr8uEPfzj33ntvzjjjjDzyyCPbPN70uqpqNBrb/Dxw4MAtD6Tvt+x/P7j5N+eCNJJcPnZsLhi5bQT+60udO1z+wvr1eWztK2/6Pq5esCA3jhuXI3p7ctvq1VuPwfvfU6bkR11dWbR48ZusAQB4vUkTJ6Zp7dqtPzeS/GF7e35vc3QlSVOa8tgeHF40aejQfO3559PUlPzhkWNy+/LlmbN2bU4feujmbTbytyednCMPPnib1zU3+tLbj+4bu8tnxX7hC19Iknzwgx/M3/zN32xdPm/evK3Lt+xG7O3tzdrXfUBJ8txzz2XixIm5+uqrc+KJJ253VszkyZNz1113JUn+8R//Me973/u2H+wOThbYmSnDhuXOjhXZsPkU5ee7u/NqX99Ol08//Ij8fM2azHr5teMAGo1GfrhiRbrecIrz+r7ejBs2LD1NTXmwqytJ0tdo5KX16zPyt3zVDADs2JpXXknjdV/OnD5oUH6ydm1e3Xy83ZKenhw8dGimDBuWuzpWbD3L9eVNm5IkLU1N6X2TEy/ePWhQfr1hfV7t68uQ1tYc2zYo/+/FjkzavIdxyrDhue11txCbv/nKEn1NzWlp3e0dnPvNLo/0kksuSZJ85StfyeWXX55TTz01vb29+cAHPpC//uu/zpe//OV87nOfy6mnnpqWlpbtjhW76aab8sADD6SlpSVnnHFGzj777Dz77LNbH7/55pvzR3/0R7nuuusybty4/MM//MN2Yzho4MD07eKZru9vb8+C7q78t3m/SCPJYQMG5DsnnbzT5W0tLbnlpJNy/fPP5dpnn0tLU/I7w4bl998Qa5cdPTYXzpuXow4emJMPOSQtjaQvyXcffTSrNv8H2+LMM8/MK6+8kkajkQ996EO5+uqrt548cckll+Tyyy/Pxz72sfzu7/5ufvCDH2xz8sSVV16ZX//61+nt7c1FF12UK664Ypvn78xtt92Wb3zjG3nxxRdz+OGH58ILL8yf/dmf7dLvDAD2pnnz5uWqq67aevmtCRMm5IYbbsjDDz+cv/iLv8imTZsyYMCAnHTKKTloydI0Nzdn1MhR+f0kq5YuyRc6OpJG0n7QgPztkWPy/sGD8+S6dfn9XzyW1qamfHzkqHz6yCNzwREj89G5c3LWsGE7PXmiqakp49vaMmbga9/ITTpkaB7o7MxRm7+h+19jx+b/PPdcPjp3TnobjZw9bFi+MuS4bGxtzUFv+Bbvpz/9aS655JKsXLkyH/zgBzNt2rTcfvvt++z3uDuaGo39fN+O3fBv//ZvefqnP825D//X/h7KNl599dX885ln5L7583P//fcnSVpbW3d4uRMAYFsH6vyeJD87+3dywu/9Xj7wgQ/s76Hskv7z3WKSkSNHZs6gQdnU0pIBB9BVoJvb2tK3+Vu4Qw45JCtXrsyf/umfijoA2AUH6vy+qaUl6wYNysiROz5p80DU78KuqbU1awYPzog3HMO3P60ZPDhNra0599xz86lPfept2eb111+fH/7wh9ss+5M/+ZN85jOfeVu2DwB7y4E6v9/W2Zm//+53c+tdd6V183F2F198ca666qr9PLKd61dh197enoMHD87y9vYD6oNffthr42rffKmVt8M111yTa6655m3bHgDsKwfq/H76aRMz6rTTctkXv7jDuz0diPbpvWL3tpaWlpw6aVIWjT06vc0HxtB7m5uz8OijM+H00/vNhw4ABxLz+95zYPz2dsPEiROzqa0tSw6Q49cWjxiRnra2TJgwYX8PBQD6LfP73tHvwm748OE5dvz4PDtu7C5f+mRf6WtqynPjxubY44/P8M33mAUAdp/5fe/od2GXJFPf976sGzEiC8aM2a/jeGbMmKwbMSJTzzlnv44DACowv++5fhl2o0ePzhlTp+ZX48dn7eYb+L7d1rS15enjx+fMc87J6NGj98sYAKAS8/ue65dhlyRTp07N8KPGZM6JJ6bnbT7Qsqe5OXNOOjHtY8ZkypQpb+u2AaAy8/ue6bdh19ramvOmT0/3kUfmkZNPetv2x/c1NeWRk0/K+tFH5iPTp2+9rg0AsOfM73um34ZdkowaNSoXfOKidI4dm4dPOXmfl31Pc3MePuXkdI4dmws+cVFGjRq1T7cHAO9E5ve3rl/dK3ZnFi5cmB/f8U9pW7Ysp8+fn6Hd3Xt9G2va2jLnpBOzfvSRueATF2XcuHF7fRsAwG+Y33dfibBLkhUrVuTeGTOyesnSvGfBgoxfujTNe+Gt9TU15ZkxY/L08ePTPmZMPjJ9er8ueQDoT8zvu6dM2CVJT09PZs6cmdkzZ2bIqlV598JFOXrVqrT09e32unqbm7N4xIg8N25s1o0YkTPPOSdTpkzpt/vcAaC/Mr/vulJht8WyZcsya+bMvPDMM2nt7s64xYsz+qXOHNrVlQG9vTt93aaWlqwZPDjLD2vPwqOPTk9bW449/vhM7aenPANAJeb3N1cy7LZYvXp1Hn/88Tw+Z042dHWl0dOTIevXZ2jn6hzU05PmRl/6mpqzsbU1a9uHZ92gQWlqbc3BgwdnwumnZ8KECf3uitMAUJ35fedKh90Wvb296ezsTEdHRzo6OrJyxYps3LAhvT09aWltzUEHH5zDR43KyJEjM3LkyLS3t/erG/4CwDuR+X1774iwAwB4J+jX17EDAOA3hB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKELYAQAUIewAAIoQdgAARQg7AIAihB0AQBHCDgCgCGEHAFCEsAMAKOL/A4WAVwMLOeW0AAAAAElFTkSuQmCC", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "est1 = ind.export_pipeline()\n", "est1.plot() #GraphPipelines have a helpful plotting function to visualize the pipeline" @@ -12989,110 +971,9 @@ }, { "cell_type": "code", - "execution_count": 38, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8fJSN1AAAACXBIWXMAAA9hAAAPYQGoP6dpAABulElEQVR4nO3deUBU5f4G8GcWdkQYUEAU3BARkE1cAPfSUqPMAjRTyzLNa3rLsm6WXcurLbfyZ5aWlTtii4maLeYKuCC7iEouINsIDIvszPL7Q5vrNLgPHGZ4Pn/l9x3OecbK8/V9zzmvSKPRaEBERERERk8sdAAiIiIiMgw2dkREREQmgo0dERERkYlgY0dERERkItjYEREREZkINnZEREREJoKNHREREZGJYGNHREREZCLY2BERERGZCDZ2RERERCaCjR0RERGRiWBjR0RERGQi2NgRERERmQg2dkREREQmgo0dERERkYlgY0dERERkItjYEREREZkINnZEREREJoKNHREREZGJYGNHREREZCLY2BERERGZCDZ2RERERCaCjR0RERGRiWBjR0RERGQi2NgRERERmQg2dkREREQmgo0dERERkYlgY0dERERkIqRCByAiMiSVSgWFQgG5XA65XI6S4mI01NVBrVJBLJHAwsoKnVxc4OzsDGdnZ8hkMkgkEqFjExEZhEij0WiEDkFEdL/Ky8uRnp6OzJQU1NfUQKNUwrauDh0VCpgplRBrNFCLRGiSSlEpk6HaygoiqRSWNjbwCwqCv78/HBwchP4aRET3hY0dERm1wsJCJMbH42JODsxqa+GedxmuCgU61tTATKW66c81SSSotLFBkUyGPPduaLK2Rg9PT4QNHQpXV9dW/AZERIbDxo6IjJJSqURCQgKSEhJgW1qK3rl56FpaColafdfHUonFyHdywp8e7qh2ckJIWBjCwsIglfJuFSIyLmzsiMjoFBcXY09cHMrzC9A3JweeBQUQG+CPMrVIhBw3N5zx9ISsqxvGRUTAxcXFAImJiFoHGzsiMiq5ubnYERsL68IiBGdnw6621uDnqLK2RrK3N2q7dMHEqEh4eHgY/BxERC2BjR0RGY3c3Fz8EBMDx9w8DDx9GtJ7WHa9U0qxGMd9+kHh7o5JkyezuSMio8D32BGRUSguLsaO2FjIcvMwOCurRZs6AJCq1RhyKguyvDzsiN2O4uLiFj0fEZEhsLEjojZPqVRiT1wcrAuLMOj0aYPcT3cnxBoNBmWdhlVRIX6Oi4NSqWyV8xIR3Ss2dkTU5iUkJKA8vwDB2dktPlP3d1K1GsGns6EoKEBiYmKrnpuI6G6xsSOiNq2wsBBJCQnom5PTIg9K3ImOtbXwOpeDE/HxKCoqEiQDEdGdYGNHRG1aYnw8bEtL4VlQIGiOPgUFsC0tRUJ8vKA5iIhuhY0dEbVZ5eXluJiTg965ea12X93NiDUa9MrNw8Vz51BeXi5oFiKim2FjR0RtVnp6Osxqa9G1tFToKACAbqWlkNbWIiMjQ+goRETNYmNHRG2SSqVCZkoK3PMu39M2YS1BolbD4/JlZCQnQ3WLfWiJiITCxo6I2iSFQoH6mhq4KhRCR9HhWnYtl6KN5SIiAtjYEdFtDB8+HIcPH9apzZkzB2vWrLntz548eRKvvvrqPZ1XLpdDo1TCvrr6tp999lQmIlJTMDzpBAYfP4aI1BREpKbgbE0NHk9Lvafz30zHmhq8s3w55HL5Hf9M9+7dUd3M95gxYwZ2795905+bOHEiHBwc8MQTT9xTViJqf9jYEdEtRUZGYvv27dpfq1QqxMXFYdKkSbf8OZVKhQEDBuDDDz+8p/PK5XLY1tXd0XvrvvH1Q1xgEOa7e+Cxzp0RFxiEuMAg2Egkd3Qu1V08mGGmUkF0PV9Le+mll7Bx48YWPw8RmQ42dkR0S0888QR++uknqK83WIcOHUKfPn0QFRWFoKAgBAYGIv76K0AOHjyIMWPGIDIyEiNHjsTBgwe1s03Hjh1DaGgoAgMDMWrUKO374N555x0899xzGDZsGHr27Ilt27YBAEqKi7H/jz8wPiUZj6SkYGPhtdedHFQo8GR6GiJSU7A4Jwfq2zRlTWoNXjt3Fg8ln8T8M9n4a3vskUkn8FleLqLS03C8sgI/yIsxKS0Vj6QkY2XuJQBAjUqFmadOYUJKMiakJOPI9adhRWo1Pvn4Y/j5+WH06NGoqakBAKSkpGDgwIHo378/pk2bhvr6er08b731Fry9vTF+/HhcuXLlltlHjhyJDh063PpfEBHRDdjYEdEtOTs7o0+fPjhy5AgAYPv27ZgyZQp27tyJlJQU7Ny5E//85z+1nz9+/Dg+/fRTveXbfv364ciRI0hNTcVzzz2HDz74QDt28eJF7N+/H7///jsWL1587TjHjuFsYSF2BARiV1AQIjp1hqKpCd8WFGCzX3/EBQbBTCzCz6Ult8x/oa4WL3Tthr1BwShrbMLJqirtmL3UDLH+Aehsbo5DinJs9w/AzsAgnK6uQWpVFeLLy2FvJsXuoGDsCgxC4PUmq7qhAb7e3sjMzISbmxt+/PFHAMD06dOxatUqZGRkwMbGBp9//rlOlhMnTuCXX35Beno61q1bx50siMjg2NgR0W1FRUXhu+++g0qlwq5duxAREYHXXnsNfn5+iIiIwOnTp7WfDQsLQ5cuXfSOUV5ejokTJ8LX1xdLly7V+Zlx48ZBKpWiV69eqKioAACczs7GyN69YS6+9seUvZkZ0qqqcLa2Rjtjl1hRgfz6hltm72FlhV7W1hCJROhna4OChv/Noj3s5AQASKyoQOrVKkxMS8Vjaak4X1eLvPp69LGxxsmqKnxw8SLSrl6FrVQKALCUSuHl6QkACA4OxqVLl1BZWYmGhgYMGjQIAPD0009rm+G/JCYmYuLEiTA3N4erqytGjRp1R7//RER3Sip0ACJq+yZNmoT33nsPjz76KPr374+ff/4ZNTU1SE1NhUQigbW1tfazN/7zjd5++22MHz8eL7zwAo4dO4bXX39dO2ZhYaH3eZFIhL8vsmoAjHSQYXmfPnec/a/GEADEIhHUNxzU8oZ78KJcXPAPdw+9n/8pIBAHFQq8d+E8HuvsjKe7dIGZRALJ9SZPIpFApVJpl3i1WTUaiESi29aIiAyJM3ZEdFtOTk7w9vbGK6+8gsjISFRVVcHZ2RlSqRTff/99s/eS/V1VVRW6du0KANi8efNtP+8fEID958+j8fq9fRVNTQjo0AHHKytQ1HBtlq68qQnFDbeesbsTgzva4+fSUlQqmwAAxQ0NKG9qgryhAdYSCSY6O2N6Fzdk11x7slUDwNzSUucY9vb2sLCwQFJSEgBg69atGDp0qM5nwsLCsGPHDjQ2NqK4uBgHDhy47+xERDfijB0R3ZGoqCjMmTMHjz32GJRKJcaPH4+BAwciPDwcjo6Ot/35hQsXYsaMGVi+fDlCQ0Nv+/kxDz2E00eP4rHUVEhFIjzp7IKnu3TBO71748XTp6HUqCEVifGepydcmpnxuxt9bGzwvFtXTM3IhAYa2Egk+MSrL87X1eH9ixcgFolgKRbjP9eXXzViMTq5uOgdZ/369ZgzZw7q6+sREBCAOXPm6IwPHDgQY8eORf/+/eHl5YVhw4bdMtfYsWORkpKCmpoadO3aFTt27EBISMh9fVciMm0izd/XD4iI2oBTp07h5+++w4RDh2HWhnZ5aJJIsHv4MIx78kn4+voKHYeISAeXYomoTXJ2doZIKkWljY3QUXRU2thAJJXC2dlZ6ChERHq4FEtEbZJMJoOljQ2KZDI43fCKEqEVOV7LJZPJDHbMQYMGoeFv9woePHgQ9vb2BjsHEbUPbOyIqE2SSCTwCwpCWlkZ+uXlQXIHO1C0NJVYjNxu3RAUHAzJHe5qcSeOHz9usGMRUfvGpVgiarP8/f3RZG2N/OvvmzO0qqoqFBYV4cqVK2hSKm/7+ctOTlBaW6N///4tkoeI6H6xsSOiNsvBwQE9PD3xp4c71AZ+/1uTUonqmmoAGihVSigUiltuT6YWiXDewx09+vSBg4ODQbMQERkKGzsiatPChg5FtZMTctzcWvQ8KpUSVbe4l++cmxuqnZwQFh7eojmIiO4HGzsiatNcXV0REhaGM56eqLrJrhb3wkwqhbm57vvvamtrUN/MC48rra1xto8nBoaHw9XV1WAZiIgMjY0dEbV5YWFhcOjqhmRvbyjFhvtjy97eHiKR7vEqKip0lmSVYjGS+3lD5uZ2Ry9WJiISEhs7ImrzpFIpxkdEoLZLFxz36Wew++2kEgns7Ox0amq1CpWVldf+WSTCcZ9+qHPtgnEREZBK+SIBImrb2NgRkVFwcXHBxKhIKNzdcdTXx2AzdzbW1rCw0N33ta6uFjWNjTjq6wOFuzsmRkXCpZktxIiI2hpuKUZERiU3Nxc7YrfDurAQwdnZsKutve9jqtQqXLlSAo3m2rvyauzscHZACDQ9e2DS5Mnw8PC473MQEbUGNnZEZHSKi4uxJy4O5fkF6JuTA8+CAojv84+y2ro6KCorUNinD3L69kWBQoG6piZs2rQJIgO/aoWIqKWwsSMio6RUKpGQkICkhATYlpaiV24eupWW3tMOFSqxGJednJDl3BlyKyskJCUhMTERKpUKMTExiI6OboFvQERkeGzsiMioFRYWIjEhARfPnYO0thYely/DtUyBjjU1MFOpbvpzTRIJKm1sUOQoQ263blBaW8O1Wze895//4OzZs9rPOTg4ICsri685ISKjwMaOiExCeXk5MjIykJGcjPqaGmiUStjW1cFOUQ5zpRJijRpqkRiNUimqZA6otrKCSCqFpY0N+gcHo3///nBwcMD27dsRFRWlc+wJEyYgLi6OS7JE1OaxsSMik6JSqaBQKCCXyyGXy1FSXIzG+nqolEpIpFKYW1qik4sLnJ2d4ezsDJlMBolEonOM6OhoxMbG6tS++eYbPPPMM635VYiI7hobOyKivykrK4OPjw/kcrm2Zmdnh8zMTLi7uwuYjIjo1vgeOyKiv3F0dMSXX36pU6uqqsLMmTPBvwsTUVvGxo6IqBkRERGYPn26Tm3fvn1Ys2aNQImIiG6PS7FERDdRUVEBPz8/5Ofna2vW1tbIyMhAr169BExGRNQ8ztgREd2Evb09vv76a51abW0tnnnmGahu8SoVIiKhsLEjIrqFMWPGYPbs2Tq1I0eOYOXKlQIlIiK6OS7FEhHdRnV1Nfr374+LFy9qaxYWFkhNTYW3t7eAyYiIdHHGjojoNmxtbfHtt9/qvKC4oaEB06dPh1KpFDAZEZEuNnZERHdg+PDhmD9/vk4tKSkJH3zwgUCJiIj0cSmWiOgO1dXVITAwUGcvWTMzMyQlJcHf31/AZERE13DGjojoDllZWWHDhg0Qi//3R2dTUxOmT5+OxsZGAZMREV3Dxo6I6C4MGjQIixYt0qmlp6fj3XffFSgREdH/cCmWiOguNTQ0ICQkBJmZmdqaRCLB0aNHERISImAyImrv2NgREd2D1NRUDBw4UOepWG9vb6SkpMDS0lLAZETUnnEplojoHgQGBuKtt97SqWVnZ+vViIhaE2fsiIjuUVNTE4YMGYLk5GRtTSQS4fDhwwgPDxcwGRG1V2zsiIjuQ1ZWFoKCgnSeiu3VqxfS09NhY2MjYDIiao+4FEtEdB98fHzw3nvv6dTOnz+v9+QsEVFr4IwdEdF9UqlUGDZsGBITE3Xq+/btw+jRowVKRUTtERs7IiIDyMnJgb+/P+rq6rQ1d3d3ZGZmws7OTsBkRNSecCmWiMgAPD098f777+vU8vLy8PLLLwuUiIjaI87YEREZiFqtxgMPPIADBw7o1Hfv3o3x48cLlIqI2hM2dkREBnTp0iX4+fmhurpaW3N1dcWpU6cgk8kETEZE7QGXYomIDKh79+745JNPdGpFRUWYN2+eQImIqD3hjB0RkYFpNBqMHz8ee/fu1al///33mDRpkkCpiKg9YGNHRNQCCgoK4Ovri4qKCm3NyckJWVlZ6Ny5s3DBiMikcSmWiKgFuLm5YdWqVTq10tJSzJkzB/z7NBG1FDZ2REQt5KmnnsLEiRN1aj/++CO2bt0qUCIiMnVciiUiakFXrlyBj48PSktLtTV7e3tkZWWhS5cuAiYjIlPEGTsiohbUuXNnfPHFFzq1iooKPPfcc1ySJSKDY2NHRNTCnnjiCUyePFmntnfvXnz99dcCJSIiU8WlWCKiVqBQKODj44Pi4mJtzdbWFpmZmejevbtwwYjIpHDGjoioFchkMqxbt06nVl1djWeffRZqtVqgVERkatjYERG1kvHjx+PZZ5/VqR04cACff/65QImIyNRwKZaIqBVVVlbCz88Ply9f1tasrKyQnp4OT09PAZMRkSngjB0RUSvq2LEjvvnmG51aXV0dZsyYAZVKJVAqIjIVbOyIiFrZAw88gBdffFGnlpiYiI8//ligRERkKrgUS0QkgOrqagQEBOD8+fPamrm5OVJSUuDj4yNgMiIyZpyxIyISgK2tLdavXw+RSKStNTY2Yvr06WhqahIwGREZMzZ2REQCCQ8Px8svv6xTS05OxooVKwRKRETGjkuxREQCqqurQ1BQEM6cOaOtSaVSnDhxAoGBgQImIyJjxBk7IiIBWVlZYePGjZBIJNqaUqnE9OnT0dDQIGAyIjJGbOyIiAQWEhKC119/XaeWmZmJf//73wIlIiJjxaVYIqI2oLGxESEhIcjIyNDWxGIxEhMTMWjQIAGTEZExYWNHRNRGpKenIyQkROepWC8vL6SmpsLKykrAZERkLLgUS0TURvj7+2PJkiU6tbNnz+LNN98UKBERGRvO2BERtSFKpRKhoaFISkrS1kQiEQ4ePIhhw4YJmIyIjAEbOyKiNiY7OxuBgYE6T8X26NEDGRkZsLW1FTAZEbV1XIolImpjvL29sWzZMp3axYsX8eqrrwqUiIiMBWfsiIjaIJVKhREjRiA+Pl6n/uuvv2LMmDECpSKito6NHRFRG3X+/Hn0798ftbW12lrXrl2RmZkJe3t74YIRUZvFpVgiojaqV69e+PDDD3Vq+fn5+Oc//ylQIiJq6zhjR0TUhqnVaowZMwZ//PGHTn3nzp2IiIgQKBURtVVs7IiI2ri8vDz4+vri6tWr2pqzszOysrLg6OgoYDIiamu4FEtE1Ma5u7vj008/1anJ5XLMnTtXmEBE1GZxxo6IyAhoNBo88sgj2LNnj049NjYWkZGRAqUioraGjR0RkZEoKiqCj48PysvLtTVHR0dkZWXB2dlZwGRE1FZwKZaIyEi4urpi9erVOrWysjLMmjUL/Ds6EQFs7IiIjEp0dDQmTZqkU4uLi8OmTZsESkREbQmXYomIjExJSQl8fHxQUlKirXXs2BGnTp1C165dBUxGRELjjB0RkZHp1KkT1q5dq1OrrKzEzJkzuSRL1M6xsSMiMkITJ07E1KlTdWq//fYbvvrqK4ESEVFbwKVYIiIjVV5eDl9fXxQWFmprNjY2yMjIQM+ePQVMRkRC4YwdEZGRcnBwwNdff61Tq6mpwbPPPgu1Wi1QKiISEhs7IiIj9tBDD+H555/XqR06dAirVq0SKBERCYlLsURERu7q1avw8/NDbm6utmZpaYm0tDR4eXkJmIyIWhtn7IiIjFyHDh3w7bff6tTq6+sxY8YMKJVKgVIRkRDY2BERmYCRI0di3rx5OrVjx47ho48+EigREQmBS7FERCaitrYWAQEByMnJ0dbMzc1x8uRJ+Pn5CZiMiFoLZ+yIiEyEtbU11q9fD7H4f3+0NzY2Yvr06WhqahIwGRG1FjZ2REQmJDQ0FAsXLtSppaamYtmyZQIlIqLWxKVYIiITU19fj+DgYJw+fVpbk0gkOH78OIKDgwVMRkQtjTN2REQmxtLSEhs3boREItHWVCoVpk+fjvr6egGTEVFLY2NHRGSCgoOD8eabb+rUsrKysGTJEoESEVFr4FIsEZGJamxsxODBg5GamqqticViHDlyBKGhoQImI6KWwsaOiMiEZWZmIjg4WOepWE9PT6SlpcHa2lrAZETUErgUS0Rkwvz8/LB06VKdWk5ODt544w2BEhFRS+KMHRGRiVMqlQgPD8fx48d16vv378fIkSMFSkVELYGNHRFRO3D27FkEBAToPBXr4eGBzMxMdOjQQcBkRGRIXIolImoHvLy8sHz5cp1abm4uXnnlFYESEVFL4IwdEVE7oVarMWrUKBw6dEinvnfvXjz00EPQaDQQiUQCpSMiQ+CMHRFROyEWi/HNN9/AxsZGp/7MM89g3rx5cHBwQO/evXHs2DGBEhLR/eKMHRFRO7N27VrMnj37puMDBgxAUlJSKyYiIkNhY0dE1M5oNBqMGTMG+/bta3ZcLBajvr4eZmZmUKlUUCgUkMvlkMvlKCkuRkNdHdQqFcQSCSysrNDJxQXOzs5wdnaGTCbT2cqMiFqXVOgARETUuoqKipCfn3/TcbVajdOnT6O8vByZKSmor6mBRqmEbV0dOioUsFIqIdZooBaJ0CSV4qxMhmQrK4ikUlja2MAvKAj+/v5wcHBoxW9FRABn7IiI2p2oqChs37692TEXFxeEh4Yi0M8P1k1NcM+7DFeFAh1ramCmUt30mE0SCSptbFAkkyHPvRuarK3Rw9MTYUOHwtXVtaW+ChH9DRs7IqJ2ZuTIkTh48KBOTSKRIDQ0FGEhIXCqrkbfgkL0unoVErX6ro+vEouR7+SEPz3cUe3khJCwMISFhUEq5SIRUUtjY0dE1M788ssvePTRR9HY2AgA6Ny5MyLGj4ebgwM8z5xBl3Pn0MHKGh07dryv86hFIuS4ueGMpydkXd0wLiICLi4uhvgKRHQTbOyIiNqhM2fO4LXXXkN6ejoiH3sMrrW18E5OhnVVFQBAKpWic6fOBjlXlbU1kr29UdulCyZGRcLDw8MgxyUifWzsiIjaqdzcXMRs2ICOFy7A6+hRSG64h04ikcK5s2EaOwBQisU47tMPCnd3TJo8mc0dUQvhC4qJiNqh4uJi7IiNhUtRMUZeuAiZjS2A/+06YWtra9DzSdVqDDmVBVleHnbEbkdxcbFBj09E17CxIyJqZ5RKJfbExcG6sAiDTp+GRKOBjY0NXFxcYG/vgM6dnWFjbW3w84o1GgzKOg2rokL8HBcHpVJp8HMQtXds7IiI2pmEhASU5xcgODsb0hueehWLRLC2soK0BV8wLFWrEXw6G4qCAiQmJrbYeYjaKzZ2RETtSGFhIZISEtA3Jwd2tbWCZOhYWwuvczk4ER+PoqIiQTIQmSo2dkRE7UhifDxsS0vhWVAgaI4+BQWwLS1FQny8oDmITA0bOyKidqK8vBwXc3LQOzcPYoFfiCDWaNArNw8Xz51DeXm5oFmITAkbOyKidiI9PR1mtbXoWloqdBQAQLfSUkhra5GRkSF0FCKTwcaOiKgdUKlUyExJgXve5XvaJqwlSNRqeFy+jIzkZKhusQ8tEd05NnZERAKQSqUICAiAr68vnnzySdS28IMMH374IZZ/+CFe37kT3vFHEJGagojUFMRduWLwc52orMC4lGQ8kZZ228+6lilQX1MDhUJh8Bw3qq6uxujRo2Fra4uFCxe26LmIhMQdmYmIBGBvb4+0643PU089hTVr1uDll19usfM9/PDDENfV4ZGDhxCamIC4wCCdcZVGA4lIdJOfvju7S0rwYrdumHAHW5J1rKmBqrERcrkcnTp1uu9zq1QqSJp5XYuZmRmWLFmCrKwsnD9//r7PQ9RWsbEjIhLY0KFDkZGRgdLSUjzzzDPIzc2FTCbD+vXr4ebmBh8fH5w7dw7nzp2Dl5cXCgsL4ezsDE9PT/z555+4cuUKXnjhBeTn58PS0hLr1q1D3759MWPGDDg6OiI5ORm9evVCsKOjznvr8uvrMef0afTv0AEZV6vwQ0Ag5mVn40pjIxo1arzk7oGxTk7az3nb2iDj6lV42djgU6++EIlEeP/iBexXKGAuEuFhp05wtbDA3tJSxJdX4ERlJf7VoycW//knztRUw0Isxru9PdHP1hb/l5uL0qZG5NbVw/xqFQ6cOAFPT0+cOnUKBQUF2LhxI1auXImUlBRMmjQJy5cvBwB8++23+OKLL1BfX4/HHnsMS5cuxaVLl/Doo49i4MCBOH78OJKSkmBhYaHze2xhYYFhw4bhwoULrfrvlqi1sbEjIhKQUqnE3r178dBDD+Gdd97B0KFDsWvXLsTGxuKll15CXFwc3NzccPHiRcTHxyMoKAjx8fHw8vKCj48PRCIRFixYgLfeegvBwcFISkrCggUL8MsvvwAALl++jAMHDiB261aoDhzQO/+ftTX40MsLfW08AQDv9+kDezMzXFUq8UR6GsY4OgIALtTV4tO+fdHTygpPZ2biZFUVeltb4+fSUhwYEAKxSISrSiU6SKU4VlmBh5ycMFLmiK/z82ErkWB3UDDSqqqw6Nw57Aq6Nlt4rqYWG/38kOrtja/OncXVq1dx8OBBbNmyBY888giSk5Ph6uoKLy8vvPLKK7hy5Qp+/vlnHD16FCKRCI8++iiOHj0KV1dXZGVlYdOmTfjqq69a6d8cUdvExo6ISAAVFRUICAgAcG3GbubMmRg4cCB+/vlnAEBkZCTmz58PAAgLC0N8fDzi4+OxaNEixMfHo6SkBGFhYQCA/fv3Izs7u9nzPPHEExCJRGioq4NVM1t4dbeyQl8bG+2v1xcW4I+ya/e7FTU0oKSpCQDQw8oKva5vM9bP1gYFDfUItLNDB4kEb+ScwwOOjhgpc9Q7/smqKjzftSsAIMDODg1qNa5ezzHaUQZzsRjmSiXUKhUiIiIAAH5+fvD09ISHhwcAwNPTE5cvX0Z8fDyOHj2K4OBgANfumzt//jxcXV3Rp08f9O/f/45+74lMGRs7IiIB3HiP3c2Irt/zFhYWhp07dyInJwdfffUV1qxZg9LSUrz44ovazyYnJzd7b5n19WZMrVI1++46qxt+5lhFBVKqqvCdvz8sJRKMTT6JxutLt+bi/z1rJxaJoNYAUpEIPwYEIqGiHDuvXEHclStY5d3vlt9JAw3+upPPUnzt3GKNGhq1Wrt8KhaLdZZSxWIxVCoVNBoNZs2ahbffflvnmJcuXdJ+T6L2jk/FEhG1EeHh4di6dSsA4Pvvv8fAgQMBAKGhodi7dy86d+4MiUQCKysrJCQkYMCAAQCA4cOHY+3atQAAtVqNzMxMvWOLJRKob/NwRLVKBXupGSwlEqRfvYpLdXW3/HyNSoWrSiVGyhzxeo+eyK6p0fvMADs77Cq59uRt+tWrsJJIYCvVnVNQi8QQiW9/ORo1ahRiY2O1LzTOz89HWVnZbX+OqD3hjB0RURvxzjvvYMaMGdi4caP24QkA6NixIzp27Khdeh00aBAqKiq0s1qrVq3C7NmzsWbNGiiVSkybNg1+fn46x7awskKT9NZ/5A91cMCWokJEpKagr40N+ljb3PLzNSoV5pzOQqP62kzgq9176H3mKVdXLP4zB4+kJMNcLMYKzz56n2mUSiFuZrbx73x9fbFo0SKMGDECarUaHTp0wLZt2277c3/x8fFBUVERmpqasG3bNpw8eRIuLi53/PNExkCk0Qi8rwwREbW4P/74A2d//RUPHj0mdBQ9vw8ZDK+xYzF69GihoxAZPS7FEhG1A87Ozqi2skLTHcyMtaYmiQTVVlZwdnYWOgqRSeBSLBFRO+Ds7AyRVIpKGxs4VVXpjStVKlytqkKTUglra2vY2tx6GdZQKm1sIJJKDdbYlZWV6c38WVtbIzEx0SDHJ2rr2NgREbUDMpkMljY2KJLJ9Bq7uvo6VFRUQqO59gRsVVUlzM3MYG5u3uK5ihyv5ZLJZAY5nqOj422fNiYyZVyKJSJqByQSCfyCgpDn3g2q60+gajQaVFRWory8XNvU/UWtVjd3GINSicXI7dYN/YODm31VCxHdPTZ2RETthL+/P5qsrZHv5IQmpRIlpaWordV/RYmZmTksLC1bPM9lJycora35YmEiA2JjR0TUTjg4OKC7pyeyu3TBldJSKJVNep+xsrKGo6Mjbv3Gu/unFolw3sMdPfr0gYODQwufjaj9YGNHRNROVFVVYc/PP6PQ3AwFfTx1xkQiMeztHeBgbw/xbV5kbAjn3NxQ7eSEsPDwFj8XUXvCxo6IqB04efIkgoKCsGHDBiQkJSGnb1/U2tkBAMykZujk5ARrKyuDnU+lVqO+vh7NvSq10toaZ/t4YmB4OFxdXQ12TiJiY0dEZNI0Gg0+/fRThIaG4vz58wCAxMREFJSXIzs4GJa2HeDUyQnS2+xKcTfq6ushl8uhKFegqLgYNTU1+Ku9U4rFSO7nDZmbG0JDQw12TiK6ho0dEZGJKisrQ0REBP75z3+iqel/99OpVCocPHIEDd26ITt0CDQiw14Krl69CmhbOQ0qqypRWlqCuqYmHPfphzrXLhgXEWHQZpKIruH/VUREJujIkSOYPHkyCgoK9MaGDBmCmJgYAMAPMTE4CmBQ1mlIDfSKk+bu0KtXq5HaxxNyOzuMCR3CPVqJWghn7IiITIhKpcK7776LESNGNNvUvf766zh06BA8PDzg4eGBSZMno6J7DxwJDESVtbVBMphbWOj8usbODmnDhuOigwPWb92KUaNGYffu3QY5FxHpEmmau7OViIiMTlFREaZOnYr9+/frjXXu3BmbNm3CmDFj9MaKi4uxJy4O5fkF6JuTA8+CAojv49JQW1eHiopyqEUiFPbpg5y+fVGgUCDu559x5coVAMCgQYNw7Nixez4HETWPS7FERCbgl19+wbRp01BSUqI3Nnr0aGzevPmmy58uLi6Y/uyzSEhIQJKlBfJdXdArNw/dSkshuYflWZGZGeQeHrjcuzdKbW2RkJSExMREqFQq7Wf47jqilsEZOyIiI9bU1ITFixfjgw8+0BuTSCRYunQpFi1adMdbdhUWFiIxIQEXz52DtLYWHpcvw7VMgY41NTC7oTHTyyGRoNLGBkWOMlzq2hWljY04d/EiEhITUVxcrPPZbt264cCBA+jVq9fdfVkiui02dkRERurSpUuIjo7G8ePH9ca6deuGrVu3IvweXwBcXl6OjIwMZCQno76mBhqlErZ1dbBTlMNcqYRYo4ZaJEajVIoqmQOqrawgkkphaWMDv6AgREVFaZdd/27YsGE4cOAAxGLe5k1kaGzsiIiM0A8//ICZM2eisrJSbywiIgLffvstZDLZfZ9HpVJBoVBALpdDLpejpLgYjfX1UCmVkEilMLe0RCcXFzg7O8PZ2RkymQwSiQTDhg3DkSNHbnrczz77DHPnzr3vfESki40dEZERqa+vx8svv4wvvvhCb8zc3Bwffvgh5s2bB1ErbAt2K0lJSXjyySchl8sRHR2N33//XecpXWtra6Snp6N3794CpiQyPWzsiIiMxJkzZxAVFYWMjAy9sd69eyM2NhZBQUECJLs5pVIJqVSKffv24cEHH9QZCwsLw6FDh+74/j8iuj3e4EBEZAQ2bNiA4ODgZpu6KVOmICUlpc01dQC0u0s88MADePHFF3XGEhIS8MknnwgRi8hkccaOiKgNu3r1KubOnYtNmzbpjVlbW+Ozzz7DjBkzBF96vRPV1dXw9/fHhQsXtDULCwukpKSgX79+AiYjMh1s7IiI2qi0tDRERkYiJydHb8zX1xfbt2+Ht7e3AMnu3ZEjRzB8+HDceOkZMGAAjh49yr1jiQyAS7FERG2MRqPBZ599hkGDBjXb1L3wwgs4ceKE0TV1ADB06FD885//1KmdPHkSK1asECgRkWnhjB0RURtSXl6OmTNnYseOHXpjdnZ2WLduHZ588kkBkhlOXV0dgoKCcObMGW1NKpUiKSkJAQEBwgUjMgFs7IiI2ojExERMnjwZeXl5emMhISHYtm0bevbsKUAywztx4gSGDBkC9Q1blvXv3x8nTpyAhYWFgMmIjBuXYomIBKZWq7FixQoMGzas2aZu4cKFiI+PN5mmDgAGDhyIN954Q6eWkZGBpUuXCpSIyDRwxo6ISEByuRxPP/00fv/9d70xJycnbNiwAePGjRMgWctrbGxESEiIzitcxGIxEhMTMWjQIAGTERkvNnZERALZt28fpk6dCrlcrjc2YsQIbN68GW5ubgIkaz3p6ekICQlBU1OTtubl5YXU1FRYWVkJmIzIOHEploiolSmVSrz55psYM2aMXlMnFovxzjvvYN++fSbf1AGAv78/3n77bZ3a2bNnsXjxYoESERk3ztgREbWivLw8TJkyBQkJCXpjXbp0wdatWzF8+HABkglHqVQiNDQUSUlJ2ppIJMKhQ4cwdOhQAZMRGR82dkRErWTnzp145plnUF5erjc2btw4rF+/Hp06dRIgmfCys7MRGBiIhoYGba1nz55IT0+Hra2tgMmIjAuXYomIWlhDQwNeeuklPPbYY3pNnZmZGf773/9i165d7bapAwBvb28sW7ZMp3bhwgW89tprAiUiMk6csSMiakE5OTmIiopCamqq3liPHj0QGxuLkJAQAZK1PSqVCiNGjEB8fLxO/bfffsODDz4oUCoi48LGjoiohWzZsgWzZ89GdXW13lhkZCS+/PJLdOzYUYBkbdf58+fRv39/1NbWamtdu3bFqVOn+HtFdAe4FEtEZGA1NTV49tlnMXXqVL2mztLSEmvXrsW2bdvYqDSjV69e+PDDD3Vq+fn5evvLElHzOGNHRGRAmZmZiIqKQnZ2tt6Yt7c3YmNj4efnJ0Ay46FWqzFmzBj88ccfOvVdu3ZhwoQJAqUiMg5s7IiIDECj0eDLL7/EggULUF9frzc+c+ZMrFy5EjY2NgKkMz55eXnw9fXF1atXtTUXFxecOnUKjo6OAiYjatu4FEtEdJ8qKioQFRWF2bNn6zV1tra22Lp1K9atW8em7i64u7vj008/1akVFxfjH//4hzCBiIwEZ+yIiO7DiRMnEBUVhUuXLumNBQUFITY2Fr179279YCZAo9HgkUcewZ49e3Tq27dvx5NPPilQKqK2jY0dEdE9UKvV+Pjjj/HGG29AqVTqjc+fPx/vv/8+LCwsBEhnOoqKiuDj46Pz/j9HR0dkZWXB2dlZwGREbROXYomI7lJJSQkmTJiAV199Va+pk8lk2LlzJz799FM2dQbg6uqK1atX69TKysowe/ZscF6CSB8bOyKiu3Dw4EEEBARg7969emPh4eFIS0tDRESEAMlMV3R0NCZNmqRT++mnn7B582aBEhG1XVyKJSK6AyqVCkuXLsW7776rN1MkEonw5ptvYsmSJZBKpQIlNG0lJSXw8fFBSUmJttaxY0ecOnUKXbt2FTAZUdvCxo6I6DYKCgowZcoUHD58WG/MxcUFmzdvxujRowVI1r7s2LEDjz/+uE5t7Nix2Lt3L0QikUCpiNoWLsUSEd3Cnj174O/v32xTN2bMGKSlpbGpayUTJ07E1KlTdWq//vorvvrqK4ESEbU9nLEjImpGY2Mj3njjDXz88cd6YxKJBMuWLcOrr74KsZh/P25N5eXl8PX1RWFhobZma2uLjIwM9OjRQ8BkRG0DGzsior+5cOECoqOjkZSUpDfm4eGBmJgYDBkyRIBkBAB79+7FuHHjdGrDhw/H/v372WhTu8f/A4iIbhAbG4vAwMBmm7rHH38cqampbOoE9vDDD+O5557TqR06dAifffaZQImI2g7O2BERAaitrcWCBQuavV/LwsICH3/8MebMmcOb9NuIqqoq9O/fH7m5udqalZUV0tLS0KdPHwGTEQmLjR0RtXunT59GZGQksrKy9Mb69OmD2NhYBAQEtH4wuqUDBw5g1KhROrXBgwcjPj4eEolEoFREwuJSLBG1WxqNBl9//TUGDBjQbFM3bdo0JCcns6lro0aOHIl58+bp1I4dO4aPPvpIoEREwuOMHRG1S1VVVZg9ezZiYmL0xmxsbPD5559j2rRpAiSju1FTU4OAgAD8+eef2pq5uTmSk5Ph6+srYDIiYbCxI6J2Jzk5GVFRUTh//rzemL+/P2JjY+Hl5SVAMroXiYmJGDp0KNRqtbYWGBiI48ePw8zMTMBkRK2PS7FE1G5oNBp8+umnGDJkSLNN3dy5c3Hs2DE2dUYmNDQUr7zyik4tNTUV//nPfwRKRCQcztgRUbtQVlaGZ555Brt27dIbs7e3x9dff623XRUZj/r6egQHB+P06dPamlQqxfHjxxEUFCRgMqLWxcaOiEzekSNHMGXKFOTn5+uNDR48GDExMejevXvrByODSk5OxqBBg6BSqbQ1Hx8fJCcnw8LCQsBkRK2HS7FEZLJUKhXeffddjBgxotmmbtGiRTh8+DCbOhMRHByMN998U6eWlZWFJUuWCJSIqPVxxo6ITFJRURGmTp2K/fv364116tQJmzZtwtixYwVIRi2psbERgwYNQlpamrYmFosRHx/PHUOoXWBjR0Qm55dffsG0adNQUlKiNzZ69Ghs2rQJrq6uAiSj1pCZmYng4GA0NTVpa56enkhLS4O1tbWAyYhaHpdiichkNDU1YdGiRXj44Yf1mjqJRIJly5bh119/ZVNn4vz8/PDvf/9bp5aTk4N//etfAiUiaj2csSMik3Dp0iVER0fj+PHjemPdunXD1q1bER4eLkAyEoJSqUR4eLjefw8HDhzAiBEjhAlF1ArY2BGR0fvhhx8wc+ZMVFZW6o1FRETg22+/hUwmEyAZCens2bMICAhAfX29tta9e3dkZGSgQ4cOAiYjajlciiUio1VfX48XX3wRTzzxhF5TZ25ujpUrV+Knn35iU9dOeXl5Yfny5Tq1S5cuYeHChQIlImp5nLEjIqN05swZREVFISMjQ2+sd+/eiI2N5YtpCWq1GqNGjcKhQ4d06r/88gufiiaTxMaOiIzOhg0b8OKLL6K2tlZvbMqUKVizZg2X2kjrwoUL6N+/P2pqarQ1Nzc3nDp1Cvb29sIFI2oBXIolIqNx9epVTJs2DTNmzNBr6qytrfHNN99g8+bNbOpIR8+ePfHf//5Xp1ZQUID58+cLlIio5XDGjoiMQlpaGiIjI5GTk6M35uvri+3bt8Pb21uAZGQMNBoNHnroIfz222869R07duCxxx4TJhRRC2BjR0RtmkajwerVq/HKK6+gsbFRb/yFF17AJ598AisrKwHSkTG5fPky/Pz8dB606dy5M7KysuDk5CRgMiLD4VIsEbVZ5eXlmDRpEubNm6fX1NnZ2WH79u1Ys2YNmzq6I926dcPKlSt1aleuXMGcOXPAOQ4yFZyxI6I2KTExEZMnT0ZeXp7eWEhICLZt24aePXsKkIyMmUajwWOPPYa4uDidekxMDKKjowVKRWQ4bOyIqE1Rq9X44IMPsHjxYqhUKr3xhQsXYtmyZTA3NxcgHZmC4uJi+Pj4QKFQaGsymQynTp3idnNk9LgUS0Rthlwux0MPPYQ33nhDr6lzcnLCnj178OGHH7Kpo/vi4uKCL774QqemUCgwa9YsLsmS0WNjR0Rtwr59++Dv74/ff/9db2zEiBFIS0vDuHHjBEhGpigyMhKRkZE6td27d2PDhg0CJSIyDC7FEpGglEollixZguXLl+vNlojFYrz99ttYvHgxJBKJQAnJVJWWlsLX1xdyuVxbs7Ozw6lTp9CtWzcBkxHdOzZ2RCSYvLw8TJkyBQkJCXpjXbp0wdatWzF8+HABklF7ERcXh0cffVSn9sADD+C3336DSCQSKBXRveNSLBEJYufOnQgICGi2qRs3bhzS0tLY1FGLi4iIwPTp03Vq+/btw5o1awRKRHR/OGNHRK2qoaEBr776KlatWqU3ZmZmhhUrVmDBggUQi/n3TmodFRUV8PX1RUFBgbZmY2OD9PR09OrVS8BkRHePjR0RtZqcnBxERUUhNTVVb6xHjx6IjY1FSEiIAMmovfvtt98wduxYndrQoUNx8OBB/iWDjAr/ayWiVrFlyxYEBQU129RFRkYiNTWVTR0JZsyYMXjhhRd0akeOHNHbqYKoreOMHRG1qJqaGsybNw/ffvut3pilpSVWrlyJ559/njeqk+CuXr0Kf39/XLx4UVuzsLBAWloa+vbtK2AyojvHxo6IWkxmZiaioqKQnZ2tN+bt7Y3Y2Fj4+fkJkIyoeYcOHcKIESN0agMHDkRCQgKkUqkwoYjuApdiicjgNBoN1q5di4EDBzbb1M2cORNJSUls6qjNGT58OBYsWKBTO3HiBD744ANhAhHdJc7YEZFBVVRUYNasWfjuu+/0xmxtbfHll19i8uTJAiQjujN1dXUICAjAuXPntDUzMzMkJSXB399fwGREt8fGjogM5sSJE4iKisKlS5f0xoKCghAbG4vevXu3fjCiu3Ts2DGEhYVBrVZra/7+/jhx4gT3KqY2jUuxRHTf1Go1PvroI4SFhTXb1M2fPx+JiYls6shoDB48GK+99ppOLT09He+++65AiYjuDGfsiOi+lJSUYPr06di7d6/emEwmw7fffouIiAgBkhHdn4aGBgwYMACnTp3S1iQSCY4ePcpX81CbxcaOiO7ZwYMH8dRTT6GwsFBvLDw8HFu3buVm6mTUUlNTMXDgQCiVSm3N29sbKSkpsLS0FDAZUfO4FEtEd02lUmHJkiUYNWqUXlMnEomwePFiHDhwgE0dGb3AwEC89dZbOrXs7Gy9GlFbwRk7IrorBQUFmDJlCg4fPqw35uLigs2bN2P06NECJCNqGU1NTRgyZAiSk5O1NZFIhMOHDyM8PFzAZET62NgR0R3bs2cPpk+fjrKyMr2xMWPGYOPGjXB2dhYgGVHLysrKQlBQEBobG7W1Xr16IT09HTY2NgImI9LFpVgiuq3Gxka88sormDBhgl5TJ5FIsGLFCuzdu5dNHZksHx8fvPfeezq18+fPY9GiRQIlImoeZ+yI6JYuXLiA6OhoJCUl6Y15eHggJiYGQ4YMESAZUetSqVQYNmwYEhMTder79u3j7QfUZrCxI6Kbio2NxaxZs1BVVaU39vjjj2PdunVwcHAQIBmRMHJycuDv74+6ujptzd3dHZmZmbCzsxMwGdE1XIolIj21tbWYNWsWoqOj9Zo6CwsLrF69Gt9//z2bOmp3PD098f777+vU8vLy8PLLLwuUiEgXZ+yISMfp06cRGRmJrKwsvbE+ffogNjYWAQEBrR+MqI1Qq9V44IEHcODAAZ367t27MX78eIFSEV3Dxo6IAAAajQbffPMN5s2bp7PM9Jdp06Zh9erVsLW1FSAdUdty6dIl+Pn5obq6WltzdXXFqVOnIJPJBExG7R2XYokIVVVVeOqpp/Dcc8/pNXU2NjbYsGEDNmzYwKaO6Lru3bvjk08+0akVFRVh3rx5AiUiuoYzdkTtXHJyMqKionD+/Hm9MX9/f8TGxsLLy0uAZERtm0ajwfjx4/X2Sf7+++8xadIkgVJRe8fGjqid0mg0WLlyJV577TU0NTXpjc+dOxcfffQR98MkuoWCggL4+vqioqJCW3NyckJWVhY6d+4sXDBqt7gUS9QOlZWV4dFHH8U///lPvabO3t4eP/zwAz777DM2dUS34ebmhlWrVunUSktLMXv2bHDehITAGTuidubIkSOYMmUK8vPz9cYGDx6MmJgYdO/evfWDERkpjUaDSZMmYceOHTr1zZs346mnntKpqVQqKBQKyOVyyOVylBQXo6GuDmqVCmKJBBZWVujk4gJnZ2c4OztDJpNBIpG05tchI8fGjqidUKlUWL58OZYsWQK1Wq03vmjRIrz77rswMzMTIB2Rcbty5Qp8fHxQWlqqrdnb2+PUqVNwc3NDeXk50tPTkZmSgvqaGmiUStjW1aGjQgEzpRJijQZqkQhNUikqZTJUW1lBJJXC0sYGfkFB8Pf353sj6Y6wsSNqB4qKijB16lTs379fb6xTp07YtGkTxo4dK0AyItPx/fff48knn9SpTZo0CVGRkbiUkwOz2lq4512Gq0KBjjU1MFOpbnqsJokElTY2KJLJkOfeDU3W1ujh6YmwoUPh6ura0l+FjBgbOyIT9+uvv+Lpp59GSUmJ3tjo0aOxadMmXiiIDGTKlCmIiYmBRCJBaGgowkJC0KWxCd6FhehaWgpJM7Plt6MSi5Hv5IQ/PdxR7eSEkLAwhIWFQSqVtsA3IGPHxo7IRDU1NWHx4sX44IMP9MbEYjGWLl2K119/nffvEBmQQqHA0KFDETpoENwcHOB55gzccv6Es5MTpPf5/5paJEKOmxvOeHpC1tUN4yIi4OLiYqDkZCrY2BGZoEuXLmHy5Mk4duyY3ljXrl0RExOD8PBwAZIRmbbc3FzEbNgA88uX4Z2cDOvrey2bm1vA0dERIgOco8raGsne3qjt0gUToyLh4eFhgKOSqeDrTohMzI8//ojAwMBmm7qIiAikpaWxqSNqAbm5ufghJgauRcUIPZGkbeoAoLGxAbU1NQY5j11tLYampsL+0kX8EBOD3NxcgxyXTAMbOyITUV9fj7lz52LSpEk6L0sFADMzM3z66af46aef4OjoKExAIhNWXFyMHbGxkOXmYXBWFhw6dIBErLv0WlVVBeUtHpi4G1K1GkNOZUGWl4cdsdtRXFxskOOS8WNjR2QCzp49i8GDB+Pzzz/XG+vduzeOHj2K+fPnQyQyxEIQEd1IqVRiT1wcrAuLMOj0aYg1GohFItjb2+t8TgMNKsrLYaj7n8QaDQZlnYZVUSF+jouDUqk00JHJmLGxIzJyGzZsQHBwMNLT0/XGpkyZguTkZAQHBwuQjKh9SEhIQHl+AYKzsyG94alXCwsL2Fjb6Hy2samx2S387pVUrUbw6WwoCgqQmJhosOOS8WJjR2SkqqurMW3aNMyYMQM1f7t3x8rKCl9//TU2b94MOzs7gRISmb7CwkIkJSSgb04O7Gpr9cY72NlBIvnba0kM/Mxix9paeJ3LwYn4eBQVFRn02GR82NgRGaG0tDQEBwdj06ZNemO+vr44efIknn32WS69ErWwxPh42JaWwrOgoNlxsUh0fVuwa82dtbUNzM3NDZ6jT0EBbEtLkRAfb/Bjk3FhY0dkRDQaDVavXo3Bgwfj3LlzeuOzZs3CiRMn0K9fPwHSEbUv5eXluJiTg965eRDfYhbOTCqFc+fOcHXtAvuOHVski1ijQa/cPFw8dw7l5eUtcg4yDmzsiIxEeXk5Jk2ahH/84x9oaGjQGbOzs0NsbCzWrl0LKysrgRIStS/p6ekwq61F1xv2h72Vlp4/71ZaCmltLTIyMlr4TNSWsbEjMgJHjx5FQEAAduzYoTcWEhKC1NRUREZGCpCMqH1SqVTITEmBe97le9omrCVI1Gp4XL6MjORkqAz0WhUyPmzsiNowtVqNFStWYOjQocjLy9Mbf+WVVxAfH4+ePXsKkI7IdIlEIixevFj764ULF2L9+vXaXysUCtTX1MBVobjlcdYXFKCxFRs/17JruRR/yzV37lx07twZAwYMaLUsJAw2dkRtlFwux8MPP4w33nhD72/fjo6O2L17Nz766KMWuRGbqL2ztbXFli1bUHXD7hE3ksvl0CiVsK+uvuVxNhQWoKmZ++80Gg3ULbCjZ8eaGmiUSsjlcp36lClTsHfvXoOfj9oe6e0/QkSt7Y8//sDUqVObfZv88OHDsWXLFri5uQmQjKh9sLCwwFNPPYUvvvgCixYt0tZzcnIwY8YMLF26FLZ1dTiuKMN3xcX4r1dfLDp3FlnV1ZCIRHjGzQ11KjWuNDYiOj0N3Syt8Hm/fhh47CiedHHB0YoKfNTHCzHFRUisqIBEJMKiHj0QZu8ApUaDFRcuIPVqFZo0Gsxzd8eDjk74US7HAUUZ6tVqnK+txT/cPZBfX4/fy0rhZG6Otf18YA7Atq4Ocrkcvr6+2txhYWG4dOlS6/9GUqvjjB1RG6JUKrF48WI8+OCDek2dWCzGkiVL8Mcff7CpI2oF8+fPx5dffon6+nptzdPTE2KxGJlpaeioUOAn+RVM7OyM7Jpq5Nc3YG/wAOwOCsYYRydM7dIFnc3Nsc0/AJ9ff1K9QqnEALuO+DEgEOdqa5BbV49dgUH43LsfFufkoEGtxnfFxXCztMQPAYHY6tcf/710Sbuc+2dtLf6vrze29PfH0vN/wtPGGruCgtFRKsXB68uvdopylHCLsXaLM3ZEbcTly5cxefJkJCQk6I116dIFW7ZswYgRI1o/GFE71alTJ0yYMAHffPONTn3GjBmI++knTLKzQ9rVKqzo0wfVKiWuNDbgnfN/4gGZI8IdHJo9pqVYjJEyGQAguaoKj3TqBLFIhK6WluhuZYULtbVIqChHTm0tdly5tpxap1ajuPHak/BD7O1hJZHASiKBmViM0bJrez972dig4PrT8uZKpU4zSu0LZ+yI2oC4uDj4+/s329SNGzcOaWlpbOqIBLBw4UKsXLlSZx/WyMhIJKek4PilS3jA0RESkQgdpWbYFRSMQR074puCfKy4eKHZ41mKb37Z1eDaQxsaAO/19kRcYBDiAoNwKGQg3C2vvcbI/IafF93wazFE2nv2xBo1VNw3tt1iY0ckoIaGBsyfPx+PPvqo3ktFpVIpPvroI+zatQudOnUSKCFR+9atWzeEhYXhhx9+0NY6dOgAd3d3xKSm4rHOzgAARVMTNBoNHnbqhLnu7siuvrbNn41EgpqbvHok2M4Oe0pLoNZoUFBfj7y6OvSwskKovT1iiougut6onb7NAxp/pxaJIZFyQa694r95IoHk5OQgOjoaKSkpemM9evTAtm3bMHDgQAGSEdGNFi1ahA0bNujUwsPCUJiTAy8bGwCAvKEBr+ecg1oDSEUi/Ov6K4giXVzwdGYGellZa++z+8sYR6dry7GpKZCIRHjX0xMWYjGiXVxxub4ej6amQAOgu5UVVnvf+W4yjVIpzC0tdWrPPfcc9uzZg7KyMnTt2hWrVq3CxIkT7+F3g9o6kUbTAs9bE9EtbdmyBbNnz0Z1M38Tf/LJJ/HVV1+hYwttPURE9+/pp59GQ1ERljU0Ch1Fz+9DBsNr7FiMHj1a6CgkAC7FErWimpoaPPvss5g6dapeU2dpaYm1a9ciNjaWTR1RG/bwww8jLS0N/UNC0CSRCB1HR5NEgmorKzg7OwsdhQTCGTuiVpKZmYmoqChkZ2frjXl7eyM2NhZ+fn4CJCOiu1VSUoL1a9Yg/NhxON3kJcZCeC7nHHJEIsgcHSG9fp/d1q1b0a/fnS/lknHjPXZELUyj0eDLL7/EggULmn0FwcyZM7Fy5UrYXL9Xh4jaPplMBksbGxTJZG2qsZs/ZgwKAgLw4vz5kLSx2URqHVyKJWpBFRUViIqKwuzZs/WaOltbW2zduhXr1q1jU0dkZCQSCfyCgpDn3g2qW7zCpDWpxGLkduuG/sHBbOrasbbxXyORCTpx4gQCAwPx3Xff6Y0FBQUhNTUVkydPFiAZERmCv78/mqytke/kJHQUAMBlJycora3Rv39/oaOQgNjYERmYWq3GRx99dNO9GefPn4/ExET07t279cMRkcE4ODigh6cn/vRwh1okEjSLWiTCeQ939OjTBw432fWC2gc2dkQGVFJSggkTJuDVV1/VeVM9cO2enJ07d+LTTz+FhYWFQAmJyJDChg5FtZMTcgTev/mcmxuqnZwQFh4uaA4SHh+eIDKQgwcP4qmnnkJhYaHeWHh4OLZu3Ypu3boJkIyIWoqrqytCwsKQVN8AV4UCdrW1LXIepUqFxsZGWFpYQPy3e/oqra1xto8nBoaHw9XVtUXOT8aDM3ZE90mlUmHJkiUYNWqUXlMnEomwePFiHDhwgE0dkYkKCwuDQ1c3JHt7Q9kCD1LU1dfjypUrqKgoR7FcjqvV1fjrPWVKsRjJ/bwhc3NDaGiowc9NxoczdkT3oaCgAFOmTMHhw4f1xlxcXLB582a+/Z3IxEmlUoyPiMC2ikocb2zAkFNZEBvwFbFXr14FtK2cBlevVqGurha2He2RGhyEOtcueDQiQvveOmrfOGNHdI/27NkDf3//Zpu6MWPGIC0tjU0dUTvh4uKCiVGRULi746ivj0Fn7kTNPJjRoNEg3qsPznfoAL/gILi4uBjsfGTc2NgR3aXGxka88sormDBhAsrKynTGJBIJVqxYgb1793JLH6J2xsPDA5MmT0ZF9x44EhiIKmtrgxz37w9b1djZIW3YcFx0cMCGmBiMHTsW27dvN8i5yPhxSzGiu3DhwgVER0cjKSlJb8zDwwMxMTEYMmSIAMmIqK0oLi7Gnrg4lOcXoG9ODjwLCu5rabauvg7l5eVQi0Qo7NMHOX37okChQNzPP+PKlSsArr1TLy0tzUDfgIwZF+SJ7lBsbCxmzZqFqma2D3r88cexbt06vj+KiODi4oLpzz6LhIQEJFlaIN/VBb1y89CttBQStfruD2hmDrmHBy737o1SW1skJCUhMTERKpVK+5HOnTsb8BuQMeOMHdFt1NbWYsGCBfjqq6/0xiwsLPDxxx9jzpw5zd4HQ0TtW2FhIRITEnDx3DlIa2vhcfkyXMsU6FhTA7MbGrO/a5JIUGljgyJHGS517YbSxgacu3gRCYmJKC4u1vmsh4cH9u/fj549e7b01yEjwMaO6BZOnz6NyMhIZGVl6Y316dMHsbGxCAgIaP1gRGRUysvLkZGRgYzkZNTX1ECjVMK2rg52inKYK5UQa9RQi8RolEpRJXNAtZUVRFIpLG1s4BcUhKeffhr5+fnNHjskJARHjx7l/rAEgI0dUbM0Gg2++eYbzJs3D3V1dXrj06ZNw+rVq2FraytAOiIyViqVCgqFAnK5HHK5HCXFxWisr4dKqYREKoW5pSU6ubjA2dkZzs7OkMlkkEgkGDNmDH7//febHnfFihVYtGhRK34TaqvY2BH9TVVVFWbPno2YmBi9MRsbG3z++eeYNm2aAMmIqL3KyMjA5MmTkZeXhylTpmDfvn24cOGCdtzc3BzJycnw9fUVMCW1BWzsiG6QnJyMqKgonD9/Xm/M398fsbGx8PLyEiAZEdG11QSRSITExEQMHToU6hsexggMDMTx48dhZmYmYEISGt9jR4Rrf1h++umnGDJkSLNN3dy5c3Hs2DE2dUQkqL8e0goNDcUrr7yiM5aamoply5YJEYvaEM7YUbtXVlaGZ555Brt27dIbs7e3x9dff43HH39cgGRERDdXX1+P4OBgnD59WluTSqU4duwYgoODBUxGQmJjR+3akSNHMGXKlGafNhs8eDBiYmLQvXv31g9GRHQHkpOTMWjQIJ132vn4+CA5OVlvxwpqH7gUS+2SSqXCe++9hxEjRjTb1C1atAiHDx9mU0dEbVpwcDDefPNNnVpWVhaWLFkiUCISGmfsqN0pKirC1KlTsX//fr2xTp06YdOmTRg7dqwAyYiI7l5jYyMGDRqks6WYWCxGfHw8tzhsh9jYUbvy66+/4umnn0ZJSYne2OjRo7Fp0ya4uroKkIyI6N5lZmYiODgYTU1N2pqnpyfS0tJgbW0tYDJqbVyKpXahqakJixYtwkMPPaTX1InFYrz33nv49ddf2dQRkVHy8/PDv//9b51aTk4O3njjDYESkVA4Y0cm79KlS5g8eTKOHTumN9a1a1fExMQgPDxcgGRERIajVCoRHh6O48eP69T379+PkSNHCpSKWhsbOzJpP/74I2bOnImKigq9sYiICHzzzTdwdHRs/WBERC3g7NmzCAgIQH19vbbWvXt3ZGRkoEOHDgImo9bCpVgySfX19Zg7dy4mTZqk19SZmZnh008/xU8//cSmjohMipeXF5YvX65Tu3TpEhYuXChQImptnLEjk3PmzBlERUUhIyNDb6x3797Ytm0bX95JRCZLrVZj5MiROHz4sE597969eOihhwRKRa2FjR2ZlA0bNuDFF19EbW2t3tiUKVPwxRdfwM7OToBkRESt58KFC+jfvz9qamq0NTc3N2RmZsLBwUHAZNTSuBRLJuHq1auYNm0aZsyYodfUWVlZ4euvv8bmzZvZ1BFRu9CzZ0989NFHOrWCggLMnz9foETUWjhjR0YvLS0NkZGRyMnJ0Rvz9fVFbGws+vXrJ0AyIiLhaDQajB07Fr///rtOfceOHXjssceECUUtjo0dGS2NRoPVq1fjlVdeQWNjo974Cy+8gE8++QRWVlYCpCMiEt7ly5fh6+uLqqoqba1z587IysqCk5OTgMmopXAploxSeXk5Jk2ahHnz5uk1dXZ2dti+fTvWrFnDpo6I2rVu3brh//7v/3RqV65cwZw5c8B5HdPEGTsyOomJiZg8eTLy8vL0xkJCQrBt2zb07NlTgGRERG2PRqPBo48+il27dunUY2JiEB0dLVAqails7MhoqNVqfPDBB1i8eDFUKpXe+MKFC7Fs2TKYm5sLkI6IqO0qLi6Gj48PFAqFtiaTyXDq1ClupWhiuBRLRkEul+Ohhx7CG2+8odfUOTk5Yc+ePfjwww/Z1BERNcPFxQVffPGFTk2hUGDWrFlckjUxbOyozdu3bx/8/f31nuwCgBEjRiAtLQ3jxo0TIBkRkfGIjIxEZGSkTm337t1Yv369MIGoRXApltospVKJJUuWYPny5Xp/oxSLxXj77bexePFiSCQSgRISERmX0tJS+Pr6Qi6Xa2t2dnbIzMyEu7u7gMnIUNjYUZuUl5eHKVOmICEhQW+sS5cu2Lp1K4YPHy5AMiIi4xYXF4dHH31Up/bAAw/gt99+g0gkEigVGQqXYqnN2blzJwICAppt6saNG4e0tDQ2dURE9ygiIgLTp0/Xqe3btw9r1qwRKBEZEmfsqM1oaGjAq6++ilWrVumNmZmZYcWKFViwYAHEYv59hIjoflRUVMDX1xcFBQXamrW1NTIyMtCrVy8Bk9H9YmNHbUJOTg6ioqKQmpqqN9ajRw/ExsYiJCREgGRERKbpt99+w9ixY3VqQ4cOxYEDB3jvshHj1AcJbsuWLQgKCmq2qYuMjERqaiqbOiIiAxszZgxeeOEFndqRI0ewcuVKgRKRIXDGjgRTU1ODefPm4dtvv9Ubs7S0xMqVK/H888/zZl4iohZy9epV+Pv74+LFi9qahYUFUlNT4e3tLWAyulds7EgQmZmZiIqKQnZ2tt6Yt7c3YmNj4efnJ0AyIqL25dChQxgxYoROLSQkBImJiZBKpcKEonvGpVhqVRqNBmvXrsXAgQObbepmzpyJpKQkNnVERK1k+PDhWLBggU4tKSkJH3zwgTCB6L5wxo5aTUVFBWbNmoXvvvtOb8zW1hZffvklJk+eLEAyIqL2ra6uDgEBATh37py2ZmZmhqSkJPj7+wuYjO4WGztqFSdOnEBUVBQuXbqkNxYUFITY2Fj07t279YMREREA4NixYwgLC4NardbW/P39ceLECe7DbUS4FEstSq1W46OPPkJYWFizTd38+fORmJjIpo6ISGCDBw/Ga6+9plNLT0/Hu+++K1AiuhecsaMWU1JSgunTp2Pv3r16YzKZDN9++y0iIiIESEZERM1paGjAgAEDcOrUKW1NIpHg6NGjfO2UkWBjRy3i4MGDeOqpp1BYWKg3Fh4ejq1bt6Jbt24CJCMioltJTU3FwIEDoVQqtTVvb2+kpKTA0tJSwGR0J7gUSwalUqmwZMkSjBo1Sq+pE4lEWLx4MQ4cOMCmjoiojQoMDMRbb72lU8vOztarUdvEGTsymIKCAkyZMgWHDx/WG3NxccHmzZsxevRoAZIREdHdaGpqwpAhQ5CcnKytiUQiHD58GOHh4QImo9thY0cGsWfPHkyfPh1lZWV6Y2PGjMHGjRvh7OwsQDIiIroXWVlZCAoKQmNjo7bWq1cvpKenw8bGRsBkdCtciqX70tjYiFdeeQUTJkzQa+okEglWrFiBvXv3sqkjIjIyPj4+ek/Enj9/HosWLRIoEd0JztjRPbtw4QKio6ORlJSkN+bh4YGYmBgMGTJEgGRERGQIKpUKQ4cOxdGjR3Xq+/bt4601bRQbO7onsbGxmDVrFqqqqvTGHn/8caxbtw4ODg4CJCMiIkPKycmBv78/6urqtDV3d3dkZmbCzs5OwGTUHC7F0l2pra3FrFmzEB0drdfUWVhYYPXq1fj+++/Z1BERmQhPT0+8//77OrW8vDy8/PLLAiWiW+GMHd2xrKwsREVFISsrS2+sT58+iI2NRUBAQOsHIyKiFqVWq/HAAw/gwIEDOvXdu3dj/PjxAqWi5rCxo9vSaDT4+uuv8dJLL+lMxf9l2rRpWL16NWxtbQVIR0REreHSpUvw8/NDdXW1tubq6opTp05BJpMJmIxuxKVYuqWqqipMmTIFzz//vF5TZ2Njgw0bNmDDhg1s6oiITFz37t3x8ccf69SKioowb948gRJRczhjRzd18uRJREdH4/z583pj/v7+iI2NhZeXlwDJiIhICBqNBuPGjcMvv/yiU//+++8xadIkgVLRjdjYkR6NRoOVK1fitddeQ1NTk974P/7xD3z44YfcM5CIqB0qKCiAr68vKioqtDUnJydkZWWhc+fOwgUjAFyKpb8pKytDREQE/vnPf+o1dfb29vjxxx+xatUqNnVERO2Um5sbVq1apVMrLS3FnDlzwLki4XHGjrSOHDmCyZMno6CgQG9syJAhiImJgYeHhwDJiIioLdFoNHj88cfx008/6dQ3b96Mp556SphQBICNHeHam8X/85//4J133oFardYbf/3117F06VKYmZkJkI6IiNoiuVwOX19flJaWamv29vbIyspCly5dBEzWvnEptp0rKirCmDFj8Pbbb+s1dZ07d8avv/6K5cuXs6kjIiIdzs7O+OKLL3RqFRUVeO6557gkKyA2du3YL7/8An9/f+zfv19vbPTo0UhPT8eYMWMESEZERMbgiSeewOTJk3Vqe/fuxTfffCNQIuJSbDvU1NSExYsX44MPPtAbk0gkWLp0KRYtWgSJRCJAOiIiMiYKhQI+Pj4oLi7W1jp06IDMzEzely0ANnbtzKVLlxAdHY3jx4/rjXXr1g1bt25FeHi4AMmIiMhY7dmzBxMmTNCpjRo1Cr///jvEYi4Otib+brcjP/zwAwICAppt6iIiIpCWlsamjoiI7tr48ePx7LPP6tT279+Pzz//XKBE7Rdn7NqB+vp6vPzyy3o3uQKAubk5PvzwQ8ybNw8ikUiAdEREZAoqKyvh5+eHy5cva2vW1tZIS0uDp6engMnaFzZ2Ju7MmTOIiopCRkaG3ljv3r0RGxuLoKAgAZIREZGp2bdvHx588EGdWmhoKA4fPsz7tlsJl2JN2IYNGxAcHNxsUzdlyhSkpKSwqSMiIoN54IEH8OKLL+rUEhMT8cknnwiUqP3hjJ0Junr1KubOnYtNmzbpjVlbW+Ozzz7DjBkzuPRKREQGV11djYCAAJw/f15bs7CwQEpKCvr16ydgsvaBjZ2JSUtLQ2RkJHJycvTGfH19sX37dnh7ewuQjIiI2ov4+HgMGzZM50XFAwYMQGJiIl9438K4FGsiNBoNPvvsMwwaNKjZpu6FF17AiRMn2NQREVGLCw8Px8svv6xTO3nyJFasWCFQovaDM3YmoLy8HDNnzsSOHTv0xuzs7LBu3To8+eSTAiQjIqL2qq6uDkFBQThz5oy2JpVKkZSUhICAAOGCmTg2dkYuMTERkydPRl5ent5YSEgItm3bhp49ewqQjIiI2rsTJ04gNDQUKpVKW/Pz80NSUhIsLCwETGa6uBRrpNRqNVasWIFhw4Y129QtXLgQ8fHxbOqIiEgwAwcOxOuvv65Ty8zMxNKlSwVKZPo4Y2eE5HI5nn76afz+++96Y05OTtiwYQPGjRsnQDIiIiJdjY2NCAkJ0Xn1llgsRmJiIgYNGiRgMtPExs7I7Nu3D1OnToVcLtcbGzFiBDZv3gw3NzcBkhERETUvPT0dISEhaGpq0ta8vLyQmpoKKysrAZOZHi7FGgmlUok333wTY8aM0WvqxGIx3nnnHezbt49NHRERtTn+/v5YsmSJTu3s2bN48803BUpkujhjZwTy8vIwZcoUJCQk6I116dIFW7duxfDhwwVIRkREdGeUSiVCQ0ORlJSkrYlEIhw8eBDDhg0TMJlpYWPXxu3cuRPPPPMMysvL9cbGjRuH9evXo1OnTgIkIyIiujvZ2dkIDAxEQ0ODttajRw9kZGTA1tZWwGSmg0uxbVRDQwNeeuklPPbYY3pNnZmZGf773/9i165dbOqIiMhoeHt7Y9myZTq1ixcv4rXXXhMokenhjF0blJOTg6ioKKSmpuqN9ejRA7GxsQgJCREgGRER0f1RqVQYMWIE4uPjdeq//fYbHnzwQYFSmQ42dm3Mli1bMHv2bFRXV+uNRUZG4ssvv0THjh0FSEZERGQY58+fR//+/VFbW6utde3aFadOneI17j5xKbaNqKmpwbPPPoupU6fqNXWWlpZYu3Yttm3bxv/giYjI6PXq1QsffvihTi0/Px8LFiwQJpAJ4YxdG5CRkYGoqCid/fT+4u3tjdjYWPj5+QmQjIiIqGWo1WqMGTMGf/zxh049Li4OjzzyiECpjB8bOwFpNBqsXbsWCxYs0HlC6C8zZ87EypUrYWNjI0A6IiKilpWXlwdfX19cvXpVW3N2dkZWVhYcHR0FTGa8uBQrkIqKCkRGRmLOnDl6TV2HDh2wdetWrFu3jk0dERGZLHd3d3z66ac6Nblcjn/84x/CBDIBnLETwPHjxxEdHY1Lly7pjQUHB2Pbtm3o3bt36wcjIiJqZRqNBo888gj27NmjU9++fTuefPJJgVIZLzZ2rUitVuO///0v/vWvf0GpVOqNL1iwACtWrICFhYUA6YiIiIRRVFQEHx8fnfe2Ojo6IisrC87OzgImMz5cim0lJSUlmDBhAl577TW9pk4mkyEuLg6ffPIJmzoiImp3XF1dsXr1ap1aWVkZXnjhBXD+6e6wsWsFBw4cgL+/P/bu3as3Fh4ejrS0ND4BRERE7Vp0dDQmTZqkU9u5cyc2b94sUCLjxKXYFqRUKvHuu+/i3Xff1fsbh0gkwptvvoklS5ZAKpUKlJCIiKjtKCkpgY+PD0pKSrS1jh074tSpU+jatauAyYwHG7sWkp+fj6eeegqHDx/WG3NxccGWLVswatQoAZIRERG1XTt27MDjjz+uUxs7diz27t0LkUgkUCrj0S4aO5VKBYVCAblcDrlcjpLiYjTU1UGtUkEskcDCygqdXFzg7OwMZ2dnyGQySCSSez7f7t27MWPGDJSVlemNjR07Fhs3bkTnzp3v5ysRERGZrKefflpvCXbt2rWYNWuWTq21r+/GwKQbu/LycqSnpyMzJQX1NTXQKJWwratDR4UCZkolxBoN1CIRmqRSVMpkqLaygkgqhaWNDfyCguDv7w8HB4c7Pl9jYyNef/11fPLJJ3pjUqkUy5Ytw8KFCyEW89ZGIiKimykvL4evry8KCwu1NVtbW2RkZKBHjx6tfn03JibZ2BUWFiIxPh4Xc3JgVlsL97zLcFUo0LGmBmYq1U1/rkkiQaWNDYpkMuS5d0OTtTV6eHoibOhQuLq6/u9zTU3YuHEjioqKMH36dHTr1g3nz59HdHQ0Tp48qXdcDw8PbNu2DYMHD26R70tERGRq9u7di3HjxunUxo8fj+lPP41Lf/7ZItd3U2BSjZ1SqURCQgKSEhJgW1qK3rl56FpaColafdfHUonFyHdywp8e7qh2ckJIWBjCwsIglUoxd+5cfP755wCu3S/31ltv4fXXX9fZEuUvjz/+ONatW2eyfzMgIiJqKc8//zzWrVsHiUSC0NBQhIWEoEtjI7wLi1rk+m4KTKaxKy4uxp64OJTnF6BvTg48CwogNsBXU4tEyHFzwxlPT8i6uqGvry/Cw8Nv+14dCwsLfPLJJ5g9ezZv9iQiIroHVVVVGD58OAYEBsLNwQGeZ87A7VwOnDs5QSq5v0bs79f3cRERcHFxMVBy4ZhEY5ebm4sdsbGwLixCcHY27GprDX6OKmtrJHt7I9/SAhtiYpCXl3fTz3p5eSE2Nhb+/v4Gz0FERNRe5ObmYtvGjTDLy4N3cjKsq6oAAOZm5nB0coIhpk3+ur7XdumCiVGR8PDwMMBRhWP0jV1ubi5+iImBY24eBp4+Dek9TMveqTq1Gkd69USeoyO2/fhjs83d9OnT8dlnn8HW1rbFchAREZm6G6/vfY8dRf3fbney62BnsGutUizGcZ9+ULi7Y9LkyUbd3Bl1Y1dcXIxtGzfC/uIlDMnKMsjS660oFArUNjbg9JAhuOjggE3btuHKlSvacXt7e/z5559wdHRs0RxERESm7O/Xd5FajSslJVCpbtySU4ROnTrBzED3xqlFIhz19UFF9x6Inva00S7LGu17N5RKJfbExcG6sAiDTp9u8aauobEB9Q31EGs08D5+HF1q6xAxbpzO+3AqKiqwadOmFs1BRERkypq7votEIjjY2wM6i68aVFSUw1BXf7FGg0FZp2FVVIif4+L09nU3Fkbb2CUkJKA8vwDB2dktuvz6l8rKKu0/S1Qq9E0+CTeZDKGhoTqf4zvqiIiI7t3Nru/m5uawtbHR+WxTUxMaGxsNdm6pWo3g09lQFBQgMTHRYMdtTUbZhRQWFiIpIQF9c3Ja5EGJO2FTVQXPM2cQFhKina596KGH8NxzzwmSh4iIyNjd7vrewa4DpFKzFs3QsbYWXudycCI+HkVFRS16rpZglI1dYnw8bEtL4VlQ0Grn7NixI/43BSyCmdQMvS/no0tjIxb/618oKirC3r17YW1t3WqZiIiITMntru8iiCCTyWAmNQMggrW1DczNzQ2eo09BAWxLS5EQH2/wY7c0o3sbX3l5OS7m5CAwN6/F76u7kYW5OVxdXbX7z/3V4vUtLEKaiwssLCxaLQsREZGpudPru1QiQadOnVo0i1ijQa/cPKQ5OqK8vNyoNhkwuhm79PR0mNXWomtpaaufWwRAckNTBwDdSkshra1FRkZGq+chIiIyFUJe35tjrNd3o2rsVCoVMlNS4J53+Z62EWkJErUaHpcvIyM5Gapb7FNHREREzeP13XBatLE7efIkXn311ZuOx8XF4ZNPPrnj4ykUCtTX1ODc6dOISE1BRGoKfBLi8UhKMiJSU7AuP/++8p6rqcG0zAw8eDIJDyWfxDt//okmtRr/l5uLTYWFN/0517JruWJjYzFq1Cj0798f27Zta/aza9asQWxsLADg0KFD8PHxwaBBg+769+Jm1q1bB09PT4hEIlRXV9/38YiIiO7H0qVL4evrCz8/PwwYMAAXL17U+8xf1/dnY5u/dt7OV/mXdX7tHX9E2ydEpKag8R6bxb+u7wqFQqe+e/du+Pr6QiwW49SpU/d07JZyx/fYffnll5g1a9ZdHXzAgAEYMGDATccjIiLu6nhyuRwapRKP2tpiUmAQAGBk0gls8w+AzQ3vk9NoNNAAEN/FHq11KhXmZJ/G0l69EebgAI1Gg10lJWi8zX18GgAWZWWovVqN//znP8jKygIAPPXUUxg5ciScnZ11Pj979mztP8fExOCtt95CdHT0Hef8i0ql0nmH3l8GDRqE3377DSNHjrzrYxIRERlSYmIiDh48iLS0NEilUuTn58Pmb68sAf53fRfd473zX+Xn4/mu3bS/7iCVIu56n3A/OtbUQKNUQi6X69zX5+Xlhe+//17nmt5W3HFj9/nnn2PWrFmorq7Giy++iOzsbGg0GqxcuRJhYWGoqqrCnDlzkJmZCbFYjM8//xyNjY347LPP8P333+PAgQN46aWXIBaLYWZmhpMnT2L9+vU4deoUPvroI1y4cAHPPvssFAoFunfvjvXr10Mmk2HEiBEYNGgQ9u/fj9LSUjw1cuRN31s38NhRPOnigqMVFfivlxd+KS3F72VlaFKrMcW1Cya7ugIAvricp1ffVVKCELuOCLt+g6RIJEJE585654gpKsL38mI0qNXoa2WFhY5OUKuUOBkfr/O3ELVajXfffRe//PILzMzM0LVrV6xduxYrV66Eg4MDbG1tsW3bNvz888+Ii4tDQEAAzp07h3/9618oLS3Fm2++ieLiYlhYWGD58uXo1asXXn31VTg4OODUqVMYOnQo5syZo5fPxsYGGo0GSqUSFy9ebPZ/ICIiotaQnp4Oc3NznS04Gxsb8eOPP+Kzzz5DQ0MD+vfvj0mTJsGmpgYAoLy+7LkmPx/7FWVo1GgwxcUVU7p0AQB8lpeLvaWlEEOEJ12cUdrYhKtKJSJSUxBkZ4d3evVuNsu4lGT84B8AAAg+dhRb/Poj0M4OEakp2OLXH2KRCO/8+SfO1117zcqbPXsi2K4jbOvqIJfL4evrqz2Wp6enwX+vDOWOG7uzZ88CAN577z1MnDgRGzduRH5+PsaPH4/09HQsXboU3bt3x5YtW6BSqVBTU4OUlBTtz3/88cf4+OOP8eCDD6KyslLv+C+99BJefPFFREZG4v3338c777yD//u//wMAWFhYICkpCS/MmoX4Q4cwvbOz3s8DQIVSiQF2HfFq9x44XK5AWWMTfgwIRKNajckZ6Rgpk+FcbU2z9T9ra+F9B03Qw05OiHJxwZUrV/BBcREOV1Ui3MYG+5NPYvyECfjuxx+1n129erX2n8+cOYNevXrpHa+yshIxMTGIiYkBAHz99dd6nxkzZoxe7fjx4/joo49umbV///63/T5EREQtrbnr31+ys7NxOTcX03v0gFqtxpUrchyvrUVBbS0+c3FBk0aDeQX5GGxtjVy1CicqK7EjIBDmYjEqmppgb2aGbcVFOjN0fzV6ABDQoQOW9vZEf9sOSL96FRoAXtY2SK6qQu/rryjrIJXiw0sX8aCjIz508kJxQwOez8rCrqAg2CnKUVJc3KK/P4Z01687+f333/Hzzz/j3//+NwCgrKwMjY2N2L9/P+Li4gBce3LUzs5O5+fCwsLw+uuvIzs7G08++eT198L9T1JSEnbt2gUAePrppzF+/Hjt2KOPPgoA6OrqisSqKuAmjZ2lWIyRMhkAIKG8AvsVCpyoutZEViuVyKuvu2kd0OBOVm7P1NTgowvncVWpxFWVCq5mZgi3sUFPBwccT0q6/QGIiIhIR0N9PaQ37CCRXFuLxJoapNXVAQBq1GqcUSiQBmCSswvMr+/yZG/W/MuKm1uKDbKzQ3JVFTTQ4LmuXbGnpAS9ra0R2KEDACCxvAKHFQp8dvna7GKFsgmNajXMlUrU19cb+Bu3nDtu7Pr27Qvg2v1ru3fvhru7+12d6PXXX8fDDz+MPXv2ICQkBMePH9cZF93QVWmu7wv3F+074jQaqG+x/m55w3ZeGgDz3N0x8W/3uO0rUzRbv1hXh9Sqq7f9Hv/KycHH3bujs7IJW8rLtffgLQgNxY81Nci7fPk2RyAiIqIbBfn7Q1T1v607NQBmyGQYe73pAgCRSIy0+9htKsjODssvXIBIBMzo4oaYoiIkV1Uh2K7j9XNqsLafD7pYWur8nFijhsqI9o2946di//GPfwAAHnjgAZ0lxvT0dG39iy++AHDtxv6qG/4FAcD58+fh7++Pf/3rX/D29tZ7KmbAgAH44YcfAABbt27F0KFD9cM287DAzYTa2+N7eTHqr6/VX6itRYNafdN6RKfOOFFZicSKcgDXmsvviotR87dHnOvUKnjY20MlFuPQ9fsB1BoNyurq4NzMPXlERER0a5VXr0Jzw+RMsJUVfq6qQsP1e+oLlCpY2dkh1N4eP8iLtU+5VjQ1AQAkIhFUt3nwopeVFS7V16FBrYatVIoe1lbYeUWOoOsrjKH2DthywxZi2dffLKEWiSGRGs9+DnecdObMmQCAt99+G/PmzYOfnx9UKhVGjx6NVatW4a233sLs2bPh5+cHiUSibfL+8sknn+DAgQOQSCQICQnBkCFD8Oeff2rH/+///g/PPPMMli5dCg8PD2zYsEEvg7mFBdR3+KTrCJkMObU1eCI9DRoAjmZmWNPP56Z1a4kEn/frh2UXzuOdP89DIgIG29vjsb81ay92c8ektDR0tbSAr50dpBoN1AC+PHkSpdf/A/tLaGgoKisrodFoMH78ePz73//Ge++9B0dHR8yZMwezZs3CY489hnHjxmHTpk04ffo0li9fDrlcjpdeegkXL16EUqnElClTsHDhQp3P38z69evx3nvvQS6Xo3PnzoiOjsayZcvu6PeMiIjIkFJSUvDyyy/j6tVrK2KBgYFYtWoVjhw5gnfeeQdKpRJSqRR+/v4wz8uDWCyGq4srJgIozc/HPLn8f9fqLl0wwtoaWdXVeCwtFVKRCE86u+DpLl0wsbMzJqQkY5C9/U0fnhCJRPC0toabxbUZuaAOdjigUKDr9Rm6ue7uePf8eUxISYZKo8EQe3u8bdsbjVIpzP82i/frr79i5syZKCkpwQMPPICRI0dq75UXmkijacV9ue7TH3/8gbO//ooHjx4TOoqOxqZG/DJgAH7Ozsb+/fsBXFs+LioqMqptSIiIiITQVq/vAPD7kMHwGjsWo0ePFjrKHTGeuUUAzs7OSLayQpNEArM29BZokaUVVI6OmDt3Lrp06YKSkhIsXLiQTR0REdEdaKvX9yaJBNVWVnrvpG3LjK6xE0mlqLSxgdPf7uETUqWNDURSKYYOHYrHH3+8Vc65bNkyfPfddzq1l19+GdOmTWuV8xMRERlKW72+b1Eo8O2XX2LjDz9Aev0+u+joaLz++usCJ7s5o2rsZDIZLG1sUCSTtal/8UWO13LJrr9qpTW8+eabePPNN1vtfERERC2lrV7fgwP84RIQgBfnz292t6e2qEX3ijU0iUQCv6Ag5Ll3g0rcNqKrxGLkduuG/sHBRvMvnYiIqC3h9d1w2sbv3l3w9/dHk7U18p2chI4CALjs5ASltTV3eSAiIroPvL4bhtE1dg4ODujh6Yk/Pdzv+NUnLUUtEuG8hzt69OnDByWIiIjuA6/vhmF0jR0AhA0dimonJ+S4uQma45ybG6qdnBAWHi5oDiIiIlPA6/v9M8rGztXVFSFhYTjj6Ymq6xv4trZKa2uc7eOJgeHhcHV1FSQDERGRKeH1/f4ZZWMHAGFhYXDo6oZkb28oW/lGS6VYjOR+3pC5uSE0NLRVz01ERGTKeH2/P0bb2EmlUoyPiEBtly447tOv1dbj1SIRjvv0Q51rF4yLiNC+14aIiIjuH6/v98doGzsAcHFxwcSoSCjc3XHU16fFO3ulWIyjvj5QuLtjYlQkXFxcWvR8RERE7RGv7/fOqPaKvZnc3FzsiN0O68JCBGdnw6621uDnqLS2RnI/b9S5dsHEqEh4eHgY/BxERET0P7y+3z2TaOwAoLi4GHvi4lCeX4C+OTnwLCiA2ABfTS0S4ZybG8728YTMzQ3jIiKMupMnIiIyJry+3x2TaewAQKlUIiEhAUkJCbAtLUWv3Dx0Ky2FRK2+62OpxGJcdnLCeQ93VDs5YWB4OEJDQ412zZ2IiMhY8fp+50yqsftLYWEhEhMScPHcOUhra+Fx+TJcyxToWFMDM5Xqpj/XJJGg0sYGRY4y5HbrBqW1NXr06YMwI33kmYiIyJTw+n57JtnY/aW8vBwZGRnISE5GfU0NNEolbOvqYKcoh7lSCbFGDbVIjEapFFUyB1RbWUEklcLSxgb9g4PRv39/o3vjNBERkanj9f3mTLqx+4tKpYJCoYBcLodcLkdJcTEa6+uhUiohkUphbmmJTi4ucHZ2hrOzM2QymVFt+EtERNQe8fqur100dkRERETtgVG/x46IiIiI/oeNHREREZGJYGNHREREZCLY2BERERGZCDZ2RERERCaCjR0RERGRiWBjR0RERGQi2NgRERERmQg2dkREREQmgo0dERERkYlgY0dERERkItjYEREREZkINnZEREREJoKNHREREZGJYGNHREREZCLY2BERERGZCDZ2RERERCaCjR0RERGRiWBjR0RERGQi2NgRERERmQg2dkREREQmgo0dERERkYlgY0dERERkItjYEREREZkINnZEREREJoKNHREREZGJYGNHREREZCL+H+m2TEIUXhWCAAAAAElFTkSuQmCC", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8fJSN1AAAACXBIWXMAAA9hAAAPYQGoP6dpAACuxElEQVR4nOzdeVhUZfsH8O8sbMM+IMMmaIrIJii5gUumuYtLamqv4pZZvplZvWa9ldliWb5plmlu4S8zTbNIJc1dwAVBVlFAkX0bZthm2Gbm/P5QJ4/jgjDDGeD+XJdXcc/MOfeg4nee55zn4TEMw4AQQgghhLR5fK4bIIQQQggh+kHBjhBCCCGknaBgRwghhBDSTlCwI4QQQghpJyjYEUIIIYS0ExTsCCGEEELaCQp2hBBCCCHtBAU7QgghhJB2goIdIYQQQkg7QcGOEEIIIaSdoGBHCCGEENJOULAjhBBCCGknKNgRQgghhLQTFOwIIYQQQtoJCnaEEEIIIe0EBTtCCCGEkHaCgh0hhBBCSDtBwY4QQgghpJ2gYEcIIYQQ0k5QsCOEEEIIaSco2BFCCCGEtBMU7AghhBBC2gkKdoQQQggh7QQFO0IIIYSQdoKCHSGEEEJIO0HBjhBCCCGknaBgRwghhBDSTlCwI4QQQghpJ4RcN0AIIfqkVqshk8lQUlKCkpISlBUXo762Fhq1GnyBAGYWFujk7AyJRAKJRAKxWAyBQMB124QQohc8hmEYrpsghJCWksvlSEpKQkpCAuoUCjAqFaxqa2Erk8FEpQKfYaDh8dAoFKJSLEaNhQV4QiHMLS0R0KcPAgMDYW9vz/XbIISQFqFgRwhp0woLCxEbHY3szEyYKJXwyM2Di0wGW4UCJmr1Q1/XKBCg0tISRWIxcj06o1EkQlcvL4QOHgwXF5dWfAeEEKI/FOwIIW2SSqVCTEwM4mJiYCWVontOLtylUgg0mic+lprPR76jI7I8PVDj6Ii+oaEIDQ2FUEhXqxBC2hYKdoSQNqe4uBiHIyMhzy9Az8xMeBUUgK+HH2UaHg+Zbm645uUFsbsbxoaFwdnZWQ8dE0JI66BgRwhpU3JycnBw716ICosQnJ4OG6VS7+eoEokQ7+MDpasrJr8wHZ6enno/ByGEGAIFO0JIm5GTk4MDe/bAIScX/a5ehbAZ065NpeLzcdHPFzIPDzw/cyaFO0JIm0Dr2BFC2oTi4mIc3LsX4pxcDEhLM2ioAwChRoOBqWkQ5+bi4N59KC4uNuj5CCFEHyjYEUKMnkqlwuHISIgKi9D/6lW9XE/XFHyGQf+0q7AoKsSRyEioVKpWOS8hhDQXBTtCiNGLiYmBPL8AwenpBh+pu59Qo0Hw1XTICgoQGxvbqucmhJAnRcGOEGLUCgsLERcTg56ZmQa5UaIpbJVKeGdk4lJ0NIqKijjpgRBCmoKCHSHEqMVGR8NKKoVXQQGnffQoKICVVIqY6GhO+yCEkEehYEcIMVpyuRzZmZnonpPbatfVPQyfYdAtJxfZGRmQy+Wc9kIIIQ9DwY4QYrSSkpJgolTCXSrluhUAQGepFEKlEsnJyVy3QgghD0TBjhBilNRqNVISEuCRm9esbcIMQaDRwDMvD8nx8VA/Yh9aQgjhCgU7QohRkslkqFMo4CKTcd0Ki0v57b5kRtYXIYQAFOwIIa3ou+++Q1BQkPaXt7c3BAIBampqdJ5bUlICRqWC3QMeu1+DRoNPb97A8MtxGJcQjzkpyUhvwuuexG8lJShvaICtQgFGpUJJSYnOc5YsWQInJyc8/fTTej03IYQ0FQU7QkirWbJkCRITE7W/QkJC8J///AdWVlY6zy0pKYFVba123Tr1I26e+OpWNmrVGhwNfhqH+wRjdXcvyFSNeu39t5ISlDc2wkSthlVt7QOD3axZsxAVFaXX8xJCyJMQct0AIaRj+v3335GYmIiLFy+ipqYGr776KtLT08EwDDZs2ICy4mIcP34c54qKkFNbh+4iEaZIJPggKxMNGg16WlrhEy8vqBkGv5eW4lTffhDyeACALhYW6GJhAQD4IT8Pf5SWggdgkXtnhDk54WJFBX4qKsRGH18AwGvpV/EvF1f0t7NDvwvnMUUiQbRcDrGJCTb7+iFaLkdqTTVeu5YOa4EQb3t7o+wBW4yFhobi1q1brfUtJIQQHTRiRwhpdSUlJXjttdfw008/wdTUFJ988gkmT56MuLg4/P7773j11VdRX1sLgUaDDIUSW/388H63bliRcR0fdOuGP/sEQyTgY3dRIXLr6uBiZgZLgUDnPCnV1Ygqk+K3oN74KaAXNuTmoKS+/pG9VahUGGovxqE+wZCYmuFYuRQjHR3hb2WNjT19sD8oCKYqFRrq6gz17SGEkGajYEcIaXULFy7EG2+8AT8/PwDA33//jQ8//BBBQUEYP348ysvL0VBfDx7DYLiDGKZ8PqpVKjRoGARa2wAAJjpJcLmyCgzDgPeQ88RXVWGkowPM+HzYmZhgoK0dUh5z7Z2lQICBdnYAAH8rKxTU6QZBPqOBmvaNJYQYIQp2hJBW9cMPP0ChUOCNN97Q1hiGwaFDh7TX3uXn58PUzAwMjwdz/u2RuPuvsGPAgMcDPC0sUFhfD2UTlh9hAPAACHg83LuASoPmn6Ob8P6JiXwe74HX9ml4fAiEdCULIcT4ULAjhLSaGzduYPXq1YiIiADvngA1YsQIfPfdd9qvk5KSYGZhATX/nx9RNkIhTPk8JFdXAwD+LCvD0za2EAkEmOjkhC+yb2pDWJZSiXNyOYJtbPB3eTkaNBpUqhpxsbICAdbWcDUzQ5ZSCRXDQNrQgCvVVY/t3VIggOJOeGwQCmFqbq6X7wkhhOgTfeQkhLSatWvXQqlUYsKECax6REQE1q9fj4CAAKjVagwfPhyTJk1CvYUFUPlP6Pq8Rw98mJWlvXlilosLAOCtLl3xRfZNPHf5MkQCPsQmJnjvqW7wtrTEaEdHTE68Ah6ApR6ecDI1BQAMtRdjfEI8uotE8LHUvSv3flMkEryTmQFrgRAvDxwAb2dnnecsXLgQhw8fRnl5Odzd3bFx40ZMnjy5Bd8xQgh5MjyG4XgDRkIIeYDU1FQc+fVXjD9zFiZGtMtDo0CAQ0OHYOy0afD39+e6HUIIYaGpWEKIUZJIJOAJhai0tOS6FZZKS0vwhEJIJBKuWyGEEB00FUsIMUpisRjmlpYoEovhWPX4a+BaS5GDGD/t24fdBw6w6j///DN8fX056ooQQm6jYEcIMUoCgQABffogsbwcvrm5EGg0j3+Rgan5fOR07oxvFizA0KFDuW6HEEJ00FQsIcRoBQYGolEkQr6j4yOf16hSobSsFIVFRahqwh2uzZXn6AiVSIRevXoZ7ByEENISFOwIIUbL3t4eXb28kOXpAQ3vwcsQaxgGMpkMKpUKAIOamho0GmDxYA2PhxueHujaowfs7e31fnxCCNEHCnaEEKMWOngwahwdkenm9sDHq6qqoFYbfheIDDc31Dg6InTQIIOfixBCmouCHSHEqLm4uKBvaCiueXmhSiRiPVZXXw+lUsGqmZqawaQZu0JoGA3qGxp0drgAgEqRCNd7eKHfoEFwubN2HiGEGCMKdoQQoxcaGgp7dzfE+/hAdWc3Cg3DoKKigvU8Ho8Puzv7vD6J2ro6FBeXoLxciuKiIihra7WPqfh8xPv6QOzmhpCQkJa8DUIIMTgKdoQQoycUCjEuLAxKV1dc9POFhsdDZWUlNBr2wsU2NjYQCgRPfPzqqirc3Y2WAYOKCjmk5VLUqVS46OeLWhdXjA0Lg5D2hyWEGDkKdoSQNsHZ2RmTX5gOmYcHont6o6ahnvW4mZk5LO+bqm2yB9yYUatW46xXd2Tb2iLkmaFwfsAWYoQQYmwo2BFC2gxPT08MHzMGGZaWSBwyBEobGwB3p2Btm31cszv7x96lsLFB4pChyLa3x87duzFs2DCcPHmyRb0TQkhroL1iCSFtBsMwmDZtGs6dO4ewcePgZm8Pr2vX4FtaBitz82YfV6lUoqKyAhoeD4U9eiCzZ08UyGSIPHIEpaWlAIAhQ4bgzJkz+norhBBiEHTBCCGkzfjll19w4M5WXjt37UJISAiY0FBU19ejW04uOkulzduhwtQUJZ6eyOveHVIrK8TExSE2NhZq9T/X8FlZWenrbRBCiMHQiB0hpE0oLCyEv78/5HK5tubg4IDTp0/jWno6sjMyIFQq4ZmXB5dyGWwVCpio1Q89XqNAgEpLSxQ5iHHL3R3ShgZkZGcjJjYWxcXFrOe6ubnh1KlT8PLyMtj7I4QQfaARO0KI0WMYBosWLWKFOgD4/vvv4e/vrw18ycnJSI6Pxw2FAoxKBavaWtjI5DBVqcBnNNDw+GgQClEltkeNhQV4QiHMLS3Ru3dvTJs2DeXl5Q88v7+/P7p3794ab5UQQlqERuwIIUZvx44dWLBgAav2wgsv4JdfftF5rlqthkwmQ0lJCUpKSlBWXIyGujqoVSoIhEKYmpujk7MzJBIJJBIJxGIxBAIB+vbti8uXLz+0h507d2Lu3Ln6fmuEEKJXFOwIIUYtJycHAQEBqK6u1tYkEgnS0tLg4OCgt/OcPXsWU6dORUVFBaZMmYKTJ0+irKxM+7iNjQ1SU1PRuXNnvZ2TEEL0jYIdIcRoaTQajBw5EidOnGDVIyMjMWHCBL2fj2EY1NbWQiQSITIyEhMnTmQ9PmLECBw7dgy8B6x7RwghxoDWsSOEGK3NmzfrhLq5c+caJNQBAI/Hg+jOIsdhYWEIDw9nPX78+HFs3rzZIOcmhBB9oBE7QohRysrKQmBgIJRKpbbm7u6O1NRU2No2fzHiJ1FRUQF/f38UFBRoa5aWlkhKSkK3bt1apQdCCHkSNGJHCDE6arUa8+bNY4U6ANi+fXurhToAsLOzw44dO1g1hUKBefPmQdOc9fIIIcTAKNgRQozO+vXrER0dzaotXrwYI0eObPVeRo4ciZdffplVO3fuHNavX9/qvRBCyOPQVCwhxKikp6ejd+/eqK+v19a6du2K5ORkznZ/qK6uRmBgILKzs7U1MzMzJCYmomfPnpz0RAghD0IjdoQQo6FSqRAeHs4KdTweDz/++COnW3pZW1tj586drLth6+vrER4eDpVKxVlfhBByPwp2hBCj8cUXXyAuLo5VW7ZsGYYMGcJRR/8YOnQoXn/9dVbt0qVLWLt2LUcdEUKILpqKJYQYhaSkJPTt2xeNjY3amre3N65cuQILCwsOO/tHbW0tevfujevXr2trJiYmuHz5Mnr16sVhZ4QQchuN2BFCONfQ0IA5c+awQh2fz0dERITRhDoAsLCwQEREBPj8f350NjY2Ys6cOWhoaOCwM0IIuY2CHSGEc6tXr0ZycjKrtmLFCvTv35+jjh6uf//++M9//sOqJSUl4ZNPPuGoI0II+QdNxRJCOHXp0iWEhIRArVZrawEBAYiLi4OZmRmHnT1cfX09nn76aaSmpmprAoEA58+fR9++fVnPVavVkMlkKCkpQUlJCcqKi1FfWwuNWg2+QAAzCwt0cnaGRCKBRCKBWCyGQCBo7bdECGknKNgRQjhTW1uLPn364Nq1a9qaUChEXFwcgoKCuGusCa5cuYJ+/fqx7or18fFBQkICzM3NIZfLkZSUhJSEBNQpFGBUKljV1sJWJoOJSgU+w0DD46FRKESlWIwaCwvwhEKYW1oioE8fBAYGwt7ensN3SAhpiyjYEUI489Zbb2HdunWs2urVq/H+++9z1NGTWb16NT788ENW7T//+Q/6BgcjOzMTJkolPHLz4CKTwVahgMk9o5L3axQIUGlpiSKxGLkendEoEqGrlxdCBw+Gi4uLod8KIaSdoGBHCOHEuXPnMHToUNz7Iyg4OBjnz5+HiYkJh501XWNjIwYOHIj4+HgIBAKEhIQgtG9fuKvU8M7Ph7tUCkEzth5T8/nId3RElqcHahwd0Tc0FKGhoRAKhQZ4F4SQ9oSCHSGk1dXU1CAwMBA3b97U1szMzBAfHw8/Pz8OO3tyaWlpGDVqFMaMHAk3e3t4XbuGzjduwtnRkbWgcXNoeDxkurnhmpcXxO5uGBsWBmdnZz11Tghpj+jjHyGk1a1YsYIV6gDg448/bnOhDgCsrKzw8vz5sCwqgs+pUxBVVYEBUFVVBVtb2xYdm88w8M7Ph4tMhvgqH/xSUYnJL0yHp6enfponhLQ7NGJHCGlVx48fx3PPPceqhYSE4OzZs23ubtCcnBwc2LMH4pxcdD93Duq6WtbjDg4OMDPVz529Kj4fF/18IfPwwPMzZ1K4I4Q8EAU7QkirqaysREBAAPLy8rQ1CwsLJCUlwcvLi8POnlxxcTF+2bULdtm3MDAtDRqVCmWlpWDwz49UgUCATp2cwG/hlOxdGh4P5/39UNGlK2bMmU3TsoQQHbRAMSGk1SxfvpwV6gBg7dq1bS7UqVQqHI6MhKiwCP2vXgWfYSAUCGBjY8N6nlqtRlVVpd7Oy2cY9E+7CouiQhyJjGQttUIIIQAFO0JIKzl06BB27NjBqg0bNgyvvvoqRx01X0xMDOT5BQhOT4fwnrteRZaWML1v6lWpVLIWX24poUaD4KvpkBUUIDY2Vm/HJYS0DxTsCCEGV15ejpdeeolVs7a2xo4dO1j7rrYFhYWFiIuJQc/MTNgolazHeADs7OzA47Hfk0qPwQ4AbJVKeGdk4lJ0NIqKivR6bEJI29a2fqISQtqk1157DcXFxaza//73P3Tp0oWbhlogNjoaVlIpvAoKHvi4UCC4He5w+7o6U1MzmJqa6r2PHgUFsJJKERMdrfdjE0LaLlruhBBiUPv378eePXtYtTFjxmDBggUcddR8crkc2ZmZ6J2TC/4j7juzMDeHqUQCtVoNExMT6OfWCTY+w6BbTi4SHRwgl8tp+zFCCAAasSOEGFBpaSleeeUVVs3Ozg7btm1r8eK9XEhKSoKJUgl3qfSxzxXw+TA1UKi7q7NUCqFSieTkZAOehRDSllCwI4QYBMMwePnllyG9LwR9++23cHV15air5lOr1UhJSIBHbl6ztgkzBIFGA8+8PCTHx+v1Bg1CSNtFwY4QYhC7d+/G77//zqpNnjwZs2bN4qahFpLJZKhTKOAik3HdCotL+e2+ZEbWFyGEGxTsCCF6V1BQgNdee41Vc3R0xObNm/U2BSsUChEUFAR/f39MmzYNyvvuUNW39evX4+vvvsO8M6fhE30OYVcSEHYlAZGlpXo/16XKCoxNiMfUxMTHPtdWoQCjUqGkpETvfdwrMTERAwYMgL+/P4KDg3H69GmDno8Q0jx08wQhRK8YhsHChQtRUVHBqm/evBlOTk56O4+dnR0S7wSfF198EZs3b8by5cv1dvz7DR06FG6NjXju/AX0u3Aekb37sB5XMwwEegqth8rK8Grnzhjf6fHfLxO1GiKFAiUlJfD392/xudVq9QO3drO0tMTu3bvRrVs3XLt2DWPHjtXZ75cQwj0KdoQQvdq+fTv++usvVm3WrFl4/vnnDXbOwYMHIzk5GVKpFPPmzUNOTg7EYjF+/PFHuLm5wc/PDxkZGcjIyIC3tzcKCwshkUjg5eWFrKwslJaW4uWXX0Z+fj7Mzc2xbds29OzZE3PnzoWDgwPi4+PhLJFggrU167z5dXV45epV9LK2RnJ1FQ4E9cZr6ekobWhAA6PBUg9PjHJ01D7Px8oSydXV8La0xHrvnuDxePgi+yZOymQw5fEwxrETXMzMECWVIlpegUuVlXi361P4b1YWrilqYMbn4+PuXvC1ssI3OTmQNjYgp7YOVrJy/HrpEg4ePIjU1FQUFBRg165d2LBhAxISEvD8889jzZo1AICdO3fi+++/R11dHSZNmoTVq1fj1q1bmDhxIvr164eLFy8iLi4OZmbshZbv3R3E29sbNTU1Dw2BhBDuULAjhOjNrVu38MYbb7BqLi4u2Lhxo8HOqVKpEBUVhdGjR2PVqlUYPHgw/vzzT+zduxdLly5FZGQk3NzckJ2djejoaPTp0wfR0dHw9vaGn58feDweli1bhvfffx/BwcGIi4vDsmXLtOE0Ly8Pp06dwq7t22HygDXjspQKfOntjZ6Wt4PPFz16wM7EBNUqFaYmJWKkgwMA4GatEut79sRTFhaYnZKCy1VV6C4S4YhUilNP9wWfx0O1SgVroRAXKisw2tERw8QO2J6fDyuBAIf6BCOxqgorMjLwZ5/bo4UZCiV2BQQgzcsLGzMzUF1djdOnT2P37t2YMGEC4uPj4eLiAm9vb7z55psoLS3FkSNHcP78efB4PEycOBHnz5+Hi4sL0tLS8H//93/YunXrY7/nBw8eRHBwMIU6QowQBTtCiF5oNBrMnz8fNTU1rPrWrVshFov1fr6KigoEBQUBuD1it2DBAvTr1w9HjhwBAEyfPh2vv/46ACA0NBTR0dGIjo7GihUrEB0djbKyMoSGhgIATp48ifT09AeeZ+rUqeDxeNCo1Q9cu66LhQV6Wlpqv/6xsAAnym/fyFBUX4+yxkYAQFcLC3QTiQAAvlaWKKivQ28bG1gLBFiZmYERDg4YJnbQOf7lqiq85O4OAAiysUG9RoPqO3vEDncQw5TPB5/RgNFoEBYWBgAICAiAl5cXPD09AdwebcvLy0N0dDTOnz+P4OBgAEBNTQ1u3LgBFxcX9OjRA7169Xrs9/3GjRtYsWIFoqKiHvtcQkjro2BHCNGL7777DqdOnWLVFixYgHHjxhnkfPdeY/cwd2/UCA0NxR9//IHMzExs3boVmzdvhlQqZe1TGx8f/8ARKNGdMMYXCKB5wDV0Fve85kJFBRKqqvBrYCDMBQKMir+MhjtLo5jes3Uan8eDhgGEPB5+C+qNmAo5/igtRWRpKTb6+D7yPTFgtGvjmfNvn1vD44PH52unT/n3/P/dr9VqNRiGwaJFi/DBBx+wjnnr1i3t+3wUmUyGSZMmYcuWLejevftjn08IaX10VywhpMUyMjKwYsUKVs3DwwP/+9//WrWPQYMG4eeffwZwe8eLfv36AQBCQkIQFRUFJycnCAQCWFhYICYmBk8//TSA2zdGbNmyBcDtkceUlBSdY5tZWKBR+OjPwjVqNeyEJjAXCJBUXY1btbWPfL5CrUa1SoVhYge80/UppCsUOs952sYGf5bdvvM2qboaFgIBrO7ro0EoBL8J06LPPvss9u7dC7lcDgDIz89HeXn5Y18HAA0NDZg8eTKWL1+OZ599tkmvIYS0PhqxI4S0iFqtxty5c1F7X4jZsWMHbGxsWrWXVatWYe7cudi1a5f25gkAsLW1ha2trXbqtX///qioqNCOam3cuBGLFy/G5s2boVKpMGfOHAQEBLCO3cnZGdcfM6U82N4eu4sKEXYlAT0tLdFDZPnI5yvUarxyNQ0NmttTvG936arznBddXPDfrExMSIiHKZ+Pz7166DynSmwPcwuLR54LAPz9/bFixQo888wz0Gg0sLa2xi+//PLY1wHAvn37cOHCBVRWVmLDhg0AgBMnTsDBQXf6mBDCHR7DPGLDQ0IIeYy1a9fqjNYtWbIE3377LUcdGUZqaiqO/Porxp85CxMj2uWhUSDAoaFDMHbaNL0sd0IIadtoKpYQ0mxpaWl4//33WbVu3brhiy++4Kgjw5FIJOAJhai0fPQoXGurtLQETyiERCLhuhVCiBGgqVhCSLM0NjZizpw5aGho0NZ4PB4iIiJgaWThRx/EYjHMLS1RJBbDsaqK63a0ihxu96WvO4/Ly8sxfPhwVk0kEiE2NlYvxyeEGBYFO0JIs6xZswYJCQms2ptvvqm9jq29EQgECOjTB4nl5fDNzYXgzt2u96utq0N1VRXA48HOzg6mJiYG60nN5yOnc2f00eOacg4ODo+925gQYrxoKpYQ8sQSEhLw8ccfs2o+Pj46tfYmMDAQjSIR8h0dH/h4bV0d5HI5VGoVVKpGnW3V9C3P0REqkahJ688RQjoGCnaEkCdSX1+P8PBwqO4skgvcHs2KiIiAubk5h50Znr29Pbp6eSHL00NnTbv6hoY7y4j8cz+a5iGjevqg4fFww9MDXXv0gL29vcHOQwhpWyjYEUKeyKpVq5CamsqqrVy5En379uWoo9YVOngwahwdkenmpq01NjZCJpPh3lAHAFZWVgbrI8PNDTWOjggdNMhg5yCEtD0U7AghTXbhwgWsXbuWVQsMDNS5M7Y9c3FxQd/QUFzz8kKVSASVWo1ymQwMwx6dE4ksYdXCm0gethpVpUiE6z280G/QILi4uLToHISQ9oWCHSGkSZRKJcLDw1nTiyYmJti1axdMTU057Kz1hYaGwt7dDZd79kRpRQU0Gva6dubmFrC1tW328RsaG1FUVISi4iIUl5Sg/p47j1V8PuJ9fSB2c0NISEizz0EIaZ8o2BFCmuS9995DRkYGq7Zq1aoOeeG+UCjEsBEjkGMiROrTwazr7UxNzWBvbwfdXWWbrrKiAsydaV2NRo3ycinkFRVo1Ghw0c8XtS6uGBsWBuFjtjgjhHQ8tPMEIeSxzpw5g2eeeYZV69evH2JiYjpkuGhoaMD48eNx/fp1zJgyBR7l5fC5eBFmPD4cHR3B57Uk1gGlZaWsm1MAQC0QIH3AAJR37oxpL74IHx+fFp2DENI+UbAjhDxSdXU1AgMDkZ2dra2Zm5vjypUr6NmzJ4edcUOj0eBf//oX9uzZAwDw8PDAtEmT4FZbh9DsbNjX1bX4HBWVlVAqFdqvFTY2uBb8NApFFth38CAaGxsRFRWFoKCgFp+LENK+0FQsIeSR3n77bVaoA4BPP/20Q4Y6hmHw1ltvaUMdAOTm5uLPv/6CqZ8vzg4ciOvu7jpLoTwpkzuLGmt4POR7e+PSsGFIVzXi/375BXl5eSguLsZbb73VonMQQtqnjjeHQghpsqNHj2LLli2s2uDBg/H6669z1BG3vvrqK3z99desmkgkwv/93/+hT58+iImJQZy5GfJdnNEtJxedpdKH7lDxKDwTE5R4eiKve3dIrawQExeH2NhYqNX/3KRx/1QtIYQANBVLCHmIiooK+Pv7o6CgQFsTiURITk5Gt27dOOyMG7t27UJ4eDirJhQK8eeff2L06NHaWmFhIWJjYpCdkQGhUgnPvDy4lMtgq1DARK2+/7BajQIBKi0tUeQgxi13d0gbGpCRnY2Y2FgUFxeznmttbY1jx45hwIAB+n2ThJA2j0bsCCEPtGzZMlaoA26PWHXEUBcVFYX58+fr1Hfs2MEKdQDg6uqKqdOmQS6XIzk5Gcnx8bihUIBRqWBVWwsbmRymKhX4jAYaHh8NQiGqxPaosbAATyiEuaUlgoKCMG3atDuLHusaNWoUhTpCyAPRiB0hREdkZCQmTpzIqo0YMQLHjh0Dr4XXj7U1Fy9exLPPPgulUsmqf/nll026zk2tVkMmk6GkpAQlJSUoKy5GQ10d1CoVBEIhTM3N0cnZGRKJBBKJBGKxGAKBAL169UJKSspDjxsZGYkJEya0+P0RQtoXCnaEEBapVAp/f3+UlJRoazY2NkhJSYGHhweHnbW+69evIzQ0FOXl5az68uXLsW7dOoOe++DBg5gxYwYaGhrQr18/pKWlQaH4505ZiUSCtLQ0ODg4GLQPQkjbQnfFEkJYlixZwgp1ALB+/foOF+oKCwsxatQonVD34osv4ssvvzT4+SdPnozCwkJkZGTg4sWL+Oabb1iPl5SU4N///rfB+yCEtC00YkcI0dq7dy9mzJjBqo0fPx6RkZEdagq2oqICQ4YM0ZkKHTlyJP78809OtlBjGAYTJkzA4cOHWfV9+/Zh2rRprd4PIcQ4UbAjhAAAiouL4efnx7pg397eHmlpaR1qo/m6ujqMGjUKZ8+eZdWffvppnDp1ClZWVhx1BhQVFcHPzw9yuVxbc3BwQFpaGiQSCWd9EUKMB03FEkLAMAxefvllnbswN23a1KFCnVqtxosvvqgT6ry8vHD48GFOQx0AuLi44LvvvmPVysvL8fLLL4M+oxNCAAp2hBDcXqMtMjKSVZs6dSpeeOEFjjpqfQzDYMmSJfjtt99YdWdnZxw9ehROTk4cdcY2Y8YMPP/886zaH3/8gZ9++omjjgghxoSmYgnp4PLy8hAQEIDKykptzcnJCampqejUqROHnbWujz76CKtWrWLVrK2tcfbsWaPbk7WsrAx+fn4oKyvT1mxtbZGamgp3d3cOOyOEcI1G7AjpwBiGwcKFC1mhDgC2bNnSoULdli1bdEKdqakp/vjjD6MLdQDQqVMnna3eKisrsXDhQpqSJaSDo2BHSAf2ww8/4NixY6za7NmzMWnSJG4a4sDBgwfx6quvsmo8Hg+7d+/GsGHDOOrq8SZPnox//etfrNrRo0exdetWjjoihBgDmoolpIO6efMmevXqxVr01tXVFampqbC3t+ews9Zz9uxZjBw5EvX19az6t99+iyVLlnDUVdPJ5XL4+/ujsLBQW7OyskJycjK6du3KYWeEEK7QiB0hHZBGo8G8efNYoQ4Atm/f3mFCXUpKCsLCwnRC3X//+982EeqA28vRbNu2jVWrqanBvHnzoNFoOOqKEMIlCnaEdEDffPONzpIeL730ks6G9u1VTk4ORo8erXNt4cKFC7F69WqOumqeMWPGYOHChazamTNn8O2333LUESGESzQVS0gHc/36dQQFBaGurk5b69KlC5KTk2Ftbc1hZ61DKpVi0KBBuH79OqseFhaGAwcOQCgUctRZ81VVVaFXr17IycnR1iwsLJCYmIgePXqwnqtWqyGTyVBSUoKSkhKUFRejvrYWGrUafIEAZhYW6OTsDIlEAolEArFYDIFA0NpviRDSTBTsCOlAVCoVBg0ahIsXL7LqJ0+eNOobBfRFoVBg+PDhOu8/NDQUf//9NywsLDjqrOVOnTqFZ599llUbMGAAoqOjIRAIIJfLkZSUhJSEBNQpFGBUKljV1sJWJoOJSgU+w0DD46FRKESlWIwaCwvwhEKYW1oioE8fBAYGdphpekLaMgp2hHQgn3/+OVauXMmqLV26FBs2bOCoo9bT2NiISZMm4ciRI6y6r68vzp07B7FYzFFn+rN06VJs3LiRVfvss8/g1a0bsjMzYaJUwiM3Dy4yGWwVCpio1Q89VqNAgEpLSxSJxcj16IxGkQhdvbwQOnhwh9qNhJC2hoIdIR1ESkoKgoOD0djYqK15eXkhMTERIpGIw84Mj2EYzJs3DxEREax6586dERsb224W9VUoFAgKCkJWVhYEAgFCQkIwqG8/uKtV6JGXD3epFIJm3FSh5vOR7+iILE8P1Dg6om9oKEJDQ9vktDUh7R0FO0I6gIaGBgwYMABXrlzR1vh8Ps6dO4eQkBAOO2sd77zzDr744gtWzd7eHjExMfDx8eGoK8OIjY3F888/j/FjxsDN3h5e167BMzsbTg6O4LXw2BoeD5lubrjm5QWxuxvGhoXB2dlZL30TQvSDPm4R0gF8+umnrFAHAG+99VaHCHXr16/XCXUWFhY4dOhQuwt1AODm5obFCxbAuqQEPqdOQVRVBTWAmppqWFu17OYYPsPAOz8fLjIZ4qt88EtFJSa/MB2enp76aZ4Q0mI0YkdIOxcfH4/+/ftDfc/1VH5+frh8+TLMzc057Mzw9uzZg1mzZrFqAoEABw8exIQJEzjqynBycnJwYM8eiHNy8NTZc2Dq6+55lIdOjo4wMTHRy7lUfD4u+vlC5uGB52fOpHBHiJGgdewIacfq6uowZ84cVqgTCASIiIho96Hu77//Rnh4uE5969at7TLUFRcX4+DevRDn5GJg2lU4WFsDrMlXBvKKCujrk7xQo8HA1DSIc3NxcO8+FBcX6+nIhJCWoGBHSDv24Ycf4urVq6zaf//7XwQHB3PUUeuIj4/HlClTWDeKALfvEJ03bx5HXRmOSqXC4chIiAqL0P/qVfAZBqYmJrC2srrveY2orq7S23n5DIP+aVdhUVSII5GRUKlUejs2IaR5KNgR0k7Fxsbiyy+/ZNV69+6N9957j6OOWkdWVhbGjBmDmpoaVv21117DO++8w1FXhhUTEwN5fgGC09MhvOeuVytra52p15oaBdR63G5MqNEg+Go6ZAUFiI2N1dtxCSHNQ8GOkHZIoVAgPDwc915Ca2pqil27duntGitjVFxcjFGjRqGsrIxVnz59OtavXw8er6X3hRqfwsJCxMXEoGdmJmyUStZjPAB2dva4f0pWdd9IZkvZKpXwzsjEpehoFBUV6fXYhJAnQ8GOkHZo5cqVyMrKYtVWr14Nf39/jjoyvKqqKowdOxY3b95k1Z999lns2rULfH77/HEXGx0NK6kUXgUFD3zcRCiEra0t7oY7gUAIUzMzvffRo6AAVlIpYqKj9X5sQkjT0XInhLQzJ0+e1Nl9YMCAAXjrrbc46sjw6uvrMXnyZJ0lXXr37o2DBw/CzABBxhjI5XJkZ2aid04u+I9Y4MBSJIKJiQnUajXMzMxavJ7dg/AZBt1ycpHo4AC5XE7bjxHCkfb5EZaQDqqqqgrz589n1SwsLBAREdFuN3LXaDSYM2cOTp48yap37doVR44cgY2NDUedGV5SUhJMlEq4S6WPfa6piQkszM3BN+B0dGepFEKlEsnJyQY7ByHk0SjYEdKOvPnmm8jJyWHV1qxZgx49enDUkWExDINly5Zh3759rHqnTp1w7Nixdr0rglqtRkpCAjxy85q1TZghCDQaeOblITk+nrXEDiGk9VCwI6SdiIqKwrZt21i1oUOH4rXXXuOoI8P7/PPPdaadraysEBUVhe7du3PUVeuQyWSoUyjgIpNx3QqLS/ntvmRG1hchHQXtPEFIOyCXy+Hv74/CwkJtzcrKCsnJyejatSuHnRnOzp07daadTUxMcPjwYTz33HMcdWV4QqEQ/v7+qKurQ1VFBaK8ekDUxNfm19UhuboaYzt1AgBcrKjAT0WF2OjjCwA4JpViY24u1GDAAzDLxQUvurgCABRqNQZevIAVXbtqaw+SXKvEG8XFMLOwgI2NDdatW4dnnnmmBe+YEPIk6OYJQtqBpUuXskIdAKxbt67dhrpDhw7hpZde0qlHRES061AHAHZ2dkhMTMSJEydw/ehRiM5faPJrC+rqECUt0wa7e12tqcFXt25hu78/Opubo06txpF7rt07WV4OX0srHCkre2SwswEPC8aMwaAZM+Dm5vbAO5UJIYZDU7GEtHEHDx7ETz/9xKqNGjXqgcGnPTh//jymT5+ucw3X119/jZkzZ3LUVesrKy6GrUyG3NpazExOwqQrCZiWlIisO2vZXVcoMOlKAsLu/CpvaMDXOTmIrahA2JUEHChhbwG2oyAfizt3Ruc7W82ZCwSYIpFoHz8iLcMyT09IGxtRUl//0L66WFigG3O7P29vb9TU1ND1doS0IhqxI6QNKysrw8svv8yq2draYtu2be1yMd709HSMHz8etbW1rPqKFSuwbNkybppqZRUVFQgKCoKsvBzeIhHWOTgiwj8Apnw+Eqqq8L9bt7DJ1xd7i4sw08UFLzi7oE6tBp/Hwxuenqyp14sVFdrjZimVWODm/sBz1qhUuKZQoJ+tLUY6OOJouRRzXN0e2qOpSoW6ujocPHgQwcHB7faObEKMEQU7QtoohmHwyiuv6Oyy8M0338Dd/cH/QLdl+fn5GDVqlM5F+eHh4VizZg1HXbW+u1OxO7dsgXVsLBoyMvBR1g1cVyjAB9Bw57Lp3tY2+DYvFxWNKozp5AgPc4tHHpcBHvph4O/ycjwjFoPP42GMoyNW37zxyGDHZzQoKirCts8/R1RUVHPfKiGkGSjYEdJG/fLLLzhw4ACrNnHiRMyePZujjgxHLpdj9OjRyMvLY9XHjh2LrVu3tsvRycfhCwTQ8Hj4saAQ7mbmWNfDG9LGRkxPSgQATHByQi9ra5ySyRCekoJv74zSPUx3kQhXa2rQ09JS57EoqRSpNdU4fSdUlzY0oLi+Hs4PWfi5qqER327ejIhdu9r93cmEGBu6xo6QNqiwsBBLlixh1RwcHLBly5Z2F3Jqa2sRFhaGtLQ0Vr1///7Yt29fu9779lHMLCzQKBRCoVbBydQUPB4Pf5SWah/PrauFh7k55rq5IcTOHllKJSyFAigecr3bfDd3bMnPQ35dHQCgXqPBnqIiVKlUuKqowbl+/XGqbz+c6tsP893cEfWQRZEbNBqsPX0K48aOxbPPPqv/N04IeSQasSOkjWEYBosWLYJcLmfVv//+e0juudi9PVCpVJgxYwai79t/1NvbG4cOHYLlA0aXOopOzs64LhZjhrMLXruWjj/LShFiZ6d9/EiZFJFlpRDyeHAzM8NzDg4w4fGgYhiEXUlAuKsr3M3Mtc/3s7LCcs8ueOVqGlQMAyGPh1kurvi7XIpBdvYQ3POB4TkHB3x88wbmuelOx0ZJpbhRVgbVyZMICgoCAJw4cQIODg4G+14QQv5B69gR0sbs2LEDCxYsYNVeeOEF/PLLLxx1ZBh3A+z9iy67uroiNjYWnp6eHHVmHFJTU3Hk118x/sxZmBjRXaeNAgEODR2CsdOmwd/fn+t2COlwaCqWkDYkJydH5+5PiUSC7777jpuGDOjDDz/UCXW2trb466+/OnyoA27/vvOEQlQa2ahlpaUleEJhuxs9JqStoKlYQtoIjUaDBQsWoLq6mlXfunVru5vm2rRpEz7++GNWzczMDJGRkQgICOCoK+MiFothbmmJIrEYjlVVrX5+eWMjwlNTWDULPh//nTQJ5paWEIvFrd4TIYSCHSFtxubNm3HixAlWbe7cuZgwYQJHHRnG/v378e9//5tV4/P52LNnD4YMGcJRV8ZHIBAgoE8fJJaXwzc3FwKNplXPb29igsjefVg1NZ+PqM6d0YfWriOEMzQVS0gbkJWVhbfffptVc3d3x/r167lpyEBOnz6NF198Efdf+rtp0yZMnjyZo66MV2BgIBpFIuQ7OnLdCgAgz9ERKpEIvXr14roVQjosCnaEGDm1Wo158+ZBeWerqLu2b98OW1tbjrrSv6SkJEycOBENDQ2s+qpVq3R21yC32dvbo6uXF7I8PaDheJkbDY+HG54e6NqjB+zt7TnthZCOjIIdIUZu/fr1Ost9LF68GCNHjuSoI/3Lzs7G6NGjUXXftWKLFy/GBx98wFFXbUPo4MGocXRE5gOWHmlNGW5uqHF0ROigQZz2QUhHR8udEGLE0tPT0bt3b9Tfs+l6165dkZycDCsrKw4705+ysjKEhoYiMzOTVZ8yZQr27dtH12o1wZkzZxB34iSGXbwIm/tGdltDpUiE0wP6o9/w4XQdJCEcoxE7QoyUSqVCeHg4K9TxeDz8+OOP7SbU1dTUYNy4cTqhbsiQIdi9ezeFuiYKDQ2Fvbsb4n18oOLr/8e6Sq1GcUkJCouKUFpWBtU96+ap+HzE+/pA7OaGkJAQvZ+bEPJkKNgRYqS++OILxMXFsWrLli1rNyMiDQ0NmDp1qs57DAgIwB9//AFzc/OHvJLcTygUYlxYGJSurrjo56v36+0q5HJoNGoADFSqRpSWlqKquhoqABf9fFHr4oqxYWEQCmmhBUK4RlOxhBihpKQk9O3bF42Njdqat7c3rly5AgsLCw470w+NRoPw8HD89NNPrLqnpydiY2Ph6urKUWdtW05ODg7s2QNxbi76p12FUE9LoJSWlkKlVrFqaoEA1wYMgLxLF8wMD8dTTz2ll3MRQlqGRuwIMTINDQ2YM2cOK9Tx+XxERES0i1AHACtWrNAJdQ4ODjh69CiFuhbw9PTE8zNnoqJLV5zr3RtVIpFejmtiasr6WmFjg8QhQ3HTzg5bdu7E5MmTkZOTo5dzEUJahoIdIUZm9erVSE5OZtVWrFiB/v37c9SRfq1btw5fffUVqyYSiXD48GF4e3tz1FX74enpiRlzZkPg64NT/fvjurt7i6dmTe8EOw2Ph3xvb1waNgzpqkb83y+/IC8vD8nJyViyZIk+2ieEtBBNxRJiRC5duoSQkBCo77k4PSAgAHFxcTAzM+OwM/346aefMHv2bFZNKBQiMjISY8aM4air9kmlUiEmJgZxMTGwkkrRLScXnaXSZu1QoWxsxHVLEfK6d4fUygoxcXGIjY1l/Tnt27cvLl26pM+3QAhpBgp2hBiJ2tpa9OnTB9euXdPWhEIh4uLiEBQUxF1jenL06FGMHz8eKhX7Wq2IiAjMmTOHo67av8LCQsTGxCA7IwNCpRKeeXlwKZfBVqGAyT3B7H6NAgEqLS1R5CDGLXd3SBsakJGdjZjYWBQXF7Oey+fzceDAAUyaNMnA74YQ8jh0CxMhRuL9999nhToA+OCDD9pFqIuLi8Pzzz+vE+rWrl1Loc7AXF1dMXXaNMjlciQnJyM5Ph43FAowKhWsamthI5PDVKUCn9FAw+OjQShEldgeNRYW4AmFMLe0RFBQEKZOnQq5XP7Ac8ycOZNCHSFGgkbsCDEC586dw9ChQ1l7pAYHB+P8+fMwMTHhsLOWy8jIQGhoKKRSKav+xhtvYN26deBxvBVWR6NWqyGTyVBSUoKSkhKUFRejoa4OapUKAqEQpubm6OTsDIlEAolEArFYDIFAAG9vb2RkZDzwmDweD2fPnsUg2nWCEM5RsCOEYzU1NQgMDMTNmze1NTMzM8THx8PPz4/DzlquqKgIISEhuHXrFqs+a9Ys/N///R/4BlhMlxjG1q1bsWjRIgCAu7s7iouLWSOw3bp1Q1JSEiwtLblqkRACuiuWEM6tWLGCFeoA4OOPP27zoa6yshJjxozRCXUjR47Ezp07KdS1MS+99BKSk5Nx9OhRZGVl4bPPPmM9fuPGDaxYsYKj7gghd9GIHSEcOn78OJ577jlWLSQkBGfPnm3T22nV1dVhzJgxOH36NKseHByMU6dOwdrampvGiN6o1WoMGTIEsbGxrPrx48cxfPhwjroihFCwI4QjlZWVCAgIQF5enrZmYWGBpKQkeHl5cdhZy6jVasyYMQP79+9n1bt3746YmBg4OTlx1BnRt8zMTAQGBqK2tlZb8/DwQHJyMmxtbTnsjJCOi+ZCCOHI8uXLWaEOuH2XaFsOdQzDYOnSpTqhTiKR4OjRoxTq2hkvLy988cUXrFpubi6WL1/OUUeEEBqxI4QDhw4dwoQJE1i1YcOG4fjx42362rNPPvkE77//PqtmbW2NM2fOoHfv3hx1RQxJo9FgxIgROHXqFKt+6NAhjBs3jqOuCOm4KNgR0srKy8vh7+/PWuTV2toaycnJ6NKlC3eNtdC9d03eZWpqiqioKDz77LMcdUVaw61btxAQEICamhptzdnZGWlpaRCLxRx2RkjH03aHBghpo1577TWdlfv/97//telQ98cff2Dx4sWsGo/Hw08//UShrgPo0qULvv76a1atuLgYr732GkcdEdJx0YgdIa1o//79mDZtGqs2ZswYHD58uM0u1BsdHY3nnnsOdXV1rPrGjRvx73//m6OuSGtjGAbjxo1DVFQUq75//348//zzHHVFSMdDwY6QVlJaWgo/Pz/WDgx2dnZIS0uDq6srh501X2pqKgYPHoyKigpW/b333sMnn3zCTVOEMwUFBfD392f9eXB0dERaWhrdOENIK6GpWEJaAcMwePnll3W21fr222/bbKjLzc3F6NGjdULdggUL8PHHH3PTFOGUm5sbNm7cyKpJpVIsXrwYNIZASOugETtCWsFPP/2E2bNns2pTpkzB/v372+QUbHl5OQYPHoz09HRWfcKECfjtt98gFAo56oxwjWEYPP/88zh48CCr/tNPP+HFF1/kqCtCOg4KdoQYWEFBAfz8/FBZWamtteXpKaVSieHDh+PChQusekhICP7++2+IRCKOOiPG4mGXHaSmpsLNzY31XLVaDZlMhpKSEpSUlKCsuBj1tbXQqNXgCwQws7BAJ2dnSCQSSCQSiMXiNr0rCyGGRsGOEANiGAZjx47FX3/9xaofOHAAU6ZM4air5lOpVJg8eTIOHTrEqvv6+uLcuXO0tAXRetyNQnK5HElJSUhJSECdQgFGpYJVbS1sZTKYqFTgMww0PB4ahUJUisWosbAATyiEuaUlAvr0QWBgIOzt7Tl6d4QYLwp2hBjQg9Z2mzVrFnbv3s1RR83HMAwWLFiAnTt3suru7u6IjY1F586dOeqMGKtZs2Zhz549rNp3330HJ0dHZGdmwkSphEduHlxkMtgqFDBRqx96rEaBAJWWligSi5Hr0RmNIhG6enkhdPBguLi4GPqtENJmULAjxEAetGiri4sLUlNT2+TI1rvvvos1a9awavb29jh37hz8/Pw46ooYM5lMBj8/PxQXF0MgECAkJASD+vWDu1qNHrl5cJdKIdBonvi4aj4f+Y6OyPL0QI2jI/qGhiI0NJSu7SQEFOwIMQiNRoPhw4fj9OnTrPrhw4cxduxYbppqgY0bN2Lp0qWsmrm5OU6cOIGQkBCOuiJtweHDhzF//nyEjRsHN3t7eF27hi63ctBJLEZLbxvS8HjIdHPDNS8viN3dMDYsDM7Oznrpm5C2ioIdIQbwoCC0YMECbNu2jaOOmm/v3r2YOXMma7kKPp+PgwcPIiwsjMPOSFuQk5ODHVu2wLasDD7x8RBVVQEAbG1sYWlpqZdzVIlEiPfxgdLVFZNfmA5PT0+9HJeQtoiCHSF6lpGRgaCgINTW1mprHh4eSElJgY2NDYedPbkTJ05gzJgxaGxsZNW3bduGBQsWcNQVaStycnJwYM8eiHNy0PXMGaChQfsYDzx0cnKCUE93uKr4fFz084XMwwPPz5xJ4Y50WLRAMSF6pFarMXfuXFaoA4AdO3a0uVB35coVTJ48WSfUffLJJxTqyGMVFxfj4N69EOfkYmDaVThYW7MeZ8CgQi6HvkYWhBoNBqamQZybi4N79+nsx0xIR0HBjhA9WrduHc6fP8+qLVmyBMOHD+eoo+a5ceMGxowZg+rqalZ9yZIlePfddznqirQVKpUKhyMjISosQv+rV8FnGJiZmsFSxJ56bWhsgEKh0Nt5+QyD/mlXYVFUiCORkVCpVHo7NiFtBU3FEqInaWlp6NOnDxrumW7q1q0bkpKS9HYtUWsoKSlBaGgobty4wapPmzYNe/bsocVhyWOdOXMGcSdOYtjFi7BRKrV1hmFQWlYGtfrewMWDs7MEfJ7+xhkqRSKcHtAf/YYPx5AhQ/R2XELaAhqxI0QPGhsbMWfOHFao4/F4iIiIaFOhrrq6GmPHjtUJdcOGDcP//d//Uagjj1VYWIi4mBj0zMxkhTrg9t8Jezs7gHU/LIOGBvZ0f0vZKpXwzsjEpehoFBUV6fXYhBg7CnaE6MGaNWuQkJDAqr355psIDQ3lqKMn19DQgClTpui8j8DAQBw8eBBmZmYcdUbaktjoaFhJpfAqKHjg46amprCystJ+zefxYWpqqvc+ehQUwEoqRUx0tN6PTYgxo9UcCWmhhIQEfPzxx6yaj4+PTs2YaTQazJ07F8ePH2fVu3btiqioKNja2nLUGWlL5HI5sjMz0TsnF/xHXOVjY20NExMTqFUqWIgswOe1dEU7XXyGQbecXCQ6OEAul9P2Y6TDoBE7Qlqgvr4e4eHhrIu0BQIBIiIiYG5uzmFnTccwDN58802drZ86deqEo0eP0nZNpMmSkpJgolTCXSp97HMtzM1hZWUFAd9w0/udpVIIlUokJycb7ByEGBsKdoS0wKpVq5CamsqqrVy5En379uWooyf35ZdfYv369ayapaUljhw5Ai8vL26aIm2OWq1GSkICPHLzmrVNmCEINBp45uUhOT4e6kfsQ0tIe0LBjpBmunDhAtauXcuqBQYG4v333+eooycXERGBFStWsGpCoRC//fYbnn76aY66Im2RTCZDnUIBF5mM61ZYXMpv9yUzsr4IMRQKdoQ0g1KpRHh4ODT3jEyYmJhg165dBrkQ3BCOHDnywIWGf/zxR4wcOZKDjogxEAqFCAoKgp+fHyZMmICKigoAwK1btyASiRAUFITAwEAMGTIEubm5AG7/menZsye+/u47zDtzGl/eygYA7C0uwoSE+Du/EpB4ZzsxQ7hYUYHX0q/q1G0VCjAqFUpKSpp0nEOHDsHf3x98Pl9nNJ6QtoCCHSHN8N577yEjI4NVW7VqFXr16sVRR0/mwoULmDZtms701Lp16/Diiy9y1BUxBnZ2dkhMTERaWhrs7Ozw3XffaR/z9fVFYmIikpKSMHHiRNYU/vDhw/HfOXPwZ+8+eLtLVxTX1yOisBD7AoPwZ59g7AoIgAsHd1abqNWwqq3VCXYPm5r19vbG/v37af070mbRXbGEPKEzZ87oXJPWr18//Oc//+GmoSd07do1jBs3Dsr71hh7++23sXz5co66IsYoNDQUSUlJD3ysurqadbe0UqGA7T3TneWNjbDg82HGvz1+YG9ion1sQ04OzsplqNVoMFzsgDe7dAEADIu7hDAnJ8TIKyDk8fDfbk9hbXY28uvq8U7Xrhjp6IjfSkpwQlYOhVqNovp6hLu6YpaLK6s3hVqNVVlZuFF7+8/4ZDtbiAMDsWrVKhQXFyMrKwu+vr745ptvdN4XXVdK2joKdoQ8gerqasybN49VMzc3R0REBIRC4//rVFBQgFGjRulcbzR79mx8/vnnHHVFjJFarcbff/+N+fPna2tXr15FUFAQKioqwDAMLl++rH3s7LlzSBUKYdHQgOWeXTDY3h6WAgGGX45DqJ09xnRyRKjd7SVHwl1d8bqnJzQMg5fS0pBeUwOfO2vbeZib442gILyXmYlPb95EhH8A8urqsOzaNYx0dAQApNbU4FDvPgCASYlX8KzYgdX7prxcPOfggC8dvVFcX49/xcTg40mTAAApKSk4depUm7lkgpAnZfz/EhFiRN5++21kZ2ezap9++il69uzJUUdNJ5fLMXr0aO11UXeNHj0a27dvB59PV2YQoKKiAkFBQcjPz4efnx9GjRqlfczX11cb5r766iusXLkS27ZtAwCE9O+Pha6uCLz5z9+PCP8AJFRXIUZegbeuX8dyzy6Y5uyM85UV2Jqfj0aNBmWNjbhRq9QGu7shzdtSBHsTIUz5fHQTiVDaUK897mA7e1jf+SA10NYOKTXVsBH8889ZrLwCZ2UyfJt3+896DcOgvq4OADBx4kQKdaRdo2BHSBMdPXoUW7ZsYdUGDx6M119/naOOmq62thYTJ07UuRi8X79++PXXX2FyzzQZ6djuXmOnVCrx3HPPYdOmTVi6dKnO88aPH48dO3Zov+bx+dDct9Awj8dDsI0tgm1s4WUpwsGSUoQ5OeHTmzdxIDAIEjMzfJCViQbNP4sZm975gMEDD6b37B/LsI7L7oUHdoEBgy2+fnC9s5bklW7dUHfn/0UiUdO/GYS0QfQRnZAmqKio0LmDVCQSYefOnUa/f6parcasWbNw7tw5Vr1Hjx44fPgwa3snQu4SiUTYsGED1q1bx1qA+67Y2Fg89dRT2q8FQiEa77kcoaS+HldrarRfZyiUcDU3Q71GAx5uX3NX0diIM3L5E/d2Ti5HjUqFGpUKFyor4H/fn+EQO3vsvmeP2MzKSpi2kQXDCWkpGrEjpAmWLVuGgvv2vvzqq6/QrVs3jjpqGoZh8Oqrr+L3339n1V1cXHD06FE43rlmiZAHefrppxEQEIADBw6gf//+2mvsGIaBtbW1dhoWAESWlqgUi7VfqxgGn928CWljA0x4PLiZm+MzLy/YCIUIc3LC+IQEdDY3R5C19RP31cfGBq9fu4a8ujrMd3ODs5kZcmprtY8v8fDAxzduYHxCPNQMA/eePTHA2RnXMjMfe+yjR49iwYIFKCsrw4gRIzBs2DCdXVkIMWY8hnnEhn6EEERGRmLixIms2ogRI3Ds2DHwDLDHpT6tWrUKH330EatmY2ODc+fOtZmlWUjbkJqaiiO//orxZ87CxIC7PPxWUoIMpQLvdH3q8U8G0CgQ4NDQIRg7bRr8/f0N1hchxoKmYgl5BKlUikWLFrFqNjY22L59u9GHus2bN+uEOjMzM0RGRlKoI3onkUjAEwpRaWnJdSsslZaW4AmFkEgkXLdCSKugqVhCHmHJkiU6C5uuX78eHh4eHHXUNL/99hteffVVVo3H42H37t0YOnQoR12R9kwsFsPc0hJFYjEcDbjDxJQnDGhFDrf7Et8zTQwAO3fuxIYNG1i1GTNm4J133mlxj4RwiaZiCXmIvXv3YsaMGaza+PHjERkZadSjdWfOnMGoUaNQX1/Pqn///fdYvHgxR12RjuD06dNI/PtvjI6OgeCe7fa4oubzETUoFH1GjqQPNKTDoKlYQh6guLhYZ8TL3t4eP/zwg1GHuuTkZEycOFEn1H3wwQcU6ojBBQYGolEkQr6R3JST5+gIlUhElx6QDoWCHSH3YRgGL7/8ss7uDJs2bYKLiwtHXT3erVu3MHr0aFRWVrLqixYtwqpVq7hpinQo9vb26OrlhSxPD5017VqbhsfDDU8PdO3RA/b29pz2QkhromBHyH127dqFyMhIVm3q1Kl44YUXOOro8aRSKUaNGoWie9buAoBJkybhu+++M+pRRtK+hA4ejBpHR2S6uXHaR4abG2ocHRE6aBCnfRDS2ijYEXKPvLw8nZ0knJycsGnTJqMNRwqFAuPGjUNGRgarPnjwYPz8889tYg9b0n64uLigb2gornl5oYqjXR4qRSJc7+GFfoMGGfUoOyGGQMGOkDsYhsHChQt1pjK3bNmCTp06cdTVozU2NmLatGm4dOkSq+7v748//vgDFhYWHHVGOrLQ0FDYu7sh3scHKgPsQaxhGJRJpSgqLoZMJoPmnnsAVXw+4n19IHZzQ0hIiN7PTYixo2BHyB0//PADjh07xqrNnj0bkyZN4qahx7gbRKOiolh1Dw8P/PXXX3RdEeGMUCjEuLAwKF1dcdHPV+/X28nlcjQ2NoBhNKirr0NJSQkUSiXUPB4u+vmi1sUVY8PCaLSadEi03AkhAG7evIlevXpBoVBoa66urkhNTTXagLRixQqsXbuWVROLxYiJiUHPnj056oqQf+Tk5ODAnj0Q5+aif9pVCPW0BEpJaSnUavb+tWqBANcHDkTFU09hVng4PD099XIuQtoaGrEjHZ5Go8G8efNYoQ4Atm/fbrSh7uuvv9YJdRYWFjh8+DCFOmI0PD098fzMmajo0hXnevfW2zV3piYmrK8VNjZIHDIUN2xt8f22bXjllVd07monpKOgETvS4a1fvx5vvPEGq/bSSy/hhx9+4KijR/v555/x4osvsmoCgQB//PEHxo0bx1FXhDxccXExDkdGQp5fgJ6ZmfAqKAC/Bf/0KJRKVFZWQMPjobBHD2T27IkCmQyRR46gtLQUADB9+nTs3btXX2+BkDaDgh3p0K5fv46goCDU1dVpa126dEFycjKsra057OzBjh07hvHjx6OxsZFV37lzJ+bOnctNU4Q0gUqlQkxMDOJiYmAllaJbTi46S6XN2qFC0diIDEsR8rp3h9TKCjFxcYiNjYVardY+JzAwEImJiXp8B4S0DXRlKemwVCoVwsPDWaEOAHbs2GGUoe7y5cuYMmWKTqhbs2YNhTpi9IRCIYYOHQovLy/ExsQg0cEBqUolPPPy4FIug61CAZN7gtn9GgUCVFpaoshBjGx3d5Q3NCAjOxsxkZEoLi7WeT7ttEI6KhqxIx3W559/jpUrV7JqS5cu1dkY3BhkZmYiNDQUZWVlrPrrr7+Or7/+2mjX2CPkYeRyOZKTk5EcH486hQKMSgWr2lrYyOQwVanAZzTQ8PhoEApRJbZHjYUFeEIhzC0t4RsYiKlTp6KiouKBx541axZ2797dum+IECNBwY50SCkpKQgODmaNfnl5eSExMREijhZVfZji4mKEhIQgOzubVZ8xYwZ2794NvgHWCSOktajVashkMpSUlKCkpARlxcVoqKuDWqWCQCiEqbk5Ojk7QyKRQCKRQCwWQyAQoGvXrrh169YDj2lqaor4+Hj4+/u37pshxAhQsCMdTkNDAwYMGIArV65oa3w+H9HR0Rg4cCCHnemqqqrC0KFDda4VGj58OA4fPgwzMzNuGiOEY5999hnee+89AICVlRUUCgXu/eesd+/euHjxIkzuu4OWkPaOPuqTDufTTz9lhToAePvtt40u1NXX12Py5Mk6oa5Pnz747bffKNSRDm3lypU4dOgQtmzZghs3buDtt99mPX7lyhV8+umnHHVHCHdoxI50KJcvX8aAAQNYd8/5+fkhPj7eqIKSWq3GzJkz8euvv7Lq3bp1Q0xMDCQSCUedEWKc6urqEBwcjKtXr2prQqEQFy5cQHBwMIedEdK6aMSOdBh1dXUIDw9nhTqhUIiIiAijCnUMw2DZsmU6oc7JyQlHjx6lUEfIA5ibm2PXrl0QCATa2t073+vr6znsjJDWRcGOdBgffPAB69M8ALz33ntG92l+zZo1+Pbbb1k1KysrREVFoVu3bhx1RYjxCw4O1l53d1daWho+/PBDjjoipPXRVCzpEGJjYzFo0CCjv7h6+/btWLhwIatmYmKCI0eOYMSIERx1RUjb0ZZujiLEECjYkXZPoVAgKCgIWVlZ2pqpqSkuX76MgIAADjtj+/PPPzFp0iRo7luJf8+ePZgxYwZHXRHS9rSl5YwI0TeaiiXt3sqVK1mhDgBWr15tVKEuNjYW06dP1wl1GzZsoFBHyBMKCAjA6tWrWbXMzEydBckJaY9oxI60aydPnsTw4cNZtQEDBiA6Opp1kTWX0tLSMHjwYMjlclb9nXfewZo1azjqipC2TaVSYdCgQbh48SKrfvLkSQwbNoyjrggxPAp2pN2qqqpCr169kJOTo61ZWFggMTERPXr04LCzf+Tl5SEkJAT5+fms+ty5c7Fjxw7aKoyQFrh+/TqCgoJY+0F36dIFycnJRrkfNCH6QFOxpN168803WaEOuH3HqbGEOplMhtGjR+uEunHjxuGHH36gUEdIC3l7e+uMet+6dQtvvfUWRx0RYng0YkfapaioKIwdO5ZVGzp0KE6ePGkUe6sqlUo899xziI2NZdUHDBiAEydO0AXehOiJRqPBs88+izNnzrDqUVFRGD16NEddEWI43P8LR4ieyeVynSVDrKyssHPnTs5C3f79+9G3b1+MHz8eqampmDFjhk6o69mzJw4dOkShjhA94vP52LFjBywtLVn1hQsX6lzXSkh7QCN2pN2ZPXs2fvrpJ1Zty5YtWLRoUbOOp1arIZPJUFJSgpKSEpQVF6O+thYatRp8gQBmFhbo5OwMiUQCiUQCsVjMujEjNzcXXl5eaGhoAHB7qZW7/3+Xm5sbYmNj4eHh0aweCSGPtmXLFixevJhVmz17Nnbt2sVRR4QYBgU70q4cPHgQU6ZMYdVGjRqFqKioJ75mTS6XIykpCSkJCahTKMCoVLCqrYWtTAYTlQp8hoGGx0OjUIhKsRg1FhbgCYUwt7REQJ8+CAwMhL29PTZu3IilS5c+9Dx2dnY4d+4c/P39m/WeCSGPxzAMRo8ejWPHjrHqBw8exKRJk7hpihADoGBH2o2ysjL4+fmhrKxMW7O1tUVqairc3d2bfJzCwkLERkcjOzMTJkolPHLz4CKTwVahgMk9+8zer1EgQKWlJYrEYuR6dEajSISuXl7Y++uv2L9//wNfIxQKcfLkSQwePLjpb5QQ0ix5eXkICAhAZWWltubk5IS0tDQ4Ojpy2Bkh+kPX2JF2gWEYvPLKK6xQBwDffPNNk0OdSqXCmTNn8POPP0J64QJ6J1zB6OgYBNy6BceqqkeGOgAwUavhWFWFgFu3MDo6Br0TrkB64QK8PD0xePDgB66bp1KpcO3ataa/UUJIs3Xu3BkbNmxg1UpLS/HKK6+AxjhIe0EjdqRd2LNnD2bNmsWqTZw4EQcPHmzSFGxxcTEOR0ZCnl+AnpmZ8CooAF8PfzVqGxuQ4uCAzJ49USCTIfLIEZSWlrKe4+joqBNICSGGwTAMJk2ahMjISFadtu4j7QUFO9LmFRYWwt/fn3WHm4ODA9LS0iCRSB77+pycHBzcuxeiwiIEp6fDRqnUW28VFRVQ1iqhtLFBenAwCkUi/Pr778jNzdU+x8PDQ2e9PUKI4RQXF8PPzw8ymUxbs7e3R1paGlxcXDjsjJCWo6lY0qYxDINFixbpLFvw/fffNznUHdizB/bZtzD4yhW9hjoAqL2z4r2oqgpBZ8+iq1yOGVOmaO9+lUgk2L59u17PSQh5NGdnZ3z//fesmlwux6JFi2hKlrR5NGJH2rQdO3ZgwYIFrNoLL7yAX3755bGvLS4uxi+7dsEu+xYGpqXpZer1fkVFRWDwz3E1PB6uDhyIImcXeAf4Y/r06TAzM9P7eQkhj/fCCy9g3759rNqOHTswb948jjoipOUo2JE2KycnBwEBAaiurtbWJBIJ0tLS4ODg8MjXqlQqROzYAfXVdAy+cgVCjcYgPVZWVkKhVAAAeODB0soKplZWiO37NEx8fDBn/nwIhUKDnJsQ8mhSqRT+/v4oKSnR1mxsbJCSkkJrSpI2i6ZiSZuk0WiwYMECVqgDgK1btz421AFATEwM5PkFCE5PN1ioA24vt9KpUyeIxQ5wdnGBjbU1zHk8BF9Nh6ygQGf3CUJI63F0dMQPP/zAqlVVVWHBggU0JUvaLAp2pE3avHkzTpw4warNnTsXEyZMeOxrCwsLERcTg56ZmXq/pu5BTIQmMDczw7335toqlfDOyMSl6GgUFRUZvAdCyIOFhYUhPDycVTt+/Dg2b97MUUeEtAxNxZI2JysrC4GBgVDeE8rc3d2RmpoKW1vbx75+/759kF64gGGX4w1yXV1TaXg8nHo6GI4DB2LqtGmc9UFIR1dRUQF/f38UFBRoayKRCMnJyejWrRuHnRHy5GjEjrQparUa8+bNY4U6ANi+fXuTQp1cLkd2Zia65+RyGuoAgM8w6JaTi+yMDNqMnBAO2dnZYceOHayaUqnEvHnzoH7MwuSEGBsKdqRNWb9+PaKjo1m1xYsXY+TIkU16fVJSEkyUSrhLpYZo74l1lkohVCqRnJzMdSuEdGgjR47Eyy+/zKqdO3dOZ6cKQowdTcWSNiM9PR29e/dGfX29tta1a1ckJyfDysrqsa9Xq9XYtGED3K4kIuDWLQN2+mRSunZBQVAQXn399QduO0YIaR3V1dUIDAxEdna2tmZmZoYrV67Ax8eHw84IaToasSMGde/G2rt27UJwcDAqKysxd+5cdO3aFSqVCgCQmpqKZ5555qHHUalUCAsLY4U6Ho+HH3/8kRXqVq1ahW+//Vbn9adPn8akSZNQp1DA5Z7V5luqqL4er1y9iuGX4zAm/jLevH4NlapG/FZSgs+zbzbpGC7lMtQpFKxV8O8VGRmJr7/+GsDtcBsYGIigoCBcvHgRb7/9dovfw6FDh+Dv7w8+n4/U1NQWH4+Qtsra2ho7d+5k1err6xEeHq79WUWIsaNgR1rFb7/9hi+//BJ//fWX9lo4lUqFPXv2NOn1X3zxBbKysli1ZcuWYciQIU3uob6+HoxKBbuamqY3DkD9kEFthmGwJP0qRjo44MTTfREV/DQmO0lQ+YT/ANgqFGBUKtZaWvcKCwvDG2+8AQD4448/MHPmTCQmJqJ///748ssvm/4+HnKtkLe3N/bv3/9E30tC2quhQ4di2bJlrFpcXBzWrl3LTUOEPCFaGZUY3NGjR/Huu+/ixIkT6NSpk7b+xhtv4Msvv8S//vUv1vNVKhXefPNNnD9/Hg0NDQgPD8dHH33Eek7Xrl1x+vRpDBgwAKGhoThz5gwuX74MAEhMTMSQIUOQn5+Pzz77TLuxd1lZGf5v925sKSzEKEdHvOHZBQDwQ34e/igtBQ/AIvfOCHNywsWKCmzOz4ONUIiyhgZ87d0Tr1+7BoVaDYDBl949Ud7YAEuBAJPv2bpskL09AOByZZW2dry8HJvz8tDAaOBmZoavvHvCUiDAobJSfJubCxMeH6YF+QgeMgRlZWVYunQp+Hw+TExMcPnyZfz4449ITU3F6NGjsX79egiFQsTGxmL58uX49ttvsX//ftTU1ODVV19Feno6GIbBhg0bEBoailWrVqG4uBhZWVnw9fXFN998o/P74+Xl1ezfW0Lao88++wxHjhxBRkaGtrZq1SqMGzcOgYGBHHZGyONRsCMGVV1djX/961+4cOEC3NzcWI/16NED3t7e+OOPP9C9e3dtfdu2bejSpQs2bNgAqVQKd3d3NDY2ah/n8/lwdnbGf/7zH0yaNAnvvfce67jZ2dk4efIkcnJyMGrUKG2wS09Px9pJkzA8Lx8zkpPwrNgBfABRZVL8FtQbtWo1nk9KRP87I4pJ1dWI6hMMiZkZtufno5+tLZZ36QIVw6BRo8Glygr4WFo+9nvQ19YGIxyCAADf5+Vif3Exwt3csDkvD5t9/dDFwgKnunZFWXExdv/yC/73v//hueeeQ2VlJes4I0aMwOLFi+Ho6Ih///vfOH36tPaxTz75BJMnT8auXbuQn5+PcePGISkpCQCQkpKCU6dOwdTU9LG9EkIACwsLREREIDQ0FJo7C5g3NjYiPDwcly5dor9LxKjRVCwxKJFIhF69euHnn39+4OPvvvsu1qxZw6r9/fff2LJlC4KCguDn58e6rg4AVqxYgRs3bmDixIkAoA1ud40dOxZCoRDdunVDRUWFtt69Wzc4m5vDlM/Hcw4OuFJVhfiqKox0dIAZnw87ExMMtLVDyp2p2j42NpDc2ce1l7U1DpWVYkNODm4olbAQCHB7hpaHxymsq8eclGSMT4jHr8UlyKpVao//QVYm9hUXw0TViIa6OoSGhuKdd97BN998o7Oky6P8/fff+PDDDxEUFITx48ejvLwcDQ0NAICJEyfSP0SEPKEBAwbgP//5D6uWlJSEjz/+mKOOCGkaCnbEoAQCAQ4ePIj9+/frXJQMAL1794a9vT1rFwmGYfDDDz/ghx9+QHl5Oev5jo6O+PDDD1m1+2/sNrsTxh7k3rXrHhTJmHvqFvx//nr0tbXFz70CITE1xevX0nGivBzdRSKkKx5/vd4nN2/gJffOONQnGMu7eKLhzgjAR9264w3PLsirq8OKP/9ErVKJd955Bzt27EBNTQ369u3LWjD1URiGwaFDh5CYmIjExETk5+drw5xIJGrSMQghbKtWrYK/vz+rtmbNGsTFxXHUESGPR8GOGJyNjQ2OHDmCjz/+GH/99ZfO4++++y6++uor7dcjRozAd999hzlz5rAu+Ofz+RgzZgzMzMzQp08f/PnnnwCAX3/9tUl9ZN24gTKlEg0aDY6XlyPIxgbBNjb4u7wcDRoNKlWNuFhZgQBra53XFtTVwdHUFDNcXDDRyQnXFQqE2NmhWqXCH6Wl2uedLC9Hbl0t67U1ajWcTU2hYRgcKivT1vPq6tDbxgbLPT0h5PNRW1eHGzduIDAwEO+++y58fHxYyy48yt3v2V13p2EJIc1nZmaGXbt2QSj856oltVqN8PBw1NXVcdgZIQ9HwY60Cjc3N0RGRuKll15CQkIC67EhQ4bA09NT+/XLL7+MGzdu4Pr166znTZw4EU5OTgCAr7/+Gh999BEGDBiA+vp62NjYPLYH7x49sCMuDmFXEjDI3h6B1tYIsLbGaEdHTE68gheTk7HUwxNOD5i2vFhZiYlXEjDxSgJiKyowzdkZPB4Pm3x8ESUtw4jLcRibEI8jUinshCas177a2QMvX72K2SnJcDMz19Y/z87G+IR4jL+SgL5du8LZ1RVff/01/Pz80KtXL3h4eGDgwIGP/+YC+OCDD1BaWoqAgAD4+vpi27ZtTXodcPvmFnd3d5w/fx4jRozAzJkzm/xaQtq73r174/3332fV0tPTdWqEGAtaoJgYnXPnzmHo0KGsKdbg4GCcP38eJia3Q5NSqYSFhQV4PB6+/PJLlJSUsEb9HuTEiRO4fvQonjt/waD9N8ffAwfAe9QoDB8+nOtWCCH3aWxsxMCBAxEfH6+t8Xg8nD17FoMGDeKwM0J00YgdMSo1NTWYO3cuK9SZmZkhIiJCG+oA4NKlS+jduzcCAgJw8uRJrFix4rHHlkgkqLGwQKOR7e7QKBCgxsICknuWTSGEGA8TExNERESwbkJiGAZz586FQqHgsDNCdFGwI0ZlxYoVuHmTvWPDxx9/DD8/P1btmWeeQWJiIlJSUhAVFcVaH+9hJBIJeEIhKpuwRElrqrS0BE8oNHiw27lzJ4KCgli/Pv/8c4Oek5D2ws/PT+eO2Bs3bjTpQyUhrYmmYonROH78OJ577jlWLSQkBGfPntXLHqq0VywhpCXUajUGDx6M8+fPs+rHjx+nyyiI0aARO2IUKisrMX/+fFbNwsICP/74o97CjkAgQECfPsj16Aw13zj+6Kv5fOR07oxewcEU6ggxcgKBABEREbCwsGDV58+fj6qqqoe8ipDWZRz/upEOb/ny5cjLy2PV1q5dq/ftrgIDA9EoEiHf0VGvx22uPEdHqO4s4kwIMX5eXl744osvWLXc3FwsX76co44IYaNgRzh36NAh7Nixg1UbNmwYXn31Vb2fy97eHl29vJDl6QENj71EsVqjAYPWuzJBw+PhhqcHuvboAfs7e8wSQozfkiVLMGzYMFZt+/btOHz4MEcdEfIPCnaEU+Xl5XjppZdYNWtra+zYsQN8A02Xhg4ejBpHR2S6uUHDMFAolSguKUFJSTGKioqheIKtvFoiw80NNY6OCKXlEghpU/h8Pnbs2AErKytW/aWXXoJMJuOoK0Juo2BHOPXaa6+huLiYVfv666/RpUsXg53T2dkZTu7uSPLojJtKBSorK6DR3N3hgkF1K1wrUykS4XoPL/QbNAguLi4GPx8hRL+6dOmCr7/+mlUrKirCa6+9xlFHhNxGwY5wZv/+/dizZw+rNnbsWJ2bKPTp4sWL8Pf3x/z583GrrAzpwcFQt/JNCyo+H/G+PhC7uSEkJKRVz00I0Z8FCxZgzJgxrNrPP/+MAwcOcNQRIbTcCeFISUkJ/P39IZVKtTV7e3ukpqbC1dXVIOdUq9Xo2rWr9iYNJycnzJ4xA13lcviePw/+nb8KIpEIdrZ2AICq6mrU1tbCxMQE9vb24D3s4E2k4fFw3t8PFV26Ysac2XB2dm7hEQkhXCooKIC/vz8qKiq0NUdHR6SlpWm3QCSkNdGIHWl1DMNg8eLFrFAHABs3bjRYqLuroKBA+/+lpaX49fffkevggKsDB2pH7kQWIgBARWUlamqqoVarUFdXiwq5vEXnVvH5OO/vB5mHBya/MJ1CHSHtgJubGzZu3MiqSaVSvPLKK6BxE8IFCnak1e3evRu///47qzZlyhTMmjXLoOcVCARYtmwZq5abm4tffvsN2fb2SBwyBEobW5iYmqK6pgZKJXuroPqGhmafu1Ikwtk+vVHRpSuenzkTnp6ezT4WIcS4vPjii5g8eTKr9ttvv+Hnn3/mqCPSkdFULGlVBQUF8PPzQ2VlpbbWmtMWcXFxCA0NRWNjI6vu5OSEsHHj4NmpE3pmZsIhJUU7NXsXn8d/4lE2DY+HDDc3XO/hBbGbG8aGhdFIHSHtUGlpKfz8/FgzEXZ2dkhLSzP4TAQh96JgR1oNwzAYO3Ys/vrrL1b9wIEDmDJlisHPn5WVhdDQUJSWlj7wcYFAgHnz5sHRxgaONTXonJUFx7w8CDSaO8/gwcXFpUnX2an5fOQ5OuKGpwdqHB3Rb9AghISEQCgU6u39EEKMy/79+zFt2jRWbcyYMTh8+DB4vJZeoUtI01CwI61m69atWLRoEas2a9Ys7N692+DnLi4uRmhoKG7evMmq83g87XUwAoEAarUazs7OCA0JQY+uXSFSqeCekwNxUREsKyrgKhZDKHhwOGsUCFBpaYkiBzFyOneGSiRC1x49EEpLmhDSYcyaNUvnbv+tW7di4cKFHHVEOhoKdqRV3Lp1CwEBAaipqdHWXFxckJqaCrFYbNBzV1VV4ZlnnsGVK1dY9eHDh2PZsmV48803UVFRAXNzc+Tm5moft7W1RWBgIPoEBMDS3BxCHg9itRriqmqYqlTgMxpoeHw0CIWoEtujxsICPKEQ5paW6BUcjF69etGOEoR0MDKZDH5+fqz1Oa2srJCSkmLQ9TkJuYuCHTE4jUaD4cOH4/Tp06z64cOHMXbsWIOeu76+HuPGjcOJEydY9d69e+P06dOwsbHR1qZNm4b9+/frHIPH48HBwQESiQSzZ8+Gp7s7GurqoFapIBAKYWpujk7OzpBIJJBIJBCLxRC08tp4hBDjcfjwYYwfP55VGzZsGI4fP26wHXUIuYsu+CEG99133+mEugULFhg81Gk0GsyZM0cn1D311FOIiopihTrg9nIrAoEA+/btYy1TwDAMpFIppFIp6uvrMePFFw3aNyGkbRs3bhzmz5/P2gP71KlT2LRpE/79739z2BnpCGjEjhhURkYGgoKCUFtbq615eHggJSVFJ1jpE8MwWLZsGb755htW3cnJCTExMejevfsDX1dTU4NOnTqhrq5O5zEej4dz584hNDTUID0TQtqPyspKBAQEaBdEBwALCwskJSXBy8uLw85Ie0djwsRg1Go15s6dywp1ALBjxw6DhjoA+OKLL3RCnZWVFY4cOfLQUAfcnkK5N9Tx+Xz873//w9tvv40TJ05QqCOENImtrS1rxA4AamtrMXfuXKjV6oe8ipCWo2BHDGbdunU4f/48q7ZkyRIMHz7coOfduXMnVq5cyaqZmJjg4MGDCA4OfuRr9+7dy/p6xIgReOONN7B27VoMGzZM770SQtqvESNG4NVXX2XVYmNj8b///Y+jjkhHQFOxxCDS0tLQp08fNNyzW0O3bt2QlJQES0tLg5330KFDmDRpks4n4j179mDGjBmPfG1VVRWcnJxQX1+vrW3fvh3z5883SK+EkPavpqYGQUFBuHHjhrZmamqKhIQE+Pn5cdgZaa9oxI7oXWNjI+bMmcMKdTweDxEREQYNdefPn8f06dN1Qt369esfG+oAIDIykhXqTExMdLYJIoSQJ2FlZYUff/yRtUBxQ0MDwsPDdXbAIUQfKNgRvVuzZg0SEhJYtTfffNOg16elp6dj/PjxOtfzrVixAq+//nqTjrFv3z7W1yNHjqR16AghLTZo0CAsX76cVYuPj8fnn3/OUUekPaOpWKJXCQkJ6N+/P1Qqlbbm4+ODhIQEmJubG+Sc+fn5CAkJYd19BgDh4eHYuXNnk7byqaiogJOTE+sTdEREBObMmaP3fgkhHU9dXR369OmD9PR0bU0oFOLSpUvo3bs3h52R9oZG7Ije1NfXIzw8nBXqBAIBIiIiDBbq5HI5Ro8erRPqxo4di61btzZ5f8bff/+dFepMTU0xceJEvfZKCOm4zM3NERERwVq8XKVSITw8nHUJCCEtRcGO6M2qVauQmprKqq1cuRJ9+/Y1yPlqa2sRFhaGtLQ0Vr1///7Yt28fTExMmnys+++GHTNmDGxtbfXSJyGEAEDfvn117thPSUnBRx99xFFHpD2iqViiFxcuXEBoaCg0Go22FhgYiEuXLsHU1FTv51OpVHj++ecRGRnJqnt7eyM6OhqOjo5NPlZ5eTmcnZ1ZI40///wzZs6cqbd+CSEEuH3jRL9+/ZCUlKSt8fl8xMbGon///hx2RtoLGrEjLaZUKhEeHs4KdSYmJti1a5dBQh3DMHjllVd0Qp2rqyuOHj36RKEOAA4ePMgKdebm5jr7PBJCiD6YmpoiIiKCNaOg0WgQHh6uc/MXIc1Be8WSFnvvvfeQkZHBqq1atQq9evUyyPk+/PBDbNu2jVWztbXFX3/9BU9Pzyc+3v3TsOPGjYO1tXWLeiSEkIcJDAzEhx9+iP/+97/a2vXr1/Hee+/pLF6sVqshk8lQUlKCkpISlBUXo762Fhq1GnyBAGYWFujk7AyJRAKJRAKxWMy6jo90PDQVS1rkzJkzeOaZZ1i1fv36ISYmBkKh/j83bNq0CUuWLGHVzMzMcOzYMQwZMuSJj1daWgoXFxfWaOO+ffswbdq0FvdKCCEPo1KpEBISgri4OG2Nx+Ph9OnTGDJkCORyOZKSkpCSkIA6hQKMSgWr2lrYymQwUanAZxhoeDw0CoWoFItRY2EBnlAIc0tLBPTpg8DAQFquqYOiYEearbq6GoGBgcjOztbWzM3NceXKFfTs2VPv59u/fz+mT5+Oe//I8vl87N+/v9kLCW/evBmvvPKK9muRSISysjKIRKIW90sIIY+Snp6O3r17s+6KDQ4OxlvLlyMvOxsmSiU8cvPgIpPBVqGAySP2mG0UCFBpaYkisRi5Hp3RKBKhq5cXQgcPhouLS2u8HWIkaCqWNNvbb7/NCnUA8Omnnxok1J0+fRovvvgi7v8csmnTphbtDnH/NOyECRMo1BFCWoWPjw8+++wzvPnmmxAIBAgJCUFo374oiYlB7+ISuEulENwzm/AoJmo1HKuq4FhVBd/cXOQ7OiKrvBy7s7LQNzQUoaGhBplFIcaHRuxIsxw9ehSjR49m1QYPHoxTp07p/fqOpKQkDBkyBFVVVaz6qlWr8OGHHzb7uEVFRXBzc2OFxd9++422ESOEtBq1Wo3x48fD3cUFbvb28Lp2Da4ZGehkL4aZmVmLjq3h8ZDp5oZrXl4Qu7thbFgYnJ2d9dQ5MVYU7MgTq6iogL+/PwoKCrQ1kUiE5ORkdOvWTa/nys7ORkhICIqLi1n1l19+Gd9//32TFyB+kI0bN2Lp0qXar62trVFaWmqwxZQJIeR+OTk5+HX3zxBkZ6Nn/GWI7nyAFfAF6OTkBH4LfsbdVSUSId7HB0pXV0x+YXqzbjIjbQctd0Ke2LJly1ihDgC++uorvYe6srIyjBo1SifUTZkyBd99912LQh2guzfsxIkTKdQRQlpNTk4ODuzZg055eRiSmKgNdQCg1qhRVVmpl/PYKJUYfOUK7G5l48CePcjJydHLcYlxomBHnkhkZCQiIiJYtREjRmDx4sV6PU9NTQ3GjRuHzMxMVn3IkCHYvXt3i6d78/PzER0dzapNnz69RcckhJCmKi4uxsG9eyHOycWAtDRYm5vDzJQ99aqsVaKurk4v5xNqNBiYmgZxbi4O7t2n84GZtB8U7EiTSaVSLFq0iFWzsbHB9u3bWzx6dq+GhgZMnTqVtQwAAAQEBOCPP/7Qy6jar7/+yvra1tYWI0eObPFxCSHkcVQqFQ5HRkJUWIT+V6+CzzDgAbCzswOPx/5nuaKyUuemsebiMwz6p12FRVEhjkRGshZmJ+0HBTvSZEuWLEFJSQmrtn79enh4eOjtHBqNBgsWLMDRo0dZdU9PT/z111+ws7PTy3nuvxt28uTJLb5QmRBCmiImJgby/AIEp6dDeM9drwKBALa2NqznajRq1Dc06O3cQo0GwVfTISsoQGxsrN6OS4wHBTvSJHv37tW5Jm38+PGYO3euXs+zYsUK/PTTT6yag4MDjh49CldXV72c49atW7h48SKr9sILL+jl2IQQ8iiFhYWIi4lBz8xM2CiVOo+LLEQwN2PPSuh7mRJbpRLeGZm4FB2NoqIivR6bcI+CHXms4uJivPrqq6yavb09fvjhB71Owa5btw5fffUVqyYSiXD48GF4e3vr7Tz3T8OKxWIMHz5cb8cnhJCHiY2OhpVUCq/7bkC7l729PSwtrWBmZg6xvRhCA2wR1qOgAFZSKWLuu9aYtH20WiF5JIZh8PLLL0Mmk7HqmzZt0utq5j/99BPeeustVk0gEGD//v3o37+/3s4D6E7DTpkyhbUhNyGEGIJcLkd2ZiZ65+SC/4jr5ng8HmxtbB76uD7wGQbdcnKR6OAAuVxO24+1IzRiRx5p165diIyMZNWmTp2q16nLo0ePYt68eTr1HTt2YMyYMXo7DwBkZWUhPj6eVaNpWEJIa0hKSoKJUgl3qZTrVgAAnaVSCJVKJCcnc90K0SMKduSh8vLy8Prrr7NqTk5O2LRpk96mYOPi4vD888/r3J31xRdfYM6cOXo5x73uv06wU6dOeOaZZ/R+HkIIuZdarUZKQgI8cvOavE2YoQk0Gnjm5SE5Ph7qR+xDS9oWCnbkgRiGwcKFC1F53wKZW7ZsQadOnfRyjoyMDIwdOxYKhYJVf+ONN/D222/r5Rz3uz/YTZ06lfZPJIQYjFAoRFBQEAIDA7Hmyy/h+ASjdfl1dThSVqb9+mJFBV5Lv6r9+phUigkJCRibEI9xCfHYXVSofUyhVqNXbAyr9iC2JaX45ttvYWNjo3M5DGmb6F808kA//PADjh07xqrNnj0bkyZN0svxi4qKMGrUKEjv+yE3a9YsfPXVV3q9KeOu69evIykpiVWjRYkJIYZkZ2eHxMREpKam4sivv6LT6TNNfm1BXR2ipGUY+4AP01dravDVrVvY7u+PzubmqFOrceSen6cny8vha2mFI2VleNHl4SsKONTWYviQIejUuTNqa2uf7M0Ro0TBjui4efMm3nzzTVbN1dUVGzZs0MvxKysrMWbMGNy6dYtVf+6557Bz507w+YYZSL7/pglnZ2cMHjzYIOcihJB7lZSUwKq2FoUKBVZkZqBWrYYJn481Xj3QXSTCdYUCKzKu4+4k7U4/f3ydk4NMpQJhVxIQ7uoK93uWQdlRkI/FnTuj850F280FAkyRSLSPH5GWYZmnJz68kYWS+npIHrJOpyXDoJeTE27V1xvsvZPWRcGOsGg0GsybN09nenT79u16uWuqrq4OkyZN0hk5Cw4OxoEDB2BqatriczzM/cFu2rRpLd6ajBBCHqWiogJBQUGokMvRTSTC1/ZiRPgHwJTPR0JVFf536xY2+fpib3ERZrq44AVnF9Sp1eDzeHjD0xM/FRVio48vgNtTsXdlKZVY4Ob+wHPWqFS4plCgn60tRjo44mi5FHNc3R7ao41MjmoBH5bW1np974QbFOwIyzfffIOzZ8+yai+99BJGjx7d4mOr1WrMnj0bp0+fZtW7d++OI0eOwNqAP1TS0tJw9epVVo3uhiWEGNrdqdiIbdtgER2NhuvX8VHWDVxXKMAD0KDRQFlbCz9zC/xQUICKRhXGdHKEh7nFI4/LAA+9ZOXv8nI8IxaDz+NhjKMjVt+88chgZ6pSQaXR/+UvhBsU7IjW9evXsXLlSlatS5cuWLduXYuPzTAMli5div3797PqEokER48ehZOTU4vP8Sj3j9a5ublh4MCBBj0nIYQAty8/qayshGlDA7bczIYDw2C7mzvKGurxakEBKirkGMjn4SkXVyQyDMJTUvDtnVG6h+kuEuFqTQ16WlrqPBYllSK1phqn76w/WtrQgOL6ejg/ZDqWz2igoZti2w0KdgTA7U2pw8PDUVdXx6rv2LFDLyNpn376KTZt2sSqWVtbIyoqCk899VSLj/8oDMPoBLvp06cb7Fo+QkjHJJPJcPXqVdYvuVwOOzs7zJs9G4MaGlBRXwc3ExNoNGr8XV2tfW1hYyNchELMlkiQpVQiS6lEN5EFFA9ZhmS+mzuWX7+Gfra2cDc3R71Gg99KSjCuUydcVdTgXL/+ENwZ0Vt36xaipFLMc3vwqJ2GxwefTyN27QUFOwIA+Oqrr3T2T126dCmGDRvW4mNv27YN77//PqtmamqK33//Hb17927x8R8nKSkJGRkZrBpNwxJCmoNhGJSVlekEuKtXr6KkpOShr6urr4fK1BRhNjb4oLgYx6urESwSaR8/VVODv2tqYF5SAjczMzzn4AATHg8qhnngzRN+VlZY7tkFr1xNg4phIOTxMMvFFX+XSzHIzl4b6gDgOQcHfHzzxkOD3fI/fkdFYyN4PB5++eUXXL58Gc7Oznr4bhEu8BjmEfuakA4hJSUFwcHBaGxs1Na8vLyQmJgI0T0/eJrjjz/+wJQpU6C5Z0HOuz88WmupkXfffRdr1qzRfu3p6Yns7GyDLKlCCGkfGIZBUVHRAwNceXn5Ex/v2WefxUgvLww4fvy+R3gQCAQwNTGBlbU1TDhYV/PvgQPgPWoU7ZndTtCIXQfX0NCA8PBwVqjj8/n48ccfWxzqoqOjMWPGDFaoA4ANGza0Wqh72DQshTpCCHD7Z0ReXt4DA9z9C7S3RFlZGeqCgyG0tII5jwcToRBCEyGEAiGnP48aBQLUWFhAcs9SKaRto2DXwX366ae4cuUKq/b2228jJCSkRcdNTU3FhAkTdK7Ze/fdd/Haa6+16NhPIj4+Hjdv3mTVaBqWkI5HrVbj1q1b2tCWnp6u/W9NTY3ezmNiYgJvb2/4+vrCx8cHvr6+8PX1hb29PX7esQN8d3fYVFXp7XxNJW9sRHhqCqtmwefju8FDwBMKKdi1IxTsOrD4+Hh8+umnrJqfnx8++uijFh03NzcXo0ePRsU9ay4BwPz58/HJJ5+06NhP6v7Rum7duqFPnz6t2gMhpPWoVCrcuHFDZ/Tt2rVrOh80W8Lc3Bw9e/bUBre7v5566imYmJjoPF+tVsPc0hJFYjEcOQh29iYmiOyt+7MvxUEMc0tLiMXiVu+JGAYFuw6qrq4Oc+bMYW38LBQKERERAbOH3BLfFOXl5Rg9ejQKCgpY9fHjx2PLli2tOuXAMIzO3rAvvPACTcMS0g40NDQgMzNTJ8BlZGSgoaFBb+extLRkjbzd/dWlS5cnWuBcIBAgoE8fJJaXwzc3F4L7LlHhgprPR07nzugTHEyLtbcjFOw6qA8++EBnwd733nsPwcHBzT6mUqnE+PHjkZ6ezqoPHDgQe/fuhbCVLwq+ePEicnNzWTXaG5aQtqWurg7Xr1/XCXCZmZmsD6YtZWNjoxPefH190blzZ70tjRQYGIi4mBjkOzrCs7RUL8dsiTxHR6hEIvTq1YvrVogeUbDrgGJjY/HVV1+xar1798Z7773X7GOqVCq88MILuHDhAqvu4+ODQ4cOtfhGjOa4fxrW29ubfoARYqQUCgWuXbumE+Bu3rypcwNWS9jb28PPz08nwLm6uhp8NN/e3h5dvbyQVV6OzmVl4HO4KIWGx8MNTw907dFDL9tFEuNBwa6DUSgUCA8Px72r3JiammLXrl0PvC6kKRiGwaJFi3Do0CFW3d3dHUePHuXk2g2NRoNff/2VVaNpWEK4V1lZqb1x4d6bGG7duqXX80gkEp0bGHx9feHk5MTpz4HQwYOxOysLmW5u8M7P56yPDDc31Dg6YuKgQZz1QAyDgl0Hs3LlSmRlZbFqq1evhr+/f7OP+d5772Hnzp2smp2dHf766y907ty52cdtidjYWJ3r/GgalpDW86BdGK5evarz97Kl3NzcdEbffHx84ODgoNfz6IuLiwv6hoYirq4eLjIZbJRKg5ynuqYG9fX1EFlY6MyYVIpEuN7DC/0GDYKLi4tBzk+4Q8GuAzl58iQ2btzIqg0YMABvvfVWs4+5ceNG1uK/wO27xQ4dOgQ/P79mH7el7p+G9fPz47QfQtqj5u7C0Byenp4PDHC2trZ6PU9rCA0NRdb164iv8sHgK1cg1PONFFVVVahR3F7CpaGhHgqlEnZ2djARCqHi8xHv6wOxm1uLl7UixomCXQdRVVWF+fPns2oWFhaIiIho9t1Qe/fuxeuvv86q8fl87N27F6Ghoc3utaXUajX279/PqtHadYQ0n753YXgYHo+Hp556SifA9ezZE1ZWVno7D9eEQiHGhYXhl4pKXGyox8DUNL1eb1dXz17WpbGxAWVlZTC3tMTVkIGodXHFxLCwVr+hjbQO+l3tIN58803k5OSwamvWrEGPHj2adbwTJ05g9uzZuH9Hui1btiAsLKzZferD2bNnUVxczKpRsCPk8TQaDfLy8ljXwBliFwaBQIDu3bvrBDhvb29YWFjo7TzGzNnZGZNfmI4De/bgPID+aVf1NnInEAihUqlYNbWAj8uBvZBnIYKNqhFOTk56ORcxPhTsOoCoqChs27aNVRs6dGizd4C4cuUKJk+ezNqGDAA+/vhjLFy4sNl96sv9a9cFBQU1O8AS0h7dvwvDvTcyKBQKvZ3nYbsweHl5tWi9zPbC09MTz8+ciYN79+GcqRmC09P1cs2dmZkZ6u8ZtVPY2OBa8NMoFFlg3/5fkZeXh4qKCqxdu7bF5yLGh8fcP+RC2hW5XA5/f38UFhZqa1ZWVkhOTkbXrl2f+Hg3btxAaGiozvUzS5YswcaNGzm/61SlUsHV1RVlZWXa2meffYaVK1dy2BUh3DDWXRgIW3FxMQ5HRkKeX4CemZnwKiho0dRsXX09ZLJyaHg8FPbogcyePVEgkyHyyBGU3lk/z9vbG9euXdPXWyBGhEbs2rmlS5eyQh0ArFu3rlmhrqSkBKNGjdIJdVOnTsWGDRs4D3UAcOrUKVaoA2galrR/bW0XBsLm7OyM8PnzERMTgzhzM+S7OKNbTi46S6XN26HCxAQlnp7I694dUisrxMTFITY2lrWg87Bhw/T4DogxoRG7duzgwYOYMmUKqzZq1ChERUU9cQirrq7GM888g4SEBFb9mWeeQVRUFMzNzVvcrz4sXLgQ27dv13799NNPIy4ujsOOCNGf9rQLA3mwwsJCxMbEIDsjA0KlEp55eXApl8FWoYDJI36PGwUCVFpaoshBjJzOnVFWX4+M7GzExMbqXHM8YsQI/Pnnn0bzc5voFwW7dqqsrAx+fn6s0StbW1ukpqbC3d39iY7V0NCAcePG4fjx46x6YGAgzpw5YzTLDTQ2NkIikUAul2trX375ZYuWcyGECx1hFwbyaHK5HMnJyUiOj0edQgFGpYJVbS1sZHKYqlTgMxpoeHw0CIWoEtujxsICPKEQ5paW6BUcjJdeegmZmZkPPLaTkxPS0tLg6OjYyu+KtAaaim2HGIbBK6+8ojMl+c033zxxqNNoNJg7d65OqOvSpQuioqKMJtQBwPHjx1mhDgCmTZvGUTeEPN79uzDc/XX/HewtZay7MJCHs7e3x9ChQzFo0CDIZDKUlJSgpKQEZcXFqKurg1qlgkAohKm5ObydnSGRSCCRSCAWiyEQCDBkyJCHBrvS0lK88sor2LdvH/3+t0M0YtcO7dmzB7NmzWLVJk6ciIMHDz7RX2KGYbB8+XKsX7+eVXd0dERMTIzR3Wk6d+5cREREaL8eMGAAzp8/z2FHhNxGuzCQ1lZTU4OPPvoIubm5mD17NrZu3YrIyEjWc/bs2YMZM2Zw1CExFAp27UxhYSH8/f1ZI1cODg5IS0uDRCJ5omOtXbsWK1asYNUsLS1x6tQp9O3bVy/96kt9fT0kEglrra2vv/4ay5Yt464p0qHQLgzEmBUXF8PPzw8ymUxbE4vFSE1NpW3F2hmaim1HGIbBokWLdKYjv//++ycOdRERETqhTigU4rfffjO6UAcAx44d01lAlaZhiSHQLgykLXJ2dsb333/PWiVAJpNh0aJFiIyMpCnZdoSCXTuyc+dOHD58mFV74YUXnjjgHDlyBAsWLNCp//jjjxg5cmSLejSU+/eGHTRoENzc3DjqhrQHtAsDaW+mT5+OAwcOsBZxP3ToECIiIjB37lzuGiN6RVOx7UROTg4CAgJQXV2trUkkEqSlpT3R9TUXLlzA8OHDobxv9fN169Zh+fLleutXn2pra+Hk5ISamhptbePGjfj3v//NYVekraBdGEhHIpVK4e/vz7o8wMbGBikpKfDw8OCwM6IvNGLXDmg0GixYsIAV6gBg69atTxTqrl27hnHjxumEurfeestoQx1we8u0e0Mdn8/H1KlTOeyIGCPahYGQ2ze//fDDD5g4caK2VlVVhQULFuDYsWM0JdsOULBrBzZv3owTJ06wanPnzsWECROafIyCggKMGjWKdWEtAPzrX//CF198oZc+DeX+vWGHDh0KZ2dnjrohXKNdGAh5tLCwMISHh7NWETh+/Dg2b96MV155hcPOiD7QVGwbl5WVhcDAQNYom7u7O1JTU5t8h1xFRQUGDx6M1NRUVn3UqFH4888/jXqUQaFQwMnJifX+v//+eyxevJjDrkhr4HIXBh8fH3h4eNAuDKTNqqioQEBAAPLz87U1kUiE5ORkdOvWjcPOSEvRiF0bplarMW/ePJ2p0+3btzc51NXW1iIsLEwn1PXt2xf79+836lAHAIcPH2a9f4FAgOeff57Djoi+0S4MhOifnZ0dtm/fjlGjRmlrSqUS8+bNw6lTp2jUuQ2jYNeGrV+/HtHR0aza4sWLm3znqlqtxqxZs3Du3DlW3cvLC4cPH24TSyrcfzfss88+i06dOnHUDWmJ+3dhuPv/t27d0ut5aBcGQm4bOXIkFi9ejM2bN2tr586dw4YNG4z6umryaDQV20alp6ejd+/eqK+v19a6du2K5OTkJgUyhmGwePFi/PDDD6y6i4sLYmNj0aVLF323rHfV1dVwcnJiXfi+bdu2By7VQowH7cJAiPGoqalBr169kJ2dra2ZmZnhypUr8PHx4bAz0lw0YtcGqVQqhIeHs0Idj8fDjz/+2ORRto8++kgn1NnY2OCvv/5qE6EOAP78809WqBMKhZg8eTKHHZG7aBcGQtoGKysr7Ny5E8OGDcPdcZ76+nqEh4cjNjYWQiHFhLaGfsfaoC+++AJxcXGs2rJlyzBkyJAmvX7z5s346KOPWDVTU1P88ccf6NWrl976bC61Wq2z6XV9bS00ajX4AgHMLCzQydkZR48ehaOjI8rLy8EwDJ577jmIxWKu2+9QaBcGQtq+oUOH4vXXX2ftCx4XF4e1a9fi3Xff5a4x0iw0FdvGJCUloW/fvmhsbNTWvL29ceXKlSatUv/bb79h6tSpuPe3ncfj4ddff+X8pgO5XI6kpCSkJCSgTqEAo1LBqrYWtjIZTFQq8BkGGh4PjUIhKu3tUQpAxTBQ1NUhISUFs2fPxsKFCzl9D+0VwzDIy8t7YICjXRgIaftqa2vRu3dvXL9+XVszMTFBXFwcAgMDOeyMPCkKdm1IQ0MD+vbti+TkZG2Nz+cjNjYW/fv3f+zrz5w5g1GjRrGmcAFg06ZNnK5dVFhYiNjoaGRnZsJEqYRHbh5cZDLYKhQweciyFcraWkhrqqGwtYXc1RV5Hp4w7eSIp3r0QOjgwbSpdTPdvwvD3RsY0tPTWYtAtxTtwkCI8bl48SJCQkJYd5sHBgbi0qVLMDU15bAz8iQo2LUh//3vf/Hpp5+yaitXrsRnn3322NcmJydjyJAhOqMr77//PlavXq3XPptKpVIhJiYGcTExsJJK0T0nF+5SKQRNWMKiXCZDff0/19eZWIig7NEDWZ4eqHF0RN/QUISGhtL1IQ9BuzAQQh7k3XffxZo1a1i1//73v/j444856og8KQp2bcSlS5cQEhLCWng1ICAAcXFxjx3huHXrFkJCQlBUVMSqv/TSS9iyZQsnSzwUFxfjcGQk5PkF6JmZCa+CAvCb+EdRw2hQXFwC4J/n29nZQ2RhAQ2Ph0w3N1zz8oLY3Q1jw8I69C4UtAsDIeRJ1NfXo2/fvkhJSdHWBAIBzp8/j759+3LYGWkqCnZtQG1tLfr06YNr165pa0KhEHFxcQgKCnrka6VSKUJDQ5GRkcGqT5w4Efv37+dkRCsnJwcH9+6FqLAIwenpsLlvgeXHUSqVqKis0H7NAw8SZ2fw7wmoVSIR4n18oHR1xeQXpsPT01Nf7RslLndh8PX1RefOnWkXBkLaiStXrqBfv35QqVTamo+PDxISEmBubs5hZ6QpKNi1AW+99RbWrVvHqq1evRrvv//+I1+nUCjw7LPP4tKlS6z6oEGDcOzYMU4uRs/JycGBPXvgkJOLflevQtiMnQPun4Y1N7eA2N5e53kqPh8X/Xwh8/DA8zNntotwR7swEEJaw+rVq/Hhhx+yam+99Ra+/PJLjjoiTUXBzsidO3cOQ4cOZd3FGhwcjPPnzz/yGqXGxkZMnDgRUVFRrLq/vz/Onj0L+wcEIUMrLi7GL7t2wS77FgampTV56vVeD5qGtbcXw+IhnyI1PB7O+/uhoktXzJgzu81My1ZVVbF2Ybj7S9+7MDg5OT1wBI52YSCkY2tsbMTAgQMRHx+vrfF4PJw9exaDBg3isDPyOBTsjFhNTQ0CAwNx8+ZNbc3MzAzx8fHw8/N76OsYhsHcuXOxa9cuVr1z586IjY2Fu7u7wXp+GJVKhYgdO6C+mo7BV640a6Tu7nFKy0q1X/N4PDhLnB8ZQlR8Ps726Q0THx/MmT/fqG6ooF0YCCHGKi0tDX369GFdj9utWzckJSXB0vL/27vzuCrL/P/jr/ucw3IOiGyxiIKm4MIOLiVaOVqWOkyaqaNjbmWOfmuymql+5WTLTDNjaY3a2KR91W9alpOGmjpN0gLkEiigooAhsiuy75zl94fKeMTdAweOn+fj0R9e3MuHAz2uN9d9X9flZMXKxNV0nB5OtPLCCy+YhTqAN95446qhDuDFF19sFerc3d3ZvXu3VUIdQGJiIuX5BYzIyLjpUAeg1qjRaOzQ68+t4+fs7HzNkSWN0Uj00Qy+dXEhKSnpuhdythTZhUEI0RkFBwfz5ptv8oc//KGl7cSJE7zwwgusWLHCipWJq5ERuw7qP//5D/fff79Z29ChQ/n++++vOstw2bJlrTZv1mq17Nmzh7vuuqtNar2WwsJCNq5dS7/0w/TNz7/l6xmNRuobGlCr1Tg4OHC9DwyPde/O8dAQps2a1Sbr3MkuDEIIW2MwGLjnnntISkoya//Pf/7DyJEjrVSVuBoJdh1QZWUloaGh5OXltbRptVpSU1MJDAy84nkbN25k2rRpZm1qtZovv/ySsWPHtlm917L5s88o3buXET8l39R7dZZiVBTiB0bjeffdPDJxItu3b+eHH35gzJgx3Hfffdd9HdmFQQhxO8nKyiI8PJz6+vqWNn9/f9LT03FxcbFiZeJyJNh1QHPmzOGjjz4ya1u+fDn/8z//c8Vz/v3vfzNu3DizrcYA/vd//5eZM2e2RZnXpby8nNXvv09kykECTp++9glt7KSXFwcjI0g5fJhNmzYB50bAEhMTufvuu82ONRgM5ObmtgpvbbELQ1BQUKsAJ7swCCE6iuXLl/P000+btc2ZM4fVq1dbqSJxJfKOXQezffv2VqFuxIgRzJ8//4rn/PTTT0yYMKFVqHvrrbesGurg3N62dnV1dC8ttWodF3gVFtLs38NssWaTycT7779PcXGxWXg7duyY2V+ot+rSXRguLOjbu3dv2YVBCNGhLViwgC1bthAfH9/StmbNGsaPH2/VJ0KiNRmx60DOnj1LSEgIxcXFLW1dunQhLS2Nnj17XvacrKwsYmJiOHPmjFn7008/zbvvvmvVJSsMBgPvv/cefgcPEWrhZTpulAmorq6mpqaGnNAQUv38eO/992mLX3+dTnfZJURkFwYhRGd28uRJQkNDzZ5Y+Pr6cvjwYdzd3a1YmbiYjNh1IE899ZRZqANYunTpFUNdcXExo0ePbhXqJk+ezLJly9ok1K1cuZIPP/yw5d/19fVkZ2dTWVnZ6sX9srIyGmpr8S0ru+Z1m4xGlpzMYU9ZGY4qFR52drzU6076W2AygMFo5MyZM+yoKOcunQ73oiKcevfGw8OD0lsYSZRdGIQQt5OePXuybNkynnjiiZa2oqIinnrqKTZs2GDFysTFJNh1EJs3b+aTTz4xa3vooYeYM2fOZY+vqqrioYceIicnx6x95MiRrFu3rs2CxYIFC1iwYEHLv2fNmsWECRMuOxuzpKQEk16P6/m/7gwmE+orhM23T+ZQbzCyO3ogGkXhZH09BY2W2Yy+oqIco9HArupq+jo44F9RgUZR8Pb2vq5gJ7swCCHEOXPmzOGLL74wW/x+48aNTJgwgUceecSKlYkL5FFsB3D69GmCg4PNQoarqytHjhyhW7durY5vbGzkoYceMnvXASAyMpJvv/223WYpbd26lddee419+/bR1NTE/PnzycjIwGQy8d5779HQ0MDSP/4Rl59/Jre+gT46HRO8vfljdhZNRiP9nJx5MzAQg8nEfQf2Ez9oME6XeVT5z/w8vjx9GgWY270HsV5e7Kuo4OOiQpb3HwDAUxlH+Y1vN4a4ujJ4749M8PYmobwcdzs73vT24cfKCv50+jR3aDQ4q1TMmDmLf2dlsmfPnpb7uLm5ER4e3hLcNmzYwAsvvEBsbKwEOCGEOK+goICQkBAqKipa2jw9PTly5AheXl7WK0wAIM+LrMxkMvHkk0+2GjlasWLFZUOdwWBg+vTprUJd79692blzZ7uFupKSEp566ik+/vhj7O3tefPNNxk/fjwHDhxg69atzJ8/nzPFxTjU15NZW8eHwcEs6t2bFzKP88fevdkWFY1OrWJDUSGnGhrwdXC4bKhLr65m55lSvoiI5OPQMN47lUtJY+NVa6vQ67nXzZ3tUdF42zuwX9/McOcu9HVw4DVvb/7RvTtuVZV4e3qanbd06VLi4+NZuXIlsbGxFBYWMmrUKAl1QghxET8/P5YvX27WVlpayrx589rkvWVxYyTYWdmGDRvYunWrWduECROYOnVqq2NNJhPPPPMMn3/+uVm7l5cXu3fvxtvbuy1LNfP444+zcOHCll0wvv76a1599VUiIiIYN24cZ8+epba6GrXRyEgPd+xVKqr1epqMJsK7nAufv/Ly5qfKKkwm0xUXGU6uquIBTw8cVCpc7ey4u6sr6ddYasRJreZuV1cAQpydKdEb8PHxQaPR4OjgiEpRoWlqwtHe3uy8C9trNTQ0MHnyZN5++23ZNkcIIS5j2rRpjB8/3qxty5Yt8q5dByDBzooKCgparU3n6enJP/7xj8uOEr311luttnFxdnZm586d9O7du01rvdg///lPamtrWbhwYUubyWRi+/btHDp0iEOHDpGfn49KUVBMJhxV50biLv07zoQJRYEArZbCxkbqDIZr3tsEKIBaUbh4Y7Im43+vbnfRZ6dSFAwmEypFQa1S4+LigrePDzpHR7p3707Pnj3RaDTMnj2bsWPHYjKZmDFjBmPGjGHixIk38ekIIYTtUxSFVatW4XnJk4+nnnrK4ntdixsjwc5KTCYTjz/+eKudCj744IPLvqOwZs0aXn75ZbM2Ozs7tmzZQlRUVJvWerETJ07w+uuvs27dOrPwOWrUKFauXNny79TUVFRqNaaLjnHRaLBXKaRVVwOw7cwZBrp0RadW8ysvL/6a8zOG88P42XV1/FBeTrSLC1+fPUuT0Uilvpl9lRWEdulCNwcHsuvq0JtMlDY1cbC66pq1O6nV1BoMKIBKY0efPn3IycmhqamJNWvWoFKpeOmll9DpdLzyyisW+sSEEMI2eXl5sWrVKrO2iooKnnjiCXkka0US7Kxk9erV7Nq1y6xt6tSpTJgwodWx27ZtY+7cua3a169fz6hRo9qsxsv529/+Rl1dHb/85S+JiIho+W/atGmcPn2a0NBQBgwYwOrVq3HQajFcMjv3L0FBvHYim1+mJFOrNzD1/J6tz/fshVpRuP+nnxiXkszrJ7LxsrcntEsXHvT0ZPyhg0xLS+Np/wC87O3p5ujIvW7ujEtJZvGJbPo7XXtZlAne3ryYlcnEQ4do0miwd3QEaAmo+fn5/PWvf2X//v0t39fu3bst/AkKIYTteOSRR/j1r39t1rZz507WrFljpYqEzIq1ghtZ5DEpKYmRI0fS0GC+9Md7773XanuXjuabb77h+O7d3P/jXmuX0srXd99F39GjZRNrIYS4RWVlZQQHB5utw+rs7Ex6evoV12EVbUdG7NqZ0Whk1qxZrfYaXb16datQd+TIEcaNG9cq1L344osdPtQBeHt7U6PV0tzBdltoVqup0WrbdbKJEELYKnd391Z7xtbU1DB79myMRuMVzhJtRYJdO1u5ciXffvutWducOXMYM2aMWVteXh4PPvgg5eXlZu0zZ87kz3/+c1uXaRHe3t4oGg2VHWxmaaWTE+s//5wpU6aYPU4+evSotUsTQohOaezYscyePdusLT4+nvfff99KFd2+5FFsO8rMzCQiIsJsY3l/f3/S09PN1p8rKytj+PDhrYLG2LFj2bJlS6fZML4j7RV7sfRePSmIiGD+734ne7cKIYSFVFZWEhoaSl5eXkubVqslNTWVwMBAK1Z2e5ERu3ZiMBiYOXOmWagD+Oijj8xC3YWJCZeGurvuuovPPvus04Q6ALVaTWhUFKf8e7SaRGEtBpWK3B49CIuOllAnhBAW1LVrVz766COztvr6embOnInhOpazEpbRMXrb28A777zDjz/+aNa2YMECs5f39Xo9U6ZMISkpyey4fv36sX37dnQ6XbvUaknh4eE063TkX7LWkbXkeXqi1+kICwuzdilCCGFzRo0axfz5883akpKSWLp0qZUquv3Io9h2cOTIEaKiomhqampp6927N6mpqS07G5hMJp544olWU8T9/PxISkrC39+/XWu2pM2ffUbp3r2M+CkZlRV/3YyKQvzAaDzvvpuJjz5qtTqEEMKW1dTUEBERwYkTJ1ra7O3tSUlJadmtSLQdGbFrY83NzTz22GNmoU5RFNatW2e2XdWiRYtahTpXV1d27drVqUMdQMzw4dR4epLl52fVOjL9/Kjx9CRm2DCr1iGEELbM2dmZtWvXmi1i39TUxIwZM2hubrZiZbcHCXZt7K233iIlJcWs7bnnniMmJqbl3ytWrOBPf/qT2TEODg7ExcUREhLSLnW2JV9fXwbFxHAsMJAqKz1OrtTpOB4UyOBhw/A9vyiyEEKItjFs2DCeffZZs7bk5GT+8pe/WKmi24c8im1DKSkpDBkyBL1e39LWv39/UlJScDy/68Hnn3/O5MmTzbZfUalU/Otf/+Lhhx9u75LbjF6vZ91HH2E4msHwgwfRtOPaRnqViu+jIrHr35/HZs9Go9G0272FEOJ2VV9fT1RUFMeOHWtp02g07N+/n8jISCtWZttkxK6NNDY2MmPGDLNQp1arWbduXUuoi4+P5ze/+U2rPfVWrVplU6EOzv3PPDY2lrpu3dgXPADjRUP0bcmoKOwLHkC9bzfGxMZKqBNCiHai1WpZt26d2QoEer2eGTNm0NjYaMXKbJsEuzayePFiDh8+bNb20ksvMWjQIAAOHjzIr371K7N37wBee+01nnjiiXarsz35+PgwfvIkyvz9+TEkGH0bL4GiV6n4MSSYMn9/xk+ehI+PT5veTwghhLnBgwfz4osvmrWlp6fz2muvWaki2yePYtvA3r17iYmJMdtKJTw8nP3792Nvb8/PP//M0KFDKSkpMTvvt7/9LStXrjR74dQW5ebmsmXTZ+gKC4nOyMClrs7i96jU6Uge0J96326MnzyJgIAAi99DCCHEtTU1NTFo0CDS0tJa2lQqFUlJSQwZMsSKldkmCXYWVldXR2RkJJmZmS1tdnZ2/PTTT4SFhXH69GliYmLIzs42O++RRx5h06ZNt82iucXFxeyIi6M8v4B+WVkEFhRYZCkUo6KQ6efH8aBA3P38GBMbKyN1QghhZampqQwaNMhsVmzfvn05ePAgWq3WipXZHnkUa2Evv/yyWaiDc49lw8LCqK6uZuzYsa1C3b333svHH39824Q6OPdYdsbs2Qwa+QuOhYYQPzCak15eN71DhUGl4qSXF/EDozkeGsLgkSN5bPZsCXVCCNEBhIeH8+qrr5q1HT9+nJdfftlKFdkuGbGzoO+++4777rvPrG3w4MEkJiZiNBoZN24cX3/9tdnXw8LC+O6773B1dW2/QjuYwsJCkhITycnMRFNXR0BeHr5ny+haW4vdVbahaVarqXRyosjDndwePdDrdPQKCiJGljQRQogOR6/XM3ToUA4cONDSpigK3377Lffcc48VK7MtEuwspLq6mvDwcHJyclraHB0dOXjwIEFBQUyfPp2NGzeanRMQEEBSUhLdunVr73I7pPLyctLS0khLTqahthaTXo9zfT0uZeXY6/WoTEaMioomjYYqdzdqtFoUjQZHJyfCoqMJCwvDzc3N2t+GEEKIK8jIyCAyMtJsVmyvXr1IS0vD2dnZipXZDgl2FjJv3jw++OADs7Z33nmHhQsX8txzz7Fs2TKzr3l6epKYmEhQUFB7ltkpGAwGysrKKCkpoaSkhDPFxTQ1NGDQ61FrNNg7OnKHjw/e3t54e3vj7u5+Wz3GFkKIzmzp0qU899xzZm3z5s3jH//4h5Uqsi0S7Cxg9+7dPPjgg2Ztw4cPJz4+nqVLl/KHP/zB7Gs6nY74+HgGDx7cnmUKIYQQVmcwGLjvvvtISEgwa9+9ezcPPPCAlaqyHRLsblFFRQUhISEUFBS0tOl0OtLS0khKSuKxxx4zO16j0bBt27ZWQVAIIYS4XZw4cYKwsDDqLlruqnv37qSnp9/W75xbgsyKvUXPPPOMWagDePvtt8nMzGT27Nmtjv/oo48k1AkhhLit9e7dmyVLlpi15efns3DhQitVZDsk2N2CuLg41q1bZ9Y2atQoIiMjmThxotl2YgBLlixh+vTp7VmiEEII0SHNmzePUaNGmbWtXbuWuLg4K1VkG+RR7E0qLS0lJCTEbPcIFxcXtm7dyqOPPsrZs2fNjn/22Wd555132rtMIYQQosM6deoUoaGhVFVVtbR5e3tz5MgRPDw8rFhZ5yUjdjdpwYIFrbYEW7x4MbNmzWoV6qZNm9ZqyFkIIYS43fn7+/Puu++atZWUlLBgwQLrFGQDZMTuJmzatIkpU6aYtY0ePZqCggIOHz5s1v7AAw+wbds27O3t27NEIYQQolMwmUzExsayfft2s/ZNmzYxadIkK1XVeUmwu0HFxcUEBwdTVlbW0ubm5kZQUBD79u0zO3bgwIHEx8fLootCCCHEVRQVFREcHEx5eXlLm4eHB0eOHMHb29uKlXU+8ij2BphMJp588kmzUAfQp0+fVqEuMDCQHTt2SKgTQgghrsHX15eVK1eatZ09e5a5c+ci4083RoLdNRiNRurr6wFYv359q9k6d955p9m+d3Bug/vdu3fj5eXVbnUKIYQQndmUKVOYOHGiWVtcXBz/93//Z6WKOid5FHsVX331FdOmTaO+vp5Jkybx5Zdfms3ccXJyora21uycLl268P333xMREdHO1QohhBCd25kzZwgODubMmTMtbV27duXw4cN0797dipV1HhLsrqJPnz6cOHHiuo+3t7dn165djBgxog2rEkIIIWzX1q1bGT9+vFnbAw88wK5du1AUxUpVdR63RbC73KbyjfX1GA0GVGo1Dlptq03lq6urcXNzu+57KIrCZ5991moYWQghhBA3Zvr06Xz88cdmbR988AFz5841a7uZ/l2tVrfnt9LubDrYlZeXk5qaSnpKCg21tZj0epzr6+laVoadXo/KZMKoKDRrNFS6u1Oj1aJoNDg6OXFHt27MmzePysrK67rXihUrZN0dIYQQwgLKy8sJCQmhsLCwpc3JyYn09HR69ep1S/17aFQU4eHhNzR405nYZLArLCwkKSGBnKws7Orq8D+Vh29ZGV1ra7EzGK54XrNaTaWTE0Xu7vzczZcyg4GsnBwSkpIoLi6+4nmvvPIKb7zxRlt8K0IIIcRtadeuXTz00ENmbWPHjmXG9OmczM6+6f79lH8PmnU6egUGEjN8OL6+vm39rbQrmwp2er2exMREDiQm4lxaSp/cU3QvLUVtNN7wtSrr68hxceFUYCClzs4kHjhAUlIShkt+cR5//HH++c9/ynN/IYQQwsLmzp3Lhx9+iFqtZujQocQMGkS3pib6FxbddP9uUKnI9/QkO8CfGk9PBsXEEBMTg0ajaYPvoP3ZTLArLi5mR1wc5fkF9MvKIrCgANUtfGvl5eXUN9RjVBQKg4LI6tePgrIy4r76itOnTwMwfPhw9uzZYzO/DEIIIURHUl1dzb333kt0RAR+bm4EHjuGX2YW3p6et9z3GhWFLD8/jgUG4t7djzGxsfj4+FiocuuxiWCXm5vLlk2b0BUWEZ2RgUtd3S1fs6SkBIPxv6NzdS4uZERHU6jT8fnWrXh7e/Pdd9+h1Wpv+V5CCCGEaC03N5dP16/H7tQp+icnozu/5Ji9nT0enp5Y4llZlU5Hcv/+1HXrxvjJkwgICLDAVa2n0we73Nxc/vXJJ3jknmLw0aNobmJY9nKKS0owGs0fuxrUajKGDKHQ25spjz1GaGioRe4lhBBCCHMX9+/99u6lobrK7OsuXVwstruTXqViX/AAyvz9eeTXv+7U4a5T7zxRXFzMlk2bcM89xV1Hjlgs1MG5hYYvpTYYCN23nz41NXyzc9dVJ1QIIYQQ4uZc2r+7OTmhUZs/eq2qrqZZr7fI/TRGI3cfPoL7qVNs2fRZp+7fO22w0+v17IiLQ1dYxJCjR2/pfbrLMRha/7KoFBV3uLtz19EMtEWFfBUXh95Cv1RCCCGEuHz/rigKrq6uYPbw1URFRTmW6v1VJhNDjhzt9P17pw12iYmJlOcXEJ2RYdGROoDaujpqamrM2hQU3N3d0Wg0aIxGoo9mUFZQQFJSkkXvLYQQQtzOrtS/29vb4+zkZHZsc3MzzU1NFru3LfTvnTLYFRYWciAxkX5ZWRaZKHExE+dm4ZhTcHN3w97evqWla10dfTOz2J+QQFFRkUVrEEIIIW5H1+rfu7h0QaOxM2uz9ESBzt6/d8pgl5SQgHNpKYEFBRa/tgJwyWNdV9euODo4tjo2qKAA59JSEhMSLF6HEEIIcbu5Vv9+4emZncYOUHDSOZkNulhKZ+7fO90CbOXl5eRkZRGZe8ri79Vd0LVrVyoqKwAFFxcXdFrdZY9TmUz0zj3FIQ8PysvLbXZ7EiGEEKKtXW//rlGrueOOO9q0ls7cv3e6EbvU1FTs6uroXlraZvfQarX4+vji4+ODk+7yoe6CHqWlaOrqSEtLa7N6hBBCCFvXHv37jeis/XunCnYGg4H0lBT8T+Xd1DYiN+p6Fj5UG40E5OWRlpzcarsxIYQQQlxbe/fv16Oz9u83Hew0Gg0RERGEhITw6KOPUmfhSQyXWrlyJeHh4by1ZAmTNnxM7MEUYg+mEHd+ey9L2l9ZwZiUZCYeOnRdx/ueLaOhtpZNmzbxi1/8grCwMD799FOL1lRTU8PIkSNxdnbm+eeft+i1hRBCiGt5/fXXCQkJITQ0lIEDB5KTk3PFYz09PW/o2mVl5/rRbw4cMGvvn/ADsQdTGJuSzNMZGdS3d8DKy2ffvn2UlZUBEBcXx7JlywCYOXMm27dvv+FLLliwAC8vLwYOHGjRUi+46WDn6urKoUOHOHz4MPb29qxatcqSdbUyb948Pv30U56ZO5cuajVxkVHERUYR6+UFgMGC79ttP3OG+T16sDki4prHmgBNaSl11TX8+c9/Jj4+nvT0dKZNm0ZJSckN3/tKfxXY2dnx6quvsmTJkhu+phBCCHErkpKS+Pbbbzl06BDp6els3br1/LpyllFSUoJJr2fjiWyz9i4aDXGRUeyIisZOpfBJ8fXNUrVUJqgqLyM1Pb2lP4+NjWXhwoW3dM2pU6eyc+dOS5R3WRaZPDF8+HDS0tIoLS1l1qxZ5Obm4u7uztq1a/Hz8yM4OJjMzEwyMzPp27cvhYWFeHt7ExgYSHZ2NqdPn+bJJ58kPz8fR0dHVq9eTb9+/Zg5cyYeHh4kJyczevRoBg8ejHN9fct98xsa+O3Ro4R16UJadRX/iojkqYwMTjc10WQy8rR/AKM9PVuO6+/sRFp1NX2dnHi3bz8UReGvOT+zp6wMe0XhIc878HVwYGdpKQnlFeyvrOT/9bqTV7KzOVZbg4NKxRt9Ahng7Mzfc3Mprq8np66WADs7TlaUU1Fb21Kb0Wjk4Ycfpri4mAcffJDf//73AGzevJkNGzbQ2NjI/fffz8KFC8nPz+fJJ58kLCyM1NRUtmzZgoODQ6vPuXv37uzdu5eKigp+/vlnS/zohBBCiGtKTU3F3t6eU6dOtbQ1NTXxxRdfsGLFChobGwkLC+PNN99EpVJhNBpb+qmVK1fy9ddf09TUxG9+8xumTp0KwPLly9mxYwdqtZqhQ4eSf+QI1Xo9v0xJJqqLC4vuvBMAvcGAWqVioEtXjtfWUmswsDg7mxP1554UvnznnUS7dOXvubmUNjeRW99AH52Oqb6+vJqdTaVej71KYV1IKIqiXPHc4qZGTtbXU9zYxLM9Axh3hxfLf87hRFMjDz/8MC+//DKKonD48GHefvtts89n//79PPfcc9TW1tKrVy/WrVt3xe3OYmJiOHnypEV/Phe75WCn1+vZuXMnDz74IIsXL2b48OFs27aNTZs28fTTTxMXF4efnx85OTkkJCQQFRVFQkICffv2JTg4GEVReOaZZ1i0aBHR0dEcOHCAZ555hl27dgGQl5dHfHw8iqLw6YYNdD0/HHpBdl0tS/r2pZ9TIAB/DQrC1c6Oar2eiamHeMDDA4Cf6+t4t18/7tRqmZ6ezk9VVfTR6fiqtJT4gYNQKQrVej1dNBr2VlbwoKcnI9w9WJOfj7NazfaoaA5VVfFCZiZfRkZSU1vDsbo6lnXrhp2i8FJzM3Ya849z7969AKxateqyI5rHjx9nxYoVLf8+duwYAAMGDLjm575mzZrr/REJIYQQFtG7d+8rfi0jI4NNmzZd9dhFixaxaNGiVu0D+vblkTvvZO+BA3zg6wvA6dMlGI1GTp8uwYDCntIzjPD05P28U9zv4cESz74UNzbyxJEjbIuKAiCzto71oaHYq1RMOHSQ5wJ6EuPmRo1ej6NazbLck1c8N7+hgfWhYRQ2NDD7yGHG3eHFwoAA/l5TzbOvvcaUadNYu3Ztq9qbmpp4/vnniYuLw83NjSVLlrBixQpefPHFG/psLeWmg11FRQUR5x9VDh8+nDlz5jB48GC++uorACZNmsTvfvc74Fw6TUhIICEhgRdeeIGEhATOnDlDTEwMAHv27CEjI+Oy95k4cSKKcm4aQ2N9PdpLtvjoqdXS76KVqNcWFvDN2XPhr6ixkTPNzQD00mrpfX6G6wBnJwoaG4h0caGLWs1LWZmM8vBghLtHq/v/VFXFE927AxDh4kKj0UhRRQUmk4kYnQ6787UpRiO9AgI4mZt7g5+kEEIIcXvr6uKC5jLv6tcYjczJywMgzFHLI94+TElN5fuyMlbknRs9rNA303R+wsVID3fsVSpq9Hqq9Hpizi9T4nx+4CWpvOKK597r5o5GUfDXaqm6KGuoTCaaGhquWPvx48dJS0tjxIgRwLmgd999993Kx3FLbjrYXXjH7mouBLKYmBi+/PJLsrKy+PDDD1m1ahWlpaXMnz+/5djk5GTUanWra+guWm7EaDC0WttGe9E5eysqSKmq4vPwcBzVakYn/9TyA7NX/fd1QpWiYDSBRlH4IiKSxIpyvjx9mrjTp1ne/+qjZSZMLd+Xw0XXVEymViN2QgghhLg2jUqFcpnZsM4qFWt69ABAUVTYq1SYMPHBgGC6ObbeOMBR9d9McLmVLa527sU5wfwkE4ar7BtrMpmIiopiz549VzymPVl0uZNhw4axceNG4Ny7ZIMHDwZg6NCh7Ny5Ey8vL9RqNVqtlsTExJYZIffeey8ffPABcO7dtPT09MsXq1ZjVK68CEmNwYCrxg5HtZrU6mpOXvQ+3uXUGgxU6/WMcPfgxV53knHRO3IXDHRxYduZczNvU6ur0arV+Lq6orrkF8CkKBg7yBRtIYQQojOpra/HpFKhVpTLTnxQq9S4urqiAENd3dhw0VZfGZfs7Q7nRuhcNBoSy8sBqNHr0ZtM13XuxZw0aur1etRXGbjp168fubm5LYNdtbW1ZGdnX/H4tmbRIabFixczc+ZM1q9f3zJ5As7t5NC1a9eWR69DhgyhoqKiZYLA8uXLmTdvHqtWrUKv1/PYY48RGhra6voOWi3NV/lwh7u5saGokNiDKfRzciJI53TFY+FcsPvt0SM0Gc/9Ev2+Z69Wx0zz9eWV7Cx+mZKMvUrFXwKDUAAnnRNOKgUHewcamxoxqVQ0XZLod+3axT333ENsbCyvvvoq0dHRbNiwgb///e8YjUacnZ1Zv349RqORqVOnkpiYeNV6o6OjKS4uprm5GRcXFxISEvDx8bnqOUIIIcStSklJ4dlnn23ZSz0yMpLly5fzww8/sHjxYvR6PRqNhpUrVxIZGUmPHj3IO/8IdenSpXzyySeYTCbuuOMONm/ejJOTE2+99RabN29Go9GgsbfH3sGBCT4+zC0qYkhXV17t3RtVbi6+Pr4tT8oAFvj788aJE4xLScZgMnG3qyt/dO7TquYlQX1ZlJ3FX3JycFCpWBcaet3nXtBX50SzycQfX38dg0plVscF9vb2fPrpp8yfP5+a80Hxb3/7G336XP66jz/+ODt27ODs2bN0796d5cuXM378+Ov/YVyDYjK10b5cbeCbb77h+O7d3P/jXmuXYqapuYldAwfyVUZGy1Csg4MDRUVFnWobEiGEEMIaOmr/DvD13XfRd/RoRo4cae1SrkuneinM29ubZK2WZrUauw60CrTiqMXg4cGCBQvo1q0bZ86c4fnnn5dQJ4QQQlyHjtq/N6vV1Gi1eHt7W7uU69bpgp2i0VDp5IRnVZW1y2lR6eSEotEwfPhwJkyYcEvXOnv2bKu/CnQ6HUlJSbd0XSGEEKKj6uj9+80Eu/Hjx7fanWPjxo3XtaTZrehUwc7d3R1HJyeK3N071A++yONcXe7u7rd8LQ8Pj2vONhZCCCFsiS3271u2bGmDiq7NorNi25parSY0KopT/j0wXGlacjszqFTk9uhBWHT0ZZdrEUIIIcTVSf9uOR3j07sB4eHhNOt05N/gBsNtJc/TE71OR1hYmLVLEUIIITot6d8to9MFOzc3N3oFBpId4H/VNe3ag1FROBHgT6+gIJkoIYQQQtwC6d8to9MFO4CY4cOp8fQky8/PqnVk+vlR4+lJzLBhVq1DCCGEsAXSv9+6ThnsfH19GRQTw7HAQKou2nKsPVXqdBwPCmTwsGH4nt+wWAghhBA3T/r3W9cpgx2c23/Wrbsfyf37o2/nFy31KhXJA/rj7ufH0KFD2/XeQgghhC2T/v3WdNpgp9FoGBsbS123buwLHtBuz+ONisK+4AHU+3ZjTGwsmqtscSaEEEKIGyP9+63ptMEOwMfHh/GTJ1Hm78+PIcFtnuz1KhU/hgRT5u/P+MmTZJ9WIYQQog1I/37zOtVesVeSm5vLlk2foSssJDojA5e6Oovfo1KnI3lAf+p9uzF+8iQCAgIsfg8hhBBC/Jf07zfOJoIdQHFxMTvi4ijPL6BfVhaBBQWoLPCtGRWFTD8/jgcF4u7nx5jY2E6d5IUQQojORPr3G2MzwQ5Ar9eTmJjIgcREnEtL6Z17ih6lpaiNxhu+lkGlIs/TkxMB/tR4ejJ42DCGDh3aaZ+5CyGEEJ2V9O/Xz6aC3QWFhYUkJSaSk5mJpq6OgLw8fM+W0bW2FjuD4YrnNavVVDo5UeThTm6PHuh1OnoFBRHTSac8CyGEELZE+vdrs8lgd0F5eTlpaWmkJSfTUFuLSa/Hub4el7Jy7PV6VCYjRkVFk0ZDlbsbNVotikaDo5MTYdHRhIWFdboVp4UQQghbJ/37ldl0sLvAYDBQVlZGSUkJJSUlnCkupqmhAYNej1qjwd7RkTt8fPD29sbb2xt3d/dOteGvEEIIcTuS/r212yLYCSGEEELcDjr1OnZCCCGEEOK/JNgJIYQQQtgICXZCCCGEEDZCgp0QQgghhI2QYCeEEEIIYSMk2AkhhBBC2AgJdkIIIYQQNkKCnRBCCCGEjZBgJ4QQQghhIyTYCSGEEELYCAl2QgghhBA2QoKdEEIIIYSNkGAnhBBCCGEjJNgJIYQQQtgICXZCCCGEEDZCgp0QQgghhI2QYCeEEEIIYSMk2AkhhBBC2AgJdkIIIYQQNkKCnRBCCCGEjZBgJ4QQQghhIyTYCSGEEELYCAl2QgghhBA2QoKdEEIIIYSNkGAnhBBCCGEjJNgJIYQQQtgICXZCCCGEEDbi/wNil0xBGoLeTgAAAABJRU5ErkJggg==", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "for i in range(0,50):\n", " ind.mutate()\n", @@ -13119,20 +1000,9 @@ }, { "cell_type": "code", - "execution_count": 39, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "tree_search_space = tpot2.search_spaces.pipelines.TreePipeline(\n", " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", @@ -13166,445 +1036,9 @@ }, { "cell_type": "code", - "execution_count": 46, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                FeatureUnion(transformer_list=[('powertransformer',\n",
-       "                                                                PowerTransformer()),\n",
-       "                                                               ('passkbinsdiscretizer',\n",
-       "                                                                PassKBinsDiscretizer(n_bins=11,\n",
-       "                                                                                     strategy='uniform'))])),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('powertransformer',\n", - " PowerTransformer()),\n", - " ('passkbinsdiscretizer',\n", - " PassKBinsDiscretizer(n_bins=11,\n", - " strategy='uniform'))])),\n", - " ('passthrough', Passthrough())])" - ] - }, - "execution_count": 46, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "from tpot2.search_spaces.pipelines import *\n", "from tpot2.config import get_search_space\n", @@ -13628,430 +1062,9 @@ }, { "cell_type": "code", - "execution_count": 47, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
FeatureUnion(transformer_list=[('skiptransformer', SkipTransformer()),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "FeatureUnion(transformer_list=[('skiptransformer', SkipTransformer()),\n", - " ('passthrough', Passthrough())])" - ] - }, - "execution_count": 47, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "final_transformers_layer =UnionPipeline([\n", " ChoicePipeline([\n", @@ -14067,482 +1080,9 @@ }, { "cell_type": "code", - "execution_count": 52, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                FeatureUnion(transformer_list=[('estimatortransformer-1',\n",
-       "                                                                EstimatorTransformer(cross_val_predict_cv=10,\n",
-       "                                                                                     estimator=RandomForestClassifier(criterion='entropy',\n",
-       "                                                                                                                      max_features=0.0291036830622,\n",
-       "                                                                                                                      min_samples_leaf=10,\n",
-       "                                                                                                                      min_samples_split=20,\n",
-       "                                                                                                                      n_estimators=128),\n",
-       "                                                                                     method='predict')),\n",
-       "                                                               ('estimatortransformer-2',\n",
-       "                                                                EstimatorTransformer(cross_val_predict_cv=10,\n",
-       "                                                                                     estimator=QuadraticDiscriminantAnalysis(reg_param=0.6791389504331),\n",
-       "                                                                                     method='predict')),\n",
-       "                                                               ('estimatortransformer-3',\n",
-       "                                                                EstimatorTransformer(cross_val_predict_cv=10,\n",
-       "                                                                                     estimator=QuadraticDiscriminantAnalysis(reg_param=0.8087868529112),\n",
-       "                                                                                     method='predict'))])),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('estimatortransformer-1',\n", - " EstimatorTransformer(cross_val_predict_cv=10,\n", - " estimator=RandomForestClassifier(criterion='entropy',\n", - " max_features=0.0291036830622,\n", - " min_samples_leaf=10,\n", - " min_samples_split=20,\n", - " n_estimators=128),\n", - " method='predict')),\n", - " ('estimatortransformer-2',\n", - " EstimatorTransformer(cross_val_predict_cv=10,\n", - " estimator=QuadraticDiscriminantAnalysis(reg_param=0.6791389504331),\n", - " method='predict')),\n", - " ('estimatortransformer-3',\n", - " EstimatorTransformer(cross_val_predict_cv=10,\n", - " estimator=QuadraticDiscriminantAnalysis(reg_param=0.8087868529112),\n", - " method='predict'))])),\n", - " ('passthrough', Passthrough())])" - ] - }, - "execution_count": 52, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "inner_estimators_layer = UnionPipeline([\n", " ChoicePipeline([\n", @@ -14557,482 +1097,9 @@ }, { "cell_type": "code", - "execution_count": 53, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
Pipeline(steps=[('robustscaler',\n",
-       "                 RobustScaler(quantile_range=(0.1562687943568,\n",
-       "                                              0.8028910581685))),\n",
-       "                ('featureunion-1',\n",
-       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
-       "                                                 SkipTransformer()),\n",
-       "                                                ('passthrough',\n",
-       "                                                 Passthrough())])),\n",
-       "                ('featureunion-2',\n",
-       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
-       "                                                 SkipTransformer()),\n",
-       "                                                ('passthrough',\n",
-       "                                                 Passthrough())])),\n",
-       "                ('baggingclassifier',\n",
-       "                 BaggingClassifier(bootstrap_features=True,\n",
-       "                                   max_features=0.1392808422872,\n",
-       "                                   max_samples=0.5344888038724, n_estimators=3,\n",
-       "                                   n_jobs=1, oob_score=True))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('robustscaler',\n", - " RobustScaler(quantile_range=(0.1562687943568,\n", - " 0.8028910581685))),\n", - " ('featureunion-1',\n", - " FeatureUnion(transformer_list=[('skiptransformer',\n", - " SkipTransformer()),\n", - " ('passthrough',\n", - " Passthrough())])),\n", - " ('featureunion-2',\n", - " FeatureUnion(transformer_list=[('skiptransformer',\n", - " SkipTransformer()),\n", - " ('passthrough',\n", - " Passthrough())])),\n", - " ('baggingclassifier',\n", - " BaggingClassifier(bootstrap_features=True,\n", - " max_features=0.1392808422872,\n", - " max_samples=0.5344888038724, n_estimators=3,\n", - " n_jobs=1, oob_score=True))])" - ] - }, - "execution_count": 53, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "final_linear_pipeline = SequentialPipeline([\n", " get_search_space(\"scalers\"),\n", @@ -15055,7 +1122,7 @@ }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 52, "metadata": {}, "outputs": [], "source": [ @@ -15075,464 +1142,9 @@ }, { "cell_type": "code", - "execution_count": 62, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/distributed/node.py:182: UserWarning: Port 8787 is already in use.\n", - "Perhaps you already have a cluster running?\n", - "Hosting the HTTP server on port 40273 instead\n", - " warnings.warn(\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 60%|██████ | 3/5 [00:44<00:29, 14.98s/it]\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/preprocessing/_data.py:2785: UserWarning: n_quantiles (688) is greater than the total number of samples (284). n_quantiles is set to n_samples.\n", - " warnings.warn(\n" - ] - }, - { - "data": { - "text/html": [ - "
TPOTEstimator(classification=True, cv=5, early_stop=2, generations=5,\n",
-       "              max_eval_time_mins=10, n_jobs=4,\n",
-       "              scorers=['roc_auc_ovr',\n",
-       "                       <function complexity_scorer at 0x7e7bacf5b9a0>],\n",
-       "              scorers_weights=[1.0, -1.0],\n",
-       "              search_space=<tpot2.search_spaces.pipelines.sequential.SequentialPipeline object at 0x7e7ba3b078b0>,\n",
-       "              verbose=2)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "TPOTEstimator(classification=True, cv=5, early_stop=2, generations=5,\n", - " max_eval_time_mins=10, n_jobs=4,\n", - " scorers=['roc_auc_ovr',\n", - " ],\n", - " scorers_weights=[1.0, -1.0],\n", - " search_space=,\n", - " verbose=2)" - ] - }, - "execution_count": 62, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "selected_search_space = all_search_spaces[\"stc_pipeline\"] #change this to select a different search space\n", "\n", @@ -15543,7 +1155,6 @@ " classification = True,\n", " cv = 5,\n", " search_space = selected_search_space,\n", - " population_size= 50,\n", " generations = 5,\n", " max_eval_time_mins = 10,\n", " early_stop = 2,\n", @@ -15556,17 +1167,9 @@ }, { "cell_type": "code", - "execution_count": 68, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "auroc score 0.9845976642022929\n" - ] - } - ], + "outputs": [], "source": [ "# score the model\n", "auroc_scorer = sklearn.metrics.get_scorer(\"roc_auc\")\n", @@ -15577,449 +1180,9 @@ }, { "cell_type": "code", - "execution_count": 69, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
Pipeline(steps=[('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0003237844275)),\n",
-       "                ('quantiletransformer', QuantileTransformer(n_quantiles=688)),\n",
-       "                ('baggingclassifier',\n",
-       "                 BaggingClassifier(bootstrap_features=True,\n",
-       "                                   max_features=0.2631592196919,\n",
-       "                                   max_samples=0.488886320861, n_estimators=72,\n",
-       "                                   n_jobs=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" - ], - "text/plain": [ - "Pipeline(steps=[('variancethreshold',\n", - " VarianceThreshold(threshold=0.0003237844275)),\n", - " ('quantiletransformer', QuantileTransformer(n_quantiles=688)),\n", - " ('baggingclassifier',\n", - " BaggingClassifier(bootstrap_features=True,\n", - " max_features=0.2631592196919,\n", - " max_samples=0.488886320861, n_estimators=72,\n", - " n_jobs=1))])" - ] - }, - "execution_count": 69, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "#plot the best pipeline\n", "if isinstance(est.fitted_pipeline_, tpot2.GraphPipeline):\n", @@ -16047,6 +1210,134 @@ "Rather than create your own search space, you can simply pass the string into the `search_space` param. Alternatively, you can access tpot2.config.template_search_spaces.get_template_search_spaces directly which offers a few more customizable options for each template including `cross_val_predict_cv` and whether or not stacked classifiers/regressors are allowed. Or you can copy the code and customize it manually!" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Transformer-only pipelines - imputation optimization example\n", + "\n", + "Pipelines don't necessarily need to end in a classifier or regressor. Transformer only pipelines are possible as long as you have a custom objective function to match. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import sklearn\n", + "import sklearn.datasets\n", + "import numpy as np\n", + "import tpot2\n", + "\n", + "#in practice, cross validation is likely better, but this simple example is fine for demonstration purposes\n", + "def rmse_obective(est, X, missing_add=.2, rng=1, fitted=False):\n", + " rng = np.random.default_rng(rng)\n", + " X_missing = X.copy()\n", + " missing_idx = rng.random(X.shape) < missing_add\n", + " X_missing[missing_idx] = np.nan\n", + " \n", + " if not fitted:\n", + " est.fit(X_missing)\n", + " \n", + " X_filled = est.transform(X_missing)\n", + " return np.sqrt(np.mean((X_filled[missing_idx] - X[missing_idx])**2))\n", + "\n", + "from sklearn.impute import SimpleImputer\n", + "\n", + "X, y = sklearn.datasets.load_diabetes(return_X_y=True)\n", + "\n", + "imp = SimpleImputer(strategy=\"mean\")\n", + "\n", + "rmse_obective(imp, X)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import tpot2.search_spaces\n", + "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", + "\n", + "#set up an imputation search space that includes simple imputer, knn imputer, and iterative imputer (with an optimized ExtraTreesRegressor)\n", + "\n", + "simple_imputer = tpot2.config.get_search_space(\"SimpleImputer\")\n", + "knn_imputer = tpot2.config.get_search_space(\"KNNImputer\")\n", + "\n", + "space = ConfigurationSpace({ 'initial_strategy' : Categorical('initial_strategy', \n", + " ['mean', 'median', \n", + " 'most_frequent', 'constant']),\n", + " 'n_nearest_features' : Integer('n_nearest_features', \n", + " bounds=(1, X.shape[1])),\n", + " 'imputation_order' : Categorical('imputation_order', \n", + " ['ascending', 'descending', \n", + " 'roman', 'arabic', 'random']),\n", + "})\n", + "\n", + "# This optimizes both the iterative imputer parameters and the ExtraTreesRegressor parameters\n", + "iterative_imputer_sp = tpot2.search_spaces.pipelines.WrapperPipeline(\n", + " method = sklearn.impute.IterativeImputer,\n", + " space = space,\n", + " estimator_search_space = tpot2.config.get_search_space(\"ExtraTreesRegressor\"),\n", + ")\n", + "#this is equivalent to\n", + "# iterative_imputer_sp = tpot2.config.get_search_space(\"IterativeImputer_learned_estimators\")\n", + "\n", + "imputation_search_space = tpot2.search_spaces.pipelines.ChoicePipeline(\n", + " search_spaces = [simple_imputer, knn_imputer, iterative_imputer_sp],\n", + ")\n", + "imputation_search_space.generate().export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from functools import partial\n", + "\n", + "final_objective = partial(rmse_obective, X=X, missing_add=.2)\n", + "\n", + "est = tpot2.TPOTEstimator(\n", + " scorers = [],\n", + " scorers_weights = [],\n", + " other_objective_functions = [final_objective],\n", + " other_objective_functions_weights = [-1],\n", + " objective_function_names = [\"rmse\"],\n", + " classification = True,\n", + " search_space = imputation_search_space,\n", + " generations = 5,\n", + " max_eval_time_mins = 60*5,\n", + " verbose = 3,\n", + " n_jobs=20,\n", + ")\n", + "\n", + "est.fit(X, y=y)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# score the model\n", + "rmse_score = final_objective(est, fitted=True)\n", + "print(\"final rmse score\", rmse_score)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "est.fitted_pipeline_" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -16105,10 +1396,10 @@ " classification = True,\n", " cv = 5,\n", " search_space = search_space,\n", - " population_size= 10,\n", " generations = 5,\n", " max_eval_time_mins = 60*5,\n", " verbose = 2,\n", + " n_jobs=20,\n", ")\n", "\n", "est.fit(X_train, y_train)" diff --git a/tpot2/config/get_configspace.py b/tpot2/config/get_configspace.py index a044c215..da680211 100644 --- a/tpot2/config/get_configspace.py +++ b/tpot2/config/get_configspace.py @@ -355,6 +355,9 @@ def get_configspace(name, n_classes=3, n_samples=1000, n_features=100, random_st return imputers.simple_imputer_cs case "IterativeImputer": return imputers.get_IterativeImputer_config_space(n_features=n_features, random_state=random_state) + case "IterativeImputer_no_estimator": + return imputers.get_IterativeImputer_config_space_no_estimator(n_features=n_features, random_state=random_state) + case "KNNImputer": return imputers.get_KNNImputer_config_space(n_samples=n_samples) @@ -449,12 +452,10 @@ def get_node(name, n_classes=3, n_samples=100, n_features=100, random_state=None ext = get_node("ExtraTreesRegressor", n_classes=n_classes, n_samples=n_samples, random_state=random_state) return WrapperPipeline(estimator_search_space=ext, method=SelectFromModel, space=sfm_sp) # TODO Add IterativeImputer with more estimator methods - ''' - if name == "IterativeImputer_learnedestimators": - iteative_sp = get_configspace(name="IterativeImputer", n_classes=n_classes, n_samples=n_samples, random_state=random_state) - regessor_searchspace = get_search_space(["LinearRegression", ..], n_classes=n_classes, n_samples=n_samples, random_state=random_state) - return WrapperPipeline(estimator_search_space=regressor_searchspace, method=ItartiveImputer, space=iteative_sp) - ''' + if name == "IterativeImputer_learned_estimators": + iteative_sp = get_configspace(name="IterativeImputer_no_estimator", n_features=n_features, random_state=random_state) + regressor_searchspace = get_node("ExtraTreesRegressor", n_classes=n_classes, n_samples=n_samples, random_state=random_state) + return WrapperPipeline(estimator_search_space=regressor_searchspace, method=IterativeImputer, space=iteative_sp) #these are nodes that have special search spaces which require custom parsing of the hyperparameters if name == "IterativeImputer": configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state) diff --git a/tpot2/config/imputers.py b/tpot2/config/imputers.py index b991bf8d..34582b92 100644 --- a/tpot2/config/imputers.py +++ b/tpot2/config/imputers.py @@ -42,6 +42,25 @@ def get_IterativeImputer_config_space(n_features, random_state): cs.add([sampling_condition]) return cs +def get_IterativeImputer_config_space_no_estimator(n_features, random_state): + space = { 'initial_strategy' : Categorical('initial_strategy', + ['mean', 'median', + 'most_frequent', 'constant']), + 'n_nearest_features' : Integer('n_nearest_features', + bounds=(1, n_features)), + 'imputation_order' : Categorical('imputation_order', + ['ascending', 'descending', + 'roman', 'arabic', 'random']), + } + + if random_state is not None: + #This is required because configspace doesn't allow None as a value + space['random_state'] = random_state + + cs = ConfigurationSpace(space=space) + + return cs + def get_KNNImputer_config_space(n_samples): space = { 'n_neighbors': Integer('n_neighbors', bounds=(1, max(n_samples,100))), diff --git a/tpot2/tpot_estimator/estimator.py b/tpot2/tpot_estimator/estimator.py index 1e53ee64..e9e5e1ed 100644 --- a/tpot2/tpot_estimator/estimator.py +++ b/tpot2/tpot_estimator/estimator.py @@ -84,9 +84,6 @@ def __init__(self, generations_until_end_budget = 1, stepwise_steps = 5, - - - #dask parameters n_jobs=1, memory_limit = None, From 1fceb223ec870c1d477e9979c4f094269b95221d Mon Sep 17 00:00:00 2001 From: perib Date: Tue, 24 Sep 2024 17:14:06 -0700 Subject: [PATCH 18/44] removed unused param in estimator, add extra rng check so different sub-search spaces get different random seeds --- tpot2/config/get_configspace.py | 3 +++ tpot2/search_spaces/pipelines/choice.py | 1 + tpot2/search_spaces/pipelines/dynamic_linear.py | 1 + tpot2/search_spaces/pipelines/dynamicunion.py | 1 + tpot2/search_spaces/pipelines/sequential.py | 1 + tpot2/search_spaces/pipelines/tree.py | 1 + tpot2/search_spaces/pipelines/union.py | 1 + tpot2/search_spaces/pipelines/wrapper.py | 1 + tpot2/tpot_estimator/estimator.py | 10 ---------- tpot2/tpot_estimator/steady_state_estimator.py | 9 --------- 10 files changed, 10 insertions(+), 19 deletions(-) diff --git a/tpot2/config/get_configspace.py b/tpot2/config/get_configspace.py index da680211..b0fcd453 100644 --- a/tpot2/config/get_configspace.py +++ b/tpot2/config/get_configspace.py @@ -131,6 +131,8 @@ "classifiers_sklearnex" : ["RandomForestClassifier_sklearnex", "LogisticRegression_sklearnex", "KNeighborsClassifier_sklearnex", "SVC_sklearnex","NuSVC_sklearnex"], "regressors_sklearnex" : ["LinearRegression_sklearnex", "Ridge_sklearnex", "Lasso_sklearnex", "ElasticNet_sklearnex", "SVR_sklearnex", "NuSVR_sklearnex", "RandomForestRegressor_sklearnex", "KNeighborsRegressor_sklearnex"], + "genetic encoders" : ["DominantEncoder", "RecessiveEncoder", "HeterosisEncoder", "UnderDominanceEncoder", "OverDominanceEncoder"], + } @@ -456,6 +458,7 @@ def get_node(name, n_classes=3, n_samples=100, n_features=100, random_state=None iteative_sp = get_configspace(name="IterativeImputer_no_estimator", n_features=n_features, random_state=random_state) regressor_searchspace = get_node("ExtraTreesRegressor", n_classes=n_classes, n_samples=n_samples, random_state=random_state) return WrapperPipeline(estimator_search_space=regressor_searchspace, method=IterativeImputer, space=iteative_sp) + #these are nodes that have special search spaces which require custom parsing of the hyperparameters if name == "IterativeImputer": configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state) diff --git a/tpot2/search_spaces/pipelines/choice.py b/tpot2/search_spaces/pipelines/choice.py index da1fcfd0..24f597a4 100644 --- a/tpot2/search_spaces/pipelines/choice.py +++ b/tpot2/search_spaces/pipelines/choice.py @@ -50,4 +50,5 @@ def __init__(self, search_spaces : List[SklearnIndividualGenerator] ) -> None: """ def generate(self, rng=None): + rng = np.random.default_rng(rng) return ChoicePipelineIndividual(self.search_spaces, rng=rng) \ No newline at end of file diff --git a/tpot2/search_spaces/pipelines/dynamic_linear.py b/tpot2/search_spaces/pipelines/dynamic_linear.py index e58005d3..dedecc55 100644 --- a/tpot2/search_spaces/pipelines/dynamic_linear.py +++ b/tpot2/search_spaces/pipelines/dynamic_linear.py @@ -148,4 +148,5 @@ def __init__(self, search_space : SklearnIndividualGenerator, max_length: int ) """ def generate(self, rng=None): + rng = np.random.default_rng(rng) return DynamicLinearPipelineIndividual(self.search_space, self.max_length, rng=rng) \ No newline at end of file diff --git a/tpot2/search_spaces/pipelines/dynamicunion.py b/tpot2/search_spaces/pipelines/dynamicunion.py index 4f1f94ac..53b91d9d 100644 --- a/tpot2/search_spaces/pipelines/dynamicunion.py +++ b/tpot2/search_spaces/pipelines/dynamicunion.py @@ -160,4 +160,5 @@ def __init__(self, search_space : SklearnIndividualGenerator, max_estimators=Non self.allow_repeats = allow_repeats def generate(self, rng=None): + rng = np.random.default_rng(rng) return DynamicUnionPipelineIndividual(self.search_space, max_estimators=self.max_estimators, allow_repeats=self.allow_repeats, rng=rng) \ No newline at end of file diff --git a/tpot2/search_spaces/pipelines/sequential.py b/tpot2/search_spaces/pipelines/sequential.py index 3fae8a52..dde5fd6e 100644 --- a/tpot2/search_spaces/pipelines/sequential.py +++ b/tpot2/search_spaces/pipelines/sequential.py @@ -145,4 +145,5 @@ def __init__(self, search_spaces : List[SklearnIndividualGenerator] ) -> None: self.search_spaces = search_spaces def generate(self, rng=None): + rng = np.random.default_rng(rng) return SequentialPipelineIndividual(self.search_spaces, rng=rng) \ No newline at end of file diff --git a/tpot2/search_spaces/pipelines/tree.py b/tpot2/search_spaces/pipelines/tree.py index 83988656..649bbae3 100644 --- a/tpot2/search_spaces/pipelines/tree.py +++ b/tpot2/search_spaces/pipelines/tree.py @@ -47,4 +47,5 @@ def __init__(self, root_search_space : SklearnIndividualGenerator, self.crossover_same_depth = crossover_same_depth def generate(self, rng=None): + rng = np.random.default_rng(rng) return TreePipelineIndividual(self.search_space, self.leaf_search_space, self.inner_search_space, self.min_size, self.max_size, self.crossover_same_depth, rng=rng) \ No newline at end of file diff --git a/tpot2/search_spaces/pipelines/union.py b/tpot2/search_spaces/pipelines/union.py index a8fd392b..7121cb08 100644 --- a/tpot2/search_spaces/pipelines/union.py +++ b/tpot2/search_spaces/pipelines/union.py @@ -78,4 +78,5 @@ def __init__(self, search_spaces : List[SklearnIndividualGenerator] ) -> None: self.search_spaces = search_spaces def generate(self, rng=None): + rng = np.random.default_rng(rng) return UnionPipelineIndividual(self.search_spaces, rng=rng) \ No newline at end of file diff --git a/tpot2/search_spaces/pipelines/wrapper.py b/tpot2/search_spaces/pipelines/wrapper.py index 1b5807c8..f1908764 100644 --- a/tpot2/search_spaces/pipelines/wrapper.py +++ b/tpot2/search_spaces/pipelines/wrapper.py @@ -148,4 +148,5 @@ def __init__( self.wrapped_param_name = wrapped_param_name def generate(self, rng=None): + rng = np.random.default_rng(rng) return WrapperPipelineIndividual(method=self.method, space=self.space, estimator_search_space=self.estimator_search_space, hyperparameter_parser=self.hyperparameter_parser, wrapped_param_name=self.wrapped_param_name, rng=rng) \ No newline at end of file diff --git a/tpot2/tpot_estimator/estimator.py b/tpot2/tpot_estimator/estimator.py index e9e5e1ed..9a3224fd 100644 --- a/tpot2/tpot_estimator/estimator.py +++ b/tpot2/tpot_estimator/estimator.py @@ -46,7 +46,6 @@ def __init__(self, memory = None, categorical_features = None, - subsets = None, preprocessing = False, population_size = 50, initial_population_size = None, @@ -174,14 +173,6 @@ def __init__(self, - None : If None, TPOT2 will automatically use object columns in pandas dataframes as objects for one hot encoding in preprocessing. - List of categorical features. If X is a dataframe, this should be a list of column names. If X is a numpy array, this should be a list of column indices - subsets : str or list, default=None - Sets the subsets that the FeatureSetSeletor will select from if set as an option in one of the configuration dictionaries. - - str : If a string, it is assumed to be a path to a csv file with the subsets. - The first column is assumed to be the name of the subset and the remaining columns are the features in the subset. - - list or np.ndarray : If a list or np.ndarray, it is assumed to be a list of subsets. - - None : If None, each column will be treated as a subset. One column will be selected per subset. - If subsets is None, each column will be treated as a subset. One column will be selected per subset. - preprocessing : bool or BaseEstimator/Pipeline, EXPERIMENTAL A pipeline that will be used to preprocess the data before CV. Note that the parameters for these steps are not optimized. Add them to the search space to be optimized. @@ -382,7 +373,6 @@ def __init__(self, self.memory = memory self.categorical_features = categorical_features - self.subsets = subsets self.preprocessing = preprocessing self.validation_strategy = validation_strategy diff --git a/tpot2/tpot_estimator/steady_state_estimator.py b/tpot2/tpot_estimator/steady_state_estimator.py index b86b9f8f..d4819e3d 100644 --- a/tpot2/tpot_estimator/steady_state_estimator.py +++ b/tpot2/tpot_estimator/steady_state_estimator.py @@ -196,14 +196,6 @@ def __init__(self, - None : If None, TPOT2 will automatically use object columns in pandas dataframes as objects for one hot encoding in preprocessing. - List of categorical features. If X is a dataframe, this should be a list of column names. If X is a numpy array, this should be a list of column indices - subsets : str or list, default=None - Sets the subsets that the FeatureSetSeletor will select from if set as an option in one of the configuration dictionaries. - - str : If a string, it is assumed to be a path to a csv file with the subsets. - The first column is assumed to be the name of the subset and the remaining columns are the features in the subset. - - list or np.ndarray : If a list or np.ndarray, it is assumed to be a list of subsets. - - None : If None, each column will be treated as a subset. One column will be selected per subset. - If subsets is None, each column will be treated as a subset. One column will be selected per subset. - memory: Memory object or string, default=None If supplied, pipeline will cache each transformer after calling fit with joblib.Memory. This feature @@ -426,7 +418,6 @@ def __init__(self, self.memory = memory self.categorical_features = categorical_features - self.subsets = subsets self.preprocessing = preprocessing self.validation_strategy = validation_strategy self.validation_fraction = validation_fraction From 602e89e12ef99a6a1fec8270de8e975c27d7df87 Mon Sep 17 00:00:00 2001 From: perib Date: Tue, 24 Sep 2024 17:14:20 -0700 Subject: [PATCH 19/44] edits to tutorial 2, rewrote tutorial 3 --- Tutorial/2_Search_Spaces.ipynb | 18042 +++++++++++++++++++++++- Tutorial/3_Feature_Set_Selector.ipynb | 3720 ++++- 2 files changed, 21232 insertions(+), 530 deletions(-) diff --git a/Tutorial/2_Search_Spaces.ipynb b/Tutorial/2_Search_Spaces.ipynb index b07626c4..fffbffdc 100644 --- a/Tutorial/2_Search_Spaces.ipynb +++ b/Tutorial/2_Search_Spaces.ipynb @@ -28,9 +28,441 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 21, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled hyperparameters\n", + "{'bootstrap': False, 'criterion': 'gini', 'max_features': 0.0696410090574, 'min_samples_leaf': 7, 'min_samples_split': 8, 'n_estimators': 128}\n" + ] + }, + { + "data": { + "text/html": [ + "
RandomForestClassifier(bootstrap=False, max_features=0.0696410090574,\n",
+       "                       min_samples_leaf=7, min_samples_split=8,\n",
+       "                       n_estimators=128)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "RandomForestClassifier(bootstrap=False, max_features=0.0696410090574,\n", + " min_samples_leaf=7, min_samples_split=8,\n", + " n_estimators=128)" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "from ConfigSpace import ConfigurationSpace\n", "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", @@ -69,9 +501,441 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 22, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled hyperparameters\n", + "{'bootstrap': False, 'criterion': 'entropy', 'max_features': 0.2320810853841, 'min_samples_leaf': 19, 'min_samples_split': 12, 'n_estimators': 128}\n" + ] + }, + { + "data": { + "text/html": [ + "
RandomForestClassifier(bootstrap=False, criterion='entropy',\n",
+       "                       max_features=0.2320810853841, min_samples_leaf=19,\n",
+       "                       min_samples_split=12, n_estimators=128)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "RandomForestClassifier(bootstrap=False, criterion='entropy',\n", + " max_features=0.2320810853841, min_samples_leaf=19,\n", + " min_samples_split=12, n_estimators=128)" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "rf_configspace = ConfigurationSpace(\n", " space = {\n", @@ -153,7 +1017,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 23, "metadata": {}, "outputs": [], "source": [ @@ -189,9 +1053,20 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 24, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "knn_individual = knn_node.generate()\n", "knn_individual" @@ -199,9 +1074,18 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 25, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled hyperparameters\n", + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 7, 'p': 1, 'weights': 'uniform'}\n" + ] + } + ], "source": [ "print(\"sampled hyperparameters\")\n", "print(knn_individual.hyperparameters)" @@ -216,9 +1100,18 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 26, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "mutated hyperparameters\n", + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 7, 'p': 2, 'weights': 'uniform'}\n" + ] + } + ], "source": [ "knn_individual.mutate() # mutate the individual\n", "print(\"mutated hyperparameters\")\n", @@ -234,9 +1127,25 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 27, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "original hyperparameters for individual 1\n", + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 3, 'p': 3, 'weights': 'uniform'}\n", + "original hyperparameters for individual 2\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'distance'}\n", + "\n", + "post crossover hyperparameters for individual 1\n", + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'uniform'}\n", + "post crossover hyperparameters for individual 2\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'distance'}\n" + ] + } + ], "source": [ "knn_individual1 = knn_node.generate()\n", "knn_individual2 = knn_node.generate()\n", @@ -266,9 +1175,427 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 28, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
KNeighborsClassifier(n_jobs=1, n_neighbors=7, p=3)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "KNeighborsClassifier(n_jobs=1, n_neighbors=7, p=3)" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "est = knn_individual1.export_pipeline()\n", "est" @@ -283,9 +1610,427 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 29, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
KNeighborsClassifier(n_neighbors=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "KNeighborsClassifier(n_neighbors=10)" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "import tpot2\n", "from ConfigSpace import ConfigurationSpace\n", @@ -334,9 +2079,20 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 30, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "import tpot2\n", "from ConfigSpace import ConfigurationSpace\n", @@ -430,9 +2186,437 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 31, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
DecisionTreeClassifier(max_depth=11, max_features='sqrt', min_samples_leaf=20,\n",
+       "                       min_samples_split=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "DecisionTreeClassifier(max_depth=11, max_features='sqrt', min_samples_leaf=20,\n", + " min_samples_split=10)" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "classifier_individual = classifier_node.generate()\n", "\n", @@ -442,9 +2626,437 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 32, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "mutated pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=6,\n",
+       "                     weights='distance')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=6,\n", + " weights='distance')" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "print(\"mutated pipeline\")\n", "classifier_individual.mutate()\n", @@ -455,11 +3067,142 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "TPOT2 also comes with predefined search spaces. The current search spaces were adapted from a combination of the original TPOT package as well as the search spaces used in [AutoSklearn](https://github.com/automl/auto-sklearn/tree/development/autosklearn/pipeline/components). The helper function `tpot2.config.get_search_space` takes in a string or a list of strings, and returns either a EstimatorNode or a ChoicePipeline,respectively. \n", + "#### Built in search spaces for EstimatorNode and ChoicePipeline" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "TPOT2 also comes with predefined hyperparameter search spaces. The current search spaces were adapted from a combination of the original TPOT package as well as the search spaces used in [AutoSklearn](https://github.com/automl/auto-sklearn/tree/development/autosklearn/pipeline/components). The helper function `tpot2.config.get_search_space` takes in a string or a list of strings, and returns either a EstimatorNode or a ChoicePipeline (including all methods in the list), respectively. \n", + "\n", + "| String | Corresponding Method |\n", + "| --- | ----- |\n", + "| SGDClassifier | |\n", + "| RandomForestClassifier | |\n", + "| ExtraTreesClassifier | |\n", + "| GradientBoostingClassifier | |\n", + "| MLPClassifier | |\n", + "| DecisionTreeClassifier | |\n", + "| XGBClassifier | |\n", + "| KNeighborsClassifier | |\n", + "| SVC | |\n", + "| LogisticRegression | |\n", + "| LGBMClassifier | |\n", + "| LinearSVC | |\n", + "| GaussianNB | |\n", + "| BernoulliNB | |\n", + "| MultinomialNB | |\n", + "| ExtraTreesRegressor | |\n", + "| RandomForestRegressor | |\n", + "| GradientBoostingRegressor | |\n", + "| BaggingRegressor | |\n", + "| DecisionTreeRegressor | |\n", + "| KNeighborsRegressor | |\n", + "| XGBRegressor | |\n", + "| ZeroCount | |\n", + "| ColumnOneHotEncoder | |\n", + "| Binarizer | |\n", + "| FastICA | |\n", + "| FeatureAgglomeration | |\n", + "| MaxAbsScaler | |\n", + "| MinMaxScaler | |\n", + "| Normalizer | |\n", + "| Nystroem | |\n", + "| PCA | |\n", + "| PolynomialFeatures | |\n", + "| RBFSampler | |\n", + "| RobustScaler | |\n", + "| StandardScaler | |\n", + "| SelectFwe | |\n", + "| SelectPercentile | |\n", + "| VarianceThreshold | |\n", + "| SGDRegressor | |\n", + "| Ridge | |\n", + "| Lasso | |\n", + "| ElasticNet | |\n", + "| Lars | |\n", + "| LassoLars | |\n", + "| LassoLarsCV | |\n", + "| RidgeCV | |\n", + "| SVR | |\n", + "| LinearSVR | |\n", + "| AdaBoostRegressor | |\n", + "| ElasticNetCV | |\n", + "| AdaBoostClassifier | |\n", + "| MLPRegressor | |\n", + "| GaussianProcessRegressor | |\n", + "| HistGradientBoostingClassifier | |\n", + "| HistGradientBoostingRegressor | |\n", + "| AddTransformer | |\n", + "| mul_neg_1_Transformer | |\n", + "| MulTransformer | |\n", + "| SafeReciprocalTransformer | |\n", + "| EQTransformer | |\n", + "| NETransformer | |\n", + "| GETransformer | |\n", + "| GTTransformer | |\n", + "| LETransformer | |\n", + "| LTTransformer | |\n", + "| MinTransformer | |\n", + "| MaxTransformer | |\n", + "| ZeroTransformer | |\n", + "| OneTransformer | |\n", + "| NTransformer | |\n", + "| PowerTransformer | |\n", + "| QuantileTransformer | |\n", + "| ARDRegression | |\n", + "| QuadraticDiscriminantAnalysis | |\n", + "| PassiveAggressiveClassifier | |\n", + "| LinearDiscriminantAnalysis | |\n", + "| DominantEncoder | |\n", + "| RecessiveEncoder | |\n", + "| HeterosisEncoder | |\n", + "| UnderDominanceEncoder | |\n", + "| OverDominanceEncoder | |\n", + "| GaussianProcessClassifier | |\n", + "| BaggingClassifier | |\n", + "| LGBMRegressor | |\n", + "| Passthrough | |\n", + "| SkipTransformer | |\n", + "| PassKBinsDiscretizer | |\n", + "| SimpleImputer | |\n", + "| IterativeImputer | |\n", + "| KNNImputer | |\n", + "| MDR | |\n", + "| ContinuousMDR | |\n", + "| ReliefF | |\n", + "| SURF | |\n", + "| SURFstar | |\n", + "| MultiSURF | |\n", + "| LinearRegression_sklearnex | |\n", + "| Ridge_sklearnex | |\n", + "| Lasso_sklearnex | |\n", + "| ElasticNet_sklearnex | |\n", + "| SVR_sklearnex | |\n", + "| NuSVR_sklearnex | |\n", + "| RandomForestRegressor_sklearnex | |\n", + "| KNeighborsRegressor_sklearnex | |\n", + "| RandomForestClassifier_sklearnex | |\n", + "| KNeighborsClassifier_sklearnex | |\n", + "| SVC_sklearnex | |\n", + "| NuSVC_sklearnex | |\n", + "| LogisticRegression_sklearnex | |\n", + "\n", + "Some methods require a wrapped estimator. To account for both regression and classification, these have been grouped separately with their own special strings.\n", + "\n", + "| Wrapper Special String | Notes |\n", + "| :--- | :----: |\n", + "| RFE_classification | FRE with learned ExtraTreesClassifier |\n", + "| RFE_regression | RFE with learned ExtraTreesRegressor |\n", + "| SelectFromModel_classification | SelectFromModel with learned ExtraTreesClassifier |\n", + "| SelectFromModel_regression | SelectFromModel with learned ExtraTreesRegressor |\n", + "| IterativeImputer_learned_estimators | IterativeImputer with learned ExtraTreesRegressor |\n", "\n", - "strings can correspond to individual methods. There are also special strings that return predefined lists of methods. \n", "\n", - "| Special String | Included methods |\n", + "There are also special strings that include a predefined lists of methods. These will return a ChoicePipeline of the included methods.\n", + "\n", + "| List Special String | Included methods |\n", "| :--- | :----: |\n", "| \"selectors\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\",] |\n", "| \"selectors_classification\" | [\"SelectFwe\", \"SelectPercentile\", \"VarianceThreshold\", \"RFE_classification\", \"SelectFromModel_classification\"] |\n", @@ -474,14 +3217,450 @@ "| \"skrebate\" | [\"ReliefF\", \"SURF\", \"SURFstar\", \"MultiSURF\"] |\n", "| \"genetic_encoders\" | [\"DominantEncoder\", \"RecessiveEncoder\", \"HeterosisEncoder\", \"UnderDominanceEncoder\", \"OverDominanceEncoder\"] |\n", "| \"classifiers_sklearnex\" | [\"RandomForestClassifier_sklearnex\", \"LogisticRegression_sklearnex\", \"KNeighborsClassifier_sklearnex\", \"SVC_sklearnex\",\"NuSVC_sklearnex\"] |\n", - "| \"regressors_sklearnex\" | [\"LinearRegression_sklearnex\", \"Ridge_sklearnex\", \"Lasso_sklearnex\", \"ElasticNet_sklearnex\", \"SVR_sklearnex\", \"NuSVR_sklearnex\", \"RandomForestRegressor_sklearnex\", \"KNeighborsRegressor_sklearnex\"] |" + "| \"regressors_sklearnex\" | [\"LinearRegression_sklearnex\", \"Ridge_sklearnex\", \"Lasso_sklearnex\", \"ElasticNet_sklearnex\", \"SVR_sklearnex\", \"NuSVR_sklearnex\", \"RandomForestRegressor_sklearnex\", \"KNeighborsRegressor_sklearnex\"] |\n", + "| \"genetic encoders\" | [\"DominantEncoder\", \"RecessiveEncoder\", \"HeterosisEncoder\", \"UnderDominanceEncoder\", \"OverDominanceEncoder\"] |" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here are some examples of how to get search spaces using the `get_search_space` function. " ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 33, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline 1\n" + ] + }, + { + "data": { + "text/html": [ + "
DecisionTreeClassifier(max_depth=3, max_features='sqrt', min_samples_leaf=16,\n",
+       "                       min_samples_split=8)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "DecisionTreeClassifier(max_depth=3, max_features='sqrt', min_samples_leaf=16,\n", + " min_samples_split=8)" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "#same pipeline search space as before.\n", "classifier_choice = tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"])\n", @@ -492,9 +3671,434 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 34, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline 2\n" + ] + }, + { + "data": { + "text/html": [ + "
LogisticRegression(C=203.4209981734027, max_iter=1000, n_jobs=1, solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "LogisticRegression(C=203.4209981734027, max_iter=1000, n_jobs=1, solver='saga')" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "print(\"sampled pipeline 2\")\n", "classifier_choice.generate().export_pipeline()" @@ -502,9 +4106,434 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 35, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline 1\n" + ] + }, + { + "data": { + "text/html": [ + "
GaussianNB()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "GaussianNB()" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "#search space for all classifiers\n", "classifier_choice = tpot2.config.get_search_space(\"classifiers\")\n", @@ -515,14 +4544,881 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 36, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline 2\n" + ] + }, + { + "data": { + "text/html": [ + "
MultinomialNB(alpha=0.2214451695279)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "MultinomialNB(alpha=0.2214451695279)" + ] + }, + "execution_count": 36, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "print(\"sampled pipeline 2\")\n", "classifier_choice.generate().export_pipeline()" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### A note on reproducibility \n", + "Many sklearn estimators, like RandomForestClassifier, are stochastic and require a random_state parameter in order to have deterministic results. If you want TPOT runs to be reproducible, it is important that the estimators used by TPOT have a random state set. TPOT will not automatically set this value. This can either be set manually in each search space, or by passing in the random state to the `get_search_space` function. For example: " + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
RandomForestClassifier(bootstrap=False, max_features=0.5976648428162,\n",
+       "                       min_samples_leaf=4, min_samples_split=7,\n",
+       "                       n_estimators=128, random_state=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "RandomForestClassifier(bootstrap=False, max_features=0.5976648428162,\n", + " min_samples_leaf=4, min_samples_split=7,\n", + " n_estimators=128, random_state=1)" + ] + }, + "execution_count": 75, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "reproducible_random_forest = tpot2.config.get_search_space(\"RandomForestClassifier\", random_state=1)\n", + "reproducible_random_forest.generate().export_pipeline()" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -534,9 +5430,450 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 37, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0061644734724)),\n",
+       "                ('pca', PCA(n_components=0.5803735556718)),\n",
+       "                ('logisticregression',\n",
+       "                 LogisticRegression(C=0.0331002885417, class_weight='balanced',\n",
+       "                                    max_iter=1000, n_jobs=1, solver='saga'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('variancethreshold',\n", + " VarianceThreshold(threshold=0.0061644734724)),\n", + " ('pca', PCA(n_components=0.5803735556718)),\n", + " ('logisticregression',\n", + " LogisticRegression(C=0.0331002885417, class_weight='balanced',\n", + " max_iter=1000, n_jobs=1, solver='saga'))])" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "selector_choicepipeline = tpot2.config.get_search_space(\"VarianceThreshold\")\n", "transformer_choicepipeline = tpot2.config.get_search_space(\"PCA\")\n", @@ -563,9 +5900,443 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 38, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('selectpercentile',\n",
+       "                 SelectPercentile(percentile=82.0501639184698)),\n",
+       "                ('zerocount', ZeroCount()),\n",
+       "                ('multinomialnb', MultinomialNB(alpha=0.7116498874199))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('selectpercentile',\n", + " SelectPercentile(percentile=82.0501639184698)),\n", + " ('zerocount', ZeroCount()),\n", + " ('multinomialnb', MultinomialNB(alpha=0.7116498874199))])" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "selector_choicepipeline = tpot2.config.get_search_space(\"selectors\")\n", "transformer_choicepipeline = tpot2.config.get_search_space(\"transformers\")\n", @@ -583,9 +6354,469 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 39, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0090024083095)),\n",
+       "                ('nystroem',\n",
+       "                 Nystroem(gamma=0.326846805684, kernel='chi2',\n",
+       "                          n_components=49)),\n",
+       "                ('mlpclassifier',\n",
+       "                 MLPClassifier(activation='identity', alpha=0.0009288789905,\n",
+       "                               early_stopping=True,\n",
+       "                               hidden_layer_sizes=[265, 265],\n",
+       "                               learning_rate='invscaling',\n",
+       "                               learning_rate_init=0.0366758440485,\n",
+       "                               n_iter_no_change=32))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('variancethreshold',\n", + " VarianceThreshold(threshold=0.0090024083095)),\n", + " ('nystroem',\n", + " Nystroem(gamma=0.326846805684, kernel='chi2',\n", + " n_components=49)),\n", + " ('mlpclassifier',\n", + " MLPClassifier(activation='identity', alpha=0.0009288789905,\n", + " early_stopping=True,\n", + " hidden_layer_sizes=[265, 265],\n", + " learning_rate='invscaling',\n", + " learning_rate_init=0.0366758440485,\n", + " n_iter_no_change=32))])" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "print(\"sampled pipeline\")\n", "stc_pipeline.generate().export_pipeline()" @@ -602,9 +6833,437 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 40, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('zerocount-1', ZeroCount()), ('zerocount-2', ZeroCount()),\n",
+       "                ('minmaxscaler', MinMaxScaler())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('zerocount-1', ZeroCount()), ('zerocount-2', ZeroCount()),\n", + " ('minmaxscaler', MinMaxScaler())])" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "import tpot2.config\n", "\n", @@ -616,9 +7275,479 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 41, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('selectpercentile',\n",
+       "                 SelectPercentile(percentile=61.5466222112372)),\n",
+       "                ('robustscaler',\n",
+       "                 RobustScaler(quantile_range=(0.0479806149183,\n",
+       "                                              0.9674592383627))),\n",
+       "                ('selectfrommodel',\n",
+       "                 SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n",
+       "                                                                criterion='entropy',\n",
+       "                                                                max_features=0.3260066399479,\n",
+       "                                                                min_samples_leaf=6,\n",
+       "                                                                min_samples_split=8,\n",
+       "                                                                n_jobs=1),\n",
+       "                                 threshold=0.0001984121028))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('selectpercentile',\n", + " SelectPercentile(percentile=61.5466222112372)),\n", + " ('robustscaler',\n", + " RobustScaler(quantile_range=(0.0479806149183,\n", + " 0.9674592383627))),\n", + " ('selectfrommodel',\n", + " SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n", + " criterion='entropy',\n", + " max_features=0.3260066399479,\n", + " min_samples_leaf=6,\n", + " min_samples_split=8,\n", + " n_jobs=1),\n", + " threshold=0.0001984121028))])" + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "print(\"sampled pipeline\")\n", "linear_feature_engineering.generate().export_pipeline()" @@ -626,9 +7755,471 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 42, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('pipeline',\n",
+       "                 Pipeline(steps=[('binarizer',\n",
+       "                                  Binarizer(threshold=0.4321512765788)),\n",
+       "                                 ('pca', PCA(n_components=0.6918117427918)),\n",
+       "                                 ('passkbinsdiscretizer',\n",
+       "                                  PassKBinsDiscretizer(n_bins=42))])),\n",
+       "                ('extratreesclassifier',\n",
+       "                 ExtraTreesClassifier(class_weight='balanced',\n",
+       "                                      criterion='entropy',\n",
+       "                                      max_features=0.169455524505,\n",
+       "                                      min_samples_leaf=8, min_samples_split=14,\n",
+       "                                      n_jobs=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('pipeline',\n", + " Pipeline(steps=[('binarizer',\n", + " Binarizer(threshold=0.4321512765788)),\n", + " ('pca', PCA(n_components=0.6918117427918)),\n", + " ('passkbinsdiscretizer',\n", + " PassKBinsDiscretizer(n_bins=42))])),\n", + " ('extratreesclassifier',\n", + " ExtraTreesClassifier(class_weight='balanced',\n", + " criterion='entropy',\n", + " max_features=0.169455524505,\n", + " min_samples_leaf=8, min_samples_split=14,\n", + " n_jobs=1))])" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "full_search_space = tpot2.search_spaces.pipelines.SequentialPipeline([\n", " linear_feature_engineering,\n", @@ -641,9 +8232,506 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 43, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "sampled pipeline\n" + ] + }, + { + "data": { + "text/html": [ + "
Pipeline(steps=[('pipeline',\n",
+       "                 Pipeline(steps=[('rfe',\n",
+       "                                  RFE(estimator=ExtraTreesClassifier(bootstrap=True,\n",
+       "                                                                     criterion='entropy',\n",
+       "                                                                     max_features=0.0135775754498,\n",
+       "                                                                     min_samples_leaf=8,\n",
+       "                                                                     min_samples_split=6,\n",
+       "                                                                     n_jobs=1),\n",
+       "                                      step=0.7236899597647)),\n",
+       "                                 ('columnonehotencoder', ColumnOneHotEncoder()),\n",
+       "                                 ('featureagglomeration',\n",
+       "                                  FeatureAgglomeration(linkage='average',\n",
+       "                                                       metric='l2',\n",
+       "                                                       n_clusters=150))])),\n",
+       "                ('sgdclassifier',\n",
+       "                 SGDClassifier(alpha=0.0009821180851, eta0=0.1666104101354,\n",
+       "                               l1_ratio=0.7504578619487, loss='modified_huber',\n",
+       "                               n_jobs=1, penalty='elasticnet'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('pipeline',\n", + " Pipeline(steps=[('rfe',\n", + " RFE(estimator=ExtraTreesClassifier(bootstrap=True,\n", + " criterion='entropy',\n", + " max_features=0.0135775754498,\n", + " min_samples_leaf=8,\n", + " min_samples_split=6,\n", + " n_jobs=1),\n", + " step=0.7236899597647)),\n", + " ('columnonehotencoder', ColumnOneHotEncoder()),\n", + " ('featureagglomeration',\n", + " FeatureAgglomeration(linkage='average',\n", + " metric='l2',\n", + " n_clusters=150))])),\n", + " ('sgdclassifier',\n", + " SGDClassifier(alpha=0.0009821180851, eta0=0.1666104101354,\n", + " l1_ratio=0.7504578619487, loss='modified_huber',\n", + " n_jobs=1, penalty='elasticnet'))])" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "print(\"sampled pipeline\")\n", "full_search_space.generate().export_pipeline()" @@ -660,9 +8748,430 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 44, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('zerocount', ZeroCount()),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "FeatureUnion(transformer_list=[('zerocount', ZeroCount()),\n", + " ('passthrough', Passthrough())])" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "transform_and_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", " tpot2.config.get_search_space(\"transformers\"),\n", @@ -681,9 +9190,454 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 45, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0104695394381)),\n",
+       "                ('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('quantiletransformer',\n",
+       "                                                 QuantileTransformer(n_quantiles=93)),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('svc',\n",
+       "                 SVC(C=0.5015595860816, coef0=0.5773095995375, kernel='sigmoid',\n",
+       "                     max_iter=3000, probability=True))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0104695394381)),\n", + " ('featureunion',\n", + " FeatureUnion(transformer_list=[('quantiletransformer',\n", + " QuantileTransformer(n_quantiles=93)),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('svc',\n", + " SVC(C=0.5015595860816, coef0=0.5773095995375, kernel='sigmoid',\n", + " max_iter=3000, probability=True))])" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "stc_pipeline2 = tpot2.search_spaces.pipelines.SequentialPipeline([\n", " tpot2.config.get_search_space(\"selectors\"),\n", @@ -703,9 +9657,516 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 46, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('pipeline-1',\n",
+       "                                                 Pipeline(steps=[('selectfwe',\n",
+       "                                                                  SelectFwe(alpha=0.0013091622594)),\n",
+       "                                                                 ('nystroem',\n",
+       "                                                                  Nystroem(gamma=0.2527764721894,\n",
+       "                                                                           kernel='additive_chi2',\n",
+       "                                                                           n_components=40))])),\n",
+       "                                                ('pipeline-2',\n",
+       "                                                 Pipeline(steps=[('variancethreshold',\n",
+       "                                                                  VarianceThreshold(threshold=0.0130961185337)),\n",
+       "                                                                 ('featureagglomeration',\n",
+       "                                                                  Fe...ge='average',\n",
+       "                                                                                       metric='l2',\n",
+       "                                                                                       n_clusters=293,\n",
+       "                                                                                       pooling_func=<function median at 0x73f3c1bda370>))]))])),\n",
+       "                ('histgradientboostingclassifier',\n",
+       "                 HistGradientBoostingClassifier(early_stopping=False,\n",
+       "                                                l2_regularization=3.1669452e-06,\n",
+       "                                                learning_rate=0.1262523910078,\n",
+       "                                                max_features=0.8008565064114,\n",
+       "                                                max_leaf_nodes=1504,\n",
+       "                                                min_samples_leaf=32, tol=0.0001,\n",
+       "                                                validation_fraction=None))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('featureunion',\n", + " FeatureUnion(transformer_list=[('pipeline-1',\n", + " Pipeline(steps=[('selectfwe',\n", + " SelectFwe(alpha=0.0013091622594)),\n", + " ('nystroem',\n", + " Nystroem(gamma=0.2527764721894,\n", + " kernel='additive_chi2',\n", + " n_components=40))])),\n", + " ('pipeline-2',\n", + " Pipeline(steps=[('variancethreshold',\n", + " VarianceThreshold(threshold=0.0130961185337)),\n", + " ('featureagglomeration',\n", + " Fe...ge='average',\n", + " metric='l2',\n", + " n_clusters=293,\n", + " pooling_func=))]))])),\n", + " ('histgradientboostingclassifier',\n", + " HistGradientBoostingClassifier(early_stopping=False,\n", + " l2_regularization=3.1669452e-06,\n", + " learning_rate=0.1262523910078,\n", + " max_features=0.8008565064114,\n", + " max_leaf_nodes=1504,\n", + " min_samples_leaf=32, tol=0.0001,\n", + " validation_fraction=None))])" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "st_pipeline = tpot2.search_spaces.pipelines.SequentialPipeline([\n", " tpot2.config.get_search_space(\"selectors\"),\n", @@ -738,9 +10199,433 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 47, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('quantiletransformer',\n",
+       "                                QuantileTransformer(n_quantiles=81)),\n",
+       "                               ('columnonehotencoder', ColumnOneHotEncoder())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "FeatureUnion(transformer_list=[('quantiletransformer',\n", + " QuantileTransformer(n_quantiles=81)),\n", + " ('columnonehotencoder', ColumnOneHotEncoder())])" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "dynamic_transformers = tpot2.search_spaces.pipelines.DynamicUnionPipeline(tpot2.config.get_search_space(\"transformers\"), max_estimators=4)\n", "dynamic_transformers.generate().export_pipeline()" @@ -755,9 +10640,454 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 48, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('rbfsampler-1',\n",
+       "                                                                RBFSampler(gamma=0.6219125014396,\n",
+       "                                                                           n_components=64)),\n",
+       "                                                               ('powertransformer',\n",
+       "                                                                PowerTransformer()),\n",
+       "                                                               ('rbfsampler-2',\n",
+       "                                                                RBFSampler(gamma=0.3345729157827,\n",
+       "                                                                           n_components=23))])),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('rbfsampler-1',\n", + " RBFSampler(gamma=0.6219125014396,\n", + " n_components=64)),\n", + " ('powertransformer',\n", + " PowerTransformer()),\n", + " ('rbfsampler-2',\n", + " RBFSampler(gamma=0.3345729157827,\n", + " n_components=23))])),\n", + " ('passthrough', Passthrough())])" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "dynamic_transformers_with_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", " dynamic_transformers,\n", @@ -769,9 +11099,504 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 49, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.1286612361721)),\n",
+       "                ('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                 FeatureUnion(transformer_list=[('binarizer',\n",
+       "                                                                                 Binarizer(threshold=0.736067585858)),\n",
+       "                                                                                ('rbfsampler',\n",
+       "                                                                                 RBFSampler(gamma=0.1436440722816,\n",
+       "                                                                                            n_components=44)),\n",
+       "                                                                                ('quantiletransformer',\n",
+       "                                                                                 QuantileTransformer(n_quantiles=100))])),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('histgradientboostingclassifier',\n",
+       "                 HistGradientBoostingClassifier(early_stopping=True,\n",
+       "                                                l2_regularization=2.047622e-07,\n",
+       "                                                learning_rate=0.0164428425279,\n",
+       "                                                max_features=0.3325348714186,\n",
+       "                                                max_leaf_nodes=1940,\n",
+       "                                                min_samples_leaf=78,\n",
+       "                                                n_iter_no_change=12, tol=0.0001,\n",
+       "                                                validation_fraction=None))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('variancethreshold',\n", + " VarianceThreshold(threshold=0.1286612361721)),\n", + " ('featureunion',\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('binarizer',\n", + " Binarizer(threshold=0.736067585858)),\n", + " ('rbfsampler',\n", + " RBFSampler(gamma=0.1436440722816,\n", + " n_components=44)),\n", + " ('quantiletransformer',\n", + " QuantileTransformer(n_quantiles=100))])),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('histgradientboostingclassifier',\n", + " HistGradientBoostingClassifier(early_stopping=True,\n", + " l2_regularization=2.047622e-07,\n", + " learning_rate=0.0164428425279,\n", + " max_features=0.3325348714186,\n", + " max_leaf_nodes=1940,\n", + " min_samples_leaf=78,\n", + " n_iter_no_change=12, tol=0.0001,\n", + " validation_fraction=None))])" + ] + }, + "execution_count": 49, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "stc_pipeline3 = tpot2.search_spaces.pipelines.SequentialPipeline([\n", " tpot2.config.get_search_space(\"selectors\"),\n", @@ -795,9 +11620,430 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 50, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
ExtraTreesClassifier(max_features=0.2286391649712, min_samples_leaf=13,\n",
+       "                     min_samples_split=8, n_jobs=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "ExtraTreesClassifier(max_features=0.2286391649712, min_samples_leaf=13,\n", + " min_samples_split=8, n_jobs=1)" + ] + }, + "execution_count": 50, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "SelectFromModel_configspace_part = ConfigurationSpace(\n", " space = {\n", @@ -811,9 +12057,443 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 51, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n",
+       "                                               criterion='entropy',\n",
+       "                                               max_features=0.0311518006465,\n",
+       "                                               min_samples_split=19, n_jobs=1),\n",
+       "                threshold=0.0012368197842)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n", + " criterion='entropy',\n", + " max_features=0.0311518006465,\n", + " min_samples_split=19, n_jobs=1),\n", + " threshold=0.0012368197842)" + ] + }, + "execution_count": 51, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "from sklearn.ensemble import ExtraTreesClassifier\n", "from sklearn.feature_selection import SelectFromModel\n", @@ -844,9 +12524,463 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 52, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
EstimatorTransformer(estimator=HistGradientBoostingClassifier(early_stopping=True,\n",
+       "                                                              l2_regularization=0.000117454825,\n",
+       "                                                              learning_rate=0.122899142038,\n",
+       "                                                              max_features=0.5654219816525,\n",
+       "                                                              max_leaf_nodes=1048,\n",
+       "                                                              min_samples_leaf=1,\n",
+       "                                                              n_iter_no_change=17,\n",
+       "                                                              tol=0.0001,\n",
+       "                                                              validation_fraction=0.3473838441178))
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "EstimatorTransformer(estimator=HistGradientBoostingClassifier(early_stopping=True,\n", + " l2_regularization=0.000117454825,\n", + " learning_rate=0.122899142038,\n", + " max_features=0.5654219816525,\n", + " max_leaf_nodes=1048,\n", + " min_samples_leaf=1,\n", + " n_iter_no_change=17,\n", + " tol=0.0001,\n", + " validation_fraction=0.3473838441178))" + ] + }, + "execution_count": 52, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "classifiers = tpot2.config.get_search_space(\"classifiers\")\n", "wrapped_estimators = tpot2.search_spaces.pipelines.WrapperPipeline(tpot2.builtin_modules.EstimatorTransformer, {}, classifiers)\n", @@ -857,9 +12991,24 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 53, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0.94963523, 0.05036477],\n", + " [0.91791762, 0.08208238],\n", + " [0.16108516, 0.83891484],\n", + " [0.05483536, 0.94516464],\n", + " [0.05482495, 0.94517505]])" + ] + }, + "execution_count": 53, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "import numpy as np\n", "X, y = np.random.rand(100, 10), np.random.randint(0, 2, 100)\n", @@ -876,9 +13025,52 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 54, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", + " warnings.warn(\n" + ] + }, + { + "data": { + "text/plain": [ + "array([[0],\n", + " [0],\n", + " [0],\n", + " [1],\n", + " [1]])" + ] + }, + "execution_count": 54, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "classifiers = tpot2.config.get_search_space(\"classifiers\")\n", "wrapped_estimators_cv = tpot2.search_spaces.pipelines.WrapperPipeline(tpot2.builtin_modules.EstimatorTransformer, {'cross_val_predict_cv':10, 'method':'predict'}, classifiers)\n", @@ -895,9 +13087,555 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 55, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('standardscaler', StandardScaler()),\n",
+       "                ('featureunion-1',\n",
+       "                 FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                 FeatureUnion(transformer_list=[('powertransformer',\n",
+       "                                                                                 PowerTransformer()),\n",
+       "                                                                                ('nystroem-1',\n",
+       "                                                                                 Nystroem(gamma=0.4484025909592,\n",
+       "                                                                                          kernel='polynomial',\n",
+       "                                                                                          n_components=13)),\n",
+       "                                                                                ('nystroem-2',\n",
+       "                                                                                 Nystroem(gamma=0.9023618026452,\n",
+       "                                                                                          kernel='additive_chi2',\n",
+       "                                                                                          n_component...\n",
+       "                                                                                 EstimatorTransformer(cross_val_predict_cv=10,\n",
+       "                                                                                                      estimator=GaussianNB(),\n",
+       "                                                                                                      method='predict')),\n",
+       "                                                                                ('estimatortransformer-3',\n",
+       "                                                                                 EstimatorTransformer(cross_val_predict_cv=10,\n",
+       "                                                                                                      estimator=BaggingClassifier(bootstrap=False,\n",
+       "                                                                                                                                  bootstrap_features=True,\n",
+       "                                                                                                                                  max_features=0.248985416426,\n",
+       "                                                                                                                                  max_samples=0.8328766080285,\n",
+       "                                                                                                                                  n_estimators=42,\n",
+       "                                                                                                                                  n_jobs=1),\n",
+       "                                                                                                      method='predict'))])),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('gaussiannb', GaussianNB())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('standardscaler', StandardScaler()),\n", + " ('featureunion-1',\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('powertransformer',\n", + " PowerTransformer()),\n", + " ('nystroem-1',\n", + " Nystroem(gamma=0.4484025909592,\n", + " kernel='polynomial',\n", + " n_components=13)),\n", + " ('nystroem-2',\n", + " Nystroem(gamma=0.9023618026452,\n", + " kernel='additive_chi2',\n", + " n_component...\n", + " EstimatorTransformer(cross_val_predict_cv=10,\n", + " estimator=GaussianNB(),\n", + " method='predict')),\n", + " ('estimatortransformer-3',\n", + " EstimatorTransformer(cross_val_predict_cv=10,\n", + " estimator=BaggingClassifier(bootstrap=False,\n", + " bootstrap_features=True,\n", + " max_features=0.248985416426,\n", + " max_samples=0.8328766080285,\n", + " n_estimators=42,\n", + " n_jobs=1),\n", + " method='predict'))])),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('gaussiannb', GaussianNB())])" + ] + }, + "execution_count": 55, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "dynamic_wrapped_classifiers_with_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", " tpot2.search_spaces.pipelines.DynamicUnionPipeline(wrapped_estimators_cv, max_estimators=4),\n", @@ -938,7 +13676,7 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 56, "metadata": {}, "outputs": [], "source": [ @@ -954,9 +13692,20 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 57, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], "source": [ "est1 = ind.export_pipeline()\n", "est1.plot() #GraphPipelines have a helpful plotting function to visualize the pipeline" @@ -971,9 +13720,110 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 58, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], "source": [ "for i in range(0,50):\n", " ind.mutate()\n", @@ -1000,9 +13850,20 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 59, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], "source": [ "tree_search_space = tpot2.search_spaces.pipelines.TreePipeline(\n", " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", @@ -1036,9 +13897,436 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 60, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('binarizer',\n",
+       "                                                                Binarizer(threshold=0.1286154935127))])),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('binarizer',\n", + " Binarizer(threshold=0.1286154935127))])),\n", + " ('passthrough', Passthrough())])" + ] + }, + "execution_count": 60, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "from tpot2.search_spaces.pipelines import *\n", "from tpot2.config import get_search_space\n", @@ -1062,9 +14350,439 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 61, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('quantiletransformer',\n",
+       "                                                                QuantileTransformer(n_quantiles=98,\n",
+       "                                                                                    output_distribution='normal'))])),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('quantiletransformer',\n", + " QuantileTransformer(n_quantiles=98,\n", + " output_distribution='normal'))])),\n", + " ('passthrough', Passthrough())])" + ] + }, + "execution_count": 61, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "final_transformers_layer =UnionPipeline([\n", " ChoicePipeline([\n", @@ -1080,9 +14798,442 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 62, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('estimatortransformer-1',\n",
+       "                                                                EstimatorTransformer(estimator=BernoulliNB(alpha=0.0316290799363))),\n",
+       "                                                               ('estimatortransformer-2',\n",
+       "                                                                EstimatorTransformer(estimator=GaussianNB()))])),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('estimatortransformer-1',\n", + " EstimatorTransformer(estimator=BernoulliNB(alpha=0.0316290799363))),\n", + " ('estimatortransformer-2',\n", + " EstimatorTransformer(estimator=GaussianNB()))])),\n", + " ('passthrough', Passthrough())])" + ] + }, + "execution_count": 62, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "inner_estimators_layer = UnionPipeline([\n", " ChoicePipeline([\n", @@ -1097,9 +15248,479 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 63, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('robustscaler',\n",
+       "                 RobustScaler(quantile_range=(0.1503060406741,\n",
+       "                                              0.8118816788829))),\n",
+       "                ('featureunion-1',\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('featureunion-2',\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('sgdclassifier',\n",
+       "                 SGDClassifier(alpha=0.0002054334005, eta0=0.5702721028736,\n",
+       "                               l1_ratio=0.984925401959, loss='modified_huber',\n",
+       "                               n_jobs=1, penalty='elasticnet'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('robustscaler',\n", + " RobustScaler(quantile_range=(0.1503060406741,\n", + " 0.8118816788829))),\n", + " ('featureunion-1',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('featureunion-2',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('sgdclassifier',\n", + " SGDClassifier(alpha=0.0002054334005, eta0=0.5702721028736,\n", + " l1_ratio=0.984925401959, loss='modified_huber',\n", + " n_jobs=1, penalty='elasticnet'))])" + ] + }, + "execution_count": 63, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "final_linear_pipeline = SequentialPipeline([\n", " get_search_space(\"scalers\"),\n", @@ -1122,7 +15743,7 @@ }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 64, "metadata": {}, "outputs": [], "source": [ @@ -1142,9 +15763,452 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 65, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: 100%|██████████| 5/5 [00:55<00:00, 11.02s/it]\n" + ] + }, + { + "data": { + "text/html": [ + "
TPOTEstimator(classification=True, cv=5, early_stop=2, generations=5,\n",
+       "              max_eval_time_mins=10, n_jobs=4,\n",
+       "              scorers=['roc_auc_ovr',\n",
+       "                       <function complexity_scorer at 0x73f2b05de710>],\n",
+       "              scorers_weights=[1.0, -1.0],\n",
+       "              search_space=<tpot2.search_spaces.pipelines.sequential.SequentialPipeline object at 0x73f3bd5db070>,\n",
+       "              verbose=2)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "TPOTEstimator(classification=True, cv=5, early_stop=2, generations=5,\n", + " max_eval_time_mins=10, n_jobs=4,\n", + " scorers=['roc_auc_ovr',\n", + " ],\n", + " scorers_weights=[1.0, -1.0],\n", + " search_space=,\n", + " verbose=2)" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "selected_search_space = all_search_spaces[\"stc_pipeline\"] #change this to select a different search space\n", "\n", @@ -1167,9 +16231,17 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 66, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "auroc score 0.9991040371034047\n" + ] + } + ], "source": [ "# score the model\n", "auroc_scorer = sklearn.metrics.get_scorer(\"roc_auc\")\n", @@ -1180,9 +16252,439 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 67, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0012303337733)),\n",
+       "                ('powertransformer', PowerTransformer()),\n",
+       "                ('lineardiscriminantanalysis',\n",
+       "                 LinearDiscriminantAnalysis(shrinkage=0.2481023005204,\n",
+       "                                            solver='eigen'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0012303337733)),\n", + " ('powertransformer', PowerTransformer()),\n", + " ('lineardiscriminantanalysis',\n", + " LinearDiscriminantAnalysis(shrinkage=0.2481023005204,\n", + " solver='eigen'))])" + ] + }, + "execution_count": 67, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "#plot the best pipeline\n", "if isinstance(est.fitted_pipeline_, tpot2.GraphPipeline):\n", @@ -1221,9 +16723,20 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 68, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "0.04690299241236334" + ] + }, + "execution_count": 68, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "import sklearn\n", "import sklearn.datasets\n", @@ -1254,9 +16767,427 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 69, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
SimpleImputer(strategy='most_frequent')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "SimpleImputer(strategy='most_frequent')" + ] + }, + "execution_count": 69, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "import tpot2.search_spaces\n", "from ConfigSpace import ConfigurationSpace, Integer, Float, Categorical, Normal\n", @@ -1293,9 +17224,553 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 70, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/home/perib/Projects/common/Projects/TPOT_Dev/tpot2/tpot2/tpot_estimator/estimator.py:512: UserWarning: Labels are not encoded as ints from 0 to N. For compatibility with some classifiers such as sklearn, TPOT has encoded y with the sklearn LabelEncoder. When using pipelines outside the main TPOT estimator class, you can encode the labels with est.label_encoder_\n", + " warnings.warn(\"Labels are not encoded as ints from 0 to N. For compatibility with some classifiers such as sklearn, TPOT has encoded y with the sklearn LabelEncoder. When using pipelines outside the main TPOT estimator class, you can encode the labels with est.label_encoder_\")\n", + "Generation: 20%|██ | 1/5 [00:13<00:53, 13.43s/it]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Generation: 1\n", + "Best rmse score: 0.03548267518369124\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: 40%|████ | 2/5 [00:24<00:35, 11.80s/it]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Generation: 2\n", + "Best rmse score: 0.033924799732331146\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: 60%|██████ | 3/5 [00:38<00:25, 12.85s/it]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Generation: 3\n", + "Best rmse score: 0.033924799732331146\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: 80%|████████ | 4/5 [00:56<00:14, 14.86s/it]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Generation: 4\n", + "Best rmse score: 0.033924799732331146\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: 100%|██████████| 5/5 [01:18<00:00, 15.68s/it]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Generation: 5\n", + "Best rmse score: 0.033924799732331146\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n" + ] + }, + { + "data": { + "text/html": [ + "
TPOTEstimator(classification=True, generations=5, max_eval_time_mins=300,\n",
+       "              n_jobs=20, objective_function_names=['rmse'],\n",
+       "              other_objective_functions=[functools.partial(<function rmse_obective at 0x73f2ac180d30>, X=array([[ 0.03807591,  0.05068012,  0.06169621, ..., -0.00259226,\n",
+       "         0.01990749, -0.01764613],\n",
+       "       [-0.00188202, -0.04464164, -0.05147406, ..., -0.03949338,\n",
+       "        -0.06833155, -0...\n",
+       "        -0.04688253,  0.01549073],\n",
+       "       [-0.04547248, -0.04464164,  0.03906215, ...,  0.02655962,\n",
+       "         0.04452873, -0.02593034],\n",
+       "       [-0.04547248, -0.04464164, -0.0730303 , ..., -0.03949338,\n",
+       "        -0.00422151,  0.00306441]]), missing_add=0.2)],\n",
+       "              other_objective_functions_weights=[-1], scorers=[],\n",
+       "              scorers_weights=[],\n",
+       "              search_space=<tpot2.search_spaces.pipelines.choice.ChoicePipeline object at 0x73f236dd6cb0>,\n",
+       "              verbose=3)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "TPOTEstimator(classification=True, generations=5, max_eval_time_mins=300,\n", + " n_jobs=20, objective_function_names=['rmse'],\n", + " other_objective_functions=[functools.partial(, X=array([[ 0.03807591, 0.05068012, 0.06169621, ..., -0.00259226,\n", + " 0.01990749, -0.01764613],\n", + " [-0.00188202, -0.04464164, -0.05147406, ..., -0.03949338,\n", + " -0.06833155, -0...\n", + " -0.04688253, 0.01549073],\n", + " [-0.04547248, -0.04464164, 0.03906215, ..., 0.02655962,\n", + " 0.04452873, -0.02593034],\n", + " [-0.04547248, -0.04464164, -0.0730303 , ..., -0.03949338,\n", + " -0.00422151, 0.00306441]]), missing_add=0.2)],\n", + " other_objective_functions_weights=[-1], scorers=[],\n", + " scorers_weights=[],\n", + " search_space=,\n", + " verbose=3)" + ] + }, + "execution_count": 70, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "from functools import partial\n", "\n", @@ -1320,9 +17795,17 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 71, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "final rmse score 0.0329883939925428\n" + ] + } + ], "source": [ "# score the model\n", "rmse_score = final_objective(est, fitted=True)\n", @@ -1331,9 +17814,438 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 72, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
IterativeImputer(estimator=ExtraTreesRegressor(bootstrap=True,\n",
+       "                                               max_features=0.9638745327086,\n",
+       "                                               min_samples_split=13),\n",
+       "                 n_nearest_features=6)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "IterativeImputer(estimator=ExtraTreesRegressor(bootstrap=True,\n", + " max_features=0.9638745327086,\n", + " min_samples_split=13),\n", + " n_nearest_features=6)" + ] + }, + "execution_count": 72, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "est.fitted_pipeline_" ] @@ -1347,9 +18259,445 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 73, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: 100%|██████████| 5/5 [01:31<00:00, 18.21s/it]\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/neural_network/_multilayer_perceptron.py:690: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n", + " warnings.warn(\n" + ] + }, + { + "data": { + "text/html": [ + "
TPOTEstimator(classification=True, cv=5, generations=5, max_eval_time_mins=300,\n",
+       "              n_jobs=20, scorers=['roc_auc'], scorers_weights=[1],\n",
+       "              search_space=<tpot2.search_spaces.pipelines.sequential.SequentialPipeline object at 0x73f2a93816c0>,\n",
+       "              verbose=2)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "TPOTEstimator(classification=True, cv=5, generations=5, max_eval_time_mins=300,\n", + " n_jobs=20, scorers=['roc_auc'], scorers_weights=[1],\n", + " search_space=,\n", + " verbose=2)" + ] + }, + "execution_count": 73, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "from tpot2.search_spaces.pipelines import *\n", "from tpot2.config import get_search_space\n", @@ -1407,9 +18755,485 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 74, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('minmaxscaler', MinMaxScaler()),\n",
+       "                ('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0008618210477)),\n",
+       "                ('featureunion-1',\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('featureunion-2',\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('mlpclassifier',\n",
+       "                 MLPClassifier(activation='identity', alpha=0.003748165278,\n",
+       "                               hidden_layer_sizes=[119],\n",
+       "                               learning_rate='adaptive',\n",
+       "                               learning_rate_init=0.0167284771456,\n",
+       "                               n_iter_no_change=32))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('minmaxscaler', MinMaxScaler()),\n", + " ('variancethreshold',\n", + " VarianceThreshold(threshold=0.0008618210477)),\n", + " ('featureunion-1',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('featureunion-2',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('mlpclassifier',\n", + " MLPClassifier(activation='identity', alpha=0.003748165278,\n", + " hidden_layer_sizes=[119],\n", + " learning_rate='adaptive',\n", + " learning_rate_init=0.0167284771456,\n", + " n_iter_no_change=32))])" + ] + }, + "execution_count": 74, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "est.fitted_pipeline_" ] diff --git a/Tutorial/3_Feature_Set_Selector.ipynb b/Tutorial/3_Feature_Set_Selector.ipynb index 82bcf6c4..5341cf0e 100644 --- a/Tutorial/3_Feature_Set_Selector.ipynb +++ b/Tutorial/3_Feature_Set_Selector.ipynb @@ -4,20 +4,21 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Special Feature Selection nodes in TPOT2\n", + "# Genetic Feature Selection nodes in TPOT2\n", "\n", - "TPOT2 can use evolutionary algorithms to optimize feature selection simultaneously with pipeline optimization. There are two node search spaces included.\n", + "TPOT2 can use evolutionary algorithms to optimize feature selection simultaneously with pipeline optimization. It includes two node search spaces with different feature selection strategies: FSSNode and GeneticFeatureSelectorNode. \n", "\n", - "1. FSSNode - (Feature Set Selector) This node is useful if you have predefined groups of features that you want to select from. For example, one group could include the first x columns, the next group could include the next y columns, etc. Each FeatureSetSelector Node will select a single group to be passed to the next step in the pipeline. This node is also useful if you want to select individual columns at a time, this will be used in tutorial 4 to create a symbolic regression search space. \n", + "1. FSSNode - (Feature Set Selector) This node is useful if you have a list of predefined feature sets you want to select from. Each FeatureSetSelector Node will select a single group of features to be passed to the next step in the pipeline. Note that FSSNode does not create its own subset of features and does not mix/match multiple predefined feature sets.\n", "\n", - "2. GeneticFeatureSelectorNode - Whereas FSSNode selects from a predefine list of subsets of features, this node instead uses evolutionary algorithms to optimize a novel subset from scratch. This is useful where there is no predefined grouping of features.\n", + "2. GeneticFeatureSelectorNode—Whereas the FSSNode selects from a predefined list of subsets of features, this node uses evolutionary algorithms to optimize a novel subset of features from scratch. This is useful where there is no predefined grouping of features. \n", "\n", + "This tutorial focuses on FSSNode. See Tutorial 5 for more information on GeneticFeatureSelectorNode.\n", "\n", "It may also be beneficial to pair these search spaces with a secondary objective function to minimize complexity. That would encourage TPOT to try to produce the simplest pipeline with the fewest number of features.\n", "\n", "tpot2.objectives.number_of_nodes_objective - This can be used as an other_objective_function that counts the number of nodes.\n", "\n", - "tpot2.objectives.complexity_scorer - This is a scorer that can be used in the scorers parameter that tries to count the total number of learned parameters (number of coefficients, number of nodes in decision trees, etc.).\n" + "tpot2.objectives.complexity_scorer - This is a scorer that tries to count the total number of learned parameters (number of coefficients, number of nodes in decision trees, etc.).\n" ] }, { @@ -30,10 +31,10 @@ "The FeatureSetSelector is a subclass of sklearn.feature_selection.SelectorMixin that simply returns the manually specified columns. The parameter sel_subset specifies the name or index of the column that it selects. The transform function then simply indexes and returns the selected columns. You can also optionally name the group with the name parameter, though this is only for note keeping and does is not used by the class.\n", "\n", "\n", - "sel_subset: list or int\n", - " If X is a dataframe, items in sel_subset list must correspond to column names\n", - " If X is a numpy array, items in sel_subset list must correspond to column indexes\n", - " int: index of a single column\n", + " sel_subset: list or int\n", + " If X is a dataframe, items in sel_subset list must correspond to column names\n", + " If X is a numpy array, items in sel_subset list must correspond to column indexes\n", + " int: index of a single column\n", "\n", "\n" ] @@ -95,19 +96,22 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To use the FSS with TPOT2, you can simply pass it in to the configuration dictionary. Note that the FSS is only well defined when used in the leaf nodes of the graph. This is because downstream nodes will receive different transformations of the data such that the original indexes no longer correspond to the same columns in the raw data.\n", + "# FSSNode\n", "\n", - "TPOT2 includsing the string \"feature_set_selector\" in the leaf_config_dict parameter will include the FSS in the search space of the pipeline. By default, each FSS node will select a single column. You can also group columns into sets so that each node selects a set of features rather than a single feature.\n", + "The `FSSNode` is a node search space that simply selects one feature set from a list of feature sets. This works identically to the EstimatorNode, but provides a easier interface for defining the feature sets.\n", "\n", + "Note that the FSS is only well defined when used as the first step in a pipeline. This is because downstream nodes will receive different transformations of the data such that the original indexes no longer correspond to the same columns in the transformed data.\n", "\n", + "The `FSSNode` takes in a single parameter `subsets` which defines the groups of features. There are four ways of defining the subsets. \n", "\n", - "subsets : str or list, default=None\n", - " Sets the subsets that the FeatureSetSeletor will select from if set as an option in one of the configuration dictionaries.\n", - " - str : If a string, it is assumed to be a path to a csv file with the subsets. \n", - " The first column is assumed to be the name of the subset and the remaining columns are the features in the subset.\n", - " - list or np.ndarray : If a list or np.ndarray, it is assumed to be a list of subsets.\n", - " - None : If None, each column will be treated as a subset. One column will be selected per subset.\n", - " If subsets is None, each column will be treated as a subset. One column will be selected per subset.\n", + " subsets : str or list, default=None\n", + " Sets the subsets that the FeatureSetSeletor will select from if set as an option in one of the configuration dictionaries. \n", + " Features are defined by column names if using a Pandas data frame, or ints corresponding to indexes if using numpy arrays.\n", + " - str : If a string, it is assumed to be a path to a csv file with the subsets. \n", + " The first column is assumed to be the name of the subset and the remaining columns are the features in the subset.\n", + " - list or np.ndarray : If a list or np.ndarray, it is assumed to be a list of subsets (i.e a list of lists).\n", + " - dict : A dictionary where keys are the names of the subsets and the values are the list of features.\n", + " - None : If None, each column will be treated as a subset. One column will be selected per subset.\n", "\n", "\n", "Lets say you want to have three groups of features, each with three columns each. The following examples are equivalent:\n", @@ -117,13 +121,10 @@ "sel_subsets=simple_fss.csv\n", "\n", "\n", - "\\# simple_fss.csv\n", - "\n", - "group_one, 1,2,3\n", - "\n", - "group_two, 4,5,6\n", - "\n", - "group_three, 7,8,9\n", + " \\# simple_fss.csv\n", + " group_one, 1,2,3\n", + " group_two, 4,5,6\n", + " group_three, 7,8,9\n", "\n", "\n", "### dict\n", @@ -138,14 +139,18 @@ "### list\n", "\n", "\n", - "sel_subsets = [[1,2,3],[4,5,6],[7,8,9]]\n", - "\n", - "\n", - "\n", - "(As the FSS is just another transformer, you could also pass it in with the standard configuration dictionary format (described in tutorial 2), in which you would have to define your own function that returns a hyperparameter. Similar to the params_LogisticRegression function below. )\n", - "\n", + "sel_subsets = [[1,2,3],\n", + "[4,5,6],\n", + "[7,8,9]]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Examples\n", "\n", - "(In the future, FSS will be treated as a special case node with its own mutation/crossover functions to make it more efficient when there are large numbers of features.)" + "For these examples, we create a dummy dataset where the first six columns are informative and the rest are uninformative." ] }, { @@ -183,68 +188,86 @@ " g\n", " h\n", " i\n", + " j\n", + " k\n", + " l\n", " \n", " \n", " \n", " \n", " 0\n", - " -2.170854\n", - " 1.245354\n", - " 2.139022\n", - " 0.335394\n", - " 0.459081\n", - " 0.700336\n", - " 0.578917\n", - " 0.092662\n", - " 0.161226\n", + " 2.607470\n", + " -1.799163\n", + " -1.345319\n", + " -3.155746\n", + " -1.731663\n", + " 0.699546\n", + " 0.513305\n", + " 0.507864\n", + " 0.357858\n", + " 0.439859\n", + " 0.695061\n", + " 0.449589\n", " \n", " \n", " 1\n", - " -1.249092\n", - " 0.278109\n", - " -0.498371\n", - " 0.381443\n", - " 0.551928\n", - " 0.478524\n", - " 0.656872\n", - " 0.975068\n", - " 0.497428\n", + " 0.045796\n", + " -2.830673\n", + " 1.578201\n", + " -0.098472\n", + " -0.665334\n", + " -0.130451\n", + " 0.022118\n", + " 0.808068\n", + " 0.158917\n", + " 0.328156\n", + " 0.349374\n", + " 0.927755\n", " \n", " \n", " 2\n", - " -0.997527\n", - " 1.527997\n", - " -1.360814\n", - " 0.438920\n", - " 0.257216\n", - " 0.995995\n", - " 0.411837\n", - " 0.044339\n", - " 0.073172\n", + " 0.490722\n", + " -2.026190\n", + " -1.848381\n", + " -1.112946\n", + " -1.620822\n", + " 3.430459\n", + " 0.166742\n", + " 0.504127\n", + " 0.942156\n", + " 0.556877\n", + " 0.024859\n", + " 0.430831\n", " \n", " \n", " 3\n", - " 1.511913\n", - " -1.374412\n", - " 2.422807\n", - " 0.805676\n", - " 0.051917\n", - " 0.640761\n", - " 0.094881\n", - " 0.753452\n", - " 0.214523\n", + " -1.859338\n", + " -0.196734\n", + " 1.525634\n", + " 0.244376\n", + " -0.685690\n", + " 1.995038\n", + " 0.055226\n", + " 0.751830\n", + " 0.983152\n", + " 0.702334\n", + " 0.750200\n", + " 0.294415\n", " \n", " \n", " 4\n", - " -1.120579\n", - " 1.033842\n", - " -1.099884\n", - " 0.059472\n", - " 0.682245\n", - " 0.605932\n", - " 0.745800\n", - " 0.824254\n", - " 0.903524\n", + " -0.056101\n", + " 1.386592\n", + " 1.552356\n", + " 1.446347\n", + " -0.984449\n", + " 0.742441\n", + " 0.631411\n", + " 0.217660\n", + " 0.124121\n", + " 0.814294\n", + " 0.131921\n", + " 0.917958\n", " \n", " \n", "\n", @@ -252,18 +275,18 @@ ], "text/plain": [ " a b c d e f g \\\n", - "0 -2.170854 1.245354 2.139022 0.335394 0.459081 0.700336 0.578917 \n", - "1 -1.249092 0.278109 -0.498371 0.381443 0.551928 0.478524 0.656872 \n", - "2 -0.997527 1.527997 -1.360814 0.438920 0.257216 0.995995 0.411837 \n", - "3 1.511913 -1.374412 2.422807 0.805676 0.051917 0.640761 0.094881 \n", - "4 -1.120579 1.033842 -1.099884 0.059472 0.682245 0.605932 0.745800 \n", - "\n", - " h i \n", - "0 0.092662 0.161226 \n", - "1 0.975068 0.497428 \n", - "2 0.044339 0.073172 \n", - "3 0.753452 0.214523 \n", - "4 0.824254 0.903524 " + "0 2.607470 -1.799163 -1.345319 -3.155746 -1.731663 0.699546 0.513305 \n", + "1 0.045796 -2.830673 1.578201 -0.098472 -0.665334 -0.130451 0.022118 \n", + "2 0.490722 -2.026190 -1.848381 -1.112946 -1.620822 3.430459 0.166742 \n", + "3 -1.859338 -0.196734 1.525634 0.244376 -0.685690 1.995038 0.055226 \n", + "4 -0.056101 1.386592 1.552356 1.446347 -0.984449 0.742441 0.631411 \n", + "\n", + " h i j k l \n", + "0 0.507864 0.357858 0.439859 0.695061 0.449589 \n", + "1 0.808068 0.158917 0.328156 0.349374 0.927755 \n", + "2 0.504127 0.942156 0.556877 0.024859 0.430831 \n", + "3 0.751830 0.983152 0.702334 0.750200 0.294415 \n", + "4 0.217660 0.124121 0.814294 0.131921 0.917958 " ] }, "execution_count": 2, @@ -277,11 +300,18 @@ "from sklearn.linear_model import LogisticRegression\n", "import numpy as np\n", "import pandas as pd\n", + "import tpot2\n", + "import sklearn.datasets\n", + "from sklearn.linear_model import LogisticRegression\n", + "import numpy as np\n", + "from tpot2.search_spaces.nodes import *\n", + "from tpot2.search_spaces.pipelines import *\n", + "from tpot2.config import get_search_space\n", "\n", "\n", - "X, y = sklearn.datasets.make_classification(n_samples=1000, n_features=3, n_informative=3, n_redundant=0, n_repeated=0, n_classes=2, n_clusters_per_class=2, weights=None, flip_y=0.01, class_sep=1.0, hypercube=True, shift=0.0, scale=1.0, shuffle=True, random_state=None)\n", + "X, y = sklearn.datasets.make_classification(n_samples=1000, n_features=6, n_informative=6, n_redundant=0, n_repeated=0, n_classes=2, n_clusters_per_class=2, weights=None, flip_y=0.01, class_sep=1.0, hypercube=True, shift=0.0, scale=1.0, shuffle=True, random_state=None)\n", "X = np.hstack([X, np.random.rand(X.shape[0],6)]) #add six uninformative features\n", - "X = pd.DataFrame(X, columns=['a','b','c','d','e','f','g','h','i']) # a, b ,c the rest are uninformative\n", + "X = pd.DataFrame(X, columns=['a','b','c','d','e','f','g','h','i', 'j', 'k', 'l']) # a, b ,c the rest are uninformative\n", "X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, train_size=0.75, test_size=0.25)\n", "\n", "X.head()" @@ -291,83 +321,49 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Feature Set Selector\n", - "\n", - "In this configuration, each FSS node considers a single column.\n", - "\n", - "The root node is a logistic regression and there are no other intermediate transformers. An additional objective function is included that seeks to minimize the number of leave nodes (i.e the number of selected features)" + "Lets say that either based on prior knowledge or interest, we know that the features can be grouped as follows" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/distributed/node.py:182: UserWarning: Port 8787 is already in use.\n", - "Perhaps you already have a cluster running?\n", - "Hosting the HTTP server on port 39005 instead\n", - " warnings.warn(\n", - "Generation: 100%|██████████| 5/5 [04:09<00:00, 49.87s/it]\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/preprocessing/_data.py:2762: UserWarning: n_quantiles (842) is greater than the total number of samples (750). n_quantiles is set to n_samples.\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/preprocessing/_data.py:2762: UserWarning: n_quantiles (1803) is greater than the total number of samples (750). n_quantiles is set to n_samples.\n", - " warnings.warn(\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.9390338164251208\n" - ] - } - ], + "outputs": [], "source": [ - "import tpot2\n", - "import sklearn.datasets\n", - "from sklearn.linear_model import LogisticRegression\n", - "import numpy as np\n", - "\n", "subsets = { \"group_one\" : ['a','b','c',],\n", " \"group_two\" : ['d','e','f'],\n", " \"group_three\" : ['g','h','i'],\n", - " }\n", - "\n", - "fss_search_space = tpot2.search_spaces.nodes.FSSNode(subsets=subsets)\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = None, \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "combined_search_space = tpot2.search_spaces.pipelines.SequentialPipeline([fss_search_space, graph_search_space])\n", - "\n", - "\n", - "est = tpot2.TPOTEstimator(population_size=10,generations=5, \n", - " scorers=['roc_auc_ovr'],\n", - " scorers_weights=[1],\n", - " n_jobs=32,\n", - " classification=True,\n", - " search_space = combined_search_space,\n", - " verbose=1,\n", - " )\n", - "\n", - "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))" + " \"group_four\" : ['j','k','l'],\n", + " }" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can create an FSSNode that will select from this subset. Each node in a pipeline only selects one subset." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, + "outputs": [], + "source": [ + "fss_search_space = FSSNode(subsets=subsets)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If we randomly sample from this search space, we can see that we get a single selector that selects one of the predefined sets. In this case, it selects groups two, which includes ['d', 'e', 'f']. (A random seed was set in the generate function so that the same group would be selected when rerunning the notebook.)" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, "outputs": [ { "data": { @@ -776,98 +772,20 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
Pipeline(steps=[('featuresetselector',\n",
-       "                 FeatureSetSelector(name='group_one',\n",
-       "                                    sel_subset=['a', 'b', 'c'])),\n",
-       "                ('graphpipeline',\n",
-       "                 GraphPipeline(graph=<networkx.classes.digraph.DiGraph object at 0x7c2c2831c100>))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureSetSelector(name='group_two', sel_subset=['d', 'e', 'f'])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('featuresetselector',\n", - " FeatureSetSelector(name='group_one',\n", - " sel_subset=['a', 'b', 'c'])),\n", - " ('graphpipeline',\n", - " GraphPipeline(graph=))])" + "FeatureSetSelector(name='group_two', sel_subset=['d', 'e', 'f'])" ] }, - "execution_count": 4, + "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "est.fitted_pipeline_" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Note that if you want to include multiple subsets, you can instead include the node as a leaf in the graph search space. This will produce a pipeline where all leaves as FSSNodes and all FSSNodes appear in the leaves (to prevent inner nodes from also being FSSNodes). Since the graph search space allows for multiple leaves, this pipeline can select multiple feature sets. " - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/distributed/node.py:182: UserWarning: Port 8787 is already in use.\n", - "Perhaps you already have a cluster running?\n", - "Hosting the HTTP server on port 42397 instead\n", - " warnings.warn(\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 100%|██████████| 5/5 [00:22<00:00, 4.46s/it]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.8384541062801932\n" - ] - } - ], - "source": [ - "import tpot2\n", - "import sklearn.datasets\n", - "from sklearn.linear_model import LogisticRegression\n", - "import numpy as np\n", - "\n", - "\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = tpot2.search_spaces.nodes.FSSNode(subsets=X_train.columns.tolist()), \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "est = tpot2.TPOTEstimator(population_size=10,generations=5, \n", - " scorers=['roc_auc_ovr'],\n", - " scorers_weights=[1],\n", - " n_jobs=32,\n", - " classification=True,\n", - " search_space = graph_search_space ,\n", - " verbose=1,\n", - " )\n", - "\n", - "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))" + "fss_selector = fss_search_space.generate(rng=1).export_pipeline()\n", + "fss_selector" ] }, { @@ -877,31 +795,136 @@ "outputs": [ { "data": { - "image/png": "", + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
def
2212.819930-0.677731-2.883151
121-0.068495-1.3981281.239476
832.225071-1.9021561.425918
2432.5929170.764790-2.120585
9922.3292792.362780-3.780878
............
2541.850187-2.177608-4.088455
19-0.9494341.062798-3.421324
9562.105221-0.1696331.743979
1502.171954-1.343100-0.346960
6290.348571-1.1710490.854003
\n", + "

750 rows × 3 columns

\n", + "
" + ], "text/plain": [ - "
" + " d e f\n", + "221 2.819930 -0.677731 -2.883151\n", + "121 -0.068495 -1.398128 1.239476\n", + "83 2.225071 -1.902156 1.425918\n", + "243 2.592917 0.764790 -2.120585\n", + "992 2.329279 2.362780 -3.780878\n", + ".. ... ... ...\n", + "254 1.850187 -2.177608 -4.088455\n", + "19 -0.949434 1.062798 -3.421324\n", + "956 2.105221 -0.169633 1.743979\n", + "150 2.171954 -1.343100 -0.346960\n", + "629 0.348571 -1.171049 0.854003\n", + "\n", + "[750 rows x 3 columns]" ] }, + "execution_count": 6, "metadata": {}, - "output_type": "display_data" + "output_type": "execute_result" } ], "source": [ - "est.fitted_pipeline_.plot()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Other examples" + "fss_selector.set_output(transform=\"pandas\") #by default sklearn selectors return numpy arrays. this will make it return pandas dataframes\n", + "fss_selector.fit(X_train)\n", + "fss_selector.transform(X_train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## dictionary" + "We can now use this when defining our pipelines. \n", + "For this first example, we will construct a simple linear pipeline where the first step is a feature set selector, and the second is a classifier" ] }, { @@ -913,125 +936,2913 @@ "name": "stderr", "output_type": "stream", "text": [ - "Generation: 100%|██████████| 5/5 [00:44<00:00, 8.81s/it]\n" + "Generation: 100%|██████████| 5/5 [00:30<00:00, 6.12s/it]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ - "0.9549114331723028\n" + "0.8989320638188367\n" ] } ], "source": [ - "import tpot2\n", - "import pandas as pd\n", - "import numpy as np\n", - "from sklearn.linear_model import LogisticRegression\n", - "import sklearn\n", - "\n", - "subsets = { \"group_one\" : ['a','b','c'],\n", - " \"group_two\" : ['d','e','f'],\n", - " \"group_three\" : ['g','h','i'],\n", - " }\n", "\n", - "fss_search_space = tpot2.search_spaces.nodes.FSSNode(subsets=subsets)\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = None, \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "combined_search_space = tpot2.search_spaces.pipelines.SequentialPipeline([fss_search_space, graph_search_space])\n", + "classification_search_space = get_search_space([\"RandomForestClassifier\"])\n", + "fss_and_classifier_search_space = SequentialPipeline([fss_search_space, classification_search_space])\n", "\n", "\n", - "est = tpot2.TPOTEstimator(population_size=10,generations=5, \n", - " scorers=['roc_auc_ovr'],\n", - " scorers_weights=[1],\n", + "est = tpot2.TPOTEstimator(generations=5, \n", + " scorers=[\"roc_auc_ovr\", tpot2.objectives.complexity_scorer],\n", + " scorers_weights=[1.0, -1.0],\n", " n_jobs=32,\n", " classification=True,\n", - " search_space = combined_search_space,\n", + " search_space = fss_and_classifier_search_space,\n", " verbose=1,\n", " )\n", "\n", "\n", "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "\n", "est.fit(X_train, y_train)\n", "print(scorer(est, X_test, y_test))" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## list" - ] - }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 100%|██████████| 5/5 [09:02<00:00, 108.52s/it]\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/decomposition/_fastica.py:595: UserWarning: n_components is too large: it will be set to 3\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/decomposition/_fastica.py:128: ConvergenceWarning: FastICA did not converge. Consider increasing tolerance or the maximum number of iterations.\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/decomposition/_fastica.py:595: UserWarning: n_components is too large: it will be set to 24\n", - " warnings.warn(\n", - "/home/ribeirop/miniconda3/envs/tpot2env/lib/python3.10/site-packages/sklearn/linear_model/_sag.py:350: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge\n", - " warnings.warn(\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.9765539452495974\n" - ] - } - ], - "source": [ - "import tpot2\n", - "import pandas as pd\n", - "import numpy as np\n", - "from sklearn.linear_model import LogisticRegression\n", - "import sklearn\n", - "\n", - "subsets = [['a','b','c'],['d','e','f'],['g','h','i']]\n", - "\n", - "fss_search_space = tpot2.search_spaces.nodes.FSSNode(subsets=subsets)\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = None, \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "combined_search_space = tpot2.search_spaces.pipelines.SequentialPipeline([fss_search_space, graph_search_space])\n", - "\n", - "\n", - "est = tpot2.TPOTEstimator(population_size=10,generations=5, \n", - " scorers=['roc_auc_ovr'],\n", - " scorers_weights=[1],\n", - " n_jobs=32,\n", - " classification=True,\n", - " search_space = combined_search_space,\n", - " verbose=1,\n", - " )\n", - "\n", + "data": { + "text/html": [ + "
Pipeline(steps=[('featuresetselector',\n",
+       "                 FeatureSetSelector(name='group_two',\n",
+       "                                    sel_subset=['d', 'e', 'f'])),\n",
+       "                ('randomforestclassifier',\n",
+       "                 RandomForestClassifier(class_weight='balanced',\n",
+       "                                        max_features=0.7587731972584,\n",
+       "                                        min_samples_leaf=7,\n",
+       "                                        min_samples_split=12,\n",
+       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('featuresetselector',\n", + " FeatureSetSelector(name='group_two',\n", + " sel_subset=['d', 'e', 'f'])),\n", + " ('randomforestclassifier',\n", + " RandomForestClassifier(class_weight='balanced',\n", + " max_features=0.7587731972584,\n", + " min_samples_leaf=7,\n", + " min_samples_split=12,\n", + " n_estimators=128))])" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "est.fitted_pipeline_" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With this setup TPOT is able to identify one of the subsets used, but the performance is not optimal. In this case we happen to know that multiple feature sets are required. If we want to include multiple features in our pipelines, we will have to modify our search space. There are three options for this.\n", + "\n", + "1. UnionPipeline - This allows you to have a fixed number of feature sets selected. If you use a UnionPipeline with two FSSNodes, you will always select two feature sets that are simply concatenated together.\n", + "2. DynamicUnionPipeline - This space allows multiple FSSNodes to be selected. Unlike UnionPipeline you don't have to specify the number of selected sets, TPOT will identify the number of sets that are optimal. Additionally, with DynamicUnionPipeline, the same feature set cannot be selected twice. Note that while DynamicUnionPipeline can select multiple feature sets, it never mixes two feature sets together.\n", + "3. GraphSearchPipeline - When set as the leave_search_space, GraphSearchPipeline can also select multiple FSSNodes which act as an input to the rest of the pipeline." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### UnionPipeline + FSSNode example" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "union_fss_space = UnionPipeline([fss_search_space, fss_search_space])" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('featuresetselector-1',\n",
+       "                                FeatureSetSelector(name='group_two',\n",
+       "                                                   sel_subset=['d', 'e', 'f'])),\n",
+       "                               ('featuresetselector-2',\n",
+       "                                FeatureSetSelector(name='group_three',\n",
+       "                                                   sel_subset=['g', 'h',\n",
+       "                                                               'i']))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "FeatureUnion(transformer_list=[('featuresetselector-1',\n", + " FeatureSetSelector(name='group_two',\n", + " sel_subset=['d', 'e', 'f'])),\n", + " ('featuresetselector-2',\n", + " FeatureSetSelector(name='group_three',\n", + " sel_subset=['g', 'h',\n", + " 'i']))])" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# this union search space will always select exactly two fss_search_space\n", + "selector1 = union_fss_space.generate(rng=1).export_pipeline()\n", + "selector1" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
defghi
2212.819930-0.677731-2.8831510.4203930.9304310.223210
121-0.068495-1.3981281.2394760.5837170.8729670.355258
832.225071-1.9021561.4259180.5217140.6243030.526185
2432.5929170.764790-2.1205850.7483970.0992240.640145
9922.3292792.362780-3.7808780.2338200.0544630.025899
.....................
2541.850187-2.177608-4.0884550.9550170.3632700.010256
19-0.9494341.062798-3.4213240.0942150.8376560.856361
9562.105221-0.1696331.7439790.2663650.4467270.356117
1502.171954-1.343100-0.3469600.7126730.9833440.873176
6290.348571-1.1710490.8540030.7464990.1014730.609367
\n", + "

750 rows × 6 columns

\n", + "
" + ], + "text/plain": [ + " d e f g h i\n", + "221 2.819930 -0.677731 -2.883151 0.420393 0.930431 0.223210\n", + "121 -0.068495 -1.398128 1.239476 0.583717 0.872967 0.355258\n", + "83 2.225071 -1.902156 1.425918 0.521714 0.624303 0.526185\n", + "243 2.592917 0.764790 -2.120585 0.748397 0.099224 0.640145\n", + "992 2.329279 2.362780 -3.780878 0.233820 0.054463 0.025899\n", + ".. ... ... ... ... ... ...\n", + "254 1.850187 -2.177608 -4.088455 0.955017 0.363270 0.010256\n", + "19 -0.949434 1.062798 -3.421324 0.094215 0.837656 0.856361\n", + "956 2.105221 -0.169633 1.743979 0.266365 0.446727 0.356117\n", + "150 2.171954 -1.343100 -0.346960 0.712673 0.983344 0.873176\n", + "629 0.348571 -1.171049 0.854003 0.746499 0.101473 0.609367\n", + "\n", + "[750 rows x 6 columns]" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "selector1.set_output(transform=\"pandas\") \n", + "selector1.fit(X_train)\n", + "selector1.transform(X_train)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### DynamicUnionPipeline + FSSNode example\n", + "The dynamic union pipeline may select a variable number of feature sets." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('featuresetselector',\n",
+       "                                FeatureSetSelector(name='group_three',\n",
+       "                                                   sel_subset=['g', 'h',\n",
+       "                                                               'i']))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "FeatureUnion(transformer_list=[('featuresetselector',\n", + " FeatureSetSelector(name='group_three',\n", + " sel_subset=['g', 'h',\n", + " 'i']))])" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "dynamic_fss_space = DynamicUnionPipeline(fss_search_space)\n", + "dynamic_fss_space.generate(rng=1).export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
FeatureUnion(transformer_list=[('featuresetselector-1',\n",
+       "                                FeatureSetSelector(name='group_one',\n",
+       "                                                   sel_subset=['a', 'b', 'c'])),\n",
+       "                               ('featuresetselector-2',\n",
+       "                                FeatureSetSelector(name='group_four',\n",
+       "                                                   sel_subset=['j', 'k',\n",
+       "                                                               'l']))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "FeatureUnion(transformer_list=[('featuresetselector-1',\n", + " FeatureSetSelector(name='group_one',\n", + " sel_subset=['a', 'b', 'c'])),\n", + " ('featuresetselector-2',\n", + " FeatureSetSelector(name='group_four',\n", + " sel_subset=['j', 'k',\n", + " 'l']))])" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "dynamic_fss_space.generate(rng=3).export_pipeline()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### GraphSearchPipeline + FSSNode example\n", + "\n", + "FSSNodes must be set as the leaf search space as they act as the inputs to the pipeline.\n", + "\n", + "Here is an example pipeline from this search space that utilizes two feature sets." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8fJSN1AAAACXBIWXMAAA9hAAAPYQGoP6dpAABljUlEQVR4nO3deXiM5/4G8HuW7JFNZJEIEUEkkQgREkv8pC21HFprHRSlpVXqKGptY6eKqq2l6KKl2h45xVFKqUSJhGy2kEjIRkwW2TPL74/WHEOGIJl3ZnJ/rutc1zlfz8zcoadze56Z9xWpVCoViIiIiMjgiYUOQERERER1g8WOiIiIyEiw2BEREREZCRY7IiIiIiPBYkdERERkJFjsiIiIiIwEix0RERGRkWCxIyIiIjISLHZERERERoLFjoiIiMhIsNgRERERGQkWOyIiIiIjwWJHREREZCRY7IiIiIiMBIsdERERkZFgsSMiIiIyEix2REREREaCxY6IiIjISLDYERERERkJFjsiIiIiI8FiR0RERGQkWOyIiIiIjASLHREREZGRYLEjIiIiMhIsdkRERERGgsWOiIiIyEiw2BEREREZCRY7IiIiIiMhFToAEVFdUigUkMlkyMvLQ15eHu7k5qKyvBxKhQJiiQRmFhZo4uICZ2dnODs7w8HBARKJROjYRER1QqRSqVRChyAiel4FBQVISEhAUnw8KkpLoZLLYV1eDluZDCZyOcQqFZQiEaqlUhQ5OKDEwgIiqRTmVlbwDwpCQEAA7O3thf4xiIieC4sdERm07OxsxJw6hfTUVJiUlcEj8yZcZTLYlpbCRKHQ+rhqiQRFVlbIcXBApkczVFtawtPbG2Hdu8PV1VWHPwERUd1hsSMigySXyxEdHY3Y6GhY5+ejVUYm3PPzIVEqn/q5FGIxbjk64lpzD5Q4OiI4LAxhYWGQSvlpFSIyLCx2RGRwcnNzcSAqCgW3stA2NRXeWVkQ18G/ypQiEVLd3HDZ2xsO7m54eeBAuLi41EFiIiLdYLEjIoOSkZGBn/fsgWV2DjpeugSbsrI6f41iS0vE+figrGlTDB4+DM2bN6/z1yAiqg8sdkRkMDIyMvDjd9+hcUYmOl+8COkzHLvWllwsxhnfdpB5eODVkSNZ7ojIIPA6dkRkEHJzc/Hznj1wyMhEl5SUei11ACBVKtE1OQUOmZn4ec9e5Obm1uvrERHVBRY7ItJ7crkcB6KiYJmdg5CLF+vk83S1IVapEJJyERY52TgYFQW5XK6T1yUielYsdkSk96Kjo1FwKwsdL12q9526h0mVSnS8eAmyrCzExMTo9LWJiJ4Wix0R6bXs7GzERkejbWpqvXxRojZsy8rQ5moqzp46hZycHEEyEBHVBosdEem1mFOnYJ2fD++sLEFztM7KgnV+PqJPnRI0BxHR47DYEZHeKigoQHpqKlplZOrsc3XaiFUqeGVkIv3qVRQUFAiahYhIGxY7ItJbCQkJMCkrg3t+vtBRAADN8vMhLStDYmKi0FGIiGrEYkdEekmhUCApPh4emTef6TZh9UGiVKL5zZtIjIuD4jH3oSUiEgqLHRHpJZlMhorSUrjKZEJH0eB6969cMj3LRUQEsNgR0WNIpVIEBgaq/1NVVfXUz7Fq1apneu28vDyo5HLYlZRozD/LzEC/+Dj0j4/DKxfO42ZFxWOf54tbN5/r8Z3/PK3xv21LS6GSy5GXl/fYxz3rz/2wI0eOICgoCP7+/ggNDUVSUlKdPC8RGSep0AGISH/Z2dnhwoULz/Ucq1atwqxZs57qMQqFAnl5ebAuL9e4bl18cTHOFhVhf4cgSEUi5FZWwkLy+L+ffnHrFia6N3vmxz/MRKGAdXk58vLy4Ofnp3Xd0/7cCoUCEonkkXmTJk1w8OBBuLi44OjRo3j77bdx8uTJp8pMRA0Hd+yI6KkcPHgQXbp0QWBgICZNmgTl38Vr0qRJ6NixI3x9fbFx40YAwLx581BYWIjAwEC8/fbbuHHjBjp16qR+rpkzZ2Lnzp0AgBYtWiAyMhKhoaH4/fff8cOePVizYwcGxMdhfcYNAMCdqirYSKWQikQAABczM9hKTQAAv8tkGJpwAQPPx2N+aiqUKhU+uXED9+RyDDwfjw+vX3vqxz9s881MvHLhPFZs24bd336rnkdGRsLPzw8BAQHYsGHDIz83AKxcuRJ+fn7w9/fHt38/9vfff8eLL76IYcOGoVevXjX+fgcGBsLFxQUAEBQUhCyBL/tCRPqNO3ZEpNX9cgIAXbp0wZIlS/DJJ5/g999/h7m5Od555x3s3bsXI0aMwIoVK+Dg4ICqqip06dIFw4cPx9KlS7F161b1rt+NGzce+3qNGzdGTEwMLl68iLj4eCzt2xcd0tIx+eJFnC8uRpidHTZkZqBP3Dl0s7PHQCcntG/UCLLqauzIysI3/u1hJhbjo+vXcDD/Dma0aIHvc3MQ1SEIAFAilz/V4/s3cVJnO1kgw92qavwU2AGxLVpg/vHjuHXrFi5cuIATJ04gLi4OZmZmkMlkcHBw0Pi5z507h7179+LcuXMoKytDcHCwusidOXMGly5dQtOmTZ/457Fz5068+OKLT/eHSEQNCosdEWn18FHsf/7zHyQmJqJLly4AgPLycri5uQEAdu/eje3bt0OhUCAzMxOpqalwdHR8qtcbOnQoAOC3337D9evXMSctDRZVVShTKJBZUYEONjb4d4cgnCksRExhIcYlJ2F9Wx9UKZW4UlaKoQl/Za1UKuFsavbI81tLpc/8+OiCQhyTyXC2uAjlKckoFYtx/fp1HDt2DOPGjYOZ2V/rHRwcHnndU6dO4dVXX4W5uTnMzc3Ru3dvxMbGwtbWFmFhYbUqdX/++Sc+//xzREdHP81vKRE1MCx2RFRrKpUK/fv3x5dffqkxT0tLw6ZNm3D69GnY2tqiT58+qKysfOTxUqlUfXQL4JE1lpaW6tcJ79EDI+zt0eF6muZziEQIs7dHmL097E1M8JvsLrrZ2aOXvQOWt279xJ/hWR+vAjDVwwODnZ1x3ssLFd27oWfPnti/f/8TX/OR51KpIPr7OPj+z/w46enpGDNmDH7++Wc0btz4qV+PiBoOfsaOiGqtS5cuOH78OG7e/Oubpnfv3sWtW7dw7949WFtbw8bGBjdu3MCpB267JZFI1Nd8c3JyQnZ2Nu7du4eSkhIcOXKkxtf5v//7P5yNi0Ph34/LraxEQXU10srKkFleDuCvcpRaVgo3M3MENmqEM0WFyPm7KBZUVyP37/8uEYmg+Pvzcs/y+PtC7eywLy8XFQoFqqRS5MtkqKioQEREBHbs2KEuqfcvg/Lgz92tWzf89NNPqKysREFBAY4fP47g4OBa/Z4XFhbiH//4BzZu3AhfX99aPYaIGi7u2BFRrTk5OWHz5s0YNGgQqqurYWJigi+++AJBQUFo06YN/Pz80Lp1a3Tt2lX9mLFjx8Lf3x+9evXCxo0bMWvWLAQFBcHb2xv+/v41vo6fnx9GjRqFyB07sKa0FFYSCda2aYsypQKR16+j5O/C5GtljX+6usJcIsGHrVphysWLkKuUkIrEWOLtDRczMwx2ckb/+DiE2NlhiLPzUz/+vnAHB6SWlWJIwgXcS70KZ3d3vDllCl5++WXExcUhKCgIJiYmeOONN/DOO+888nMPHToUHTt2hEgkwkcffQRXV1dcuXLlib/nn332GdLT0/H+++8DAMzMzHDmzJln/jMkIuMmUqkEvgEjEVENkpOTcfCHH9D/xEmY6NFdHqolEvzSswdeHjr0sZc7ISISAo9iiUgvOTs7QySVosjKSugoGoqsrCCSSuHs7Cx0FCKiR/Aoloj0koODA8ytrJDj4ADH4mKh46jlNP4rV03ffn0eSUlJGD16tMasVatW2LdvX52+DhEZNxY7ItJLEokE/kFBuHD3LtplZkLywLdphaIQi5HRrBmCOnas8S4Rz8Pf3/+57/JBRMSjWCLSWwEBAai2tMStp7weXm0V3ytGdk4Obt+5jWq5/Inrbzo6Qm5pifbt29dLHiKi58ViR0R6y97eHp7e3rjW3APKv6/7Vleq5XKUlJQAUEEul0Mmk9V4G7H7lCIRrjf3gGfr1rC3t6/TLEREdYXFjoj0Wlj37ihxdETq33e4qC8KhRzFj/ks31U3N5Q4OiKsW7d6zUFE9DxY7IhIr7m6uiI4LAyXvb1RXIu7NNSWiVQK04duG1ZWVoqKGu6YUWRpiSutvdG5Wze4urrWWQYiorrGYkdEei8sLAz27m6I8/GBXFx3/9qys7ODSKT5fIWFhRpHsnKxGHHtfODg5obQ0NA6e20iovrAYkdEek8qlaLfwIEoa9oUZ3zb1dnn7aQSCWxsbDRmSqUCRUVFf/13kQhnfNuh3LUpXh44EFIpLyRARPqNxY6IDIKLiwsGDx8GmYcHTvv51tnOnZWlJczMzDVm5eVlKK2qwmk/X8g8PDB4+DC4uLjUyesREdUn3lKMiAxKRkYGft6zF5bZ2eh46RJsysqe+zkVCgVu37kDleqva+WV2tjgSqdgqFp64tWRI9G8efPnfg0iIl1gsSMig5Obm4sDUVEouJWFtqmp8M7Kgvg5/1VWVl4OWVEhslu3RmrbtsiSyVBeXY2vv/4aojq+1AoRUX1hsSMigySXyxEdHY3Y6GhY5+fDKyMTzfLzn+kOFQqxGDcdHZHi7IQ8CwtEx8YiJiYGCoUC3333HUaMGFEPPwERUd1jsSMig5adnY2Y6GikX70KaVkZmt+8Cde7MtiWlsJEodD6uGqJBEVWVshp7ICMZs0gt7SEa7NmWLx0Ka5evapeZ29vj5SUFF7mhIgMAosdERmFgoICJCYmIjEuDhWlpVDJ5bAuL4eNrACmcjnEKiWUIjGqpFIUO9ijxMICIqkU5lZWaN+xI9q3bw97e3vs3bsXw4cP13ju/v37IyoqikeyRKT3WOyIyKgoFArIZDLk5eUhLy8Pd3JzUVVRAYVcDolUClNzczRxcYGzszOcnZ3h4OAAiUSi8RzDhw/H3r17NWZffvklxo0bp8sfhYjoqbHYERE9JD8/H35+fsjLy1PPbGxskJSUBA8PDwGTERE9Hq9jR0T0EEdHR3z++ecas+LiYkyYMAH8uzAR6TMWOyKiGgwcOBBjx47VmB09ehRbtmwRKBER0ZPxKJaISIvCwkL4+fkhKytLPbO0tERiYiK8vLwETEZEVDPu2BERaWFnZ4cvv/xSY1ZWVoZx48ZB8ZhLqRARCYXFjojoMV588UW8+eabGrM//vgD69evFygREZF2PIolInqCe/fuISAgAOnp6eqZmZkZzp8/Dx8fHwGTERFp4o4dEdETNGrUCDt27NCYVVZWYuzYsZDL5QKlIiJ6FIsdEVEt9OzZE9OnT9eYxcbGYtWqVcIEIiKqAY9iiYhqqby8HIGBgRr3kjUxMUFsbCwCAgIETEZE9Bfu2BER1ZKFhQV27doFsfh//+qsrq7G2LFjUVVVJWAyIqK/sNgRET2FLl26YNasWRqzhIQELF68WKBERET/w6NYIqKnVFlZiU6dOiE5OVk9k0gkOH36NIKDgwVMRkQNHYsdEdEzOH/+PDp37qzxrVgfHx/Ex8fD3NxcwGRE1JDxKJaI6Bl06NABCxYs0JhdunTpkRkRkS5xx46I6BlVV1eja9euiIuLU89EIhFOnjyJbt26CZiMiBoqFjsioueQkpKCoKAgjW/Fenl5ISEhAVZWVgImI6KGiEexRETPwdfXF0uWLNGYXb9+HbNnzxYoERE1ZNyxIyJ6TgqFAj169EBMTIzG/OjRo+jdu7dAqYioIWKxIyKqA6mpqQgICEB5ebl65uHhgcTERNja2gqYjIgaEh7FEhHVAW9vb6xcuVJjlpmZiRkzZgiUiIgaIu7YERHVEaVSiYiICBw/flxj/ssvv6Bfv34CpSKihoTFjoioDt24cQP+/v4oKSlRz1xcXJCSkgIHBwcBkxFRQ8CjWCKiOtSiRQusXbtWY5abm4upU6cKlIiIGhLu2BER1TGVSoV+/frh0KFDGvN9+/bh1VdfFSgVETUELHZERPUgKysLfn5+KCwsVM8cHR2RkpICJycn4YIRkVHjUSwRUT1wc3PDhg0bNGb5+fl46623wL9PE1F9YbEjIqono0aNwuDBgzVmP//8M3bv3i1QIiIydjyKJSKqR7dv34avry/y8/PVMzs7OyQnJ8PNzU3AZERkjLhjR0RUj5ycnLB582aNWWFhISZOnMgjWSKqcyx2RET1bMiQIRg5cqTG7NChQ9i+fbtAiYjIWPEolohIB2QyGXx9fZGbm6ueWVtbIykpCS1atBAuGBEZFe7YERHpgIODA7Zt26YxKykpwfjx46FUKgVKRUTGhsWOiEhH+vXrh/Hjx2vMjh8/jk2bNgmUiIiMDY9iiYh0qKioCP7+/rh586Z6ZmFhgYSEBHh7ewuYjIiMAXfsiIh0yNbWFl9++aXGrLy8HK+//joUCoVAqYjIWLDYERHpWEREBKZMmaIxi4mJwSeffCJQIiIyFjyKJSISQElJCQIDA3H9+nX1zNTUFPHx8fD19RUwGREZMu7YEREJwNraGjt37oRIJFLPqqqqMHbsWFRXVwuYjIgMGYsdEZFAunXrhhkzZmjM4uLisGLFCoESEZGh41EsEZGAysvLERQUhMuXL6tnUqkUZ8+eRYcOHQRMRkSGiDt2REQCsrCwwK5duyCRSNQzuVyOsWPHorKyUsBkRGSIWOyIiATWuXNnzJkzR2OWlJSEjz76SKBERGSoeBRLRKQHqqqqEBwcjMTERPVMLBYjJiYGISEhAiYjIkPCYkdEpCcSEhIQHBys8a3Y1q1b4/z587C0tBQwGREZCh7FEhHpiYCAACxatEhjdvXqVcybN0+gRERkaLhjR0SkR+RyOUJDQxEbG6ueiUQiHD9+HD179hQwGREZAhY7IiI9c+nSJXTo0EHjW7Genp5ITEyEtbW1gMmISN/xKJaISM/4+Phg2bJlGrP09HS8//77AiUiIkPBHTsiIj2kUCgQHh6OU6dOacwPHz6MF198UaBURKTvWOyIiPTU9evX0b59e5SVlaln7u7uSEpKgp2dnXDBiEhv8SiWiEhPeXl5YfXq1RqzW7duYfr06cIEIiK9xx07IiI9plQq8dJLL+Ho0aMa8/3792PgwIECpSIifcViR0Sk5zIzM+Hv74/i4mL1zNnZGSkpKWjcuLGAyYhI3/AolohIz3l4eGDdunUas7y8PLz99tvCBCIivcUdOyIiA6BSqTBw4ED88ssvGvM9e/Zg2LBhAqUiIn3DYkdEZCBycnLg6+uLgoIC9axx48ZITk6Gi4uLgMmISF/wKJaIyEC4urpi48aNGrO7d+/izTffBP+OTkQAix0RkUEZMWIEhgwZojGLiorC119/LVAiItInPIolIjIwd+7cga+vL+7cuaOe2draIjk5Ge7u7gImIyKhcceOiMjANGnSBJ9//rnGrKioCBMmTOCRLFEDx2JHRGSABg0ahH/+858as19//fWRwkdEDQuPYomIDFRBQQH8/PyQnZ2tnllZWSExMREtW7YUMBkRCYU7dkREBsre3h7bt2/XmJWWlmL8+PFQKpUCpSIiIbHYEREZsD59+mDixIkasxMnTmDDhg0CJSIiIfEolojIwN27dw/+/v7IyMhQz8zNzXHhwgW0adNGwGREpGvcsSMiMnCNGjXCjh07NGYVFRV4/fXXIZfLBUpFREJgsSMiMgK9evXC1KlTNWZ//vknPv74Y4ESEZEQeBRLRGQkysrKEBgYiNTUVPXM1NQU586dg7+/v4DJiEhXuGNHRGQkLC0tsXPnTojF//tXe1VVFcaOHYuqqioBkxGRrrDYEREZkdDQUMycOVNjdv78eSxbtkygRESkSzyKJSIyMhUVFejUqRNSUlLUM4lEgjNnzqBjx44CJiOi+sYdOyIiI2Nubo5du3ZBIpGoZwqFAmPHjkVFRYWAyYiovrHYEREZoY4dO2L+/Pkas5SUFCxatEigRESkCzyKJSIyUtXV1QgJCcH58+fVM5FIhFOnTiE0NFTAZERUX1jsiIiMWFJSEjp16qTxrdhWrVrhwoULsLKyEjAZEdUHHsUSERkxf39/fPTRRxqza9eu4YMPPhAoERHVJ+7YEREZOblcju7du+PPP//UmB87dgy9evUSKBUR1QcWOyKiBuDKlSsIDAzU+FZs8+bNkZiYCBsbGwGTEVFd4lEsEVED0KZNG6xYsUJjlpGR8cjFjInIsHHHjoiogVAqlfi///s/nDhxQmN+8OBB9O3bV6BURFSXWOyIiBqQ9PR0+Pv7o7S0VD1r2rQpkpOTYW9vL2AyIqoLPIolImpAPD09sWbNGo1ZdnY23n33XYESEVFd4o4dEVEDo1Kp0KdPH/z6668a859++gmDBw8WKBUR1QUWOyKiBujWrVvw8/NDUVGRetakSROkpKSgSZMmAiYjoufBo1giogbI3d0dn376qcbszp07mDx5Mvj3fSLDxR07IqIGSqVSYdCgQYiKitKY7969GyNHjhQoFRE9DxY7IqIGLDc3F35+frh79656Zm9vj5SUFLi6ugqYjIieBY9iiYgaMBcXF2zatEljVlBQgIkTJ/JIlsgAsdgRETVww4YNw/DhwzVmBw4cwM6dO4UJRETPjEexRESEu3fvwtfXF3l5eepZo0aNkJycDA8PDwGTEdHT4I4dERGhcePG+PzzzzVm9+7dw4QJE6BUKgVKRURPi8WOiIgAAAMHDsTYsWM1ZkePHsWWLVsESkRET4tHsUREpFZYWAh/f3/cunVLPbO0tERiYiK8vLwETEZEtcEdOyIiUrOzs8P27ds1ZmVlZXj99dehUCgESkVEtcViR0REGl588UW89dZbGrNTp05h/fr1AiUiotriUSwRET2ipKQE7du3R3p6unpmZmaG8+fPw8fHR8BkRPQ43LEjIqJHWFtbY8eOHRCJROpZZWUlxo4dC7lcLmAyInocFjsiIqpRz549MW3aNI1ZbGwsVq5cKVAiInoSHsUSEZFW5eXl6NChA65cuaKemZiYIDY2FgEBAQImI6KacMeOiIi0srCwwK5duyAW/+/torq6GmPGjEFVVZWAyYioJix2RET0WCEhIZg9e7bGLDExEYsXLxYoERFpw6NYIiJ6osrKSgQHByMpKUk9k0gkOH36NIKDgwVMRkQPYrEjIqJaOX/+PDp37qzxrVgfHx/ExcXBwsJCwGREdB+PYomIqFY6dOiABQsWaMwuXbr0yIyIhMMdOyIiqrXq6mp07doVcXFx6plIJMLJkyfRrVs3AZMREcBiR0RETyklJQVBQUEa34r18vJCQkICrKysBExGRDyKJSKip+Lr64slS5ZozK5fv/7IN2eJSPe4Y0dERE9NoVCgR48eiImJ0ZgfPXoUvXv3FigVEbHYERHRM0lNTUVAQADKy8vVMw8PDyQmJsLW1lbAZEQNF49iiYjomXh7ez9y39jMzEzMmDFDoERExB07IiJ6ZkqlEhERETh+/LjG/JdffkG/fv0ESkXUcLHYERHRc7lx4wb8/f1RUlKinrm4uCAlJQUODg4CJiNqeHgUS0REz6VFixZYu3atxiw3NxdTp04VKBFRw8UdOyIiem4qlQr9+vXDoUOHNOb79u3Dq6++KlAqooaHxY6IiOpEVlYW/Pz8UFhYqJ45OjoiJSUFTk5OwgUjakB4FEtERHXCzc0NGzZs0Jjl5+fjrbfeAvcQiHSDxY6IiOrMqFGjMHjwYI3Zzz//jN27dwuUiKhh4VEsERHVqdu3b8PX1xf5+fnqmZ2dHZKTk+Hm5iZgMiLjxx07IiKqU05OTti8ebPGrLCwEBMnTuSRLFE9Y7EjIqI6N2TIEIwcOVJjdujQIWzfvl2gREQNA49iiYioXshkMvj6+iI3N1c9s7a2RlJSElq0aCFcMCIjxh07IiKqFw4ODti2bZvGrKSkBOPHj4dSqRQoFZFxY7EjIqJ6069fP4wfP15jdvz4cWzatEmgRETGjUexRERUr4qKiuDv74+bN2+qZxYWFkhISIC3t7eAyYiMD3fsiIioXtna2uLLL7/UmJWXl+P111+HQqEQKBWRcWKxIyKiehcREYEpU6ZozGJiYvDJJ58IlIjIOPEoloiIdKKkpASBgYG4fv26emZqaor4+Hj4+voKmIzIeHDHjoiIdMLa2ho7d+6ESCRSz6qqqjB27FhUV1cLmIzIeLDYERGRznTr1g0zZszQmMXFxWHFihUCJSIyLjyKJSIinSovL0dQUBAuX76snkmlUpw9exYdOnQQMBmR4eOOHRER6ZSFhQV27doFiUSinsnlcowdOxaVlZUCJiMyfCx2RESkc507d8acOXM0ZklJSfjoo48ESkRkHHgUS0REgqiqqkJwcDASExPVM7FYjJiYGISEhAiYjMhwsdgREZFgEhISEBwcrPGt2DZt2uD8+fOwsLAQMBmRYeJRLBERCSYgIACLFi3SmF25cgXz5s0TKBGRYeOOHRERCUoulyM0NBSxsbHqmUgkwvHjx9GzZ08BkxEZHhY7IiIS3KVLl9ChQweNb8V6enoiMTER1tbWAiYjMiw8iiUiIsH5+Phg2bJlGrP09HS8//77AiUiMkzcsSMiIr2gUCgQHh6OU6dOacwPHz6MF198UaBURIaFxY6IiPTG9evX0b59e5SVlaln7u7uSEpKgp2dnXDBiAwEj2KJiEhveHl5YfXq1RqzW7duYfr06cIEIjIw3LEjIiK9olQq8dJLL+Ho0aMa8/3792PgwIFQKBQatyMjov/hjh0REekVsViM7du3w8bGRmM+ceJETJkyBXZ2dvDw8Hjks3hExB07IiLSUzt27MD48eO1/npgYCDOnz+vw0RE+o/FjoiI9JJKpUK/fv1w6NChGn9dJBKhvLwcZmZmUCgUkMlkyMvLQ15eHu7k5qKyvBxKhQJiiQRmFhZo4uICZ2dnODs7w8HBgce5ZJSkQgcgIiKqSU5ODtLS0rT+ukqlQkpKCoqLi5EUH4+K0lKo5HJYl5fDViaDhVwOsUoFpUiEaqkUVxwcEGdhAZFUCnMrK/gHBSEgIAD29vY6/KmI6hd37IiISC+NGDECe/bsqfHXXFxc0C00FB38/WFZXQ2PzJtwlclgW1oKE4VC63NWSyQosrJCjoMDMj2aodrSEp7e3gjr3h2urq719aMQ6Qx37IiISC/l5+c/MpNIJAgNDUVYcDAcS0rQ9lwcvO7dg0SprNVzmigUcCwuhmNxMdplZuKWoyOu3b2Lb69dQ3BYGMLCwiCV8q2RDBd37IiISC8dPXoUAwYMQEVFBQDAyckJA/v1g5u9PbwvX0bTq1dhbWEJO1vb53odpUiEVDc3XPb2hoO7G14eOBAuLi518SMQ6RyLHRER6a1r165h9uzZOHfuHIYNGgTXsjL4xMXBsrgYACCRSOHs5FQnr1VsaYk4Hx+UNW2KwcOHoXnz5nXyvES6xGJHRER6LSMjA9999RVsr19Hm9OnIXngM3R1WewAQC4W44xvO8g8PPDqyJEsd2RweIFiIiLSW7m5ufh5zx64ZOegV1o6HKwbARCpf93a2rpOX0+qVKJrcgocMjPx8569yM3NrdPnJ6pvLHZERKSX5HI5DkRFwTI7ByEXL0KiUsHK0hKuLi6wt3eAk5MzrCwt6/x1xSoVQlIuwiInGwejoiCXy+v8NYjqC4sdERHppejoaBTcykLHS5cgfeBbryKRCBbm5pDW4wWGpUolOl68BFlWFmJiYurtdYjqGosdERHpnezsbMRGR6NtaipsysoEyWBbVoY2V1Nx9tQp5OTkCJKB6Gmx2BERkd6JOXUK1vn58M7KEjRH66wsWOfnI/rUKUFzENUWix0REemVgoICpKemolVGJsQCX7hBrFLBKyMT6VevoqCgQNAsRLXBYkdERHolISEBJmVlcK/hzhNCaJafD2lZGRITE4WOQvRELHZERKQ3FAoFkuLj4ZF5s9a3CatvEqUSzW/eRGJcHBSPuQ8tkT5gsSMiIp2TSqUIDAxEYGAggoODceHCBQDA7t278euRI3CVyZ77NX67exc76ugzeq53ZagoLYXs71zbtm2Dt7c3RCIRSkpK6uQ1iOoCix0REemcnZ0dLly4gAsXLmDOnDmIjIwEAHTo0AHdO3eG3XOWJYVKhd6NG2Ocm9tzPw8A2JaWQiWXIy8vDwAQEhKCX3/9lXemIL0jFToAERE1bMXFxbC1tQUAfPXVV4g/fhyDzS0w++oVNJJIkVByD4XV1Vjq7Y3OtnbILC/H7NSrKFcoYCIWY7l3a7SytMRPeXn4o6AAJQo5LCUS9LR3wNWyUszxbImB5+PVr5daWoqjnYJhJhZjwbVryKuqhKlYjKWtvOFlaYnZV6/ATmqClJISdLO3x1vNmsFEoYB1eTny8vLg5+cHf39/oX67iB6LxY6IiHSusLAQgYGBKCsrQ35+vvoiwPeKimBWUQGYWwAAihVy/BAQiNOFhfgsMxNf+duhiakpdvn5w1QsRnxxMT65cQOb2rUDACSW3MP+wA6wlkrx09+7awAQ1SEIAPBDbi5OFMjgZm6O9y5fxtsezeBn3QiJ9+5hWVoatvv5AQByqyrxtb8/RKL/3b7MRlaAO7zFGOk5FjsiItK5+0exALBv3z688847OHr0KKqrqiB+4EsTvR0aAwD8rK2RVVkJAKhSKfHRteu4UloKMYCqBy6J0t3OHtbSmt/aLpWU4KvsLHzXPgAA8GdRIa6X13zx45caO2qUOgAwlctRUVHxTD8vka6w2BERkaD69++PMWPGAABUSiUerFOm4r/+l1gkgvLvArczKxvuZuZY07oN8qurMSzhgnq9uaTmj47fk8sx6+oVrGrdRqP4/RzYAZKHChwAWNTwPGKVEgreN5b0HL88QUREgoqJiUHLli0BACKxGE+6JHGpQg4nU1OIRCLsv327Vq/xQepVjG3qBh9ra/Wss60tvs/961ZhSpUKV0pLH/scSpEYEi27gUT6gv+EEhGRzt3/jJ1KpYJUKsXnn38OADAxNYVS/Pg9hxEurph6+RL+c+c2Qu3snvhaWRUV+O3uXWRWVOCrnGwAwBftfLGgpRcWXruG73NyIFepMMjJGW2srLQ+T5VUClNzcwDA9u3bsWjRIuTm5qJNmzYYNWoUVq1aVcufnqj+iFQqge/XQkRE9LfffvsNVw4fxgun/3zk16rlcpSVlkKhVMLKygpmpqY6zXakaxe0eekl9O7dW6evS/Q0uGNHRER6w9nZGXEWFqiWSGDy910equVy3Lt3DxUV5ep1lRUVcHJ2huQJu3t1pVoiQYmFBZydnXXyekTPisWOiIj0hrOzM0RSKYqsrGBz9y5KSu7V+E1UFVRQKOSQiOt3127zzUwcys+HQixByZXL+OrHHzFr1iz1lz2I9A2LHRER6Q0HBwdAIsF1Cws0y7+jdZ2JiSlMpCb1nmdyMw9MbuaBJM8WyAoMxJRp0yCRSOr9dYmeFb8VS0REeiEmJgb9+vXDL4cP44ZbUyhqOGYVicRoZN0IjRs3fuQ6c/VFIRYjo1kztO/YkaWO9B6LHRERCerEiROIiIhAWFgYDh8+jISEBJSZmOCuu7t6jVgkRqNGNnB2dkajRo0g1lGpA4Cbjo6QW1qiffv2OntNomfFo1giItI5lUqFY8eOITIyEidPntT4taKiIqSmp6OxtzecsrJgY2kFSysrnZa5+5QiEa4394Bn69awt7fX+esTPS3u2BERkc6oVCocPnwY3bp1Q0RExCOl7r6Lly+j3MkJxUFBsLa2FqTUAcBVNzeUODoirFs3QV6f6Gmx2BERUb1TqVQ4cOAAunTpgj59+iAmJqbGdU2bNsW6desQGxuL7hERuOLdGsWWlvWWS6FUoLyiXH27sgcVWVriSmtvdO7WDa6urvWWgagu8SiWiIjqjUqlQlRUFCIjIxEfH691nbu7Oz744AOMHz8e5n/f3SEsLAzXrlxBXLEPup8/D6lSWafZyisqUFBQAEAFQAQbGxtYWVlBBEAuFiOunQ8c3NwQGhpap69LVJ+4Y0dERHVOqVRi37596NChAwYNGqS11DVv3hxbt27FtWvXMGXKFHWpAwCpVIp+AweirGlTnPFtB2UdH8feu3cPUN+ZVoXi4iLcuXMH5VVVOOPbDuWuTfHywIGQ8v6wZEBY7IiIqM4oFAp8//33aN++PYYOHYqEhIQa17Vs2RLbt29HamoqJk2aBDMzsxrXubi4YPDwYZB5eOC0ny/kdXiniZoul1KpUuKPNq1x3cYGHUI6w8XFpc5ej0gXeK9YIiJ6bnK5HN9//z2WLFmCK1euaF3n7e2N+fPn47XXXnuqnbCMjAz8vGcvLLOz0fHSJdiUlT135uJ791BSck/9v0ttbHC5YydkW1pg788/IycnB3v37sXgwYOf+7WIdIXFjoiInll1dTW+/fZbLF26FNeuXdO6zsfHB/Pnz8fw4cOf+SK/ubm5OBAVhYJbWWibmgrvrCyIn+MtrKy8HIWFBVCKRMhu3RqpbdsiSyZD1MGDuH37NgCgY8eOOHfu3DO/BpGu8YMDRET01KqqqvDVV19h2bJlSE9P17rOz88PCxYswKuvvvrcd21wcXHB2PHjER0djVhzM9xydYFXRiaa5edD8gxfrBCZmiCveXPcbNUK+dbWiI6NRUxMDBQKhXqNo6Pjc2Um0jXu2BERUa1VVlZix44dWL58OTIzM7WuCwwMxMKFC/GPf/wD4jr8XNx92dnZiImORvrVq5CWlaH5zZtwvSuDbWkpTB4oZg+rlkhQZGWFnMYOuOHujvyqKlxNT0d0TAxyc3M11np4eODYsWPw8vKq8/xE9YXFjoiInqi8vBzbtm3DypUrkZWVpXVdp06dsHDhQvTv318n93ItKChAYmIiEuPiUFFaCpVcDuvyctjICmAql0OsUkIpEqNKKkWxgz1KLCwgkkphbmUF/6AgjBw58pFCd19oaCj++OOPeimmRPWFxY6IiLQqKyvD1q1bsWrVKq0FCABCQkKwaNEi9OnTRyeF7mEKhQIymQx5eXnIy8vDndxcVFVUQCGXQyKVwtTcHE1cXODs7AxnZ2c4ODhAIpGgV69e+P3337U+7yeffIL33ntPdz8I0XNisSMiokeUlJRg8+bN+Pjjj9VfJKhJt27dsHDhQkRERAhS6J5XfHw8hg8fjuzsbIwcORJHjhzROGI2MzPDhQsX0LZtWwFTEtUeix0REakVFxdj48aNWLNmDe7evat1XXh4OBYuXIjw8HCDLHQPUygUkEgkOHHiBMLDwzV+rXPnzoiOjuaFiskg8IMDRESEwsJCREZGokWLFpg7d67WUvfCCy/g5MmTOH78OHr16mUUpQ6A+hu7PXv2xLRp0zR+7ezZs1i1apUQsYieGnfsiIgaMJlMhnXr1mH9+vUoLi7Wuq5v375YsGABunbtqsN0wigrK0OHDh1w9epV9czExATnzp1D+/btBUxG9GQsdkREDVB+fj4++eQTbNiwASUlJVrXDRgwAAsWLEBwcLAO0wnvzz//RFhYGJQPXB8vICAAZ8+ehampqYDJiB6PR7FERA1IXl4eZs2ahRYtWmD58uVaS93gwYMRHx+PqKioBlfqAKBLly6YNWuWxiwhIQFLliwRKBFR7XDHjoioAcjJycHq1auxZcsWlJeX17hGJBJhyJAhmD9/Po8c8dfFmDt16oTk5GT1TCKR4PTp0w2y7JJhYLEjIjJit27dwsqVK/HFF1+gsrKyxjVisRgjRozAvHnz0K5dOx0n1G/x8fEICQmBXC5Xz3x8fBAfHw9zc3MBkxHVjEexRERGKCMjA5MnT4aXlxc+++yzGkudRCLBmDFjcPHiRXz77bcsdTUICgrCggULNGaXLl16ZEakL7hjR0RkRNLS0rB8+XLs3LlTY5fpQVKpFGPGjMEHH3yAVq1a6Tih4amurkbXrl0RFxennolEIvzxxx8ICwsTMBnRo1jsiIiMQGpqKpYtW4avv/4aCoWixjUmJiYYN24c5syZA09PTx0nNGwpKSkICgpCVVWVeubl5YWEhARYWVkJmIxIE49iiYgM2OXLlzF69Gi0bdsWO3furLHUmZqaYsqUKbh27Rq2bt3KUvcMfH19sXjxYo3Z9evXMXv2bIESEdWMO3ZERAYoJSUFS5YswZ49e6DtX+Pm5uaYNGkSZs2aBTc3Nx0nND4KhQLdu3fH6dOnNeZHjx5F7969BUpFpInFjojIgCQkJGDx4sX48ccfta6xsLDAlClTMHPmTLi4uOgwnfFLTU1FQECAxiVjPDw8kJSUBBsbGwGTEf2FR7FERAYgLi4OgwYNQmBgoNZSZ2VlhdmzZ+PGjRv4+OOPWerqgbe3N1auXKkxy8zMxIwZMwRKRKSJO3ZERHrszJkzWLx4MQ4cOKB1TaNGjfDuu+9i+vTpcHR01GG6hkmpVCIiIgLHjx/XmP/yyy/o16+fQKmI/sJiR0Skh6Kjo7F48WIcPnxY6xpbW1tMnz4d06ZNg729vQ7T0Y0bN+Dv769xSzZXV1ckJyfDwcFBwGTU0PEolohIj5w4cQK9e/dGt27dtJY6e3t7LF68GBkZGfjwww9Z6gTQokULfPLJJxqznJwcTJ06VaBERH/hjh0RkcBUKhWOHTuGyMhInDx5Uus6R0dHzJw5E1OmTEGjRo10mJBqolKp8PLLL+O///2vxnzfvn149dVXBUpFDR2LHRGRQFQqFX799VdERkYiJiZG6zonJyfMmjULb731Fi+Gq2eysrLg5+eHwsJC9czR0REpKSlwcnISLhg1WDyKJSLSMZVKhQMHDqBLly7o06eP1lLn6uqKdevWIT09Hf/6179Y6vSQm5sbNmzYoDHLz8/H5MmTtV5fkKg+cceOiEhHVCoVoqKiEBkZifj4eK3r3N3dMWfOHEyYMAHm5uY6TEjPQqVS4ZVXXsG///1vjfk333yDUaNGCROKGiwWOyKieqZUKvHTTz9hyZIlSEhI0LquefPm+OCDD/D666/DzMxMhwnpeeXl5cHPzw/5+fnqmZ2dHVJSUtC0aVMBk1FDw6NYIqJ6olAo8P3336N9+/YYOnSo1lLXsmVLbN++HampqXjzzTdZ6gyQs7MzNm/erDErLCzEG2+8wSNZ0ikWOyKiOiaXy/HNN9/A19cXI0eOREpKSo3rvL29sWvXLly5cgXjx4+HiYmJjpNSXRoyZAhGjhypMTt06BC+/PJLgRJRQ8SjWCKiOlJdXY1vv/0WS5cuxbVr17Su8/Hxwfz58zF8+HBIJBIdJqT6JpPJ4Ovri9zcXPWsUaNGSEpKQvPmzQVMRg0Fd+yIiJ5TVVUVtm3bhjZt2mDcuHFaS52fnx/27NmDpKQkvPbaayx1RsjBwQFffPGFxuzevXsYP348lEqlQKmoIWGxIyJ6RpWVldiyZQu8vb0xceJEpKen17guMDAQP/74IxISEjBs2DAWOiPXv39/jBs3TmN27NgxbNq0SaBE1JDwKJaI6CmVl5dj27ZtWLlyJbKysrSu69SpExYuXIj+/ftDJBLpMCEJraioCP7+/rh586Z6ZmlpiQsXLsDb21vAZGTsWOyIiGqprKwMW7duxapVqzQ+Q/WwkJAQLFq0CH369GGha8COHj2KF154QWMWGhqKkydPcteW6g2PYomInqCkpASrV6+Gp6cnZsyYobXUhYWF4ddff8Xp06fRt29flroGLiIiAlOmTNGYxcTEYO3atQIlooaAO3ZERFoUFxdj48aNWLNmDe7evat1XXh4OBYuXIjw8HCWOdJQUlKCgIAApKWlqWdmZmaIj49Hu3btBExGxorFjojoIYWFhfj000+xbt06FBQUaF0XERGBBQsWoEePHjpMR4bmjz/+QM+ePTUuVNypUyfExMTw2oVU53gUS0T0N5lMhoULF6J58+ZYtGiR1lLXt29fxMTE4MiRIyx19ETdu3fHe++9pzE7d+4cVqxYIVAiMmbcsSOiBi8/Px+ffPIJNmzYgJKSEq3rBgwYgAULFiA4OFiH6cgYlJeXIygoCJcvX1bPpFIpYmNjERgYKFwwMjosdkTUYOXl5WHNmjXYtGkTSktLta4bPHgw5s+fj6CgIB2mI2Nz9uxZdO3aVeNCxf7+/oiNjeX9ganO8CiWiBqcnJwczJgxA56enli9enWNpU4kEmHo0KFISEjATz/9xFJHz61z58744IMPNGZJSUmIjIwUKBEZI+7YEVGDcevWLaxcuRJffPEFKisra1wjFosxfPhwzJs3D76+vjpOSMauqqoKwcHBSExMVM/EYjFiYmIQEhIiYDIyFix2RGT0MjIysGLFCnz55ZeoqqqqcY1YLMY///lPzJ07F23atNFxQmpIEhISEBwcjOrqavWsTZs2OH/+PCwsLARMRsaAR7FEZLTS0tIwceJEtGrVClu2bKmx1EmlUowfPx5XrlzBrl27WOqo3gUEBGDhwoUasytXrmDevHkCJSJjwh07IjI6qampWLZsGb7++msoFIoa15iYmGDcuHGYM2cOPD09dZyQGjq5XI7Q0FDExsaqZyKRCL///jsvoUPPhcWOiIzG5cuXsXTpUuzevVvjm4cPMjU1xRtvvIHZs2fDw8NDxwmJ/ufSpUvo0KGDxuc9PT09kZiYCGtrawGTkSHjUSwRGbzk5GSMGDEC7dq1wzfffFNjqTM3N8e7776LtLQ0bNy4kaWOBOfj44OlS5dqzNLT0zFr1iyBEpEx4I4dERmshIQELF68GD/++KPWNRYWFpg8eTJmzpwJV1dXHaYjejKFQoGePXsiOjpaY/7rr7/ihRdeECgVGTIWOyIyOHFxcVi8eDH279+vdY2VlRXeeecdzJgxA05OTjpMR/R0rl27hoCAAJSVlaln7u7uSE5Ohq2trYDJyBDxKJaIDMaZM2fQv39/dOrUSWupa9SoEebNm4cbN25gxYoVLHWk91q1aoVVq1ZpzG7duoXp06cLE4gMGnfsiEjvRUdHY/HixTh8+LDWNba2tpg+fTqmTZsGe3t7HaYjen5KpRIvvvgifvvtN415VFQUBgwYIFAqMkQsdkSkt06cOIHIyEgcO3ZM6xp7e3vMmDEDU6dO5bEVGbTMzEz4+fnh3r176pmzszNSUlLQuHFjAZORIeFRLBHpFZVKhd9++w09e/ZEeHi41lLn6OiI5cuXIyMjA/Pnz2epI4Pn4eGBdevWaczy8vLwzjvvCBOIDBJ37IhIL6hUKvz666+IjIxETEyM1nVOTk54//338dZbb/FaX2R0VCoV+vfvj4MHD2rM9+7di6FDhwqUigwJix0RCUqlUuHgwYOIjIzE2bNnta5zdXXF7NmzMXHiRFhaWuowIZFuZWdnw8/PDwUFBepZ48aNkZKSAmdnZwGTkSHgUSwRCUKlUmH//v3o1KkT+vfvr7XUubu747PPPkNaWhqmTZvGUkdGr2nTpvjss880Znfv3sWbb74J7sXQk3DHjoh0SqlU4qeffsKSJUuQkJCgdZ2Hhwfmzp2L119/HWZmZjpMSCQ8lUqFoUOHPnLx7a+++gqjR48WKBUZAhY7ItIJhUKBH374AUuWLEFKSorWdS1btsTcuXMxevRomJqa6jAhkX65c+cOfH19cefOHfXM1tYWycnJcHd3FzAZ6TMexRJRvZLL5fjmm2/g6+uLkSNHai113t7e2LlzJy5fvowJEyaw1FGD16RJE2zZskVjVlRUhDfeeINHsqQVix0R1Yvq6mrs3LkTPj4+GD16NK5cuVLjurZt2+Kbb77BxYsXMXbsWJiYmOg4KZH+euWVVzBq1CiN2eHDh/HFF18IlIj0HY9iiahOVVVV4auvvsKyZcuQnp6udZ2fnx8WLFiAV199FRKJRIcJiQxLQUEB/Pz8kJ2drZ5ZW1sjMTERnp6eAiYjfcQdOyKqE5WVldi8eTO8vb0xceJEraUuICAAP/74IxISEjBs2DCWOqInsLe3x7Zt2zRmJSUlGDduHJRKpUCpSF+x2BHRcykvL8eGDRvg5eWFKVOmIDMzs8Z1HTt2xP79+3H+/Hm88sorEIv5rx+i2urbty/eeOMNjdmJEyceuSwKEY9iieiZlJWVYevWrVi1ahVyc3O1rgsJCcHChQvRt29fiEQiHSYkMi7FxcXw9/fX+MuThYUFLly4gNatWwuYjPQJix0RPZWSkhJs3rwZH3/8MW7fvq11XVhYGBYtWoSIiAgWOqI6cuzYMfTu3Vtj1qVLF5w6dYofayAAPIololoqLi7G8uXL0aJFC8yaNUtrqQsPD8exY8fwxx9/4IUXXmCpI6pD//d//4d33nlHY/bnn3/i448/FigR6Rvu2BHRYxUWFuLTTz/FunXrNO5d+bCIiAgsWLAAPXr00GE6ooantLQUgYGBuHbtmnpmamqKuLg4+Pn5CZiM9AGLHRHVSCaTYd26dVi/fj2Ki4u1ruvTpw8WLFiA0NBQHaYjatiio6PRvXt3jQsVd+jQAWfOnOG1IBs4HsUSkYb8/HzMnTsXzZs3x+LFi7WWugEDBuDMmTM4dOgQSx2RjoWFhWHmzJkas/Pnz2PZsmUCJSJ9wR07IgIA5OXlYc2aNdi0aRNKS0u1rhs8eDDmz5+PoKAgHaYjoodVVFSgY8eOuHjxonomlUpx5swZ/v+zAWOxI2rgcnJysHr1amzZsgXl5eU1rhGJRBgyZAjmz5+P9u3b6zghEWlz7tw5dOnSBQqFQj3z9fVFXFwczMzMBExGQuFRLFEDdevWLUydOhWenp5Yu3ZtjaVOJBJh5MiRSEpKwt69e1nqiPRMp06dMHfuXI1ZSkoKFi1aJFAiEhp37IgamIyMDKxYsQJffvklqqqqalwjFosxatQozJ07F23bttVxQiJ6GlVVVQgJCcGFCxfUM7FYjFOnTqFr167CBSNBsNgRNRBpaWlYvnw5du7cCblcXuMaiUSCMWPGYO7cuWjVqpWOExLRs0pKSkLHjh1RXV2tnnl7e+PChQuwtLQUMBnpGo9iiYxcamoqxo0bh9atW2Pbtm01ljoTExNMmjQJqamp+PLLL1nqiAyMv78/PvroI41ZamrqI8e0ZPy4Y0dkpC5fvoylS5di9+7dUCqVNa4xNTXFG2+8gdmzZ8PDw0PHCYmoLsnlcnTr1g1nzpzRmB8/fhzh4eHChCKdY7EjMjLJyclYsmQJ9u7dC23/9zY3N8ekSZMwa9YsuLm56TghEdWXK1euIDAwEBUVFepZixYtkJiYiEaNGgmYjHSFR7FERiIhIQFDhgyBv78/9uzZU2Ops7CwwHvvvYe0tDSsX7+epY7IyLRp0wbLly/XmN24ceORixmT8eKOHZGBi4uLw+LFi7F//36ta6ysrPD222/jX//6F5ycnHSYjoh0TalUolevXjh58qTG/NChQ+jTp49AqUhXWOyIDNSZM2ewePFiHDhwQOuaRo0a4d1338X06dPh6Oiow3REJKS0tDS0b99e4y4ybm5uSEpKgr29vYDJqL7xKJbIwERHR6NPnz7o0qWL1lJna2uLRYsW4caNG1iyZAlLHVED07JlS3z88ccas6ysLEybNk2gRKQr3LEjMhAnTpxAZGQkjh07pnWNvb09ZsyYgalTp8LW1laH6YhI36hUKrz00ks4cuSIxvznn3/GoEGDhAlF9Y7FjkiPqVQqHDt2DJGRkY98XuZBjo6O+Ne//oUpU6bAxsZGhwmJSJ/dvHkTfn5+KC4uVs+cnJyQkpLCnXwjxaNYIj2kUqlw+PBhdOvWDREREVpLnZOTE1avXo309HTMmTOHpY6INDRr1gyffvqpxuz27duYPHmy1sshkWHjjh2RHlGpVDh48CAiIyNx9uxZretcXV0xa9YsTJo0ibcLIqLHUqlU+Mc//oH//Oc/GvPvvvsOI0aMECgV1RcWOyI9oFKpEBUVhcjISMTHx2td5+7ujjlz5mDChAkwNzfXYUIiMmS5ubnw9fWFTCZTzxwcHJCcnAxXV1cBk1Fd41EskYCUSiX27duHDh06YNCgQVpLnYeHB7Zs2YJr167h7bffZqkjoqfi4uKCTZs2acxkMhkmTZrEI1kjwx07IgEoFAr88MMPWLJkCVJSUrSua9myJebOnYvRo0fD1NRUhwmJyBgNHz4ce/fu1Zjt2LEDr7/+ujCBqM6x2BHpkFwux/fff48lS5bgypUrWtd5e3tj3rx5eO2112BiYqLDhERkzPLz8+Hr64vbt2+rZzY2NkhOTkazZs0ETEZ1hUexRDpQXV2NnTt3wsfHB6NHj9Za6tq2bYtvvvkGFy9exNixY1nqiKhOOTo64osvvtCYFRcXY/z48TySNRIsdkT1qKqqCtu2bUObNm0wbtw4XLt2rcZ1fn5+2LNnD5KTkzFq1ChIpVIdJyWihmLgwIEYM2aMxuzo0aPYsmWLQImoLvEolqgeVFZWYseOHVi+fDkyMzO1rgsICMDChQsxaNAgiMX8exYR6UZhYSH8/PyQlZWlnllZWSEhIQFeXl4CJqPnxXcSojpUXl6ODRs2wMvLC5MnT9Za6jp27Ij9+/fj/PnzeOWVV1jqiEin7OzssH37do1ZaWkpxo0bB6VSKVAqqgt8NyGqA2VlZVi7di1atmyJd999V+NvwQ8KCQnBgQMHEBsbi4EDB0IkEuk4KRHRX1566SW8+eabGrM//vgD69evFygR1QUexRI9h5KSEmzevBkff/yxxrfMHhYWFoZFixYhIiKCZY6I9Ma9e/fQvn173LhxQz0zMzPDhQsX0LZtW+GC0TNjsSN6BsXFxdi4cSPWrFmDu3fval0XHh6OhQsXIjw8nIWOiPTSiRMnEB4erjHr3LkzoqOj+UUuA8SjWKKnUFhYiMWLF6NFixaYO3eu1lIXERGBEydO4Pjx4+jVqxdLHRHprZ49e2LatGkas7Nnz2LVqlUCJaLnwR07olqQyWRYv3491q9fj6KiIq3r+vTpgwULFiA0NFSH6YiInk9ZWRk6dOiAq1evqmcmJiY4d+4c2rdvL2AyelosdkSPkZ+fj08++QSfffYZ7t27p3XdgAEDMH/+fHTu3FmH6YiI6s6ff/6JsLAwjW/FBgQE4OzZs7yloQHhUSxRDfLy8jBr1iy0aNECy5cv11rqBg8ejLi4OERFRbHUEZFB69KlC2bNmqUxS0hIwJIlSwRKRM+CO3ZED8jJycHq1auxZcsWlJeX17hGJBJhyJAhmD9/Po8oiMioVFZWolOnTkhOTlbPJBIJTp8+jeDgYAGTUW2x2BEBuHXrFlatWoXPP/8clZWVNa4RiUQYMWIE5s2bB19fXx0nJCLSjfj4eISEhEAul6tnPj4+iI+Ph7m5uYDJqDZ4FEsNWkZGBqZMmQIvLy9s2LChxlInFosxevRoXLx4Ebt372apIyKjFhQUhPnz52vMLl26hAULFgiUiJ4Gd+yoQUpLS8Py5cuxc+dOjb+VPkgikWDMmDGYO3cuWrVqpeOERETCqa6uRpcuXRAfH6+eiUQi/PHHHwgLCxMwGT0Jix01KKmpqVi2bBm+/vprKBSKGteYmJhg3LhxmDNnDjw9PXWckIhIP6SkpCAoKAhVVVXqmZeXFxISEmBlZSVgMnocHsVSg3D58mWMHj0abdu2xc6dO2ssdaamppgyZQquXbuGrVu3stQRUYPm6+uLxYsXa8yuX7+O2bNnC5SIaoM7dmTUUlJSsGTJEuzZswfa/lE3NzfHpEmTMGvWLLi5uek4IRGR/lIoFOjevTtOnz6tMT969Ch69+4tUCp6HBY7Mkr3r720b98+rWssLCwwefJkzJw5E66urjpMR0RkOFJTUxEQEKBxCSgPDw8kJSXBxsZGwGRUEx7FklGJi4vDoEGDEBgYqLXUWVlZYfbs2bhx4wbWrFnDUkdE9Bje3t5YuXKlxiwzMxMzZswQKBE9DnfsyCicPXsWkZGROHDggNY1jRo1wrvvvovp06fD0dFRh+mIiAybUqlEREQEjh8/rjH/5Zdf0K9fP4FSUU1Y7MigxcTEIDIyEocPH9a6xtbWFtOnT8e0adNgb2+vw3RERMbjxo0b8Pf3R0lJiXrm6uqK5ORkODg4CJiMHsSjWDJIJ0+eREREBMLCwrSWOnt7eyxevBgZGRn48MMPWeqIiJ5DixYt8Mknn2jMcnJyMHXqVIESUU24Y0cGQ6VS4fjx44iMjMSJEye0rnN0dMS//vUvvP3222jUqJEOExIRGTeVSoWXX34Z//3vfzXm+/btw6uvvipQKnoQix3pPZVKhSNHjiAyMhLR0dFa1zk5OeH999/HW2+9BWtrax0mJCJqOLKysuDn54fCwkL1zNHRESkpKXBychIuGAHgUSzpMZVKhQMHDqBLly546aWXtJY6V1dXrFu3Dunp6Zg5cyZLHRFRPXJzc8OGDRs0Zvn5+Zg8ebLW64WS7nDHjvSOSqVCVFQUIiMjNe5T+DB3d3fMmTMHEyZMgLm5uQ4TEhE1bCqVCq+88gr+/e9/a8y/+eYbjBo1SphQBIDFjvSIUqnEzz//jMWLFyMhIUHrOg8PD8ydOxevv/46zMzMdJiQiIjuy8vLg5+fH/Lz89UzOzs7pKSkoGnTpgIma9h4FEuCUygU2LNnD9q3b48hQ4ZoLXUtW7bEtm3bkJqaijfffJOljohIQM7Ozti8ebPGrLCwEG+88QaPZAXEHTsSjFwux/fff4+lS5fi8uXLWtd5e3tj/vz5eO211yCVSnWYkIiInuS1117Dd999pzHbtm0bJkyYIFCiho3FjnSuuroa3377LZYuXYpr165pXde2bVssWLAAw4cPh0Qi0WFCIiKqLZlMBl9fX+Tm5qpnjRo1QlJSEpo3by5gsoaJR7GkM1VVVdi2bRvatGmDcePGaS11fn5+2LNnD5KTk/Haa6+x1BER6TEHBwd88cUXGrN79+5h/PjxUCqVAqVquFjsqN5VVlZiy5Yt8Pb2xsSJE5Genl7juoCAAPz4449ISEjAsGHDWOiIiAxE//79MW7cOI3ZsWPHsGnTJoESNVw8iqV6U1FRgW3btmHFihXIysrSuq5jx45YuHAhBgwYAJFIpMOERERUV4qKiuDv74+bN2+qZ5aWlrhw4QK8vb0FTNawsNhRnSsrK8PWrVuxevVq5OTkaF0XEhKCRYsWoU+fPix0RERG4OjRo3jhhRc0ZqGhoTh58iRPYXSER7FUZ0pKSrB69Wp4enpixowZWktdWFgYfv31V5w+fRp9+/ZlqSMiMhIRERGYMmWKxiwmJgZr164VKFHDwx07em7FxcXYuHEj1qxZg7t372pdFx4ejoULFyI8PJxljojISJWUlCAgIABpaWnqmZmZGeLj49GuXTsBkzUMLHb0zAoLC7FhwwasXbsWBQUFWtdFRERgwYIF6NGjhw7TERGRUP744w/07NlT40LFnTp1QkxMDExMTARMZvx4FEtPTSaTYdGiRWjRogUWLlyotdT16dMH0dHROHLkCEsdEVED0r17d7z33nsas3PnzmHFihUCJWo4uGNHtZafn4+1a9diw4YNuHfvntZ1AwYMwPz589G5c2cdpiMiIn1SXl6OoKAgjTsLSaVSxMbGIjAwULhgRo7Fjp4oLy8Pa9aswaZNm1BaWqp13eDBgzF//nwEBQXpMB0REemrs2fPomvXrhoXKvb390dsbCzv911PeBRLWuXk5GDGjBnw9PTE6tWrayx1IpEIQ4cORUJCAn766SeWOiIiUuvcuTM++OADjVlSUhIiIyMFSmT8uGNHj7h16xZWrVqFzz//HJWVlTWuEYvFGD58OObNmwdfX18dJyQiIkNRVVWF4OBgJCYmqmdisRgxMTEICQkRMJlxYrEjtczMTKxYsQLbt29HVVVVjWvEYjH++c9/Yu7cuWjTpo2OExIRkSFKSEhAcHAwqqur1bM2bdrg/PnzsLCwEDCZ8eFRLCE9PR2TJk1Cq1atsHnz5hpLnVQqxfjx43HlyhXs2rWLpY6IiGotICAAixYt0phduXIF8+bNEyiR8eKOXQOWmpqKZcuW4euvv4ZCoahxjYmJCcaNG4c5c+bA09NTxwmJiMhYyOVyhIaGIjY2Vj0TiUT4/fffeUmsOsRi1wBdvnwZS5cuxe7duzW+qfQgU1NTvPHGG5g9ezY8PDx0nJCIiIzRpUuX0KFDB43Pb3t6eiIxMRHW1tYCJjMePIptQFJSUjBy5Ei0a9cO33zzTY2lztzcHO+++y7S0tKwceNGljoiIqozPj4+WLp0qcYsPT0d77//vkCJjA937BqAhIQELFmyBPv27dO6xsLCApMnT8bMmTPh6uqqw3RERNSQKBQKhIeH49SpUxrzw4cP48UXXxQolfFgsTNi8fHxWLx4Mf79739rXWNlZYV33nkHM2bMgJOTk+7CERFRg3X9+nW0b98eZWVl6pm7uzuSkpJgZ2cnXDAjwKNYI3T27FkMGDAAHTt21FrqGjVqhHnz5uHGjRtYsWIFSx0REemMl5cXVq9erTG7devWI/eXpafHHTsjEhMTg8jISBw+fFjrGltbW0yfPh3Tpk2Dvb29DtMRERH9j1KpxIsvvojffvtNY75//34MHDhQoFSGj8XOCJw8eRKRkZGP/J/jQfb29pgxYwamTp0KW1tbHaYjIiKqWWZmJvz8/HDv3j31zNnZGSkpKWjcuLGAyQwXj2INlEqlwrFjxxAeHo6ePXtqLXWOjo5YsWIFMjIyMH/+fJY6IiLSGx4eHli3bp3GLC8vD2+//bYwgYwAd+wMjEqlwpEjRxAZGYno6Git65ycnDBr1iy89dZbsLKy0mFCIiKi2lOpVBgwYAAOHDigMd+zZw+GDRsmUCrDxWJnIFQqFQ4dOoTIyEicOXNG6zpXV1fMnj0bEydOhKWlpQ4TEhERPZucnBz4+vqioKBAPWvcuDFSUlLg7OwsYDLDw6NYPadSqbB//34EBwejX79+Wkudu7s7PvvsM6SlpWHatGksdUREZDBcXV2xceNGjdndu3cxadIkcP/p6XDHTk8plUr8/PPPWLx4MRISErSua968OT744AO8/vrrMDMz02FCIiKiuqNSqTB06FD8+OOPGvNdu3ZhzJgxAqUyPCx2ekahUGDfvn1YvHgxUlJStK5r2bIl5s2bh9GjR8PExESHCYmIiOrHnTt34Ovrizt37qhntra2SE5Ohru7u4DJDAePYvWEXC7Ht99+Cz8/P4wYMUJrqfP29sauXbtw5coVjB8/nqWOiIiMRpMmTbB161aNWVFRESZMmMAj2VpqEDt2CoUCMpkMeXl5yMvLw53cXFSWl0OpUEAskcDMwgJNXFzg7OwMZ2dnODg4QCKR6CRbdXU1du/ejaVLlyI1NVXrurZt22LBggUYPny4zrIREREJYfTo0fjmm280Zlu3bsWkSZM0Zvr8/i4Uoy52BQUFSEhIQFJ8PCpKS6GSy2FdXg5bmQwmcjnEKhWUIhGqpVIUOTigxMICIqkU5lZW8A8KQkBAQL3dnaGqqgpff/01li1bhrS0NK3r/Pz8sGDBArz66qtG/w8jERER8Nf7t5+fH7Kzs9UzKysrJCUlwdPTU6/f34VmlMUuOzsbMadOIT01FSZlZfDIvAlXmQy2paUwUSi0Pq5aIkGRlRVyHByQ6dEM1ZaW8PT2Rlj37nB1da2TbJWVldixYweWL1+OzMxMresCAwOxYMECDBo0CGIxT8yJiKhhOXToEF5++WWNWb9+/TB29GjcuHZN797f9YVRFTu5XI7o6GjERkfDOj8frTIy4Z6fD4lS+dTPpRCLccvREdeae6DE0RHBYWEICwuDVCp9pmwVFRXYtm0bVqxYgaysLK3rOnXqhIULF6J///4QiUTP9FpERETGYOLEidi2bRskEglCQ0MRFhyMplVV8MnO0Zv3d31jNMUuNzcXB6KiUHArC21TU+GdlQVxHfxoSpEIqW5uuOztDQd3N7w8cCBcXFxq/fiysjJ8/vnnWLVqFXJycrSuCwkJwaJFi9CnTx8WOiIiIgDFxcXo2bMnOnXoADd7e3hfvgy3q6lwdnR87iL2vO/v+sooil1GRgZ+3rMHltk56HjpEmzKyur8NYotLRHn44Oypk0xePgwNG/eHMBfhfLw4cPw9/dHUFCQen1JSQm2bNmC1atX4/bt21qfNywsDIsWLUJERAQLHRER0QMyMjLw/VdfwSQzEz5xcbAsLgYAmJqYorGjI+riXVPb+7uhMvhil5GRgR+/+w6NMzLR+eJFSJ9hW7a25GIxzvi2g8zDA6+OHImqqip0794deXl5AP66r13fvn2xceNGrFmzBvn5+VqfKzw8HAsXLkR4eDgLHRER0UMefH9v++dpVNy7p/HrNo1sYG1tXSev9fD7uyGXO4Mudrm5ufj+q69gl34DXVNS6uTo9UmUIhFO+/mioHkL/PTLf3Dq1Cn1r9nY2EAsFqOwsFDr4yMiIrBgwQL06NGj3rMSEREZooff30VKJW7fuQOFQv7AKhGaNGkCkzr6bNz99/fCFp4YMWa0wR7LGuzXLeVyOQ5ERcEyOwchFy/qpNQBgFilQkjKRUgybqBtq1YalyApLi7WWur69u2LmJgYHDlyhKWOiIhIi5re30UiEezt7ACNw1cVCgsLUFfv/vff3y1ysnEwKgpyufzJD9JDBlvsoqOjUXArCx0vXarX49eaVJWWotWff8LNwQGhoaGPXTtgwACcPXsWBw8eRNeuXXWUkIiIyDBpe383NTWFtZWVxtrq6mpUV1XV2WtLlUp0vHgJsqwsxMTE1Nnz6pJBFrvs7GzERkejbWpqvXxR4nGqqqtRWFgIq+JieF++jLDg4Bq3awcPHoy4uDhERUUhODhYpxmJiIgM0ZPe3xvZNIJUqnkrzbo+r7MtK0Obq6k4e+rUY69moa8MstjFnDoF6/x8eD/menD1QQXg7t27uP+PUdOrV+FYUoKwh3bt3NzcsG/fPo1vyRIREdHjPen9XQQRHBwcYCI1ASCCpaUVTE1N6zxH66wsWOfnI/qBz9EbCoMrdgUFBUhPTUWrjEydfa7uPnl1NVSq/20Li1UqNLt2Da09PWFra6ueZ2Vl4caNGzrNRkREZMhq+/4ulUjQpEkTNHV1hZ2tbZ1c8uRhYpUKXhmZSL96FQUFBfXwCvXH4IpdQkICTMrK4P6YS4nUF6nJX39DeJDjzZuwlMsREBCgnjk5OcHd3V3H6YiIiAyXkO/vNWmWnw9pWRkSExOFjvJUDOr+GQqFAknx8fDIvPlMtxF5XiIAjo0bo6i4GCqVChKJGCKRCC1uZaFbSAisra3h6uqK9957r162homIiIyR0O/vNZEolWh+8yYS4+LQrVs3jatg6LN63bE7d+4c3n//fa2/HhUVhbVr19b6+WQyGSpKS3H14kUMPB+Pgefj4Rt9CgPi4zDwfDy23br1XHmvlpZiTFIiXjgXiz5x5/DhtWuoVirxaUYGvs7OBvDXt3KaODrCqUkTNHZoDAd7B3iVl8OuUSOMGjUKaWlpGDlyJL7//vsaX2PLli3Ys2cPAODEiRPw9fVFSEjIU/9eaLNt2zZ4e3tDJBKhpKTkuZ+PiIgaBqlUisDAQPV/qp7h26arVq16pte+//7uKpNpzD/LzEC/+Dj0j4/DKxfO42ZFxWOf54tbN5/r8Z3/PK3xv13v/pVL9lCuhz3rz/2wGzduICwsDObm5vjss8+e6TlqfYHizz//HJMmTXqmF6krycnJOPjDDxjw+wn1V6B7xZ7FL0EdYfVAk1apVFABED/FHR3KFQr0Px+PSK9WCLO3h0qlwn/u3EHvxo2x/dYt2JuYYHTTpo88TgWgpLoa/+nRAz8cPICUlBQAgFgsRnZ2NpydnbW+5ltvvYXw8HCMGDGi1jnvUygUNf7tISkpCdbW1ujVqxeSk5Pr7KrcRERk3BwdHR97x6T6eg6FQoFLly498v4eX1yMdRk38KWfP6QiEXIrK2EhEcP2oW/FPqjzn6dxtkvXOnk8AFRLJPilZw+8PHQo/Pz86uzn1vYeLpPJkJqaiqioKLi6uuKdd96p9XPeV+uj2E2bNmHSpEkoKSnBlClTcOnSJahUKqxfvx5hYWEoLi7G5MmTkZSUBLFYjE2bNqGqqgqfffYZ9u3bh+PHj+Pdd9+FWCyGiYkJzp07h507dyI5ORkff/wx0tLSMH78eMhkMrRo0QI7d+6Eg4MDwsPDERISgmPHjiE/Px+jevXSet26zn+exlAXF5wuLMSaNm3w3/x8HLl7F9VKJV5zbYqRrq4AgM03Mx+Z/+fOHQTb2CLM3h4AIBKJMNDJ6ZHX+C4nB/vyclGpVKKthQVmNnaEUiHHuVOnkJ6erl6nVCqxePFi/Pe//4WJiQnc3d2xdetWrF+/Hvb29rC2tsb333+PgwcPIioqCoGBgbh69Srmzp2L/Px8zJs3D7m5uTAzM8Py5cvh5eWF999/H/b29khOTkb37t0xefLkR/JZWVlBpVJBLpcjPT0dVg9d84eIiKgmSqUSaWlpGrPjx4/js88+Q2VlJdq3b48lS5ZALBZj7ty5SE5ORlVVFUaNGoXRo0fj448/RmFhIdq1a4eOHTti4sSJePvtt7F//34AwLJly9C6dWsMGTIEPXr0wNChQ3HixAm89957+PPPP3Hgp5+wvbgYve0dMNXDA3kVFWgkkQBKJVRiMVzMzNS5fpfJsPFmJiqVSrS3boTIVq2wLiMD9+RyDDwfjyAbG3S1tYONVArp35s8T3r8w5tB97uCLCUZ2UVFWLduHQAgMjISe/fuhUQiwRtvvIHc3FwUFhYiMDAQYWFh2LhxI1auXImvv/4aIpEIc+bMwahRo/D7779j2bJlsLOzQ25uLk6ePPnIn4GDgwNCQkJw6NChZ/5zrPWOnYWFBcrLyzFnzhyEhIRg8ODBuHXrFvr164eEhATMnDkTZmZmWLp0KRQKBUpLSxEfH68udgMGDMC7776LF154AUVFRbC1tdUodv3798eYMWMwbNgwrFy5EllZWfj0008RHh6OHj16IDIyEm9OmoQrJ07gC6f/7YI9uGPX+tQf2NrOF70cHHCyQIaTsgLM9/JClVKJkYkJ2OjTDlfLSmucf5mVBTczM4x1c3vkZ/80I0O9Y1dYXQ0bqRS3b9/GqtwcdLGyQjcrK7x88yb69e+PH3766Zn/MIiIiBqiZu7umBsWhjaxsZiXm4t/2tvD09QUb2dlQaFSIdjSCoNdXNC5SRPIqqvx3uXL+NzXF2ZiMT66fg0dbWzQv4mTxo5biVyOEYkJkKtU6GZnj4FOTmjfqFGtHv9gh4hu1QpLTp7A7ydO4MKFC1i7di0OHjwIMzMzyGQyODg4aOzYnTt3Dm+++Saio6NRVlaG4OBg/PHHH7h69Sr+8Y9/4NKlS2hawwnggz788EM4OjrW747dfUeOHMHBgwfx0UcfAfjrum5VVVU4duwYoqKiAAASiQQ2NjYajwsLC8OcOXNw6dIlDB06VOPyIAAQGxuL//znPwCA0aNHo1+/fupf+8c//gEAcHd1RUxxMeBU8/GmuViMXg4OAIDogkIck8lwtrgIwF9/wJkV5VrngAq1Obm9XFqKj9Ou455cjnsKBVxNTNDNygot7e1xJjb2yU9AREREGu7KZFj23//CtLwc5Uolsqqr4Wtuji/c3XGhvBxx5eV4M/UqPpVKUa1U4kpZKYYmXAAAVCqVcDY1e+Q5raVS/LtDEM4UFiKmsBDjkpOwvq0Pqmrx+Ae7QnlKMkolEly/fh3Hjh3DuHHjYPb37p/D353jQadOncKrr74Kc3NzmJubo3fv3oiNjYWtrS3CwsKeWOqeV62LXdu2bQH89fm1X375BR4eHk/1QnPmzEHfvn1x4MABBAcH48yZMxq/LnqgVan+vi/cffd/A6FSQfmYDUZz8f++C6ICMNXDA4Mf+ozb0buyGufp5eU4X3zviT/H3NRUfNKiBZzk1fi2oABVf+eZHhqKn0pLkXnz5hOegYiIiB7k07o13vb0RMuHLi0iFYnQydISnSwtYSuR4pjsLrrZ2aOXvQOWt279xOeVikQIs7dHmL097E1M8FstH/9gh0ho6Yl7oaHo2bOn+lj5aTzYaSwtLZ/68U+r1t+Kvb8dGBERgY0bN6rnCQkJ6vnmzZsB/PWhwOLiYo3HX79+HQEBAZg7dy58fHw0Po8GAJ06dcKPP/4IANi9eze6d+/+aNin+KpxqJ0d9uXlokKhAACklZWhUqnUOh/YxAlni4oQU/jXhQhVKhV+yM1F6d/r7itXKtDczg4KsRgnSksBAEqVCnfLy+Fcw2fyiIiI6PGup6ejRC4HANyWy1GkUCCzqgpZ1dUAALFIjGyo4GZmjsBGjXCmqBA5lZUAgILqauT+/d8lIhEUf2+4pJWVIbO8HMBf7+mpZaVPfPx9D3YFpUiM2/n5qKioQEREBHbs2IHKv9ff/7asRCKB4u++0K1bN/z000+orKxEQUEBjh8/rtNbi9Z6x27ChAkAgIULF2Lq1Knw9/eHQqFA7969sWHDBixYsABvvfUW/P39IZFI1CXvvrVr1+L48eOQSCQIDg5G165dce3aNfWvf/rppxg3bhwiIyPRvHlz7Nq165EMpmZmUNbym67hDg5ILSvFkIQLUAFobGKCLe18tc4tJRJsatcOS9Ou48Nr1yERAV3s7DDoobI2pZkHXr1wAe7mZvCzsYFUpYISwOfnziH/738A7wsNDUVRURFUKhX69euHjz76CEuWLEHjxo0xefJkTJo0CYMGDcLLL7+Mr7/+GhcvXsTy5cuRl5eHd999F+np6ZDL5Xjttdcwc+ZMjfXa7Ny5E0uWLEFeXh6cnJwwYsQILF26tFa/Z0RE1HA1a9YMNx86dfr111/x4YcfQi6XQyqVYuPGjejQoQMmTJiACxcuoFWrVigtLcWsWbPQo0cPzJs3D4cOHUKPHj2wbt06fPrpp/jiiy/g5eUFKysr9OnTB6NHj0bbtm1x7tw59ZUb3ps2Dat//BEWFRWwkkjwSZs2kFfLsTjtOkrkCkAE+FpZ45+urjCXSPBhq1aYcvEi5ColpCIxlnh7w8XMDIOdnNE/Pg4hdnYY4uyMyOvXUfJ34arN4+97sCuUXroE26aueGf6dLz88suIi4tDUFAQTExM8MYbb+Cdd97B2LFj4e/vj169emHjxo0YOnQoOnbsCJFIhI8++giurq64cuXKE/8MiouL0a5dOxQXF0MikeDjjz9+6jtZ1frLE/rgt99+w5XDh/HC6T+FjqKhqroK/+3UCQcvXcKxY8cA/HV8nJOTA/u/v2VLRERENdPX93cAONK1C9q89BJ69+4tdJRaMag7Tzg7OyPOwgLVEglMHjoiFZLI3AKKxo3x9ttvo2nTprhz5w5mzpzJUkdERFQL+vr+Xi2RoMTC4rHXpNU3BlfsRFIpiqys4PjQZ/iEVGRlBZFUiu7du+OVV17RyWsuXboUP/zwg8ZsxowZGDNmjE5en4iIqK7o+/t7XRe7pKQkjB49WmPWqlUr7Nu377mf26CKnYODA8ytrJDj4KBXf/A5jf/KVdPXnuvLvHnzMG/ePJ29HhERUX1paO/v/v7+uHDhQp0+5331eq/YuiaRSOAfFIRMj2ZQiPUjukIsRkazZmjfsaPB3CCYiIhIn/D9ve7ox+/eUwgICEC1pSVuOToKHQUAcNPREXJLS7Rv317oKERERAaL7+91w+CKnb29PTy9vXGtuUetL31SX5QiEa4394Bn69b8ogQREdFz4Pt73TC4YgcAYd27o8TREak13NdVl666uaHE0RFh3boJmoOIiMgY8P39+RlksXN1dUVwWBgue3ujWAe356hJkaUlrrT2Rudu3eDq6ipIBiIiImPC9/fnZ5DFDgDCwsJg7+6GOB8fyHX8QUu5WIy4dj5wcHNDaGioTl+biIjImPH9/fkYbLGTSqXoN3Agypo2xRnfdjo7j1eKRDjj2w7lrk3x8sCBkEoN6ooxREREeo3v78/HYIsdALi4uGDw8GGQeXjgtJ9vvTd7uViM036+kHl4YPDwYXBxcanX1yMiImqI+P7+7AzqXrHaZGRk4Oc9e2GZnY2Oly7Bpqyszl+jyNISce18UO7aFIOHD0Pz5s3r/DWIiIjof/j+/vSMotgBQG5uLg5ERaHgVhbapqbCOysL4jr40ZQiEa66ueFKa284uLnh5YEDDbrJExERGRK+vz8doyl2ACCXyxEdHY3Y6GhY5+fDKyMTzfLzIVEqn/q5FGIxbjo64npzD5Q4OqJzt24IDQ012DN3IiIiQ8X399ozqmJ3X3Z2NmKio5F+9SqkZWVofvMmXO/KYFtaChOFQuvjqiUSFFlZIaexAzKaNYPc0hKerVsjzEC/8kxERGRM+P7+ZEZZ7O4rKChAYmIiEuPiUFFaCpVcDuvyctjICmAql0OsUkIpEqNKKkWxgz1KLCwgkkphbmWF9h07on379gZ3xWkiIiJjx/d37Yy62N2nUCggk8mQl5eHvLw83MnNRVVFBRRyOSRSKUzNzdHExQXOzs5wdnaGg4ODQd3wl4iIqCHi+/ujGkSxIyIiImoIDPo6dkRERET0Pyx2REREREaCxY6IiIjISLDYERERERkJFjsiIiIiI8FiR0RERGQkWOyIiIiIjASLHREREZGRYLEjIiIiMhIsdkRERERGgsWOiIiIyEiw2BEREREZCRY7IiIiIiPBYkdERERkJFjsiIiIiIwEix0RERGRkWCxIyIiIjISLHZERERERoLFjoiIiMhIsNgRERERGQkWOyIiIiIjwWJHREREZCRY7IiIiIiMBIsdERERkZFgsSMiIiIyEix2REREREaCxY6IiIjISPw/UaPnfRLk2TkAAAAASUVORK5CYII=", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "graph_search_space = tpot2.search_spaces.pipelines.GraphSearchPipeline(\n", + " leaf_search_space = fss_search_space,\n", + " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", + " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", + " max_size = 10,\n", + ")\n", + "\n", + "graph_search_space.generate(rng=4).export_pipeline().plot()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Optimize with TPOT\n", + "\n", + "For this example, we will optimize the DynamicUnion search space" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: 100%|██████████| 5/5 [00:34<00:00, 6.96s/it]\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.9916366443643849\n" + ] + } + ], + "source": [ + "import tpot2\n", + "import sklearn.datasets\n", + "from sklearn.linear_model import LogisticRegression\n", + "import numpy as np\n", + "\n", + "\n", + "final_classification_search_space = SequentialPipeline([dynamic_fss_space, classification_search_space])\n", + "\n", + "est = tpot2.TPOTEstimator(generations=5, \n", + " scorers=[\"roc_auc_ovr\", tpot2.objectives.complexity_scorer],\n", + " scorers_weights=[1.0, -1.0],\n", + " n_jobs=32,\n", + " classification=True,\n", + " search_space = final_classification_search_space,\n", + " verbose=1,\n", + " )\n", + "\n", + "\n", + "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", + "\n", + "est.fit(X_train, y_train)\n", + "print(scorer(est, X_test, y_test))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see that this pipeline performed slightly better and correctly identified group one and group two as the feature sets used in the generative equation." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('featuresetselector-1',\n",
+       "                                                 FeatureSetSelector(name='group_one',\n",
+       "                                                                    sel_subset=['a',\n",
+       "                                                                                'b',\n",
+       "                                                                                'c'])),\n",
+       "                                                ('featuresetselector-2',\n",
+       "                                                 FeatureSetSelector(name='group_two',\n",
+       "                                                                    sel_subset=['d',\n",
+       "                                                                                'e',\n",
+       "                                                                                'f']))])),\n",
+       "                ('randomforestclassifier',\n",
+       "                 RandomForestClassifier(bootstrap=False,\n",
+       "                                        class_weight='balanced',\n",
+       "                                        max_features=0.3314976075207,\n",
+       "                                        min_samples_leaf=3,\n",
+       "                                        min_samples_split=15,\n",
+       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('featureunion',\n", + " FeatureUnion(transformer_list=[('featuresetselector-1',\n", + " FeatureSetSelector(name='group_one',\n", + " sel_subset=['a',\n", + " 'b',\n", + " 'c'])),\n", + " ('featuresetselector-2',\n", + " FeatureSetSelector(name='group_two',\n", + " sel_subset=['d',\n", + " 'e',\n", + " 'f']))])),\n", + " ('randomforestclassifier',\n", + " RandomForestClassifier(bootstrap=False,\n", + " class_weight='balanced',\n", + " max_features=0.3314976075207,\n", + " min_samples_leaf=3,\n", + " min_samples_split=15,\n", + " n_estimators=128))])" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "est.fitted_pipeline_" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Other examples" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## dictionary" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
def
2212.819930-0.677731-2.883151
121-0.068495-1.3981281.239476
832.225071-1.9021561.425918
2432.5929170.764790-2.120585
9922.3292792.362780-3.780878
............
2541.850187-2.177608-4.088455
19-0.9494341.062798-3.421324
9562.105221-0.1696331.743979
1502.171954-1.343100-0.346960
6290.348571-1.1710490.854003
\n", + "

750 rows × 3 columns

\n", + "
" + ], + "text/plain": [ + " d e f\n", + "221 2.819930 -0.677731 -2.883151\n", + "121 -0.068495 -1.398128 1.239476\n", + "83 2.225071 -1.902156 1.425918\n", + "243 2.592917 0.764790 -2.120585\n", + "992 2.329279 2.362780 -3.780878\n", + ".. ... ... ...\n", + "254 1.850187 -2.177608 -4.088455\n", + "19 -0.949434 1.062798 -3.421324\n", + "956 2.105221 -0.169633 1.743979\n", + "150 2.171954 -1.343100 -0.346960\n", + "629 0.348571 -1.171049 0.854003\n", + "\n", + "[750 rows x 3 columns]" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import tpot2\n", + "import pandas as pd\n", + "import numpy as np\n", + "from sklearn.linear_model import LogisticRegression\n", + "import sklearn\n", "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", + "subsets = { \"group_one\" : ['a','b','c'],\n", + " \"group_two\" : ['d','e','f'],\n", + " \"group_three\" : ['g','h','i'],\n", + " }\n", "\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))" + "fss_search_space = tpot2.search_spaces.nodes.FSSNode(subsets=subsets)\n", + "\n", + "selector = fss_search_space.generate(rng=1).export_pipeline()\n", + "selector.set_output(transform=\"pandas\")\n", + "selector.fit(X_train)\n", + "selector.transform(X_train)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## list" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
def
2212.819930-0.677731-2.883151
121-0.068495-1.3981281.239476
832.225071-1.9021561.425918
2432.5929170.764790-2.120585
9922.3292792.362780-3.780878
............
2541.850187-2.177608-4.088455
19-0.9494341.062798-3.421324
9562.105221-0.1696331.743979
1502.171954-1.343100-0.346960
6290.348571-1.1710490.854003
\n", + "

750 rows × 3 columns

\n", + "
" + ], + "text/plain": [ + " d e f\n", + "221 2.819930 -0.677731 -2.883151\n", + "121 -0.068495 -1.398128 1.239476\n", + "83 2.225071 -1.902156 1.425918\n", + "243 2.592917 0.764790 -2.120585\n", + "992 2.329279 2.362780 -3.780878\n", + ".. ... ... ...\n", + "254 1.850187 -2.177608 -4.088455\n", + "19 -0.949434 1.062798 -3.421324\n", + "956 2.105221 -0.169633 1.743979\n", + "150 2.171954 -1.343100 -0.346960\n", + "629 0.348571 -1.171049 0.854003\n", + "\n", + "[750 rows x 3 columns]" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import tpot2\n", + "import pandas as pd\n", + "import numpy as np\n", + "from sklearn.linear_model import LogisticRegression\n", + "import sklearn\n", + "\n", + "subsets = [['a','b','c'],['d','e','f'],['g','h','i']]\n", + "\n", + "fss_search_space = tpot2.search_spaces.nodes.FSSNode(subsets=subsets)\n", + "\n", + "selector = fss_search_space.generate(rng=1).export_pipeline()\n", + "selector.set_output(transform=\"pandas\")\n", + "selector.fit(X_train)\n", + "selector.transform(X_train)" ] }, { @@ -1045,22 +3856,127 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 19, "metadata": {}, "outputs": [ { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 100%|██████████| 5/5 [00:41<00:00, 8.27s/it]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.9754589371980676\n" - ] + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
def
2212.819930-0.677731-2.883151
121-0.068495-1.3981281.239476
832.225071-1.9021561.425918
2432.5929170.764790-2.120585
9922.3292792.362780-3.780878
............
2541.850187-2.177608-4.088455
19-0.9494341.062798-3.421324
9562.105221-0.1696331.743979
1502.171954-1.343100-0.346960
6290.348571-1.1710490.854003
\n", + "

750 rows × 3 columns

\n", + "
" + ], + "text/plain": [ + " d e f\n", + "221 2.819930 -0.677731 -2.883151\n", + "121 -0.068495 -1.398128 1.239476\n", + "83 2.225071 -1.902156 1.425918\n", + "243 2.592917 0.764790 -2.120585\n", + "992 2.329279 2.362780 -3.780878\n", + ".. ... ... ...\n", + "254 1.850187 -2.177608 -4.088455\n", + "19 -0.949434 1.062798 -3.421324\n", + "956 2.105221 -0.169633 1.743979\n", + "150 2.171954 -1.343100 -0.346960\n", + "629 0.348571 -1.171049 0.854003\n", + "\n", + "[750 rows x 3 columns]" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ @@ -1079,61 +3995,42 @@ "'''\n", "\n", "fss_search_space = tpot2.search_spaces.nodes.FSSNode(subsets=subsets)\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = None, \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "combined_search_space = tpot2.search_spaces.pipelines.SequentialPipeline([fss_search_space, graph_search_space])\n", - "\n", - "\n", - "est = tpot2.TPOTEstimator(population_size=10,generations=5, \n", - " scorers=['roc_auc_ovr'],\n", - " scorers_weights=[1],\n", - " n_jobs=32,\n", - " classification=True,\n", - " search_space = combined_search_space,\n", - " verbose=1,\n", - " )\n", "\n", - "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))" + "selector = fss_search_space.generate(rng=1).export_pipeline()\n", + "selector.set_output(transform=\"pandas\")\n", + "selector.fit(X_train)\n", + "selector.transform(X_train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "note that all of the above is the same when using numpy X, but the column names are now int indeces" + "All of the above is the same when using numpy data, but the column names are replaced int indexes." ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "[[-0.51289317 -2.65333383 -0.68124034 ... 0.66358004 0.25051107\n", - " 0.23444287]\n", - " [-2.51236119 -2.04422708 -0.40301026 ... 0.19208918 0.18041725\n", - " 0.33265205]\n", - " [ 1.49065721 -3.24328369 1.58423463 ... 0.678225 0.27643945\n", - " 0.78710293]\n", + "[[ 1.01731337 -0.93465307 -3.24039768 ... 0.39357783 0.5321955\n", + " 0.60973674]\n", + " [ 1.89576966 2.25608154 0.26643722 ... 0.7163676 0.06128163\n", + " 0.90206017]\n", + " [ 0.92384181 -0.1537421 -3.93631929 ... 0.75042522 0.99312028\n", + " 0.66668663]\n", " ...\n", - " [ 0.07953312 -1.10920624 1.0985733 ... 0.68578896 0.87562184\n", - " 0.28616797]\n", - " [-2.43045085 0.42769074 2.57608083 ... 0.0447371 0.31649605\n", - " 0.52711618]\n", - " [-2.50001651 -0.46482725 2.0546322 ... 0.34574358 0.96130892\n", - " 0.93289141]]\n" + " [-0.97817968 0.74965196 -2.18630603 ... 0.6798269 0.61574192\n", + " 0.99335315]\n", + " [ 2.30789516 0.97168873 -0.64014737 ... 0.77912995 0.80199131\n", + " 0.29313876]\n", + " [-0.18979565 0.24979476 -2.08789801 ... 0.62763484 0.98615568\n", + " 0.46710305]]\n" ] } ], @@ -1155,22 +4052,24 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 21, "metadata": {}, "outputs": [ { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 100%|██████████| 5/5 [00:34<00:00, 6.92s/it]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.8762676829662166\n" - ] + "data": { + "text/plain": [ + "array([[-2.28091426, -1.08355329, -2.53836197],\n", + " [ 3.39870077, -2.05411876, -1.40671745],\n", + " [-0.34207793, 1.60073506, -1.42660153],\n", + " ...,\n", + " [-0.68208863, 2.89047595, -0.97955455],\n", + " [-0.28393195, -0.62493036, 0.21022641],\n", + " [-1.21225128, 1.97437192, 1.27051581]])" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ @@ -1186,30 +4085,9 @@ " }\n", "\n", "fss_search_space = tpot2.search_spaces.nodes.FSSNode(subsets=subsets)\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = None, \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "combined_search_space = tpot2.search_spaces.pipelines.SequentialPipeline([fss_search_space, graph_search_space])\n", - "\n", - "\n", - "est = tpot2.TPOTEstimator(population_size=10,generations=5, \n", - " scorers=['roc_auc_ovr'],\n", - " scorers_weights=[1],\n", - " n_jobs=32,\n", - " classification=True,\n", - " search_space = combined_search_space,\n", - " verbose=1,\n", - " )\n", - "\n", - "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))" + "selector = fss_search_space.generate(rng=1).export_pipeline()\n", + "selector.fit(X_train)\n", + "selector.transform(X_train)" ] } ], From a21aa7b8f44ae599b7c325573a68e708cdaf2527 Mon Sep 17 00:00:00 2001 From: perib Date: Tue, 24 Sep 2024 17:22:12 -0700 Subject: [PATCH 20/44] add fancy search space --- Tutorial/3_Feature_Set_Selector.ipynb | 585 ++++++++++++++++++++++++++ 1 file changed, 585 insertions(+) diff --git a/Tutorial/3_Feature_Set_Selector.ipynb b/Tutorial/3_Feature_Set_Selector.ipynb index 5341cf0e..a175a587 100644 --- a/Tutorial/3_Feature_Set_Selector.ipynb +++ b/Tutorial/3_Feature_Set_Selector.ipynb @@ -3537,6 +3537,591 @@ "est.fitted_pipeline_" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Getting Fancy\n", + "\n", + "If you want to get fancy, you can combine more search spaces in order to set up unique preprocessing pipelines per feature set. Here's an example:" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": {}, + "outputs": [], + "source": [ + "dynamic_transformers = DynamicUnionPipeline(get_search_space(\"all_transformers\"), max_estimators=4)\n", + "dynamic_transformers_with_passthrough = tpot2.search_spaces.pipelines.UnionPipeline([\n", + " dynamic_transformers,\n", + " tpot2.config.get_search_space(\"Passthrough\")],\n", + " )\n", + "multi_step_engineering = DynamicLinearPipeline(dynamic_transformers_with_passthrough, max_length=4)\n", + "fss_engineering_search_space = SequentialPipeline([fss_search_space, multi_step_engineering])\n", + "union_fss_engineering_search_space = DynamicUnionPipeline(fss_engineering_search_space)\n", + "\n", + "final_fancy_search_space = SequentialPipeline([union_fss_engineering_search_space, classification_search_space])" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('featureunion',\n",
+       "                 FeatureUnion(transformer_list=[('pipeline-1',\n",
+       "                                                 Pipeline(steps=[('featuresetselector',\n",
+       "                                                                  FeatureSetSelector(name='group_one',\n",
+       "                                                                                     sel_subset=[0,\n",
+       "                                                                                                 1,\n",
+       "                                                                                                 2])),\n",
+       "                                                                 ('pipeline',\n",
+       "                                                                  Pipeline(steps=[('featureunion',\n",
+       "                                                                                   FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                                                                                   FeatureUnion(transformer_list=[('pca',\n",
+       "                                                                                                                                                   PCA(n_components=0.93113403057))])),\n",
+       "                                                                                                                  ('passthrough...\n",
+       "                                                                                   FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                                                                                   FeatureUnion(transformer_list=[('quantiletransformer',\n",
+       "                                                                                                                                                   QuantileTransformer(n_quantiles=87)),\n",
+       "                                                                                                                                                  ('columnonehotencoder',\n",
+       "                                                                                                                                                   ColumnOneHotEncoder())])),\n",
+       "                                                                                                                  ('passthrough',\n",
+       "                                                                                                                   Passthrough())]))]))]))])),\n",
+       "                ('randomforestclassifier',\n",
+       "                 RandomForestClassifier(class_weight='balanced',\n",
+       "                                        criterion='entropy',\n",
+       "                                        max_features=0.021545996678,\n",
+       "                                        min_samples_leaf=11,\n",
+       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('featureunion',\n", + " FeatureUnion(transformer_list=[('pipeline-1',\n", + " Pipeline(steps=[('featuresetselector',\n", + " FeatureSetSelector(name='group_one',\n", + " sel_subset=[0,\n", + " 1,\n", + " 2])),\n", + " ('pipeline',\n", + " Pipeline(steps=[('featureunion',\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('pca',\n", + " PCA(n_components=0.93113403057))])),\n", + " ('passthrough...\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('quantiletransformer',\n", + " QuantileTransformer(n_quantiles=87)),\n", + " ('columnonehotencoder',\n", + " ColumnOneHotEncoder())])),\n", + " ('passthrough',\n", + " Passthrough())]))]))]))])),\n", + " ('randomforestclassifier',\n", + " RandomForestClassifier(class_weight='balanced',\n", + " criterion='entropy',\n", + " max_features=0.021545996678,\n", + " min_samples_leaf=11,\n", + " n_estimators=128))])" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "final_fancy_search_space.generate(rng=3).export_pipeline()" + ] + }, { "cell_type": "markdown", "metadata": {}, From 9ff1187b876d34274af14e579225357a425ec0fb Mon Sep 17 00:00:00 2001 From: perib Date: Tue, 24 Sep 2024 18:33:19 -0700 Subject: [PATCH 21/44] fssnode now always selects a new feature set when mutation is called - previously it could sometimes select the same set and not actually change --- tpot2/search_spaces/nodes/fss_node.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tpot2/search_spaces/nodes/fss_node.py b/tpot2/search_spaces/nodes/fss_node.py index a5dfe2b8..386e9450 100644 --- a/tpot2/search_spaces/nodes/fss_node.py +++ b/tpot2/search_spaces/nodes/fss_node.py @@ -47,7 +47,9 @@ def __init__( self, def mutate(self, rng=None): rng = np.random.default_rng(rng) - self.selected_subset_name = rng.choice(self.names_list) + #get list of names not including the current one + names = [name for name in self.names_list if name != self.selected_subset_name] + self.selected_subset_name = rng.choice(names) self.sel_subset = self.subset_dict[self.selected_subset_name] From 8a80f5741fa557264c54649054a314aa2c875067 Mon Sep 17 00:00:00 2001 From: perib Date: Tue, 24 Sep 2024 18:34:04 -0700 Subject: [PATCH 22/44] reorder tutorials, fill in feature set selector tutorial and genetic feature set selector --- Tutorial/2_Search_Spaces.ipynb | 510 +++ Tutorial/3_Feature_Set_Selector.ipynb | 3216 ++++++++++++----- Tutorial/4_Genetic_Feature_Selection.ipynb | 2330 ++++++++++++ Tutorial/5_Genetic_Feature_Selection.ipynb | 626 ---- ...phPipeline.ipynb => 5_GraphPipeline.ipynb} | 0 ...bolic_Regression_and_Classification.ipynb} | 0 .../nodes/genetic_feature_selection.py | 29 +- 7 files changed, 5183 insertions(+), 1528 deletions(-) create mode 100644 Tutorial/4_Genetic_Feature_Selection.ipynb delete mode 100644 Tutorial/5_Genetic_Feature_Selection.ipynb rename Tutorial/{6_GraphPipeline.ipynb => 5_GraphPipeline.ipynb} (100%) rename Tutorial/{4_Symbolic_Regression_and_Classification.ipynb => 6_Symbolic_Regression_and_Classification.ipynb} (100%) diff --git a/Tutorial/2_Search_Spaces.ipynb b/Tutorial/2_Search_Spaces.ipynb index fffbffdc..7eb6755a 100644 --- a/Tutorial/2_Search_Spaces.ipynb +++ b/Tutorial/2_Search_Spaces.ipynb @@ -16712,6 +16712,516 @@ "Rather than create your own search space, you can simply pass the string into the `search_space` param. Alternatively, you can access tpot2.config.template_search_spaces.get_template_search_spaces directly which offers a few more customizable options for each template including `cross_val_predict_cv` and whether or not stacked classifiers/regressors are allowed. Or you can copy the code and customize it manually!" ] }, + { + "cell_type": "code", + "execution_count": 76, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
Pipeline(steps=[('passthrough', Passthrough()),\n",
+       "                ('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0033508168395)),\n",
+       "                ('featureunion-1',\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('featureunion-2',\n",
+       "                 FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                 FeatureUnion(transformer_list=[('estimatortransformer',\n",
+       "                                                                                 EstimatorTransformer(estimator=AdaBoostClassifier(algorithm='SAMME',\n",
+       "                                                                                                                                   learning_rate=0.0473135874378,\n",
+       "                                                                                                                                   n_estimators=436)))])),\n",
+       "                                                ('passthrough',\n",
+       "                                                 Passthrough())])),\n",
+       "                ('linearsvc', LinearSVC(C=0.012266617842, penalty='l1'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + ], + "text/plain": [ + "Pipeline(steps=[('passthrough', Passthrough()),\n", + " ('variancethreshold',\n", + " VarianceThreshold(threshold=0.0033508168395)),\n", + " ('featureunion-1',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('featureunion-2',\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('estimatortransformer',\n", + " EstimatorTransformer(estimator=AdaBoostClassifier(algorithm='SAMME',\n", + " learning_rate=0.0473135874378,\n", + " n_estimators=436)))])),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('linearsvc', LinearSVC(C=0.012266617842, penalty='l1'))])" + ] + }, + "execution_count": 76, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "linear_search_space = tpot2.config.template_search_spaces.get_template_search_spaces(\"linear\")\n", + "linear_search_space.generate().export_pipeline()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "linear_est = tpot2.TPOTEstimator(population_size=10,generations=5, \n", + " scorers=['roc_auc_ovr',tpot2.objectives.complexity_scorer],\n", + " scorers_weights=[1,-1],\n", + " n_jobs=32,\n", + " classification=True,\n", + " search_space = linear_search_space,\n", + " verbose=1,\n", + " )\n", + "\n", + "#alternatively, you can use the template search space to generate a pipeline\n", + "linear_est = tpot2.TPOTEstimator(population_size=10,generations=5, \n", + " scorers=['roc_auc_ovr',tpot2.objectives.complexity_scorer],\n", + " scorers_weights=[1,-1],\n", + " n_jobs=32,\n", + " classification=True,\n", + " search_space = \"linear\",\n", + " verbose=1,\n", + " )" + ] + }, { "cell_type": "markdown", "metadata": {}, diff --git a/Tutorial/3_Feature_Set_Selector.ipynb b/Tutorial/3_Feature_Set_Selector.ipynb index a175a587..47aeb176 100644 --- a/Tutorial/3_Feature_Set_Selector.ipynb +++ b/Tutorial/3_Feature_Set_Selector.ipynb @@ -196,78 +196,78 @@ " \n", " \n", " 0\n", - " 2.607470\n", - " -1.799163\n", - " -1.345319\n", - " -3.155746\n", - " -1.731663\n", - " 0.699546\n", - " 0.513305\n", - " 0.507864\n", - " 0.357858\n", - " 0.439859\n", - " 0.695061\n", - " 0.449589\n", + " -0.988411\n", + " -3.270714\n", + " -1.816697\n", + " 0.384124\n", + " 1.258591\n", + " -1.577232\n", + " 0.101273\n", + " 0.657975\n", + " 0.770880\n", + " 0.882366\n", + " 0.637714\n", + " 0.002812\n", " \n", " \n", " 1\n", - " 0.045796\n", - " -2.830673\n", - " 1.578201\n", - " -0.098472\n", - " -0.665334\n", - " -0.130451\n", - " 0.022118\n", - " 0.808068\n", - " 0.158917\n", - " 0.328156\n", - " 0.349374\n", - " 0.927755\n", + " -0.531157\n", + " -1.298541\n", + " -2.630749\n", + " 0.036662\n", + " -2.097307\n", + " -1.711751\n", + " 0.894172\n", + " 0.727579\n", + " 0.211429\n", + " 0.223319\n", + " 0.496683\n", + " 0.840040\n", " \n", " \n", " 2\n", - " 0.490722\n", - " -2.026190\n", - " -1.848381\n", - " -1.112946\n", - " -1.620822\n", - " 3.430459\n", - " 0.166742\n", - " 0.504127\n", - " 0.942156\n", - " 0.556877\n", - " 0.024859\n", - " 0.430831\n", + " -0.896734\n", + " -1.805453\n", + " -2.736948\n", + " -0.310169\n", + " 1.802988\n", + " -0.269441\n", + " 0.765178\n", + " 0.341713\n", + " 0.847770\n", + " 0.696190\n", + " 0.824104\n", + " 0.297523\n", " \n", " \n", " 3\n", - " -1.859338\n", - " -0.196734\n", - " 1.525634\n", - " 0.244376\n", - " -0.685690\n", - " 1.995038\n", - " 0.055226\n", - " 0.751830\n", - " 0.983152\n", - " 0.702334\n", - " 0.750200\n", - " 0.294415\n", + " 1.637719\n", + " -0.930537\n", + " -0.229303\n", + " 0.198907\n", + " 1.184137\n", + " -0.411545\n", + " 0.870378\n", + " 0.811312\n", + " 0.142528\n", + " 0.707361\n", + " 0.201967\n", + " 0.867956\n", " \n", " \n", " 4\n", - " -0.056101\n", - " 1.386592\n", - " 1.552356\n", - " 1.446347\n", - " -0.984449\n", - " 0.742441\n", - " 0.631411\n", - " 0.217660\n", - " 0.124121\n", - " 0.814294\n", - " 0.131921\n", - " 0.917958\n", + " -1.709777\n", + " -2.701615\n", + " 0.297434\n", + " -0.909832\n", + " 1.436884\n", + " 0.120985\n", + " 0.866854\n", + " 0.352461\n", + " 0.690270\n", + " 0.172950\n", + " 0.056518\n", + " 0.806867\n", " \n", " \n", "\n", @@ -275,18 +275,18 @@ ], "text/plain": [ " a b c d e f g \\\n", - "0 2.607470 -1.799163 -1.345319 -3.155746 -1.731663 0.699546 0.513305 \n", - "1 0.045796 -2.830673 1.578201 -0.098472 -0.665334 -0.130451 0.022118 \n", - "2 0.490722 -2.026190 -1.848381 -1.112946 -1.620822 3.430459 0.166742 \n", - "3 -1.859338 -0.196734 1.525634 0.244376 -0.685690 1.995038 0.055226 \n", - "4 -0.056101 1.386592 1.552356 1.446347 -0.984449 0.742441 0.631411 \n", + "0 -0.988411 -3.270714 -1.816697 0.384124 1.258591 -1.577232 0.101273 \n", + "1 -0.531157 -1.298541 -2.630749 0.036662 -2.097307 -1.711751 0.894172 \n", + "2 -0.896734 -1.805453 -2.736948 -0.310169 1.802988 -0.269441 0.765178 \n", + "3 1.637719 -0.930537 -0.229303 0.198907 1.184137 -0.411545 0.870378 \n", + "4 -1.709777 -2.701615 0.297434 -0.909832 1.436884 0.120985 0.866854 \n", "\n", " h i j k l \n", - "0 0.507864 0.357858 0.439859 0.695061 0.449589 \n", - "1 0.808068 0.158917 0.328156 0.349374 0.927755 \n", - "2 0.504127 0.942156 0.556877 0.024859 0.430831 \n", - "3 0.751830 0.983152 0.702334 0.750200 0.294415 \n", - "4 0.217660 0.124121 0.814294 0.131921 0.917958 " + "0 0.657975 0.770880 0.882366 0.637714 0.002812 \n", + "1 0.727579 0.211429 0.223319 0.496683 0.840040 \n", + "2 0.341713 0.847770 0.696190 0.824104 0.297523 \n", + "3 0.811312 0.142528 0.707361 0.201967 0.867956 \n", + "4 0.352461 0.690270 0.172950 0.056518 0.806867 " ] }, "execution_count": 2, @@ -821,34 +821,34 @@ " \n", " \n", " \n", - " 221\n", - " 2.819930\n", - " -0.677731\n", - " -2.883151\n", + " 28\n", + " -2.393671\n", + " 2.653494\n", + " 1.336840\n", " \n", " \n", - " 121\n", - " -0.068495\n", - " -1.398128\n", - " 1.239476\n", + " 540\n", + " -1.598037\n", + " -2.639941\n", + " -1.787062\n", " \n", " \n", - " 83\n", - " 2.225071\n", - " -1.902156\n", - " 1.425918\n", + " 980\n", + " -1.562249\n", + " 1.573867\n", + " -0.135207\n", " \n", " \n", - " 243\n", - " 2.592917\n", - " 0.764790\n", - " -2.120585\n", + " 812\n", + " 0.084835\n", + " 1.809188\n", + " -1.525609\n", " \n", " \n", - " 992\n", - " 2.329279\n", - " 2.362780\n", - " -3.780878\n", + " 117\n", + " 0.647414\n", + " 1.437139\n", + " 1.873279\n", " \n", " \n", " ...\n", @@ -857,34 +857,34 @@ " ...\n", " \n", " \n", - " 254\n", - " 1.850187\n", - " -2.177608\n", - " -4.088455\n", + " 630\n", + " 0.102721\n", + " 0.463829\n", + " -0.220689\n", " \n", " \n", - " 19\n", - " -0.949434\n", - " 1.062798\n", - " -3.421324\n", + " 963\n", + " -0.530709\n", + " 0.353686\n", + " 0.621369\n", " \n", " \n", - " 956\n", - " 2.105221\n", - " -0.169633\n", - " 1.743979\n", + " 943\n", + " 3.850193\n", + " 0.948248\n", + " -2.042764\n", " \n", " \n", - " 150\n", - " 2.171954\n", - " -1.343100\n", - " -0.346960\n", + " 930\n", + " 1.051634\n", + " 1.240570\n", + " -1.477092\n", " \n", " \n", - " 629\n", - " 0.348571\n", - " -1.171049\n", - " 0.854003\n", + " 116\n", + " -0.126476\n", + " -1.599799\n", + " -0.610169\n", " \n", " \n", "\n", @@ -893,17 +893,17 @@ ], "text/plain": [ " d e f\n", - "221 2.819930 -0.677731 -2.883151\n", - "121 -0.068495 -1.398128 1.239476\n", - "83 2.225071 -1.902156 1.425918\n", - "243 2.592917 0.764790 -2.120585\n", - "992 2.329279 2.362780 -3.780878\n", + "28 -2.393671 2.653494 1.336840\n", + "540 -1.598037 -2.639941 -1.787062\n", + "980 -1.562249 1.573867 -0.135207\n", + "812 0.084835 1.809188 -1.525609\n", + "117 0.647414 1.437139 1.873279\n", ".. ... ... ...\n", - "254 1.850187 -2.177608 -4.088455\n", - "19 -0.949434 1.062798 -3.421324\n", - "956 2.105221 -0.169633 1.743979\n", - "150 2.171954 -1.343100 -0.346960\n", - "629 0.348571 -1.171049 0.854003\n", + "630 0.102721 0.463829 -0.220689\n", + "963 -0.530709 0.353686 0.621369\n", + "943 3.850193 0.948248 -2.042764\n", + "930 1.051634 1.240570 -1.477092\n", + "116 -0.126476 -1.599799 -0.610169\n", "\n", "[750 rows x 3 columns]" ] @@ -923,55 +923,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We can now use this when defining our pipelines. \n", - "For this first example, we will construct a simple linear pipeline where the first step is a feature set selector, and the second is a classifier" + "Under the hood, mutation will randomly select another feature set and crossover will swap the feature sets selected by two individuals" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 100%|██████████| 5/5 [00:30<00:00, 6.12s/it]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.8989320638188367\n" - ] - } - ], - "source": [ - "\n", - "classification_search_space = get_search_space([\"RandomForestClassifier\"])\n", - "fss_and_classifier_search_space = SequentialPipeline([fss_search_space, classification_search_space])\n", - "\n", - "\n", - "est = tpot2.TPOTEstimator(generations=5, \n", - " scorers=[\"roc_auc_ovr\", tpot2.objectives.complexity_scorer],\n", - " scorers_weights=[1.0, -1.0],\n", - " n_jobs=32,\n", - " classification=True,\n", - " search_space = fss_and_classifier_search_space,\n", - " verbose=1,\n", - " )\n", - "\n", - "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, "outputs": [ { "data": { @@ -1380,77 +1338,25 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
Pipeline(steps=[('featuresetselector',\n",
-       "                 FeatureSetSelector(name='group_two',\n",
-       "                                    sel_subset=['d', 'e', 'f'])),\n",
-       "                ('randomforestclassifier',\n",
-       "                 RandomForestClassifier(class_weight='balanced',\n",
-       "                                        max_features=0.7587731972584,\n",
-       "                                        min_samples_leaf=7,\n",
-       "                                        min_samples_split=12,\n",
-       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureSetSelector(name='group_two', sel_subset=['d', 'e', 'f'])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('featuresetselector',\n", - " FeatureSetSelector(name='group_two',\n", - " sel_subset=['d', 'e', 'f'])),\n", - " ('randomforestclassifier',\n", - " RandomForestClassifier(class_weight='balanced',\n", - " max_features=0.7587731972584,\n", - " min_samples_leaf=7,\n", - " min_samples_split=12,\n", - " n_estimators=128))])" + "FeatureSetSelector(name='group_two', sel_subset=['d', 'e', 'f'])" ] }, - "execution_count": 8, + "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "est.fitted_pipeline_" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "With this setup TPOT is able to identify one of the subsets used, but the performance is not optimal. In this case we happen to know that multiple feature sets are required. If we want to include multiple features in our pipelines, we will have to modify our search space. There are three options for this.\n", - "\n", - "1. UnionPipeline - This allows you to have a fixed number of feature sets selected. If you use a UnionPipeline with two FSSNodes, you will always select two feature sets that are simply concatenated together.\n", - "2. DynamicUnionPipeline - This space allows multiple FSSNodes to be selected. Unlike UnionPipeline you don't have to specify the number of selected sets, TPOT will identify the number of sets that are optimal. Additionally, with DynamicUnionPipeline, the same feature set cannot be selected twice. Note that while DynamicUnionPipeline can select multiple feature sets, it never mixes two feature sets together.\n", - "3. GraphSearchPipeline - When set as the leave_search_space, GraphSearchPipeline can also select multiple FSSNodes which act as an input to the rest of the pipeline." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### UnionPipeline + FSSNode example" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "union_fss_space = UnionPipeline([fss_search_space, fss_search_space])" + "ind1 = fss_search_space.generate(rng=1)\n", + "ind1.export_pipeline()" ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 8, "metadata": {}, "outputs": [ { @@ -1860,219 +1766,74 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
FeatureUnion(transformer_list=[('featuresetselector-1',\n",
-       "                                FeatureSetSelector(name='group_two',\n",
-       "                                                   sel_subset=['d', 'e', 'f'])),\n",
-       "                               ('featuresetselector-2',\n",
-       "                                FeatureSetSelector(name='group_three',\n",
-       "                                                   sel_subset=['g', 'h',\n",
-       "                                                               'i']))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureSetSelector(name='group_three', sel_subset=['g', 'h', 'i'])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "FeatureUnion(transformer_list=[('featuresetselector-1',\n", - " FeatureSetSelector(name='group_two',\n", - " sel_subset=['d', 'e', 'f'])),\n", - " ('featuresetselector-2',\n", - " FeatureSetSelector(name='group_three',\n", - " sel_subset=['g', 'h',\n", - " 'i']))])" + "FeatureSetSelector(name='group_three', sel_subset=['g', 'h', 'i'])" ] }, - "execution_count": 10, + "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "# this union search space will always select exactly two fss_search_space\n", - "selector1 = union_fss_space.generate(rng=1).export_pipeline()\n", - "selector1" + "ind1.mutate()\n", + "ind1.export_pipeline()" ] }, { - "cell_type": "code", - "execution_count": 11, + "cell_type": "markdown", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
defghi
2212.819930-0.677731-2.8831510.4203930.9304310.223210
121-0.068495-1.3981281.2394760.5837170.8729670.355258
832.225071-1.9021561.4259180.5217140.6243030.526185
2432.5929170.764790-2.1205850.7483970.0992240.640145
9922.3292792.362780-3.7808780.2338200.0544630.025899
.....................
2541.850187-2.177608-4.0884550.9550170.3632700.010256
19-0.9494341.062798-3.4213240.0942150.8376560.856361
9562.105221-0.1696331.7439790.2663650.4467270.356117
1502.171954-1.343100-0.3469600.7126730.9833440.873176
6290.348571-1.1710490.8540030.7464990.1014730.609367
\n", - "

750 rows × 6 columns

\n", - "
" - ], - "text/plain": [ - " d e f g h i\n", - "221 2.819930 -0.677731 -2.883151 0.420393 0.930431 0.223210\n", - "121 -0.068495 -1.398128 1.239476 0.583717 0.872967 0.355258\n", - "83 2.225071 -1.902156 1.425918 0.521714 0.624303 0.526185\n", - "243 2.592917 0.764790 -2.120585 0.748397 0.099224 0.640145\n", - "992 2.329279 2.362780 -3.780878 0.233820 0.054463 0.025899\n", - ".. ... ... ... ... ... ...\n", - "254 1.850187 -2.177608 -4.088455 0.955017 0.363270 0.010256\n", - "19 -0.949434 1.062798 -3.421324 0.094215 0.837656 0.856361\n", - "956 2.105221 -0.169633 1.743979 0.266365 0.446727 0.356117\n", - "150 2.171954 -1.343100 -0.346960 0.712673 0.983344 0.873176\n", - "629 0.348571 -1.171049 0.854003 0.746499 0.101473 0.609367\n", - "\n", - "[750 rows x 6 columns]" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], "source": [ - "selector1.set_output(transform=\"pandas\") \n", - "selector1.fit(X_train)\n", - "selector1.transform(X_train)" + "We can now use this when defining our pipelines. \n", + "For this first example, we will construct a simple linear pipeline where the first step is a feature set selector, and the second is a classifier" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": 9, "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generation: 100%|██████████| 5/5 [00:30<00:00, 6.11s/it]\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0.90263107355483\n" + ] + } + ], "source": [ - "#### DynamicUnionPipeline + FSSNode example\n", - "The dynamic union pipeline may select a variable number of feature sets." + "\n", + "classification_search_space = get_search_space([\"RandomForestClassifier\"])\n", + "fss_and_classifier_search_space = SequentialPipeline([fss_search_space, classification_search_space])\n", + "\n", + "\n", + "est = tpot2.TPOTEstimator(generations=5, \n", + " scorers=[\"roc_auc_ovr\", tpot2.objectives.complexity_scorer],\n", + " scorers_weights=[1.0, -1.0],\n", + " n_jobs=32,\n", + " classification=True,\n", + " search_space = fss_and_classifier_search_space,\n", + " verbose=1,\n", + " )\n", + "\n", + "\n", + "scorer = sklearn.metrics.get_scorer('roc_auc_ovr')\n", + "est.fit(X_train, y_train)\n", + "print(scorer(est, X_test, y_test))" ] }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 10, "metadata": {}, "outputs": [ { @@ -2482,44 +2243,84 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
FeatureUnion(transformer_list=[('featuresetselector',\n",
-       "                                FeatureSetSelector(name='group_three',\n",
-       "                                                   sel_subset=['g', 'h',\n",
-       "                                                               'i']))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('featuresetselector',\n",
+       "                 FeatureSetSelector(name='group_one',\n",
+       "                                    sel_subset=['a', 'b', 'c'])),\n",
+       "                ('randomforestclassifier',\n",
+       "                 RandomForestClassifier(criterion='entropy',\n",
+       "                                        max_features=0.4070021568844,\n",
+       "                                        min_samples_leaf=4, min_samples_split=3,\n",
+       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "FeatureUnion(transformer_list=[('featuresetselector',\n", - " FeatureSetSelector(name='group_three',\n", - " sel_subset=['g', 'h',\n", - " 'i']))])" + "Pipeline(steps=[('featuresetselector',\n", + " FeatureSetSelector(name='group_one',\n", + " sel_subset=['a', 'b', 'c'])),\n", + " ('randomforestclassifier',\n", + " RandomForestClassifier(criterion='entropy',\n", + " max_features=0.4070021568844,\n", + " min_samples_leaf=4, min_samples_split=3,\n", + " n_estimators=128))])" ] }, - "execution_count": 12, + "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "dynamic_fss_space = DynamicUnionPipeline(fss_search_space)\n", - "dynamic_fss_space.generate(rng=1).export_pipeline()" + "est.fitted_pipeline_" ] }, { - "cell_type": "code", - "execution_count": 13, + "cell_type": "markdown", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
FeatureUnion(transformer_list=[('featuresetselector-1',\n",
-       "                                FeatureSetSelector(name='group_one',\n",
-       "                                                   sel_subset=['a', 'b', 'c'])),\n",
+       "                                FeatureSetSelector(name='group_two',\n",
+       "                                                   sel_subset=['d', 'e', 'f'])),\n",
        "                               ('featuresetselector-2',\n",
-       "                                FeatureSetSelector(name='group_four',\n",
-       "                                                   sel_subset=['j', 'k',\n",
-       "                                                               'l']))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "FeatureUnion(transformer_list=[('featuresetselector-1',\n", - " FeatureSetSelector(name='group_one',\n", - " sel_subset=['a', 'b', 'c'])),\n", - " ('featuresetselector-2',\n", - " FeatureSetSelector(name='group_four',\n", - " sel_subset=['j', 'k',\n", - " 'l']))])" + "Pipeline(steps=[('featureunion',\n", + " FeatureUnion(transformer_list=[('featuresetselector-1',\n", + " FeatureSetSelector(name='group_one',\n", + " sel_subset=['a',\n", + " 'b',\n", + " 'c'])),\n", + " ('featuresetselector-2',\n", + " FeatureSetSelector(name='group_two',\n", + " sel_subset=['d',\n", + " 'e',\n", + " 'f']))])),\n", + " ('randomforestclassifier',\n", + " RandomForestClassifier(bootstrap=False,\n", + " class_weight='balanced',\n", + " max_features=0.4909664847192,\n", + " min_samples_leaf=2, min_samples_split=4,\n", + " n_estimators=128))])" ] }, - "execution_count": 13, + "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "dynamic_fss_space.generate(rng=3).export_pipeline()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### GraphSearchPipeline + FSSNode example\n", - "\n", - "FSSNodes must be set as the leaf search space as they act as the inputs to the pipeline.\n", - "\n", - "Here is an example pipeline from this search space that utilizes two feature sets." - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "graph_search_space = tpot2.search_spaces.pipelines.GraphSearchPipeline(\n", - " leaf_search_space = fss_search_space,\n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " max_size = 10,\n", - ")\n", - "\n", - "graph_search_space.generate(rng=4).export_pipeline().plot()" + "est.fitted_pipeline_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "#### Optimize with TPOT\n", - "\n", - "For this example, we will optimize the DynamicUnion search space" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 100%|██████████| 5/5 [00:34<00:00, 6.96s/it]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.9916366443643849\n" - ] - } - ], - "source": [ - "import tpot2\n", - "import sklearn.datasets\n", - "from sklearn.linear_model import LogisticRegression\n", - "import numpy as np\n", - "\n", - "\n", - "final_classification_search_space = SequentialPipeline([dynamic_fss_space, classification_search_space])\n", - "\n", - "est = tpot2.TPOTEstimator(generations=5, \n", - " scorers=[\"roc_auc_ovr\", tpot2.objectives.complexity_scorer],\n", - " scorers_weights=[1.0, -1.0],\n", - " n_jobs=32,\n", - " classification=True,\n", - " search_space = final_classification_search_space,\n", - " verbose=1,\n", - " )\n", + "### Combining with existing search spaces\n", "\n", + "As with all search spaces, FSSNode can be combined with any other search space. \n", "\n", - "scorer = sklearn.metrics.get_scorer('roc_auc_ovo')\n", - "\n", - "est.fit(X_train, y_train)\n", - "print(scorer(est, X_test, y_test))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can see that this pipeline performed slightly better and correctly identified group one and group two as the feature sets used in the generative equation." + "You can also pair this with the existing prebuilt templates, for example:" ] }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('featureunion',\n",
-       "                 FeatureUnion(transformer_list=[('featuresetselector-1',\n",
-       "                                                 FeatureSetSelector(name='group_one',\n",
-       "                                                                    sel_subset=['a',\n",
-       "                                                                                'b',\n",
-       "                                                                                'c'])),\n",
-       "                                                ('featuresetselector-2',\n",
-       "                                                 FeatureSetSelector(name='group_two',\n",
-       "                                                                    sel_subset=['d',\n",
-       "                                                                                'e',\n",
-       "                                                                                'f']))])),\n",
-       "                ('randomforestclassifier',\n",
-       "                 RandomForestClassifier(bootstrap=False,\n",
-       "                                        class_weight='balanced',\n",
-       "                                        max_features=0.3314976075207,\n",
-       "                                        min_samples_leaf=3,\n",
-       "                                        min_samples_split=15,\n",
-       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('featuresetselector',\n",
+       "                 FeatureSetSelector(name='group_two', sel_subset=[3, 4, 5])),\n",
+       "                ('pipeline',\n",
+       "                 Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n",
+       "                                 ('rfe',\n",
+       "                                  RFE(estimator=ExtraTreesClassifier(max_features=0.0390676831531,\n",
+       "                                                                     min_samples_leaf=8,\n",
+       "                                                                     min_samples_split=14,\n",
+       "                                                                     n_jobs=1),\n",
+       "                                      step=0.753983388654)),\n",
+       "                                 ('featureunion-1',\n",
+       "                                  FeatureUnion(transformer_list=[('f...\n",
+       "                                                                  FeatureUnion(transformer_list=[('powertransformer',\n",
+       "                                                                                                  PowerTransformer()),\n",
+       "                                                                                                 ('pca',\n",
+       "                                                                                                  PCA(n_components=0.9286371732844))])),\n",
+       "                                                                 ('passthrough',\n",
+       "                                                                  Passthrough())])),\n",
+       "                                 ('featureunion-2',\n",
+       "                                  FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                                  SkipTransformer()),\n",
+       "                                                                 ('passthrough',\n",
+       "                                                                  Passthrough())])),\n",
+       "                                 ('kneighborsclassifier',\n",
+       "                                  KNeighborsClassifier(n_jobs=1, n_neighbors=21,\n",
+       "                                                       weights='distance'))]))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('featureunion',\n", - " FeatureUnion(transformer_list=[('featuresetselector-1',\n", - " FeatureSetSelector(name='group_one',\n", - " sel_subset=['a',\n", - " 'b',\n", - " 'c'])),\n", - " ('featuresetselector-2',\n", - " FeatureSetSelector(name='group_two',\n", - " sel_subset=['d',\n", - " 'e',\n", - " 'f']))])),\n", - " ('randomforestclassifier',\n", - " RandomForestClassifier(bootstrap=False,\n", - " class_weight='balanced',\n", - " max_features=0.3314976075207,\n", - " min_samples_leaf=3,\n", - " min_samples_split=15,\n", - " n_estimators=128))])" + "Pipeline(steps=[('featuresetselector',\n", + " FeatureSetSelector(name='group_two', sel_subset=[3, 4, 5])),\n", + " ('pipeline',\n", + " Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n", + " ('rfe',\n", + " RFE(estimator=ExtraTreesClassifier(max_features=0.0390676831531,\n", + " min_samples_leaf=8,\n", + " min_samples_split=14,\n", + " n_jobs=1),\n", + " step=0.753983388654)),\n", + " ('featureunion-1',\n", + " FeatureUnion(transformer_list=[('f...\n", + " FeatureUnion(transformer_list=[('powertransformer',\n", + " PowerTransformer()),\n", + " ('pca',\n", + " PCA(n_components=0.9286371732844))])),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('featureunion-2',\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", + " ('passthrough',\n", + " Passthrough())])),\n", + " ('kneighborsclassifier',\n", + " KNeighborsClassifier(n_jobs=1, n_neighbors=21,\n", + " weights='distance'))]))])" ] }, - "execution_count": 16, + "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "est.fitted_pipeline_" + "linear_search_space = tpot2.config.template_search_spaces.get_template_search_spaces(\"linear\", classification=True)\n", + "fss_and_linear_search_space = SequentialPipeline([fss_search_space, linear_search_space])\n", + "\n", + "# est = tpot2.TPOTEstimator( \n", + "# population_size=32,\n", + "# generations=10, \n", + "# scorers=[\"roc_auc_ovr\", tpot2.objectives.complexity_scorer],\n", + "# scorers_weights=[1.0, -1.0],\n", + "# other_objective_functions=[number_of_selected_features],\n", + "# other_objective_functions_weights = [-1],\n", + "# objective_function_names = [\"Number of selected features\"],\n", + "\n", + "# n_jobs=32,\n", + "# classification=True,\n", + "# search_space = fss_and_linear_search_space,\n", + "# verbose=2,\n", + "# )\n", + "\n", + "fss_and_linear_search_space.generate(rng=1).export_pipeline()" ] }, { @@ -3548,7 +4969,7 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 19, "metadata": {}, "outputs": [], "source": [ @@ -3566,13 +4987,13 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('featureunion',\n",
+       "
Pipeline(steps=[('featureunion',\n",
        "                 FeatureUnion(transformer_list=[('pipeline-1',\n",
        "                                                 Pipeline(steps=[('featuresetselector',\n",
        "                                                                  FeatureSetSelector(name='group_one',\n",
-       "                                                                                     sel_subset=[0,\n",
-       "                                                                                                 1,\n",
-       "                                                                                                 2])),\n",
+       "                                                                                     sel_subset=['a',\n",
+       "                                                                                                 'b',\n",
+       "                                                                                                 'c'])),\n",
        "                                                                 ('pipeline',\n",
        "                                                                  Pipeline(steps=[('featureunion',\n",
        "                                                                                   FeatureUnion(transformer_list=[('featureunion',\n",
        "                                                                                                                   FeatureUnion(transformer_list=[('pca',\n",
        "                                                                                                                                                   PCA(n_components=0.93113403057))])),\n",
-       "                                                                                                                  ('passthrough...\n",
+       "                                                                                                                  ('passt...\n",
        "                                                                                   FeatureUnion(transformer_list=[('featureunion',\n",
        "                                                                                                                   FeatureUnion(transformer_list=[('quantiletransformer',\n",
        "                                                                                                                                                   QuantileTransformer(n_quantiles=87)),\n",
@@ -4001,19 +5422,19 @@
        "                                        criterion='entropy',\n",
        "                                        max_features=0.021545996678,\n",
        "                                        min_samples_leaf=11,\n",
-       "                                        n_estimators=128))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n", - " ('selectfwe', SelectFwe(alpha=0.0142080454732)),\n", + "Pipeline(steps=[('passthrough', Passthrough()),\n", + " ('selectfwe', SelectFwe(alpha=0.0012275167982)),\n", " ('featureunion-1',\n", - " FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('passkbinsdiscretizer',\n", - " PassKBinsDiscretizer(n_bins=8))])),\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", " ('passthrough',\n", " Passthrough())])),\n", " ('featureunion-2',\n", @@ -863,13 +911,12 @@ " SkipTransformer()),\n", " ('passthrough',\n", " Passthrough())])),\n", - " ('logisticregression',\n", - " LogisticRegression(C=462.7983711938423,\n", - " class_weight='balanced', max_iter=1000,\n", - " n_jobs=1, solver='saga'))])" + " ('adaboostclassifier',\n", + " AdaBoostClassifier(learning_rate=0.9052253032837,\n", + " n_estimators=273))])" ] }, - "execution_count": 6, + "execution_count": 10, "metadata": {}, "output_type": "execute_result" } @@ -881,22 +928,22 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "array([0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0,\n", - " 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1,\n", - " 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0,\n", - " 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1,\n", - " 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0,\n", - " 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1,\n", - " 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1])" + "array([1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0,\n", + " 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1,\n", + " 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1,\n", + " 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0,\n", + " 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1,\n", + " 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1,\n", + " 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1])" ] }, - "execution_count": 7, + "execution_count": 11, "metadata": {}, "output_type": "execute_result" } @@ -916,7 +963,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 12, "metadata": {}, "outputs": [], "source": [ @@ -953,7 +1000,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 13, "metadata": {}, "outputs": [ { @@ -962,7 +1009,7 @@ "['roc_auc_score', 'complexity_scorer']" ] }, - "execution_count": 9, + "execution_count": 13, "metadata": {}, "output_type": "execute_result" } @@ -974,7 +1021,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 14, "metadata": {}, "outputs": [ { @@ -1014,73 +1061,73 @@ " \n", " \n", " 0\n", - " 0.994779\n", - " 38.8\n", + " 0.964012\n", + " 1745.5\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " NaN\n", - " (Normalizer(norm='l1'), Passthrough(), Feature...\n", + " (Normalizer(norm='l1'), SelectPercentile(perce...\n", " \n", " \n", " 1\n", - " 0.884608\n", - " 12.0\n", + " NaN\n", + " NaN\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", - " None\n", + " 1.727568e+09\n", + " 1.727568e+09\n", + " INVALID\n", " NaN\n", - " (MinMaxScaler(), SelectPercentile(percentile=6...\n", + " (MaxAbsScaler(), SelectFromModel(estimator=Ext...\n", " \n", " \n", " 2\n", - " 0.995994\n", - " 277.0\n", + " NaN\n", + " NaN\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", - " None\n", + " 1.727568e+09\n", + " 1.727568e+09\n", + " INVALID\n", " NaN\n", - " (MaxAbsScaler(), Passthrough(), FeatureUnion(t...\n", + " (MaxAbsScaler(), VarianceThreshold(threshold=0...\n", " \n", " \n", " 3\n", - " 0.969714\n", - " 97.0\n", + " NaN\n", + " NaN\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", - " None\n", + " 1.727568e+09\n", + " 1.727568e+09\n", + " INVALID\n", " NaN\n", - " (RobustScaler(quantile_range=(0.1797291876324,...\n", + " (Normalizer(norm='l1'), RFE(estimator=ExtraTre...\n", " \n", " \n", " 4\n", - " 0.977700\n", - " 10.0\n", + " 0.991667\n", + " 24030.0\n", " NaN\n", " NaN\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 0.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " NaN\n", - " (RobustScaler(quantile_range=(0.0723902721638,...\n", + " (RobustScaler(quantile_range=(0.1798922078332,...\n", " \n", " \n", " ...\n", @@ -1097,106 +1144,93 @@ " ...\n", " \n", " \n", - " 295\n", - " 0.994663\n", - " 17.0\n", - " (245, 2)\n", - " ind_crossover\n", + " 345\n", + " 0.992793\n", + " 4374.0\n", + " (237, 237)\n", + " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 6.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", - " 1.0\n", - " (MinMaxScaler(), SelectPercentile(percentile=3...\n", + " NaN\n", + " (Passthrough(), SelectFwe(alpha=0.022268001122...\n", " \n", " \n", - " 296\n", - " NaN\n", - " NaN\n", - " (190, 190)\n", + " 346\n", + " 0.520972\n", + " 9.0\n", + " (128, 128)\n", " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", - " INVALID\n", + " 6.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", + " None\n", " NaN\n", - " (RobustScaler(quantile_range=(0.0790382918495,...\n", + " (MaxAbsScaler(), RFE(estimator=ExtraTreesClass...\n", " \n", " \n", - " 297\n", - " 0.995224\n", - " 231.0\n", - " (168, 232)\n", + " 347\n", + " NaN\n", + " NaN\n", + " (109, 85)\n", " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", - " None\n", + " 6.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", + " INVALID\n", " NaN\n", - " (MaxAbsScaler(), SelectFwe(alpha=0.03220532277...\n", + " (StandardScaler(), SelectPercentile(percentile...\n", " \n", " \n", - " 298\n", - " 0.974412\n", - " 10.0\n", - " (93, 205)\n", - " ind_mutate , ind_mutate , ind_crossover\n", + " 348\n", + " 0.976466\n", + " 21.0\n", + " (296, 128)\n", + " ind_crossover , ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 6.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " NaN\n", - " (MaxAbsScaler(), Passthrough(), FeatureUnion(t...\n", + " (Passthrough(), RFE(estimator=ExtraTreesClassi...\n", " \n", " \n", - " 299\n", - " 0.932393\n", - " 8.0\n", - " (234, 234)\n", - " ind_mutate\n", + " 349\n", + " 0.990725\n", + " 14.0\n", + " (297, 213)\n", + " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 6.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " NaN\n", - " (Normalizer(norm='l1'), VarianceThreshold(thre...\n", + " (MinMaxScaler(), SelectFwe(alpha=0.00016890355...\n", " \n", " \n", "\n", - "

300 rows × 11 columns

\n", + "

350 rows × 11 columns

\n", "
" ], "text/plain": [ - " roc_auc_score complexity_scorer Parents \\\n", - "0 0.994779 38.8 NaN \n", - "1 0.884608 12.0 NaN \n", - "2 0.995994 277.0 NaN \n", - "3 0.969714 97.0 NaN \n", - "4 0.977700 10.0 NaN \n", - ".. ... ... ... \n", - "295 0.994663 17.0 (245, 2) \n", - "296 NaN NaN (190, 190) \n", - "297 0.995224 231.0 (168, 232) \n", - "298 0.974412 10.0 (93, 205) \n", - "299 0.932393 8.0 (234, 234) \n", - "\n", - " Variation_Function \\\n", - "0 NaN \n", - "1 NaN \n", - "2 NaN \n", - "3 NaN \n", - "4 NaN \n", - ".. ... \n", - "295 ind_crossover \n", - "296 ind_mutate \n", - "297 ind_crossover \n", - "298 ind_mutate , ind_mutate , ind_crossover \n", - "299 ind_mutate \n", + " roc_auc_score complexity_scorer Parents Variation_Function \\\n", + "0 0.964012 1745.5 NaN NaN \n", + "1 NaN NaN NaN NaN \n", + "2 NaN NaN NaN NaN \n", + "3 NaN NaN NaN NaN \n", + "4 0.991667 24030.0 NaN NaN \n", + ".. ... ... ... ... \n", + "345 0.992793 4374.0 (237, 237) ind_mutate \n", + "346 0.520972 9.0 (128, 128) ind_mutate \n", + "347 NaN NaN (109, 85) ind_crossover \n", + "348 0.976466 21.0 (296, 128) ind_crossover , ind_mutate \n", + "349 0.990725 14.0 (297, 213) ind_crossover \n", "\n", " Individual Generation \\\n", "0 " ] @@ -1276,7 +1310,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -1308,7 +1342,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 16, "metadata": {}, "outputs": [ { @@ -1347,231 +1381,164 @@ " \n", " \n", " \n", - " 211\n", - " 0.997632\n", - " 231.0\n", - " (2, 2)\n", - " ind_mutate\n", + " 330\n", + " 0.995556\n", + " 4373.0\n", + " (237, 52)\n", + " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 6.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " 1.0\n", - " (MaxAbsScaler(), SelectFwe(alpha=0.01420804547...\n", + " (Passthrough(), SelectFwe(alpha=0.001227516798...\n", " \n", " \n", - " 251\n", - " 0.996697\n", - " 213.9\n", - " (211, 211)\n", + " 144\n", + " 0.995000\n", + " 68.6\n", + " (61, 61)\n", " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", - " None\n", - " 1.0\n", - " (MaxAbsScaler(), VarianceThreshold(threshold=0...\n", - " \n", - " \n", - " 261\n", - " 0.996394\n", - " 182.4\n", - " (211, 182)\n", - " ind_crossover\n", - " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 2.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " 1.0\n", - " (MaxAbsScaler(), SelectFwe(alpha=0.01420804547...\n", + " (RobustScaler(quantile_range=(0.2808423658106,...\n", " \n", " \n", - " 132\n", - " 0.996313\n", - " 33.0\n", - " (85, 18)\n", - " ind_crossover\n", + " 320\n", + " 0.994059\n", + " 31.0\n", + " (184, 184)\n", + " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 2.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 6.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " 1.0\n", - " (StandardScaler(), SelectFwe(alpha=0.000377258...\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.01352548659...\n", " \n", " \n", - " 190\n", - " 0.996207\n", - " 27.5\n", - " (141, 122)\n", - " ind_mutate , ind_mutate , ind_crossover\n", + " 161\n", + " 0.994028\n", + " 23.2\n", + " (123, 123)\n", + " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 3.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " 1.0\n", - " (MaxAbsScaler(), VarianceThreshold(threshold=0...\n", + " (MaxAbsScaler(), SelectFromModel(estimator=Ext...\n", " \n", " \n", - " 250\n", - " 0.995593\n", - " 24.3\n", - " (173, 173)\n", + " 297\n", + " 0.992577\n", + " 13.0\n", + " (193, 193)\n", " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " 1.0\n", - " (MinMaxScaler(), SelectFwe(alpha=0.00034913463...\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.00098089598...\n", " \n", " \n", - " 295\n", - " 0.994663\n", - " 17.0\n", - " (245, 2)\n", - " ind_crossover\n", + " 306\n", + " 0.991165\n", + " 8.0\n", + " (167, 167)\n", + " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 6.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " 1.0\n", - " (MinMaxScaler(), SelectPercentile(percentile=3...\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.00057722163...\n", " \n", " \n", - " 227\n", - " 0.990767\n", - " 11.0\n", - " (188, 168)\n", + " 106\n", + " 0.965015\n", + " 7.0\n", + " (11, 85)\n", " ind_crossover\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", - " None\n", - " 1.0\n", - " (Passthrough(), SelectFwe(alpha=0.000109999882...\n", - " \n", - " \n", - " 226\n", - " 0.989583\n", - " 9.0\n", - " (144, 90)\n", - " ind_mutate , ind_mutate , ind_crossover\n", - " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", - " None\n", - " 1.0\n", - " (StandardScaler(), VarianceThreshold(threshold...\n", - " \n", - " \n", - " 213\n", - " 0.976500\n", - " 8.1\n", - " (151, 151)\n", - " ind_crossover , ind_mutate\n", - " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 4.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 2.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " 1.0\n", - " (MaxAbsScaler(), SelectFwe(alpha=0.03007503043...\n", + " (StandardScaler(), SelectPercentile(percentile...\n", " \n", " \n", - " 290\n", - " 0.960114\n", + " 195\n", + " 0.945486\n", " 6.0\n", - " (240, 240)\n", + " (25, 25)\n", " ind_mutate\n", " <tpot2.search_spaces.pipelines.sequential.Sequ...\n", - " 5.0\n", - " 1.727192e+09\n", - " 1.727192e+09\n", + " 3.0\n", + " 1.727568e+09\n", + " 1.727568e+09\n", " None\n", " 1.0\n", - " (MinMaxScaler(), VarianceThreshold(threshold=0...\n", + " (MaxAbsScaler(), SelectFwe(alpha=0.00098089598...\n", " \n", " \n", "\n", "
" ], "text/plain": [ - " roc_auc_score complexity_scorer Parents \\\n", - "211 0.997632 231.0 (2, 2) \n", - "251 0.996697 213.9 (211, 211) \n", - "261 0.996394 182.4 (211, 182) \n", - "132 0.996313 33.0 (85, 18) \n", - "190 0.996207 27.5 (141, 122) \n", - "250 0.995593 24.3 (173, 173) \n", - "295 0.994663 17.0 (245, 2) \n", - "227 0.990767 11.0 (188, 168) \n", - "226 0.989583 9.0 (144, 90) \n", - "213 0.976500 8.1 (151, 151) \n", - "290 0.960114 6.0 (240, 240) \n", - "\n", - " Variation_Function \\\n", - "211 ind_mutate \n", - "251 ind_mutate \n", - "261 ind_crossover \n", - "132 ind_crossover \n", - "190 ind_mutate , ind_mutate , ind_crossover \n", - "250 ind_mutate \n", - "295 ind_crossover \n", - "227 ind_crossover \n", - "226 ind_mutate , ind_mutate , ind_crossover \n", - "213 ind_crossover , ind_mutate \n", - "290 ind_mutate \n", + " roc_auc_score complexity_scorer Parents Variation_Function \\\n", + "330 0.995556 4373.0 (237, 52) ind_crossover \n", + "144 0.995000 68.6 (61, 61) ind_mutate \n", + "320 0.994059 31.0 (184, 184) ind_mutate \n", + "161 0.994028 23.2 (123, 123) ind_mutate \n", + "297 0.992577 13.0 (193, 193) ind_mutate \n", + "306 0.991165 8.0 (167, 167) ind_mutate \n", + "106 0.965015 7.0 (11, 85) ind_crossover \n", + "195 0.945486 6.0 (25, 25) ind_mutate \n", "\n", " Individual Generation \\\n", - "211
Pipeline(steps=[('minmaxscaler', MinMaxScaler()),\n",
-       "                ('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0004675292341)),\n",
+       "
Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n",
+       "                ('selectfwe', SelectFwe(alpha=0.0009808959816)),\n",
        "                ('featureunion-1',\n",
-       "                 FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                                 FeatureUnion(transformer_list=[('quantiletransformer',\n",
-       "                                                                                 QuantileTransformer(n_quantiles=104))])),\n",
+       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
+       "                                                 SkipTransformer()),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
        "                ('featureunion-2',\n",
@@ -2016,13 +1981,12 @@
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
        "                ('kneighborsclassifier',\n",
-       "                 KNeighborsClassifier(n_jobs=1, n_neighbors=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
MaxAbsScaler()
SelectFwe(alpha=0.0009808959816)
FeatureUnion(transformer_list=[('skiptransformer', SkipTransformer()),\n",
+       "                               ('passthrough', Passthrough())])
SkipTransformer()
Passthrough()
FeatureUnion(transformer_list=[('skiptransformer', SkipTransformer()),\n",
+       "                               ('passthrough', Passthrough())])
SkipTransformer()
Passthrough()
KNeighborsClassifier(n_jobs=1, n_neighbors=1, p=1, weights='distance')
" ], "text/plain": [ - "Pipeline(steps=[('minmaxscaler', MinMaxScaler()),\n", - " ('variancethreshold',\n", - " VarianceThreshold(threshold=0.0004675292341)),\n", + "Pipeline(steps=[('maxabsscaler', MaxAbsScaler()),\n", + " ('selectfwe', SelectFwe(alpha=0.0009808959816)),\n", " ('featureunion-1',\n", - " FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('quantiletransformer',\n", - " QuantileTransformer(n_quantiles=104))])),\n", + " FeatureUnion(transformer_list=[('skiptransformer',\n", + " SkipTransformer()),\n", " ('passthrough',\n", " Passthrough())])),\n", " ('featureunion-2',\n", @@ -2053,10 +2014,11 @@ " ('passthrough',\n", " Passthrough())])),\n", " ('kneighborsclassifier',\n", - " KNeighborsClassifier(n_jobs=1, n_neighbors=1))])" + " KNeighborsClassifier(n_jobs=1, n_neighbors=1, p=1,\n", + " weights='distance'))])" ] }, - "execution_count": 13, + "execution_count": 17, "metadata": {}, "output_type": "execute_result" } @@ -2081,12 +2043,12 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 18, "metadata": {}, "outputs": [ { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -2195,7 +2157,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 19, "metadata": {}, "outputs": [], "source": [ @@ -2306,21 +2268,23 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "Evaluations: : 25it [00:10, 2.47it/s]\n" + "Evaluations: : 113it [00:21, 5.15it/s]\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/linear_model/_sag.py:349: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge\n", + " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ - "0.9824771007566706\n" + "0.9957890070921986\n" ] } ], @@ -2348,6 +2312,7 @@ " max_eval_time_mins=15,\n", " max_time_mins=30,\n", " early_stop=10, #In TPOTEstimatorSteadyState, since there are no generations, early_stop is the number of pipelines to evaluate before stopping.\n", + " n_jobs=30,\n", " verbose=2)\n", "\n", "\n", @@ -2360,20 +2325,9 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8fJSN1AAAACXBIWXMAAA9hAAAPYQGoP6dpAAB9fUlEQVR4nO3dd1iVZR8H8O8ZbESWAoK4QkM2iAPclpoaaZa4MS1LrSw1s9dMG5YNy7JlpjlKxYZJjjK3gIPYICkqArIUWbI54/1DO/l0cHLg4Ry+n+t6ryt/NzzPF+v1/Lzv57lviVqtVoOIiIiI9J5U7ABEREREpBts7IiIiIgMBBs7IiIiIgPBxo6IiIjIQLCxIyIiIjIQbOyIiIiIDAQbOyIiIiIDwcaOiIiIyECwsSMiIiIyEGzsiIiIiAwEGzsiIiIiA8HGjoiIiMhAsLEjIiIiMhBs7IiIiIgMBBs7IiIiIgPBxo6IiIjIQLCxIyIiIjIQbOyIiIiIDAQbOyIiIiIDwcaOiIiIyECwsSMiIiIyEGzsiIiIiAwEGzsiIiIiA8HGjoiIiMhAsLEjIiIiMhBs7IiIiIgMBBs7IiIiIgPBxo6IiIjIQMjFDkBEpEtKpRJFRUUoKChAQUEBruTno6aqCiqlElKZDCZmZmjj6AgHBwc4ODjA1tYWMplM7NhERDohUavVarFDEBE1VHFxMRITE5EcF4fqigqoFQpYVlWhdVERjBQKSNVqqCQS1MnlKLW1RbmZGSRyOUwtLODl7w8fHx/Y2NiI/WMQETUIGzsi0mu5ubmIjoxERno6jCor4ZqVDaeiIrSuqICRUnnL76uTyVBqYYE8W1tkubZHnbk5Orm5IbhfPzg5OTXhT0BEpDts7IhILykUCkRFRSEmKgqWhYV4IDMLLoWFkKlU93wtpVSKS/b2ONfBFeX29ggMDkZwcDDkcj6tQkT6hY0dEemd/Px87I6IQPGlHDyYng63nBxIdfBHmUoiQbqzM/52c4OtizNGhITA0dFRB4mJiJoGGzsi0iuZmZnYER4O89w8BKSlwaqyUuf3KDM3R6y7OyrbtcOY0HHo0KGDzu9BRNQY2NgRkd7IzMzEz1u3wi4zCz1Pn4b8PpZd75ZCKsVJj+4ocnXF2AkT2NwRkV7gPnZEpBfy8/OxIzwctplZ6J2a2qhNHQDIVSr0SUmFbVYWdoRvR35+fqPej4hIF9jYEVGzp1AosDsiAua5eeh1+rROnqe7G1K1Gr1ST8MsLxd7IiKgUCia5L5ERPeLjR0RNXtRUVEovpSDgLS0Rp+p+y+5SoWA02koyslBdHR0k96biOhesbEjomYtNzcXMVFReDA9vVFelLgbrSsr0e1sOk5FRiIvL0+UDEREd4ONHRE1a9GRkbAsLIRbTo6oObrm5MCysBBRkZGi5iAiuh02dkTUbBUXFyMjPR0PZGY12XN1tyJVq9ElMwsZZ8+iuLhY1CxERLfCxo6Imq3ExEQYVVbCpbBQ7CgAgPaFhZBXViIpKUnsKERE9WJjR0TNklKpRHJcHFyzsu/rmLDGIFOp0CE7G0mxsVDe5hxaIiKxsLEjomapqKgI1RUVcCoqEjuKgNPV67mKmlkuIiKAjR0R3cGAAQNw9OhRQW3WrFn4+uuv7/i9f/31F1555ZX7um9BQQHUCgWsy8vv+LXTU5IREh+HATGn0PvkCYTExyEkPg5nKirweEL8fd3/VlpXVGDZe++hoKDgrr+nY8eOKK/n55g2bRp27dpV7/dkZ2dj4MCB6N69O7y9vfHjjz/ed2YiajnkYgcgouZt3Lhx2L59O/r37w/g+hJpREQE3nrrrdt+n1KpRI8ePdCjR4/7um9BQQEsq6ruat+69Z5eAIBfCgpwtrICizp1BgBcqq6+q3sp1WrIJJK7+lojpRKSG/k8PT3v6nvuh1wux6pVq+Dr64vLly/D398fI0aMgIWFRaPdk4j0H2fsiOi2nnjiCfz6669Q3Wiwjhw5gq5duyI0NBT+/v7w8/ND5I0tQA4fPoyhQ4di3LhxGDRoEA4fPownnngCAHDixAkEBQXBz88PgwcP1uwHt2zZMjz99NPo378/OnfujG3btgEAruTn4+CBAxgZF4tH4+KwKff6dieHi4rwZGICQuLj8Hp6OlR3eFu2TqXGwrNnMDz2L8z9Ow3/HI89KOYUPs/KRGhiAk6WluDngnyMTYjHo3Gx+DTzIgCgQqnEjJQUjIqLxai4WBy78TasRKXCJx9/DC8vLwwZMgQVFRUAgLi4OPTs2RPe3t6YOnUqqutpLJcsWQJ3d3eMHDkSly9fvmVuJycn+Pr6AgDatm0LW1tbLv8S0R2xsSOi23JwcEDXrl1x7NgxAMD27dsxceJE7Ny5E3Fxcdi5cydefvllzdefPHkSq1at0lq+7d69O44dO4b4+Hg8/fTT+OCDDzRjGRkZOHjwIP7880+8/vrr169z4gTO5OZih68ffvP3R0ibtiiqq8N3OTn43ssbEX7+MJJKsKfwym3zX6iqxLMu7bHXPwBXa+vwV1mZZsxaboRwH1+0NTbGkaJibPfxxU4/f5wur0B8WRkii4thbSTHLv8A/ObnD79WrQAA5TU18HR3R3JyMpydnfHLL78AAMLCwrB69WokJSXBwsICX375pSDLqVOn8PvvvyMxMRHffvvtXZ9k8ddff0GlUqF9+/Z39fVE1HKxsSOiOwoNDcWPP/4IpVKJ3377DSEhIVi4cCG8vLwQEhKC06dPa742ODgY7dq107pGcXExxowZA09PT7z11luC7xkxYgTkcjm6dOmCkpISAMDptDQMeuABGEuv/zFlbWSEhLIynKms0MzYRZeU4FJ1zW2zdzIzQxdzc0gkEnS3tEBOzb+zaI/Y2wMAoktKEH+tDGMS4jE6IR7nqyqRVV2Nrhbm+KusDB9kZCDh2jVYyq8/vWIql6ObmxsAICAgABcvXkRpaSlqamrQq1cvAMCUKVM0zfA/oqOjMWbMGBgbG8PJyQmDBw++4+/91atXMXXqVHzzzTd3/FoiIj5jR0R3NHbsWLzzzjt47LHH4O3tjT179qCiogLx8fGQyWQwNzfXfO3N/3yzN954AyNHjsSzzz6LEydOYNGiRZoxExMTra+XSCT47yKrGsAgG1u817XrXWf/pzEEAKlEAtVNFzWVyTT/HOroiOddO2h9/6++fjhcVIR3LpzH6LYOmNKuHYxkMshuNHkymQxKpVKzxKvJqlZD8p/n9uqr3U5NTQ3GjBmD1157DUFBQXf9fUTUcnHGjojuyN7eHu7u7pg/fz7GjRuHsrIyODg4QC6X46effqr3WbL/Kisrg4uLCwDg+++/v+PX+/j64uD586i98WxfSV0dfFu1wsnSEuTVXJ+lK66rQ37N7Wfs7kbv1tbYU1iIUkUdACC/pgbFdXUoqKmBuUyGMQ4OCGvnjLSK62+2qgEYm5oKrmFtbQ0TExPExMQAALZs2YJ+/foJviY4OBg7duxAbW0t8vPzcejQoVtmUqvVmDZtGgYPHowpU6Y0+GckopaBM3ZEdFdCQ0Mxa9YsjB49GgqFAiNHjkTPnj3Rt29f2NnZ3fH7FyxYgGnTpuG99967q9mnocOH4/Tx4xgdHw+5RIInHRwxpV07LHvgAcw+fRoKtQpyiRTvuLnBsZ4Zv3vR1cICzzi7YHJSMtRQw0ImwyfdHsT5qiq8n3EBUokEplIp3r2x/KqWStHG0VHrOhs2bMCsWbNQXV0NX19fzJo1SzDes2dPDBs2DN7e3ujWrZvmTeP6REVFITw8HN7e3vj1118BAJs3b4aXl1eDflYiMmwS9X/XD4iImoGUlBTs+fFHjDpyFEbN6JSHOpkMuwb0x4gnn2zU7U6IiO4Hl2KJqFlycHCARC5HaTPbt63UwgISuRwODg5iRyEi0sKlWCJqlmxtbWFqYYE8W1vY37RFidjy7K7nsrW11dk1e/XqhZr/PCt4+PBhWFtb6+weRNQysLEjomZJJpPBy98fCVevontWFmR3cQJFY1NKpchs3x7+AQGQ3fRGbUOdPHlSZ9ciopaNS7FE1Gz5+Pigztwcl27sN6drZdfKkJuXh8tXLqNOobjj12fb20Nhbg5vb+9GyUNE1FBs7Iio2bKxsUEnNzec6+AK1T3s/3Y36hQKlJeXA1BDoVCgqKjotseTqSQSnO/gik5du8LGxkanWYiIdIWNHRE1a8H9+qHc3h7pzs6Neh+lUoGy2zzLd9bZGeX29gju27dRcxARNQQbOyJq1pycnBAYHIy/3dxQdotTLe6HkVwOY2Ph/neVlRWormfD41Jzc5zp6oaeffvCyclJZxmIiHSNjR0RNXvBwcGwcXFGrLs7FFLd/bFlbW0NiUR4vZKSEsGSrEIqRWx3d9g6O/NYLyJq9tjYEVGzJ5fLMTIkBJXt2uGkR3edPW8nl8lgZWUlqKlUSpSWll7/Z4kEJz26o8qpHUaEhEAu50YCRNS8sbEjIr3g6OiIMaHjUOTqiuOeHjqbubMwN4eJifDc16qqSlTU1uK4pweKXF0xJnQcHOs5QoyIqLnhkWJEpFcyMzOxI3w7zHNzEZCWBqvKygZfU6lU4vKVK1Crr++VV2FlhTM9AqHu3AljJ0xAhw4dGnwPIqKmwMaOiPROfn4+dkdEoPhSDh5MT4dbTg6kDfyjrLKqCkWlJcjt2hXpDz6InKIiVNXVYfPmzZDoeKsVIqLGwsaOiPSSQqFAVFQUYqKiYFlYiC6ZWWhfWHhfJ1QopVJk29sj1aEtCszMEBUTg+joaCiVSmzduhXjx49vhJ+AiEj32NgRkV7Lzc1FdFQUMs6ehbyyEh2ys+F0tQitKypgpFTe8vvqZDKUWlggz84Wme3bQ2FuDqf27fH28uU4e/as5utsbGyQmprKbU6ISC+wsSMig1BcXIykpCQkxcaiuqICaoUCllVVsCoqhrFCAalaBZVEilq5HGW2Nig3M4NELoephQW8AwLg7e0NGxsbbN++HaGhoYJrjxo1ChEREVySJaJmj40dERkUpVKJoqIiFBQUoKCgAFfy81FbXQ2lQgGZXA5jU1O0cXSEg4MDHBwcYGtrC5lMJrhGaGgotm/fLqitX78eTz31VFP+KERE94yNHRHRfxQWFsLT0xMFBQWampWVFZKTk+Hq6ipiMiKi2+M+dkRE/2Fvb49vvvlGUCsrK8OMGTPAvwsTUXPGxo6IqB4hISEICwsT1Pbv34+vv/5apERERHfGpVgiolsoKSmBp6cncnJyNDVzc3MkJSWhS5cuIiYjIqofZ+yIiG7B2toa69evF9QqKyvx1FNPQXmbrVSIiMTCxo6I6DaGDh2KZ599VlA7duwYPv30U5ESERHdGpdiiYju4Nq1a/Dx8UFGRoamZmJigvj4eLi7u4uYjIhIiDN2RER30KpVK3z33XeCWk1NDcLCwqBQKERKRUSkjY0dEdFdGDBgAF566SVBLSYmBh988IE4gYiI6sGlWCKiu1RVVQVfX1/BWbJGRkaIiYmBj4+PiMmIiK7jjB0R0V0yMzPDxo0bIZX++0dnXV0dwsLCUFtbK2IyIqLr2NgREd2D3r17Y+HChYJaYmIi3n77bZESERH9i0uxRET3qKamBj169EBKSoqmJpPJcPz4cQQGBoqYjIhaOjZ2RET3IT4+Hj179hS8Fevu7o64uDiYmpqKmIyIWjIuxRIR3Qc/Pz8sWbJEUEtLS9OqERE1Jc7YERHdp7q6OvTp0wexsbGamkQiwdGjR9G3b18RkxFRS8XGjoioAVJTU+Hv7y94K7ZLly5ITEyEhYWFiMmIqCXiUiwRUQN4eHhovRF7/vx5vPrqqyIlIqKWjDN2REQNpFQq0a9fPxw/flxQ379/P4YMGSJSKiJqidjYERHpQHp6Onx8fFBVVaWpubq6Ijk5GVZWViImI6KWhEuxREQ64Obmhvfff19Qy8rKwrx580RKREQtEWfsiIh0RKVS4aGHHsKhQ4cE9V27dmHkyJEipSKiloSNHRGRDl28eBFeXl4oLy/X1JycnJCSkgJbW1sRkxFRS8ClWCIiHerYsSM+/vhjQS0vLw8vvPCCSImIqCXhjB0RkY6p1WqMGDECv//+u6D+008/YezYsSKlIqKWgI0dEVEjyMnJgaenJ0pKSjQ1e3t7pKamom3btuIFIyKDxqVYIqJG4OzsjNWrVwtqhYWFmDVrFvj3aSJqLGzsiIgayaRJkzB69GhB7ZdffsGWLVvECUREBo9LsUREjaigoACenp4oLCzU1KytrZGamop27dqJmIyIDBFn7IiIGpGDgwO++uorQa2kpARPP/00l2SJSOfY2BERNbInnngCEyZMENT27t2L9evXi5SIiAwVl2KJiJpAUVERPDw8kJ+fr6m1atUKycnJ6NChg4jJiMiQcMaOiKgJ2NraYu3atYLatWvXMH36dKhUKpFSEZGhYWNHRNRERo0ahenTpwtqBw8exJdffilSIiIyNFyKJSJqQqWlpfDy8kJ2dramZmZmhsTERLi5uYmYjIgMAWfsiIiaUOvWrbVemqiqqsK0adOgVCpFSkVEhoKNHRFRE3vooYcwe/ZsQS06Ohoff/yxSImIyFBwKZaISATl5eXw9fXF+fPnNTVjY2PExcXBw8NDxGREpM84Y0dEJAJLS0ts2LABEolEU6utrUVYWBjq6upETEZE+oyNHRGRSPr27Yt58+YJarGxsVixYoVIiYhI33EplohIRFVVVfD398fff/+tqcnlcpw6dQp+fn4iJiMifcQZOyIiEZmZmWHjxo2QyWSamkKhQFhYGGpqakRMRkT6iI0dEZHIevbsiUWLFglqycnJePPNN0VKRET6ikuxRETNQG1tLQIDA5GUlKSpSaVSREdHo1evXiImIyJ9wsaOiKiZSExMRGBgoOCt2G7duiE+Ph5mZmYiJiMifcGlWCKiZsLHxwdLly4V1M6cOYPFixeLlIiI9A1n7IiImhGFQoGgoCDExMRoahKJBIcPH0b//v1FTEZE+oCNHRFRM5OWlgY/Pz/BW7GdOnVCUlISLC0tRUxGRM0dl2KJiJoZd3d3LF++XFDLyMjAK6+8IlIiItIXnLEjImqGlEolBg4ciMjISEH9jz/+wNChQ0VKRUTNHRs7IqJm6vz58/D29kZlZaWm5uLiguTkZFhbW4sXjIiaLS7FEhE1U126dMGHH34oqF26dAkvv/yySImIqLnjjB0RUTOmUqkwdOhQHDhwQFDfuXMnQkJCREpFRM0VGzsiomYuKysLnp6euHbtmqbm4OCA1NRU2NnZiZiMiJobLsUSETVzrq6uWLVqlaBWUFCAOXPmiBOIiJotztgREekBtVqNRx99FLt37xbUw8PDMW7cOJFSEVFzw8aOiEhP5OXlwcPDA8XFxZqanZ0dUlNT4eDgIGIyImouuBRLRKQnnJyc8MUXXwhqV69excyZM8G/oxMRwMaOiEivjB8/HmPHjhXUIiIisHnzZpESEVFzwqVYIiI9c+XKFXh4eODKlSuaWuvWrZGSkgIXFxcRkxGR2DhjR0SkZ9q0aYM1a9YIaqWlpZgxYwaXZIlaODZ2RER6aMyYMZg8ebKgtm/fPqxdu1akRETUHHAplohITxUXF8PT0xO5ubmamoWFBZKTk9GpUycRkxGRWDhjR0Skp2xsbPDtt98KahUVFXjqqaegUqlESkVEYmJjR0Skxx555BE8/fTTgtqRI0ewevVqkRIRkZi4FEtEpOfKysrg7e2NzMxMTc3U1BQJCQno1q2biMmIqKlxxo6ISM9ZWVnhu+++E9Sqq6sxbdo0KJVKkVIRkRjY2BERGYBBgwbhhRdeENROnDiBjz76SKRERCQGLsUSERmIiooK+Pr64ty5c5qasbExYmNj4enpKWIyImoqnLEjIjIQFhYW2LhxI6TSf/9or62txdSpU1FXVydiMiJqKmzsiIgMSFBQEObPny+oxcfHY/ny5SIlIqKmxKVYIiIDU11djYCAAJw+fVpTk8vlOHHiBAICAkRMRkSNjTN2REQGxtTUFJs2bYJMJtPUFAoFwsLCUFNTI2IyImpsbOyIiAxQQEAAFi9eLKilpqZi6dKlIiUioqbApVgiIgNVW1uL3r17Iz4+XlOTSqWIjIxEnz59RExGRI2FjR0RkQFLTk5GQECA4K1YNzc3JCQkwNzcXMRkRNQYuBRLRGTAvLy88NZbbwlq6enpeO2110RKRESNiTN2REQGTqFQoG/fvjh58qSgfvDgQQwaNEikVETUGNjYERG1AGfOnIGvry+qq6s1tQ4dOiA5ORmtWrUSMRkR6RKXYomIWoBu3brhvffeE9QyMzO1NjMmIv3GGTsiohZCpVJh8ODBOHLkiKC+d+9eDB8+XKRURKRLbOyIiFqQCxcuwNvbGxUVFZpau3btkJKSAhsbGxGTEZEucCmWiKgF6dy5M1auXCmo5ebmYu7cuSIlIiJd4owdkciUSiWKiopQUFCAgoICXMnPR01VFVRKJaQyGUzMzNDG0REODg5wcHCAra2t4KgoonulVqsxfPhw7Nu3T1DfsWMHRo8eLU4oItIJNnZEIikuLkZiYiKS4+JQXVEBtUIBy6oqtC4qgpFCAalaDZVEgjq5HKW2tig3M4NELoephQW8/P3h4+PDpTO6b9nZ2fDy8kJpaamm1rZtW6SkpKBNmzYiJiOihmBjR9TEcnNzER0ZiYz0dBhVVsI1KxtORUVoXVEBI6Xylt9XJ5Oh1MICeba2yHJtjzpzc3Ryc0Nwv35wcnJqwp+ADMXGjRsxbdo0Qe2JJ57A9u3bIZFIxAlFRA3Cxo6oiSgUCkRFRSEmKgqWhYV4IDMLLoWFkKlU93wtpVSKS/b2ONfBFeX29ggMDkZwcDDkcnkjJCdDpVarMXr0aERERAjqW7duxfjx40VKRUQNwcaOqAnk5+djd0QEii/l4MH0dLjl5ECqg//rqSQSpDs74283N9i6OGNESAgcHR11kJhaivz8fHh4eKCoqEhTs7GxQWpqKmeCifQQGzuiRpaZmYkd4eEwz81DQFoarCordX6PMnNzxLq7o7JdO4wJHYcOHTro/B5kuLZv347Q0FBBbdSoUYiIiOCSLJGeYWNH1IgyMzPx89atsMvMQs/TpyG/j2XXu6WQSnHSozuKXF0xdsIENnd0T0JDQ7F9+3ZBbf369XjqqadESkRE94ONHVEjyc/Px7ZNm2CdcRF9UlN1svR6JyqJBMc9PVDSsRPGT53CZVm6a4WFhfD09ERBQYGmZmVlheTkZLi6uoqYjIjuBTcoJmoECoUCuyMiYJ6bh16nTzdJUwcAUrUavVJPwywvF3siIqBQKJrkvqT/7O3t8c033whqZWVlmDFjBvj3fyL9wcaOqBFERUWh+FIOAtLSGnX5tT5ylQoBp9NQlJOD6OjoJr036beQkBCEhYUJavv378fXX38tUiIiulds7Ih0LDc3FzFRUXgwPb1RXpS4G60rK9HtbDpORUYiLy9PlAykn1atWgVnZ2dBbcGCBTh//rxIiYjoXrCxI9Kx6MhIWBYWwi0nR9QcXXNyYFlYiKjISFFzkH6xtrbG+vXrBbXKyko89dRTUN5mA20iah7Y2BHpUHFxMTLS0/FAZlaTPVd3K1K1Gl0ys5Bx9iyKi4tFzUL6ZejQoXj22WcFtWPHjuHTTz8VKRER3S02dkQ6lJiYCKPKSrgUFoodBQDQvrAQ8spKJCUliR2F9MyHH36ITp06CWr/+9//kJaWJlIiIrobbOyIdESpVCI5Lg6uWdn3dUxYY5CpVOiQnY2k2Fguo9E9adWqFb777jtBraamBmFhYXzbmqgZY2NHek0ul8PX1xeenp548sknUamDlxUuXryIHj16aH799ttvY/jw4aitrcXAgQPx4IMPwtfXFx4eHtiyZQuA6y9MjBs3DtUVFXC66WimuzEo5hQejYvFyLhYjIqLxZrsbChvLOMeuHoV3zXwWT2nq0WorqgQHBkFAIcPH8apU6c0v37jjTdw7NixBt3rVubMmYO2bdsKfl+p+RswYABeeuklQS0mJgYffPCBOIGI6I7Y2JFes7a2RkJCAlJSUmBsbKzzbRk+/fRT/Pnnn9ixYweMjY0BAD/99BMSEhJw4MABzJs3DwDQrl07vPnmm1ArFLAuL7/n+2zz8cVu/wBs9vLGydISrMrMBAAMsbPDU/95Q/FeKNVqtK6ogFqhEGw8C2g3dm+99Rb69et33/cCcMtZwYkTJ2Lv3r0NujaJ491330XXrl0FtWXLliExMVGkRER0O2zsyGD069cP586dw4kTJxAUFAQ/Pz8MHjxYs93HoUOH4OXlBR8fH83MUXJyMvz9/eHr6wtfX19cvnxZc73vvvsOP/zwA3bt2gUzMzOt+5WXl6NVq1YArs/yjR07FpZVVYjIy8Pcv9MwLSUZD/0Vg3WXLgEAKpRKzEhJwagbM3PH6nmhwcbICG8+4IateXlQq9X4paAAKzIuAAB2XbmM4bF/4dG4OMw6nXo9g0KB+Wf+xqNxsQiJj0NcWRlOlpTgqZRkzP07DVOSk1BbW4uff/oJY8eORY8ePRAVFYXs7Gx8/fXXWLFiBXx9fZGcnIxp06Zh165d+OuvvzS/Hw8++KDmOatTp06hX79+8Pf3x9ixY1F+o4Ht2LEj3nrrLQQFBeHw4cP1/rsJDg6GnZ3dPf87JfGZmZlh48aNkEr//bioq6tDWFgYamtrRUxGRPWRix2ASBcUCgX27t2L4cOHo3v37jh27BhkMhm2bNmCDz74AJ988gk+/vhjfPzxx3j44YdRWloKAPjmm28wa9YsPPPMM6iqqoJMJkNlZSXOnj2LpUuXIj4+HlZWVoJ7PfHEEzAyMkJ6errgGaS6ujq0LirCNQBnKyrws68fFGo1hsX+hSnt2iGyuBjWRnKs8/SEWq1GxS1mt9qbmgIArtbVCepfZ2fj6+4e6Ghmhms3nnH6PDsLziamWNntQSjValQplUgtL0fitWvY6x8ABxMTfHgxA326PYjeT01D3wEDMHLkSCQmJuK5556Dvb09nn/+ecF9evTogYSEBADA1KlT0aNHD9TW1mLBggWIiIiAjY0NPvzwQ3z++edYtGgRAMDOzo6bIRuw3r17Y+HChVixYoWmlpiYiLfffhtvv/22iMmI6L/Y2JFeKykpga+vL4DrM3YzZsxAXl4eJk+ejAsXLkChUKBDhw4Ars8aLVq0CGlpaXjyySfRunVr9OnTB2+99RauXr2KcePGoXPnzgAAJycnmJiYYPfu3Zg6dargnj/99BM8PT2RkZGBgQMHYtSoUQAAtUoFoxsNVx9ra5jLZACAtsbGuFpXh64W5ng3owwfZGTgYTs7+P2nYbyZGtpbpfhbWeGNc+kY1aYthtvbAwBOlJTg6+4eAACZRAJLuVzztQ4mJgCA6OISXIuLxY6zZ2Bja4urV6/e1UzLmjVrUF1djRdffBHJyclISkrCoEGDAEDzvOE/nnzyyTtej/TbsmXLsGvXLqSkpGhq7733HkJCQhAYGChiMiK6GZdiSa/984xdQkICVq9eDWNjY7zxxhsYOXIkUlJSsGHDBtTU1AAAFi1ahPXr16O8vByBgYHIycnBxIkTsWvXLpiammLw4MGIi4sDcP2NwD179mDZsmX4888/6713p06d4OTkpNn+Qa1Wa/auM75p2UomkUCpVqOTmTl+9fWDm7k53rlwHptzc+u97qXqakglEtgZGQnqb3Z5AC936Ijs6mqMjo9D9W3ecjW76f5qqLFo0CAsW7wYCQkJuHTpkuZ5wVuJi4vDl19+iXXr1ml+Nn9/f83v9enTp/Hll19qvt7c3Py21yP9Z2Jigk2bNkEu/3c+QKlUIiwsDNXV1SImI6KbsbEjg1NWVgYXFxcAwPfff6+pnz9/Hj4+Pvjf//4Hd3d3ZGRk4MKFC+jSpQteeuklPPzwwzh9+rTm611cXPDrr79i+vTpmqXJmxUWFuLChQtwdXUFAEgkEqgkklvmKqipgblMhjEODghr54y0Cu2XLErq6rD0/DlMcnKC5D/Xyq6uhp+VFeZ16AAjqRQlCgWCrG2w5cYzhEq1GuX1bEMRZG2D38+ehezGB/I/D723atUK165d085QUoKwsDD88MMPmmcIH3zwQWRmZmp+HyoqKnDu3Llb/qxkmPz8/LBkyRJBLS0tTatGROJhY0cGZ8GCBXjppZfQt29fwUzSJ598Ag8PD3h7e8PV1RV9+vRBeHg4PD09NS9OjBkzRnAtb29vbNiwAaNHj0bmjTdVn3jiCfj6+qJ///5Yvnw5HB0dAQASqRR18ls/3XC2shJjE+IREh+H7/NyMf2mt13HJyZgZFwsJicnoVfr1njBtYPW96/IyLj+4kV8HIbZ2cPRxASz27dHTk01RsXFYkxCPNLr2e5ljqsrimtq8PqyZejevTu+/fZbAMCjjz6KrVu3al6e+MfOnTuRnZ2NyZMnw9fXFyNGjICxsTG2bduG2bNnw9vbG3369Lmnxu7pp59Gnz59kJSUBBcXF+zYseOuv5eal9deew0BAQGC2sqVKxHJo+uImgWJWi3yuUdEBuLAgQM488cfePj4CbGjaPmzT290GzYMQ4YMETsKGYDU1FT4+/sLntXs0qULEhMTYWFhIWIyIuKMHZGOODg4oNzMDHU3XppoLupkMpSbmcHBwUHsKGQgPDw8tN6GPX/+PF599VWREhHRP9jYEemIg4MDJHI5SptoxqK8ohyXr1xBcXExFLd5kaLUwgISubxJGrsxY8Zo9sD75383P7dIhmP+/Pno06ePoPbFF1/gwIEDIiUiIoDbnRDpjK2tLUwtLJBnawv7srJGvVdtbS3KbtxDoahDdU0NrK1bw8xUeyPlPLvruWxtbRs1EwA+O9eCyGQybNy4ET4+PqiqqtLUp0+fjuTkZK39H4moaXDGjkhHZDIZvPz9keXaHkpp4/5fS6VSCX6tVqtQXFyMktJS3PzYrFIqRWb79vAOCICsmS0Rk/5zc3PD+++/L6hlZWVpjtojoqbHxo5Ih3x8fFBnbo5LNzYQbiwmpqYwMtLei66ysgJXCgtRd2Pbk2x7eyjMzeHt7d2oeajlmjNnjmbj6n+sW7cOu3fvFikRUcvGxo5Ih2xsbNDJzQ3nOrjedk+7hpLg+jFeZmbaGwMrFHUovHIF5VVVON/BFZ26doWNjU2jZaGWTSqVYv369bC0tBTUn3nmGRQVFYmUiqjlYmNHpGPB/fqh3N4e6TftU9cYpBIJbKytYW1tA4lE+H9lNdQ47dAW+aam8PXza9QcRB07dsTHH38sqOXl5eGFF14QKRFRy8XGjkjHnJycEBgcjL/d3FD2n6O21Go1qqurofzPM3INYW5mhjb29jCS/3sEWYWVFdIffBAHIyMxfPhwxMbG6ux+RPV5+umnMXz4cEFty5Yt+Pnnn0VKRNQysbEjagTBwcGwcXFGrLs7FFIp1Lh+DFdefj6KiotQUFCAKh2erymXy2Hfxh4W5hZQymT4O6AHcoqKEB0djfPnz6NPnz5YtWoVuB85NRaJRIJvv/0W1tbWgvpzzz2Hy5cvixOKqAViY0fUCORyOUaGhKCyXTtEP9gNlwsLUVpWCuCfxkpd7zmtDSGBBK2srZExYADyLS0QsWcPlDf2t6urq8PLL7+Mxx57DFevXtXpfYn+4ezsjNWrVwtqhYWFmDVrFv9SQdRE2NgRNZLS0lKcSojHWXNzJAb2gPI/243o+tUKhVSK454eKO3cGWPHj0fnzp21vua3336Dr68vjh07puO7E103adIkrTOXf/nlF2zZskWkREQtC8+KJWoEu3btwuOPP466ujq4urriydGj0a6yEu6xsTC/sbGwpYWlzjZxLTU3R2x3d1Q5tcOY0HHo0KED6urqsGTJEq19xoDrbzK++eabeO2117i/Henc5cuX4eHhgcLCQk3N2toaKSkpcG7kl4qIWjo2dkSNoFevXjh16pTm123btkXIyJFwtrGB299/o93Zs7BtbQ1zM+2TIu6FSiLBWWdnnOnqBltnZ4wICYGjo6Pga/744w9MmTIFV65c0fr+wYMH4/vvv4eTk1ODchD9108//YQnn3xSUHvkkUewe/duSBpxKyCilo6NHVEjeOSRR/D7778LajKZDEFBQQgODIR9eTk88gvQsaQEsvt4Q1YplSLb3h7nO7ii3N4ePfv2RVBQEOTy+k8JzMvLw5QpU+o9x7NNmzbYvHkzhg0bds85iG5n4sSJ2Lp1q6C2du1aPP300yIlIjJ8bOyIGsH58+cxaNAgZGdna405OjoiOCgIgb6+MK6uRofsbDhdLULrigoY3XjZoT51MhlKLSyQZ2eLzPbtoTA3R6euXRHct+9dzbgplUqsWLECb7zxhtaRZACwcOFCvPPOOzAyMqrnu4nuXVFRETw8PJCfn6+pWVpaIjk5GR07dhQvGJEBY2NH1AhUKhUGDhx4y5cUzMzMkJOTg6SkJCTFxqK6ogJqhQKWVVWwKiqGsUIBqVoFlUSKWrkcZbY2KDczg0Quh6mFBbwDAuDt7X1fJ0pERkZiwoQJuHTpktZY7969sXXrVn7oks7s3r0bo0aNEtQGDRqE/fv3Q9rIZyoTtURs7Igaweeff37bXff79euHo0ePArg+k1ZUdH1vu4KCAlzJz0dtdTWUCgVkcjmMTU3RxtERDg4OcHBwgK2tbYNfeLh69SqmT5+OiIgIrTFra2usW7cOjz/+eIPuQfSPGTNmYP369YLa6tWr8fzzz4uUiMhwsbEj0rH09HT4+PigqqpKU3NxccGQIUMQHh4OBwcH/PjjjwgMDBQx5fVTMD777DO88sorqKur0xqfPXs2Vq5cCVNTUxHSkSEpLS2Fl5eX4NEEMzMzJCYmws3NTcRkRIaHjR2RDimVSvTv3x/R0dGC+v79+zFkyBAoFIpbvuAgltjYWIwfPx7nzp3TGvPx8UF4eDi6desmQjIyJPv378fDDz8sqAUFBeHo0aPccodIh/iAA5EOffzxx1pN3Zw5czBkyBAAaHZNHQAEBAQgNjYWEydO1BpLTExEQEAANm7cKEIyMiQPPfQQZs+eLahFR0fj448/FikRkWHijB2RjqSmpsLf3x+1tbWaWpcuXZCYmAgLCwsRk90dtVqN7777Ds8//7xgGfkfU6ZMwZdffglLS0sR0pEhKC8vh6+vL86fP6+pGRsbIy4uDh4eHiImIzIcnLEj0oG6ujqEhYUJmjqJRIINGzboRVMHXM87ffp0/PXXX/D09NQa37x5MwICApCQkND04cggWFpaYsOGDYINimtraxEWFlbvc55EdO/Y2BHpwIoVKxAbGyuozZs3D3379hUp0f3r3r07Tp06hZkzZ2qNnT17Fr1798YXX3zBQ93pvvTt2xfz5s0T1GJjY7FixQqREhEZFi7FEjVQfHw8evbsCYVCoam5u7sjLi5O798o3b59O5555hmU3Tjf9mZjxozBunXr7msvPWrZqqqq4O/vj7///ltTk8vlOHXqFPz8/ERMRqT/2NgRNUBNTQ0CAwORnJysqclkMhw/flz07Ux05cKFCxg/fjxiYmK0xlxdXbFt2zb06dNHhGSkz06dOoWgoCAobzptxcvLCzExMTAxMRExGZF+41IsUQO8+eabgqYOAF577TWDaeoAoHPnzoiMjMT8+fO1xrKystCvXz+sWLGi3mPKiG6lZ8+eWLRokaCWnJyMN998U6RERIaBM3ZE9+nkyZMICgoSNDQ+Pj44deoUjI2NRUzWeHbv3o2wsDBcvXpVa2zo0KHYtGkTHBwcREhG+qi2thaBgYFISkrS1KRSKaKjo9GrVy8RkxHpLzZ2RPehqqoKfn5+OHPmjKZmZGSEmJgY+Pj4iJis8eXk5GDSpEk4cuSI1pijoyO+//57zb59RHeSmJiIwMBAwVux3bp1Q3x8PMzMzERMRqSfuBRLdB8WL14saOoAYOnSpQbf1AGAs7MzDhw4gKVLl2od4p6fn4+HH34Yr7/+uuBlEqJb8fHxwdKlSwW1M2fOYPHixSIlItJvnLEjukdHjx7FwIEDBdt9BAYGIjo6ulmeLNGYDh8+jEmTJiE3N1drLDg4GFu3bkX79u1FSEb6RKFQICgoSPCCjkQiweHDh9G/f38RkxHpHzZ2RPegvLwc3t7eyMjI0NRMTEwQHx8Pd3d3EZOJ58qVK5g2bRr27NmjNWZjY4MNGzYgJCREhGSkT9LS0uDn54eamhpNrVOnTkhKSuJpJ0T3gEuxRPfglVdeETR1APDuu++22KYOANq0aYPffvsNH330kdaMZXFxMR577DHMnTtX8IFN9F/u7u5Yvny5oJaRkYFXXnlFpERE+okzdkR3ad++fRg2bJig1q9fPxw6dAgymUykVM3LqVOnMH78eK3mFwD8/f2xbds2uLm5iZCM9IFSqcTAgQMRGRkpqP/xxx8YOnSoSKmI9AsbO6K7UFJSAi8vL1y6dElTMzc3R1JSErp06SJisuantLQUzzzzDH788UetMUtLS6xZswYTJ04UIRnpg/Pnz8Pb2xuVlZWamouLC5KTk2FtbS1eMCI9waVYorvw8ssvC5o6APjoo4/Y1NWjdevWCA8Px5o1a7SOVCsvL8ekSZMwY8YMVFRUiJSQmrMuXbrgww8/FNQuXbqEl19+WaRERPqFM3ZEdxAREYHHHntMUHvooYewb98+SCQSkVLph+TkZISGhiItLU1rzN3dHeHh4fDy8hIhGTVnKpUKQ4cOxYEDBwT1nTt38kUcojtgY0d0G1evXoWHhwcKCgo0NSsrKyQnJ8PV1VXEZPqjoqICL774ItavX681ZmpqilWrVmHmzJlskkkgKysLnp6euHbtmqbm4OCA1NRU2NnZiZiMqHnjUizRbcyZM0fQ1AHAqlWr2NTdAwsLC6xbtw4//PCD1rYV1dXVeO655xAaGorS0lKRElJz5OrqilWrVglqBQUFmDNnjjiBiPQEZ+yIbmH79u0IDQ0V1EaNGoWIiAjOLt2n9PR0jB8/HnFxcVpjnTp1wrZt29CzZ08RklFzpFar8eijj2L37t2Cenh4OMaNGydSKqLmjY0dUT0KCgrg4eEhOOzexsYGqampcHJyEjGZ/qupqcHChQvx2WefaY3J5XKsWLECL7/8stZxZdQy5eXlwcPDA8XFxZqanZ0dUlNT4eDgIGIyouaJf3IS/YdarcbMmTMFTR0AfPnll2zqdMDExASffvopdu7cCRsbG8GYQqHAggUL8Oijj+LKlSsiJaTmxMnJCV988YWgdvXqVcycOROclyDSxsaO6D82b96MiIgIQe2JJ57QWpalhgkJCUFiYiKCg4O1xvbs2QNfX18cPny46YNRszN+/HiMHTtWUIuIiMDmzZtFSkTUfHEplugmly5dgqenp+BB/rZt2yIlJQVt2rQRMZnhUigUWLZsGd59912tGRipVIolS5ZgyZIlPN2jhbty5Qo8PDwEM7mtW7dGSkoKXFxcRExG1Lxwxo7oBrVajRkzZmi9nblmzRo2dY1ILpfjnXfewb59+7SemVKpVHjzzTcxZMgQ5OTkiJSQmoM2bdpgzZo1glppaSlmzJjBJVmim7CxI7rhm2++wb59+wS1KVOmYPTo0eIEamEeeughJCYm1nsm6JEjR+Dr64s9e/aIkIyaizFjxmDy5MmC2r59+7B27VqREhE1P1yKJQJw4cIFeHt7C465ateuHVJSUrQe8KfGpVKp8OGHH2Lx4sVQKpVa4/Pnz8e7774LY2NjEdKR2IqLi+Hp6Ync3FxNzcLCAsnJyejUqZOIyYiaB87YUYunUqkwffp0rbNL161bx6ZOBFKpFK+++iqOHTtW70bQK1euRN++fXHhwgUR0pHYbGxs8O233wpqFRUVeOqpp6BSqURKRdR8sLGjFm/16tU4cuSIoPbMM89g+PDhIiUiAOjTpw8SEhIwZswYrbGYmBj4+flh+/btIiQjsT3yyCN4+umnBbUjR45g9erVIiUiaj64FEst2pkzZ+Dr64vq6mpNrWPHjkhKSkKrVq1ETEb/UKvV+PLLLzFv3jzU1tZqjc+cOROrVq2CmZmZCOlILGVlZfD29kZmZqamZmpqioSEBHTr1k3EZETi4owdtVgKhQLTpk0TNHUAsH79ejZ1zYhEIsGcOXNw8uRJdO3aVWv8m2++Qc+ePXH69GkR0pFYrKys8N133wlq1dXVmDZtWr3PZhK1FGzsqMX66KOPcOLECUHtxRdfxKBBg0RKRLfj6+uL2NhYTJkyRWssJSUFPXr0wPr167n1RQsyaNAgvPDCC4LaiRMn8NFHH4mUiEh8XIqlFik5ORk9evQQLO25ubkhISEB5ubmIiaju7Fp0ybMnj1b64UXAJg4cSK++uorWFlZiZCMmlplZSV8fX2Rnp6uqRkbGyM2Nhaenp4iJiMSBxs7anHq6urQq1cvxMfHa2pSqRTHjh1DUFCQiMnoXpw5cwbjxo1DUlKS1tgDDzyAbdu2ISAgQIRk1NSio6PRr18/wVuxfn5+OHnyJIyMjERMRtT0uBRLLc7y5csFTR0ALFiwgE2dnunWrRtOnjyJ2bNna42dO3cOffr0waeffsql2RYgKCgICxYsENTi4+OxfPlykRIRiYczdtSixMbGolevXoKHqz08PPDXX3/B1NRUxGTUED///HO9x8EBQEhICNavXw87OzsRklFTqa6uRkBAgOAlGplMhpMnT3LmlloUNnbUYlRXV6NHjx5ITU3V1PgHv+G4ePEiJkyYoPVCDAC4uLhgy5Yt6NevnwjJqKnwL25EXIqlFmTp0qWCpg4AXn/9dTZ1BqJjx444evQoXn31Va2xS5cuYeDAgXjnnXe4FYYBCwgIwOLFiwW11NRULF26VKRERE2PM3bUIhw/fhx9+/blw9UtxB9//IEpU6bgypUrWmODBw/G999/DycnJxGSUWOrra1F7969+XIUtVhs7Mjg3Wo7hL/++gteXl4iJqPGlJeXh8mTJ+PgwYNaY23atMHmzZsxbNgwEZJRY0tOTkZAQADq6uo0NW5nRC0Fl2LJ4L322muCpg4A3nzzTTZ1Bs7JyQn79u3DO++8A6lU+EfdlStXMHz4cLz66quCD38yDF5eXnjrrbcEtfT0dLz22msiJSJqOpyxI4N26NAhDB48WFDr3bs3jh07BrlcLlIqamqRkZGYMGECLl26pDXWu3dvbN26FR07dmz6YNRoFAoF+vbti5MnTwrqBw8e5OkyZNDY2JHBunbtGry8vHhIOAEArl69iunTpyMiIkJrzNraGuvWrcPjjz8uQjJqLGfOnIGvr6/gPOgOHTogOTmZ50GTweJSLBms+fPnC5o6AFixYgWbuhbKzs4Ov/76K1atWqX1wkxJSQnGjh2LOXPmCJoA0m/dunXDe++9J6hlZmZi/vz5IiUianycsSOD9Pvvv+ORRx4R1AYMGICDBw9qPW9FLU9sbCzGjx+Pc+fOaY35+PggPDycfwEwECqVCoMHD8aRI0cE9b1792L48OEipSJqPGzsyOAUFxfD09MTubm5mpqlpSWSkpLQqVMnEZNRc1JWVoZZs2Zhy5YtWmMWFhb44osvEBYWJkIy0rULFy7A29sbFRUVmlq7du2QkpICGxsbEZMR6R6nLsjgzJ07V9DUAcDKlSvZ1JGAlZUVvv/+e6xbtw5mZmaCsYqKCkybNg1Tp05FeXm5SAlJVzp37oyVK1cKarm5uZg7d65IiYgaD2fsyKDs2LFD6wH4YcOGYe/evZBIJCKloubu9OnTCA0NRUpKitZY165dER4eDl9f36YPRjqjVqsxfPhw7Nu3T1DfsWMHRo8eLU4ookbAxo4MxpUrV+Dh4SE4baB169ZISUmBi4uLiMlIH1RVVeGll17CN998ozVmYmKClStXYvbs2fwLgh7Lzs6Gl5cXSktLNbW2bdsiJSUFbdq0ETEZke5wKZYMglqtxqxZs7SOkPrss8/Y1NFdMTMzw5o1axAeHg4rKyvBWE1NDZ5//nmMHTsWxcXFIiWkhmrfvj0+/fRTQe3y5cuYPXs2OMdBhoIzdmQQtm7diokTJwpqjz32GHbs2MEZFrpnFy5cwPjx4xETE6M15urqim3btqFPnz4iJKOGUqvVGD16tNZ+hlu3bsX48eNFSkWkO2zsSO/l5eXBw8NDMJNiZ2eH1NRUODg4iJiM9FltbS3+97//aT10DwAymQzvvPMOFi5cyO1z9FB+fj48PDxQVFSkqdnY2CA1NRVOTk4iJiNqOP6JRHpNrVbjmWee0Voe++qrr9jUUYMYGxvjo48+wq5du2BnZycYUyqVeO211/DII4+goKBApIR0vxwdHfHVV18JasXFxZg5cyaXZEnvsbEjvbZhwwbs3r1bUAsNDcWTTz4pUiIyNCNHjkRiYiIGDBigNbZv3z74+vriwIEDIiSjhhg3bhzGjRsnqO3atQsbNmwQJxCRjnAplvRWVlYWPD09ce3aNU3NwcEBqampWjMsRA2lVCrx9ttv4+2334ZKpRKMSSQS/O9//8OyZcsgl8tFSkj3qrCwEJ6enoJZVysrKyQnJ8PV1VXEZET3jzN2pJdUKhVmzJghaOoAYO3atWzqqFHIZDIsW7YMBw4cQLt27QRjarUay5cvx8CBA5GdnS1SQrpX9vb2WtvblJWVYcaMGVySJb3Fxo700tdff439+/cLatOmTcOjjz4qUiJqKQYOHIiEhASMGDFCaywqKgo+Pj5ab1xS8xUSEqJ1dNz+/fvx9ddfi5SIqGG4FEt65/z58/D29kZlZaWm5uLigpSUFLRu3VrEZNSSqFQqfPLJJ1i0aBEUCoXW+IsvvogPPvgAJiYmIqSje1FSUgJPT0/k5ORoaubm5khKSkKXLl1ETEZ07zhjR3pFqVRi2rRpgqYOANatW8emjpqUVCrF/PnzERUVVe85xJ999hmCgoKQnp4uQjq6F9bW1li/fr2gVllZiaeeegpKpVKkVET3h40d6ZVPP/0UkZGRgtpzzz2HoUOHipSIWrqePXsiPj6+3jex4+Li4O/vjy1btoiQjO7F0KFD8eyzzwpqx44d0zqpgqi541Is6Y20tDT4+fmhpqZGU+vUqROSkpJgaWkpYjKi6y9QrF27FnPnzkV1dbXW+PTp0/HZZ5/BwsJChHR0N65duwYfHx9kZGRoaiYmJoiPj4e7u7uIyYjuHmfsSC8oFAqEhYUJmjqJRIINGzawqaNmQSKRYObMmTh16lS9TcD69esRGBiI5ORkEdLR3WjVqhW+++47Qa2mpgZhYWH1PkdJ1ByxsSO98P7772ud2/nSSy+hf//+IiUiqp+XlxdiYmIwffp0rbG0tDT07NkTa9as4XYazdSAAQPw0ksvCWoxMTH44IMPxAlEdI+4FEvNXmJiIgIDA1FXV6epdevWDfHx8TAzMxMxGdHtbdmyBc8++yzKy8u1xp588kmsXbuWL/00Q1VVVfD19cXZs2c1NSMjI8TExMDHx0fEZER3xsaOmrXa2loEBgYiKSlJU5NKpYiOjkavXr1ETEZ0d9LT0zF+/HjExcVpjXXq1Anbtm1Dz549RUhGt3PixAkEBwcLThnx8fHBqVOnYGxsLGIyotvjUiw1a2+99ZagqQOAV199lU0d6Q03NzdER0fjxRdf1BrLyMhAcHAwVq5cqXVMGYmrd+/eWLhwoaCWmJiIt99+W6RERHeHM3bUbJ06dQpBQUGCfaT+eX6Jm76SPoqIiMC0adNQXFysNTZixAhs2LABbdq0ESEZ1aempgY9evRASkqKpiaTyXD8+HEEBgaKmIzo1tjYUbNUVVUFf39//P3335qaXC5HTEwMfH19xQtG1EDZ2dmYMGECoqKitMbatWuHH374AQMHDmz6YFSv+Ph49OzZU/BWrLu7O+Li4mBqaipiMqL6cSmWmqUlS5YImjoAeOONN9jUkd5r3749Dh8+jMWLF0MikQjGcnNzMWTIECxbtownHjQTfn5+WLJkiaCWlpamVSNqLjhjR83OsWPHMGDAAMF2ED169EB0dDSMjIxETEakW/v378fkyZNRUFCgNTZgwAD88MMPcHZ2FiEZ3ayurg59+vRBbGyspiaRSHD06FH07dtXxGRE2tjYUbNSXl4OHx8fXLhwQVMzMTFBXFwcunfvLmIyosZRUFCAqVOnYt++fVpj9vb22LhxI0aMGCFCMrpZamoq/P39UVtbq6l16dIFiYmJPE2EmhUuxVKz8uqrrwqaOgB455132NSRwXJwcMDevXuxYsUKyGQywVhhYSFGjhyJBQsWCBoKanoeHh5ab8SeP38er776qkiJiOrHGTtqNvbv34+HH35YUAsODsaRI0e0PvCIDNHx48cxfvx4ZGVlaY0FBgZi27Zt6Ny5swjJCACUSiX69euH48ePC+r79+/HkCFDREpFJMTGjpqF0tJSeHl5ITs7W1MzNzdHYmIiHnjgARGTETWt4uJizJgxAzt27NAas7Kywtq1azFu3DgRkhFwfcNpHx8fVFVVaWqurq5ITk6GlZWViMmIruNSLDUL8+bNEzR1wPXzYdnUUUtjY2ODn3/+GZ9//rnWCQdlZWUIDQ3Fs88+K2gsqOm4ubnh/fffF9SysrIwb948kRIRCXHGjkS3a9cuPProo4La4MGD8eeff0Iq5d89qOVKSEhAaGio4MzSf3h6eiI8PJzPn4pApVLhoYcewqFDhwT1Xbt2YeTIkSKlIrqOjR2J6urVq/D09ER+fr6m1qpVKyQnJ6NDhw4iJiNqHsrLyzF79mxs3rxZa8zMzAyff/45nnrqKa098ahxXbx4EV5eXigvL9fUnJyckJKSAltbWxGTUUvH6RAS1QsvvCBo6gDgk08+YVNHdIOlpSU2bdqEjRs3am2rUVVVhRkzZmDy5MkoKysTKWHL1LFjR3zyySeCWl5eHl544QWREhFdxxk7Es1PP/2EJ598UlAbMWIEdu3axdkHonqcOXMG48aNQ1JSktbYAw88gG3btiEgIECEZC2TWq3GyJEjsXfvXkH9p59+wtixY0VKRS0dGzsSRUFBATw9PVFYWKip2djYICUlBe3atRMxGVHzVl1djfnz5+PLL7/UGjMyMsKHH36IF198kX85aiI5OTnw9PRESUmJpmZvb4/U1FS0bdtWvGDUYnEplpqcWq3Gc889J2jqAGD16tVs6ojuwNTUFF988QV++ukntG7dWjBWV1eHl156CaNHj8bVq1dFStiyODs7Y/Xq1YJaYWEhnnvuOXDehMTAGTu6Z0qlEkVFRSgoKEBBQQGu5OejpqoKKqUSUpkMJmZmaOPoCAcHBzg4OMDW1lawwfD333+PKVOmCK75+OOP46effuIsA9E9uHjxIiZMmIATJ05ojbm4uGDr1q08y7QJqNVqjB07Vmvvwe+//x6TJk0SKRW1VGzs6K4VFxcjMTERyXFxqK6ogFqhgGVVFVoXFcFIoYBUrYZKIkGdXI5SW1uUm5lBIpfD1MICXv7+8PHxQWVlJTw8PFBaWqq5LpctiO5fXV0dlixZorW3GgDIZDK8+eabWLRoEU9vaWSXL1+Gh4eHYCXC2toaKSkpcHZ2FjEZtTRs7OiOcnNzER0ZiYz0dBhVVsI1KxtORUVoXVEBI6Xylt9XJ5Oh1MICeba2yHJtjzpzc2Tn5eHHn38WvAn7888/4/HHH2+KH4XIYP3xxx+YMmUKrly5ojU2ZMgQbN68GU5OTiIkaznqeyHskUcewe7du7kaQU2GjR3dkkKhQFRUFGKiomBZWIgHMrPgUlgImUp1z9dSSqU436oV/nZuh0JLS0TFxCA6OhqhoaH44YcfGiE9UcuTl5eHyZMn4+DBg1pjbdu2xaZNmzBs2DARkrUcEydOxNatWwW1tWvX4umnnxYpEbU0bOyoXvn5+dgdEYHiSzl4MD0dbjk5kDbgPxWFUokrV65ACTVyu3ZF+oMP4kp5OebOn4+uXbvqMDlRy6ZUKvHee+9h6dKlUNXzl7BXX30Vb7/9NoyMjERIZ/iKiorg4eEhWJWwtLREcnIyOnbsKF4wajHY2JGWzMxM7AgPh3luHgLS0mBVWdmg66lx/YSJ2toaTa3SygoZwcGobe+KMaHjuCExkY4dO3YMEydOxKVLl7TG+vTpg61bt/L/d41k9+7dGDVqlKA2aNAg7N+/n8ckUqPjf2EkkJmZiZ+3boVNxkX0i49vcFMHAJUVFYKmDgDsFQoMSkqG9cUM/Lx1KzIzMxt8HyL6V79+/ZCQkKB1DjMAHD9+HL6+vvjll19ESGb4Ro4cienTpwtqhw4dqnfvQSJd44wdaeTn52Pbpk2wzriIPqmpDVp6/YdCqcCVy1egxr/XkslkaNOmLaQSCVQSCY57eqCkYyeMnzoFjo6ODb4nEf1LrVbjs88+wyuvvIK6ujqt8Tlz5uCjjz6CqampCOkMV2lpKby8vJCdna2pmZmZITExEW5ubiImI0PHGTsCcP1Fid0RETDPzUOv06d10tSpAZQUlwiaOuD6FgDSG2+ISdVq9Eo9DbO8XOyJiIBCoWjwfYnoXxKJBHPnzsXx48fRpUsXrfEvvvgCvXv3xpkzZ0RIZ7hat26N9evXC2pVVVWYNm0alLfZTYCoodjYEQAgKioKxZdyEJCWBvl9vPVan7q6OtTW1QpqFuYWMDE2EdTkKhUCTqehKCcH0dHROrk3EQkFBAQgLi4OEyZM0BpLTExEQEAANm3aJEIyw/XQQw9h9uzZglp0dDQ+/vhjkRJRS8DGjpCbm4uYqCg8mJ6uk2fqNP4z6yeTyWFlZVXvl7aurES3s+k4FRmJvLw83WUgIg0rKyv88MMP+Pbbb2FmZiYYq6ioQFhYGMLCwlBeXi5SQsPz/vvva82Uvv7660hNTRUpERk6NnaE6MhIWBYWwi0nR6fXNTI2hrm5BYDrTZ2tre1tN+nsmpMDy8JCREVG6jQHEf1LIpFgxowZiImJgYeHh9b4pk2b0KNHDyQmJoqQzvBYWlpiw4YNgj/7amtrERYWVu8zj0QNxcauhSsuLkZGejoeyMzSyXN1N5MAsG7dGk5O7eDQti2M5PLbfr1UrUaXzCxknD2L4uJinWYhIiEPDw+cOnUKM2fO1Bo7c+YMevXqhS+//JIH2etA3759MW/ePEEtNjYWK1asECkRGTI2di1cYmIijCor4XLT+Ya6di8H6bQvLIS8shJJSUmNloeIrjM3N8eaNWuwbds2rcckampqMGfOHDzxxBP8i5YOvP3223jwwQcFtbfeegvx8fEiJSJDxcauBVMqlUiOi4NrVvZ9HRPWGGQqFTpkZyMpNpZvjhE1kdDQUMTHx6NHjx5aY7/88gv8/Pxw/PhxEZIZDjMzM2zcuBEymUxTUygUCAsLQ01NzW2+k+jesLET2YABA3D06FFBbdasWfj666/v+L1//fUXXnnllfu+d1FREaorKuBUVHTbr5uekoyQ+DgMiDmF3idPICQ+DiHxcThTUYHHE3T/t83p27ahuqICRXfI9Y+OHTvW+7D3tGnTsGvXrlt+35gxY2BjY4MnnnjivrMSGYrOnTsjKipKa8kQuL5xeb9+/fD+++/Xe0wZ3Z2ePXti0aJFglpycjLefPNNkRKRIWJjJ7Jx48Zh+/btml8rlUpERERg7Nixt/0+pVKJHj164MMPP7zvexcUFECtUMD6Dm/Arff0QoSfP+a6dsDotm0R4eePCD9/WNz0N8/bZr3HZ3QkajXUCgUKCgru6fvu1YsvvsjtHYhuYmxsjJUrV2LXrl2ws7MTjCmVSixatAgjRozA5cuXRUqo/9544w14e3sLau+//z5OnjwpUiIyNGzsRPbEE0/g119/1fwt+MiRI+jatStCQ0Ph7+8PPz8/RN54S/Tw4cMYOnQoxo0bh0GDBuHw4cOa2aYTJ04gKCgIfn5+GDx4sGbLkGXLluHpp59G//790blzZ2zbtk1z748++girvvwSY/76C5tyr78Re7ioCE8mJiAkPg6vp6dDdYemrE6lxsKzZzA89i/M/TtN86D1oJhT+DwrE6GJCThZWoKfC/IxNiEej8bF4tPMiwCACqUSM1JSMCouFqPiYnHspud49u3Zg0cffRRDhgxBRUUFACAuLg49e/aEt7c3pk6diurqaq08S5Ysgbu7O0aOHHnHD59BgwahVatWt/0aopZo5MiRSEhIQP/+/bXG/vjjD/j4+ODAgQMiJNN/xsbG2LRpE4yMjDQ1lUqFsLAwVFVViZiMDAUbO5E5ODiga9euOHbsGABg+/btmDhxInbu3Im4uDjs3LkTL7/8subrT548iVWrVmkt33bv3h3Hjh1DfHw8nn76aXzwwQeasYyMDBw8eBB//vknXn/9dQDArl278FdMDN4dNQq/+fsjpE1bFNXV4bucHHzv5Y0IP38YSSXYU3jltvkvVFXiWZf22OsfgKu1dfirrEwzZi03QriPL9oaG+NIUTG2+/hip58/TpdXIL6sDJHFxbA2kmOXfwB+8/OH340mq0ShQGCbtnjvnXfg7OysOc8yLCwMq1evRlJSEiwsLLTOXTx16hR+//13JCYm4ttvv+Vmx0QN4OLigoMHD2Lp0qVa2xTl5+fj4Ycfxuuvv87TYu6Dj48Pli5dKqidOXMGixcvFikRGRI2ds1AaGgofvzxRyiVSvz2228ICQnBwoUL4eXlhZCQEJw+fVrztcHBwWjXrp3WNYqLizFmzBh4enrirbfeEnzPiBEjIJfL0aVLF5SUlAAADh48iOA+fWB+Y4bN2sgICWVlOFNZoZmxiy4pwaXq2z/U28nMDF3MzSGRSNDd0gI5Nf/Ooj1ibw8AiC4pQfy1MoxJiMfohHicr6pEVnU1ulqY46+yMnyQkYGEa9dgeWM7FAuZDH5t26K2uhoBAQG4ePEiSktLUVNTg169egEApkyZommG/xEdHY0xY8bA2NgYTk5OGDx48N3+KyCieshkMixbtgwHDx6Ek5OTYEytVmP58uUYNGiQ4DxUujuvvvoqAgMDBbX6/tJOdK/Y2DUDY8eOxc6dO3Hw4EF4e3tjz549qKioQHx8POLj4wUPK5ubm9d7jTfeeAMjR45ESkoKNmzYIHjLysTEpN7vUatUgr3r1AAG2dhqnqH7I6AHnmvf/rbZjaX//icklUigumnl1vSmZ/BCHR01193fIxCPtW2LTmbm+NXXD27m5njnwnlszs0FABhJJJCqVVAqFJDJZFAqlVp7aanVaq1ZhPpqRNRwAwcORGJiIh555BGtscjISPj6+iIiIkKEZPpLLpdj48aNgj+f1Wo1pk2bxpM/qEHY2DUD9vb2cHd3x/z58zFu3DiUlZXBwcEBcrkcP/30U73Pkv1XWVkZXFxcAADff//9Hb/+oYceQuTx46i50TSW1NXBt1UrnCwtQd6NprC4rg75OngNv3dra+wpLESp4vou6/k1NSiuq0NBTQ3MZTKMcXBAWDtnpFX8+4eZSiKF7KYNja2trWFiYoKYmBgAwJYtW9CvXz/BfYKDg7Fjxw7U1tYiPz8fhw4danB2IrquTZs22LVrFz788EPI/7PZeFFRER577DG89NJL3LrjHri7u2P58uWCWkZGRoN2OyBiY9dMhIaG4u+//8bo0aMxceJEHDlyBD179sTx48e13k6rz4IFC/DSSy+hb9++t5zVu9mIESPg6eGBV3ftQkh8HH67cgV2xsZY9sADmH36NB6Ni8X0lBRc1cGRN10tLPCMswsmJyVjVFws5v6dhiqlEmcrKzE2IR4h8XH4Pi8X052dNd9TK5fD2NRUcJ0NGzZgzpw58Pb2xrVr1zBr1izBeM+ePTFs2DB4e3vj2WefrffB75sNGzYMTz75JPbs2QMXFxdN00hE9ZNKpViwYAEiIyPRsWNHrfFPP/0UQUFBSE9Pb/pweuqfP7dv9vXXX2Pfvn0iJSJ9J1HzvJgW68CBAzjzxx94+PgJsaNo+bNPb3QbNgxDhgwROwoR1aOkpAQzZ87Ejz/+qDVmaWmJNWvWYOLEiSIk0z/nz5+Ht7c3KisrNTUXFxckJyfD2tpavGCklzhj14I5ODig3MwMdXe5H11TqZPJUG5mBgcHB7GjENEtWFtbIzw8HF9//TVM/zO7Xl5ejkmTJmHGjBma7Yro1rp06aK1J+mlS5cEOyIQ3S02di2Yg4MDJHI5Si0sxI4iUGphAYlcrrPGrlevXvD19RX875+3g4no/kkkEjz77LM4deoU3N3dtcbXr1+PwMBAJCcni5BOvzz33HNaKxQbNmzgSyl0z9jYtWC2trYwtbBAnq2t2FEE8uyu57LVUa6TJ08iISFB8D8ubxDpjpeXF2JiYjB9+nStsbS0NPTs2RNr1qzRerud/iWVSrF+/XqtTdNnzpyJq1evipSK9BEbuxZMJpPBy98fWa7toZQ2j/8UlFIpMtu3h3dAgOCwbCJq3iwsLLBu3Tr88MMPsLS0FIxVV1fjueeeQ2hoKEpLS0VK2Py5urpi1apVglpBQQHmzJkjTiDSS83j05xE4+Pjgzpzc1y6sZmw2LLt7aEwN9c6S5GI9MPEiRMRFxcHf39/rbEff/wRfn5+OHXqlAjJ9MNTTz2FkSNHCmrh4eGCM8WJboeNXQtnY2ODTm5uONfBFSqRN/dVSSQ438EVnbp2hY2NjahZiOj+ubm5ITo6Gi+++KLWWEZGBoKDg7Fy5UrB5ut0nUQiwdq1a7X+DJw9ezYKCgpESkX6hI0dIbhfP5Tb2yP9pn3kxHDW2Rnl9vYI/s+eTkSkf0xMTPDpp59i586dWk2KQqHAggUL8Oijj+LKldufR90SOTk54YsvvhDUrl69ipkzZ/I5RbojNnYEJycnBAYH4283N5TdxebGjaHU3BxnurqhZ9++WmdSEpH+CgkJQWJiIoKDg7XG9uzZA19fXxw+fLjpgzVz48ePx9ixYwW1iIgIbN68WaREpC/Y2BGA68dx2bg4I9bdHYomfpFCIZUitrs7bJ2dERQU1KT3JqLG1759exw+fBiLFy/WOs85NzcXQ4YMwbJly6BUKkVK2PxIJBJ89dVXaNOmjaD+4osv4tKlSyKlIn3Axo4AXD+QemRICCrbtcNJj+5N9rydSiLBSY/uqHJqhxEhIVpnUBKRYZDL5XjnnXewb98+rT0qVSoV3nzzTQwZMgQ5OTkiJWx+2rRpgzVr1ghqpaWlmDFjBpdk6ZbY2JGGo6MjxoSOQ5GrK457ejT6zJ1CKsVxTw8UubpiTOg4ODo6Nur9iEh8Dz30EBITEzF06FCtsSNHjsDX1xd79uwRIVnzNGbMGEyePFlQ27dvH9auXStSImrueFYsacnMzMSO8O0wz81FQFoarG46v1BXSs3NEdvdHVVO7TAmdBw6dOig83sQUfOlUqnw4YcfYvHixfUuwc6fPx/vvvsujI2NRUjXvBQXF8PT0xO5ubmamoWFBZKTk9GpUycRk1FzxMaO6pWfn4/dEREovpSDB9PT4ZaTA6kO/lNRSSQ46+yMM13dYOvsjBEhIZypI2rBjh8/jvHjxyMrK0trLDAwENu2bUPnzp1FSNa87N27FyNGjBDUBgwYgIMHD0LaTDaYp+aBjR3dkkKhQFRUFGKiomBZWIgumVloX1gI2X3sPaWUSpFtb4/zHVxRbm+Pnn37IigoiM/UERGKi4sxY8YM7NixQ2vMysoKa9euxbhx40RI1rw888wz+PbbbwW1VatWYe7cuSIlouaIjR3dUW5uLqKjopBx9izklZXokJ0Np6tFaF1RAaPbvMVWJ5Oh1MICeXa2yGzfHgpzc3Tq2hXB3NKEiP5DrVbjyy+/xLx581BbW6s1PnPmTKxatQpmZmYipGseysrK4O3tjczMTE3N1NQUCQkJ6Natm4jJqDlhY0d3rbi4GElJSUiKjUV1RQXUCgUsq6pgVVQMY4UCUrUKKokUtXI5ymxtUG5mBolcDlMLC3gHBMDb25snShDRbSUkJCA0NBRnz57VGvP09ER4eDi6d+8uQrLm4dChQxg8eLCg1rt3b0RGRvJ8bQLAxo7ug1KpRFFREQoKClBQUIAr+fmora6GUqGATC6Hsakp2jg6wsHBAQ4ODrC1teUfOER018rLyzF79ux6N+M1MzPD559/jqeeekprT7yW4sUXX8Tq1asFtRUrVuDVV18VKRE1J2zsiIioWdq0aRNmz56NiooKrbGJEyfiq6++gpWVlQjJxFVRUQFfX1+cO3dOUzM2NkZsbCw8PT1FTEbNARs7IiJqts6cOYNx48YhKSlJa+yBBx7Atm3bEBAQIEIycUVHR6Nfv35Q3fQym5+fH06ePAkjIyMRk5HY+I40ERE1W926dcPJkycxe/ZsrbFz586hT58++PTTT1vcSQxBQUFYsGCBoBYfH4/ly5eLlIiaC87YERGRXvj5558xY8YMlJaWao2FhIRg/fr1sLOzEyGZOKqrqxEQEIDTp09ranK5HCdOnGiRs5h0HRs7IiLSGxcvXsSECRNw4sQJrTEXFxds3boVffv2FSGZOGJjY9GrVy/B6R0eHh7466+/YGpqKmIyEguXYomISG907NgRR48erfcN0EuXLmHgwIFYvnx5vceUGaKAgAAsXrxYUEtNTcXSpUtFSkRi44wdERHppT/++ANTpkzBlStXtMaGDBmCzZs3t4jN0Gtra9G7d2/Ex8dralKpFMeOHUNQUJCIyUgMbOyIiEhv5eXlYfLkyTh48KDWWNu2bbFp0yYMGzZMhGRNKzk5GQEBAairq9PU3NzckJCQAHNzcxGTUVPjUiwREektJycn7Nu3D2+//TakUuFH2uXLlzF8+HAsWrRI0PAYIi8vL7z11luCWnp6Ol577TWREpFYOGNHREQG4dixY5g4cSIuXbqkNdanTx9s3boVHTp0ECFZ01AoFOjbty9OnjwpqB88eBCDBg0SKRU1NTZ2RERkMK5evYqnnnoKv/32m9aYtbU11q1bh8cff1yEZE3jzJkz8PX1RXV1tabWoUMHJCcno1WrViImo6bCpVgiIjIYdnZ22LlzJ1atWqV1AkNJSQnGjh2L559/XtD4GJJu3brhvffeE9QyMzMxf/58kRJRU+OMHRERGaTY2FiEhobi/PnzWmM+Pj4IDw9Ht27dREjWuFQqFQYPHowjR44I6nv37sXw4cNFSkVNhY0dEREZrLKyMjz33HPYunWr1piFhQW++OILhIWFiZCscV24cAHe3t6oqKjQ1Nq1a4eUlBTY2NiImIwaG5diiYjIYFlZWeGHH37At99+CzMzM8FYRUUFpk2bhqlTp6K8vFykhI2jc+fOWLlypaCWm5uLuXPnipSImgpn7IiIqEU4ffo0QkNDkZKSojXWtWtXhIeHw9fXt+mDNRK1Wo3hw4dj3759gvqOHTswevRocUJRo2NjR0RELUZVVRVeeuklfPPNN1pjJiYmWLlyJWbPng2JRCJCOt3Lzs6Gl5cXSktLNbW2bdsiJSUFbdq0ETEZNRYuxRIRUYthZmaGNWvWIDw8HFZWVoKxmpoaPP/88xg7diyKi4tFSqhb7du3x6effiqoXb58GbNnzwbndQwTZ+yIiKhFunDhAsaPH4+YmBitMVdXV2zbtg19+vQRIZluqdVqjB49GhEREYL61q1bMX78eJFSUWNhY0dERC1WbW0t/ve//2m9aAAAMpkM77zzDhYuXKh1XJm+yc/Ph4eHB4qKijQ1GxsbpKamwsnJScRkpGv6/V8qERFRAxgbG+Ojjz7Crl27YGdnJxhTKpV47bXX8Mgjj6CgoECkhLrh6OiIr776SlArLi7GzJkzuSRrYNjYERFRizdy5EgkJiZiwIABWmP79u2Dr68vDhw4IEIy3Rk3bhzGjRsnqO3atQsbNmwQJxA1Ci7FEhER3aBUKvH222/j7bffhkqlEoxJJBL873//w7JlyyCXy0VK2DCFhYXw9PQUzEBaWVkhOTkZrq6uIiYjXWFjR0RE9B+HDx/GpEmTkJubqzUWHByMrVu3on379iIka7iIiAg89thjgtpDDz2Effv2Gcw2Ly0Zl2KJiIj+Y+DAgUhISMCIESO0xqKiouDj46P1lqm+CAkJ0TpGbf/+/fj6669FSkS6xBk7IiKiW1CpVPjkk0+waNEiKBQKrfEXX3wRH3zwAUxMTERId/9KSkrg6emJnJwcTc3c3BxJSUno0qWLiMmoodjYERER3cGpU6cwfvx4ZGRkaI35+/tj27ZtcHNzEyHZ/du3bx+GDRsmqPXr1w+HDh2CTCYTKRU1FJdiiYiI7qBnz56Ij4/Hk08+qTUWFxcHf39/bNmyRYRk92/o0KF49tlnBbVjx45pnVRB+oUzdkRERHdJrVZj7dq1mDt3Lqqrq7XGp0+fjs8++wwWFhYipLt3165dg4+Pj2Am0sTEBPHx8XB3dxcxGd0vNnZERET3KDk5GaGhoUhLS9Mac3d3R3h4OLy8vERIdu+OHDmCgQMHCmqBgYGIjo7W221dWjIuxRIREd0jLy8vxMTEYPr06VpjaWlp6NmzJ9asWaMXpzoMGDAAL730kqAWExODDz74QJxA1CCcsSMiImqALVu24Nlnn0V5ebnW2JNPPom1a9eidevWIiS7e1VVVfD19cXZs2c1NSMjI8TExMDHx0fEZHSv2NgRERE1UHp6OsaPH4+4uDitsU6dOmHbtm3o2bOnCMnu3okTJxAcHCw4ccPHxwenTp2CsbGxiMnoXnAploiIqIHc3NwQHR2NF198UWssIyMDwcHBWLlypdYxZc1J7969sXDhQkEtMTERb7/9tkiJ6H5wxo6IiEiHIiIiMG3aNBQXF2uNjRgxAhs2bECbNm1ESHZnNTU16NGjB1JSUjQ1mUyG48ePIzAwUMRkdLfY2BEREelYdnY2JkyYgKioKK2xdu3a4YcfftB6E7W5iI+PR8+ePQUnbbi7uyMuLg6mpqYiJqO7waVYIiIiHWvfvj0OHz6MxYsXQyKRCMZyc3MxZMgQLFu2DEqlUqSEt+bn54clS5YIamlpaVo1ap44Y0dERNSI9u/fj8mTJ6OgoEBrbMCAAfjhhx/g7OwsQrJbq6urQ58+fRAbG6upSSQSHD16FH379hUxGd0JGzsiIqJGVlBQgKlTp2Lfvn1aY/b29ti4cSNGjBghQrJbS01Nhb+/P2prazW1Ll26IDExUW9O1miJuBRLRETUyBwcHLB3716sWLECMplMMFZYWIiRI0diwYIFgiZKbB4eHlpvxJ4/fx6vvvqqSInobnDGjoiIqAkdP34c48ePR1ZWltZYYGAgtm3bhs6dO4uQTJtSqUS/fv1w/PhxQX3//v0YMmSISKnodtjYERERNbHi4mLMmDEDO3bs0BqzsrLC2rVrMW7cOBGSaUtPT4ePjw+qqqo0NVdXVyQnJ8PKykrEZFQfLsUSERE1MRsbG/z888/4/PPPtU51KCsrQ2hoKJ599llUVlaKlPBfbm5ueP/99wW1rKwszJs3T6REdDucsSMiIhJRQkICQkNDBee0/sPDwwPbt29H9+7dRUj2L5VKhYceegiHDh0S1Hft2oWRI0eKlIrqw8aOiIhIZOXl5Zg9ezY2b96sNWZmZobVq1dj+vTpWnviNaWLFy/Cy8sL5eXlmpqTkxNSUlJga2srWi4S4lIsERGRyCwtLbFp0yZs3LhRayuRqqoqPP3005g0aRLKyspESgh07NgRH3/8saCWl5eHF154QaREVB/O2BERETUjZ86cQWhoKBITE7XGunTpgvDwcAQEBIiQDFCr1RgxYgR+//13Qf2nn37C2LFjRclEQmzsiIiImpnq6mosWLAAX3zxhdaYkZERPvjgA8ydO1eUpdmcnBx4enqipKREU7O3t0dqairatm3b5HlIiEuxREREzYypqSk+//xz/Pzzz7C2thaM1dXV4eWXX8Zjjz2Gq1evNnk2Z2dnrF69WlArLCzErFmzwLki8XHGjoiIqBm7ePEiJkyYgBMnTmiNubi4YMuWLejXr1+TZlKr1Rg7dqzWPnzff/89Jk2a1KRZSIiNHRERUTNXV1eHJUuWaO0nBwBSqRRvvvkmXnvtNa3jyhrT5cuX4eHhgcLCQk3N2toaKSkpcHZ2brIcJMSlWCIiombOyMgIK1aswO+//442bdoIxlQqFZYsWYKhQ4ciLy+vyTK1bdsWX331laBWUlKCZ555hkuyImJjR0REpCeGDRuGxMTEes9pPXjwIHx8fPDHH380WZ4nnngCEyZMENT27t2LdevWNVkGEuJSLBERkZ5RKpVYsWIF3njjDahUKq3xhQsX4p133oGRkVGjZykqKoKHhwfy8/M1NUtLSyQnJ6Njx46Nfn8SYmNHRESkpyIjIzFhwgRcunRJa6x3797YunVrkzRXu3fvxqhRowS1QYMGYf/+/ZBKuTjYlPi7TUREpKf69u2LhIQEhISEaI2dOHECfn5++OWXXxo9x8iRIzF9+nRB7dChQ/jyyy8b/d4kxBk7IiIiPadWq/HZZ5/hlVdeQV1dndb47NmzsXLlSpiamjZahtLSUnh5eSE7O1tTMzMzQ2JiItzc3BrtviTExo6IiMhAxMbGYvz48Th37pzWmI+PD8LDw9GtW7dGu//+/fvx8MMPC2pBQUE4evRok27F0pJxKZaIiMhABAQEIDY2FhMnTtQaS0xMREBAADZu3Nho93/ooYcwe/ZsQS06Ohoff/xxo92ThDhjR0REZGDUajW+++47PP/886iqqtIanzJlCr788ktYWlrq/N7l5eXw9fXF+fPnNTVjY2PExcXBw8ND5/cjITZ2REREBur06dMIDQ1FSkqK1ljXrl0RHh4OX19fnd83MjIS/fv3F2xUHBAQgOPHjzfJFiwtGZdiiYiIDFT37t1x6tQpzJw5U2vs7Nmz6N27N7744gudnxTRt29fzJs3T1CLjY3FihUrdHof0sYZOyIiohZg+/bteOaZZ1BWVqY1NmbMGKxbtw42NjY6u19VVRX8/f3x999/a2pyuRynTp2Cn5+fzu5DQmzsiIiIWogLFy5g/PjxiImJ0RpzdXXFtm3b0KdPH53d79SpUwgKCoJSqdTUvLy8EBMTAxMTE53dh/7FpVgiIqIWonPnzoiMjMT8+fO1xrKystCvXz+sWLGi3mPK7kfPnj2xaNEiQS05ORlvvvmmTq5P2jhjR0RE1ALt3r0bYWFhuHr1qtbY0KFDsWnTJjg4ODT4PrW1tQgMDERSUpKmJpVKER0djV69ejX4+iTExo6IiKiFysnJwaRJk3DkyBGtMUdHR3z//fcYMmRIg++TmJiIwMBAwakY3bp1Q3x8PMzMzBp8ffoXl2KJiIhaKGdnZxw4cABLly6FVCpsCfLz8/Hwww/j9ddfh0KhaNB9fHx8sHTpUkHtzJkzWLx4cYOuS9o4Y0dEREQ4fPgwJk2ahNzcXK2x4OBgbN26Fe3bt7/v6ysUCgQFBQle3JBIJDh8+DD69+9/39clITZ2REREBAC4cuUKpk2bhj179miN2djYYMOGDQgJCbnv66elpcHPzw81NTWaWqdOnZCUlNQop2C0RFyKJSIiIgBAmzZt8Ntvv+Gjjz6CXC4XjBUXF+Oxxx7D3LlzBY3ZvXB3d8fy5csFtYyMDLzyyiv3nZmEOGNHREREWk6dOoXx48cjIyNDa8zf3x/btm2Dm5vbPV9XqVRi4MCBiIyMFNT/+OMPDB069L7z0nVs7IiIiKhepaWleOaZZ/Djjz9qjVlaWmLNmjWYOHHiPV/3/Pnz8Pb2RmVlpabm4uKC5ORkWFtbNyRyi8elWCIiIqpX69atER4ejjVr1sDU1FQwVl5ejkmTJmHGjBmoqKi4p+t26dIFH374oaB26dIlvPzyyw3O3NJxxo6IiIjuKDk5GaGhoUhLS9Mac3d3R3h4OLy8vO76eiqVCkOHDsWBAwcE9Z07dzboBY2Wjo0dERER3ZWKigq8+OKLWL9+vdaYqakpVq1ahZkzZ0IikdzV9bKysuDp6Ylr165pag4ODkhNTYWdnZ3OcrckXIolIiKiu2JhYYF169bhhx9+0NqepLq6Gs899xxCQ0NRUlJyV9dzdXXFqlWrBLWCggLMmTNHR4lbHs7YERER0T07d+4cQkNDERcXpzXWsWNHhIeHo2fPnne8jlqtxqOPPordu3cL6uHh4Rg3bpzO8rYUbOyIiIjovtTU1ODVV1/Fp59+qjUml8vx3nvvYd68eVrHlf1XXl4ePDw8UFxcrKnZ2dkhNTUVDg4OOs9tyLgUS0RERPfFxMQEq1atws6dO2FraysYUygUeOWVVzBq1ChcuXLlttdxcnLCF198IahdvXoVM2fOBOef7g0bOyIiImqQkJAQJCQkoG/fvlpje/fuha+vLw4fPnzba4wfPx5jx44V1CIiIrB582YAQFVVFVQqlc4yGyo2dkRERNRg7du3x6FDh/D6669rvRWbm5uLwYMHY+nSpVAqlfV+v0QiwVdffYU2bdoI6i+88AKmTp0KGxsb2NnZ1XuOLf2Lz9gRERGRTh04cACTJ09Gfn6+1lj//v2xZcsWODs71/u9O3bswOOPP37La3fp0gXnzp3TWVZD0yIaO6VSiaKiIhQUFKCgoABX8vNRU1UFlVIJqUwGEzMztHF0hIODAxwcHGBrawuZTCZ2bCIiIr1VUFCAqVOnYt++fVpjdnZ22LhxI0aOHFnv906YMAHbtm275bVLSkrQunVrfr7Xw6Abu+LiYiQmJiI5Lg7VFRVQKxSwrKpC66IiGCkUkKrVUEkkqJPLUWpri3IzM0jkcphaWMDL3x8+Pj6wsbER+8cgIiLSSyqVCh9++CEWL15c7xLsvHnz8N5778HY2FhTy8nJwaBBg5Cenn7L60ZGRqKuro6f7/UwyMYuNzcX0ZGRyEhPh1FlJVyzsuFUVITWFRUwusXaPgDUyWQotbBAnq0tslzbo87cHJ3c3BDcrx+cnJya8CcgIiIyHMePH8eECROQmZmpNRYYGIht27ahc+fOyM7OxqOPPorExMR6r+Po6Ii+QUHw9fKCRV0dP9/rYVCNnUKhQFRUFGKiomBZWIgHMrPgUlgI2X28RaOUSnHJ3h7nOrii3N4egcHBCA4Ohlwub4TkREREhq24uBhPP/00fvnlF60xKysrvPfee3jrrbdQUFCgNS6TyRAUFITgwEDYl5ejW04OHrhWzs/3ehhMY5efn4/dEREovpSDB9PT4ZaTA6kOfjSVRIJ0Z2f87eYGWxdnjAgJgaOjow4SExERtSxqtRpfffUV5s2bh5qamrv6nrZt2yJk5Eg429jA7e+/0e7sWViYmsLGumFLqYb6+W4QjV1mZiZ2hIfDPDcPAWlpsKqs1Pk9yszNEevujsp27TAmdBw6dOig83sQERG1BAkJCQgNDcXZs2dv+3Wurq4YN3o0nCor4R4bC/OyMgCATCrT2YkUhvb5rveNXWZmJn7euhV2mVnoefo05I24eaFCKsVJj+4ocnXF2AkT9P5fPhERkVjKy8sxZ84cbNq0qd5xV1dXjH/8cXQsLkG349GQ3fQMnVQqg6MOjxozpM93vW7s8vPzsW3TJlhnXESf1FSdLL3eiUoiwXFPD5R07ITxU6cYxLQtERGRGK5evYoHHngAJSUlgnrbtm0xdfx4dCwugX9cHCzMzFBSUgK1WgVAAuvWrWFubq7TLIby+a63J08oFArsjoiAeW4eep0+3SRNHQBI1Wr0Sj0Ns7xc7ImIgEKhaJL7EhERGZpNmzZpNXUymQwhI0deX349eQK11VWQSqVwdHSEnZ09HBwcdN7UAYbz+a63jV1UVBSKL+UgIC2tUZdf6yNXqRBwOg1FOTmIjo5u0nsTEREZCqlUuw0JCgqCs40N3GNjNcuvZaWlkAAwMTaGrJ7v0RVD+HzXy8YuNzcXMVFReDA9vVFelLgbrSsr0e1sOk5FRiIvL0+UDERERPrsmWeewbBhwzS/dnR0RHBgINz+/lvzogQANOUzY/r++a6XjV10ZCQsCwvhlpMjao6uOTmwLCxEVGSkqDmIiIj0kbm5OX7//Xfk5eXhzz//xPOzZ8OxuhquFzIASG58lQStW7du0lz6/Pmud7vxFRcXIyM9HX6ZWU32XN2tSNVqdMnMQoKdHYqLiw32eBIiIqLG5OjoCBMTE8SfPAmP/AI42ttDDWjOfJXc8Qq6pc+f73o3Y5eYmAijykq4FBaKHQUA0L6wEPLKSiQlJYkdhYiISG/99/NdgusvUjR1U/cPff1816vGTqlUIjkuDq5Z2fd1jEhjkKlU6JCdjaTY2HoPOCYiIqLb4+e77txXY2dvb9/gG48YMQJVVVW3HP/ggw80/5ybm4tJkyahqKgI1RUVcCoq0vp698hjCImPw4i4WDybmoqyJnxN2enq9Vzh4eEYPHgwvL29sW3bNgDA119/jfDwcJ3d69tvv4WbmxskEgnKy8t1dl0iIqLGJpfL4evrC19fXwQGBiIhIQEAsGXLFuz78896P9/v1YGrV/Gdjp7B/+fzvehGLn34DL6vDYrt7e1R2MhLofXdIyUlBXt+/BGPHj6itcVJzxPHcap3HwDAgjNn0MXcDLPauzYog1Kthkxy+0lgNYDyujr81r8/ftyzG6mpqQCuv8Kdm5ursyNP/pGcnAxLS0sMGjQIKSkpsLS01On1iYiIGsvNn+0///wzfvjhB/zyyy+3/Xy/F3fzuX0v16mTybBrQH+MePJJeHp66sVnsM5enti3bx8WLlwIhUKBoUOHYuXKlZBIJPjqq6/wySefoH379mjTpg369u2L559/Hh07dkRKSgoA4IknnkDOje76o48+wtGjR1FSUgJfX18EBwfjlVdewRNPPIH3338fZuXleDf9LE6VlkICCZ53dcWw/8wgBlhZ4e+K6510YW0tlpw7h4LaGhhLpVj+gBu6mJsjo6oS88+cgUwigX8rK8SUleIXXz98lpmJwrpaZFZV4wFzc0xp1w7Lzp9DaZ0C1kZyvN+1G9oaG+O7nBxsyc2BXA14mJogyNdXkEGlUmH//v04deoUbGxsMHXqVKSkpGDJkiWoqamBu7s73n33XZiYmKB///4YO3Ys9u/fD5lMhm+++QZt27at9/fZwsICarUaCoUCGRkZsLCw0NW/QiIiokalUqlw4cIFAMD58+chlUpx4cIFfP3110g7cQKPGptg4bl0WMrkSCq/hhKFAm937oLeNjbIrq7Gq+lnUaVUwkgqxXtuXfGAuTl+KSjAseJilCsVMJfJMMDGFmcrK7CoU2eExMdp7p1eUYH9PQJhIpXW2xe8evYMrOVGSC0vR18bGzzXvj2MlEpYVlWhoKAAnp6e8PLyEuu37q7ppLGrqqrCM888gyNHjsDV1RUhISHYsWMHevXqhZUrVyI2NhZyuRz+/v7o27ev4Hv/+OMP2NnZ4ffff4darca1a9cwbNgwrFmzRjNFe/HiRQDAlfx8/HX8OK4plIjw84dUIkGpok5wPaVajaiSYox1uH4UyPILFzDHtT08LVsh6do1vHvhAtZ5emL5hQuY1b49Hrazx8c3rv+PsxWV2OTlBWOpFNNSkrH8ATc4m5pib+EVfJ6ViWVdHsDnmRexvUMHmEmlKFcqkVNWikuXLgmuM3nyZM0/v/nmm4KxM2fO4Ndff9X8+rPPPtP8c58+fe7q993b2/uuvo6IiKi56NKli+DXP//8M3oFBsKqogKXS0pQVV2NKrUanzk4IK6yEp9mXEDHWheYW1pio+f1z+a4sjJ8fPEivuzeHQCQVH4NO339YCmX45eCAs21I/z8AQA/5ufjSHERnE1N8fLff9fbFwBAfm0NNnt5QXLTrJ9VUTGu5Oc39m+LzuiksTtz5gy6deuGjh07AgAmTpyIY8eOQSqVYsiQIZr9Z0aNGqX1vV5eXnj55ZexcOFCjBkz5rZNTU1VFdJycvCyoyOkN37TW8uNAADXFAqExMchv6YGbubm6Hfj1eQTpSU4X6W9iXFqeTkesrUDAIxs0waRJcWasSF2tjCWSlGuUCCurAyz0k4DAFRqNZxNTHHt2jU8aGKC5ZcvY6CFBfpaWEBeWwvHtm1RWlp6r799RERELZqRkREkN72gEHRjNaqriQny6+qgVqtwtaQE7+ZcwpmKCkgB1N70JFk/axtYyutvadLKy7EpNwdbvX0A3LovAIBhdvaCpg4AjBUKVFdXN+THa1KNso+dWq2GRCLBfx/fq+9xvq5duyI+Ph67d+/G3LlzMXXqVDz//PP1XlelVN7ytedWcjki/PxRpVTiqZQUbMnLxdR2zgCAHb5+t11z/28qU6lM88/2Rsaajv8fZdeuYYWTExKqqnCsogLbS0uxyM8P/fr0wZn09Fveh4iIiLTJJBJIbuoRjG58ZkslEvzT7v1UWgoXCwus7NoNhXV1GJeYoPl6U1n974JeUyiw8OwZfNC1m6Dxu1VfYFbPdaRqFZR6dG6sTrY76datG86ePYvMzEyoVCps27YN/fr1Q2BgIA4ePIiysjJUVlZiz549Wt+bm5sLCwsLTJ06FXPnztUsv8pkMq3Xi6UyGTydnBCenw/Vjf8A/rsUayaTYXHnzlifkwOFWo2erVtjW/71I0FUajXOVFQAALpbWuLgjbdcfi+8Uu/PZSmXw9bICIdvfF2dSoVzlZWwsLREEYAAc3PMtrdHfl0dlACKOVtHRER0z5RqNdS3mYCRSWVQGhmhrbExJBIJdl6+fFfXfS39LMLaOcP9ppccbtUX3IpKIoXsFrOBzdF9JS0uLoaLi4vm15988gm++eYbPPbYY5qXJ0aPHg2JRIKXXnoJPXr0gKurK/z8/GBlZSW4VnJyMhYsWACZTAYzMzOsW7cOABAWFgYvLy8MGjQIr7zyCgDAxMwMA7p3x/6sbIyKj4PsFi9PeLVqha7mFvijsBBLOnfBG+fOYVteHhRqNUa3dUA3Cwv8r1NnLDhzBl9fykagVWtYymSoz8pu3fDGuXNYefEilFDjaWcXdDA1xXtXCnFNUQelSoWnbG2hMjHByaNHBd/7+++/4+jRo7Czs8OsWbMQFxeHuXPnoqamBt7e3vj8889hamqKBx98EH/99RcsLS2xZ88e/Prrr/jmm2/qzbNhwwa88847KCgoQNu2bTF+/HgsX7783v4FEhERiaBVq1bofuO5OJlMho8//hi9e/fGSy++iOwTJ+DU2hrm167B1sYWTra2qFAqIb+xw8TkVq3wwt9p+O3KZQRZW9/xXjnV1Thw9SqyqquxKS8XALC2u8ct+4JbqZXLYWxqCgBYt24dli5divz8fHTr1g2TJk0SbM/WHNzXdif3oqKiAhYWFqiqqkL//v2xfv36+36r5MCBAzjzxx94+PiJBueqUiphKpVCIpHg20uXUFhXi0WdOt/XtWrravF7jx7Yk5aGgwcPAgBMTEyQl5enV8eQEBERiUGXn++69mef3ug2bBiGDBkidpS70uhzi6+//joOHTqE6upqTJ06tUGvCjs4OCDWzAx1MhmMGrgLdNK1a1iecQEqtRoOJib4sGvX+76WxNQMSjs7zJkzB+3atcOVK1ewYMECNnVERER3QZef77pUJ5Oh3MxM53vSNqZGb+w++eQTnV3LwcEBErkcpRYWsC8ra9C1ellba70Ucb9KLSwgkcvRr18/PP744zq55vLly/Hjjz8KavPmzcPUqVN1cn0iIqLmQpef77rwVXYW9hYWQimVofzM39j0889YuHChXnwG68/TgABsbW1hamGBPFvbZvEv/h95dtdz2dra6uyaixcvxuLFi3V2PSIiouaquX2+z2rvilntXZHcqSNyfH0xe+5cyG7xLH5zo5O3YpuKTCaDl78/slzbQyltHtGVUiky27eHd0CA3vxLJyIiak74+a47zeN37x74+Pigztwcl/7zJqxYsu3toTA35ykQREREDcDPd93Qu8bOxsYGndzccK6DK1Q6OOi3IVQSCc53cEWnrl35ogQREVED8PNdN/SusQOA4H79UG5vj3RnZ1FznHV2Rrm9PYL/c/4tERER3Tt+vjecXjZ2Tk5OCAwOxt9ubigzNxclQ6m5Oc50dUPPvn3h5OQkSgYiIiJDws/3htPLxg4AgoODYePijFh3dyia+EFLhVSK2O7usHV2RlBQUJPem4iIyJDx871h9Laxk8vlGBkSgsp27XDSo3uTrcerJBKc9OiOKqd2GBESArkenR9HRETU3PHzvWH0trEDAEdHR4wJHYciV1cc9/Ro9M5eIZXiuKcHilxdMSZ0HBwdHRv1fkRERC0RP9/vX6OfFdsUMjMzsSN8O8xzcxGQlgarykqd36PU3Byx3d1R5dQOY0LHoUOHDjq/BxEREf2Ln+/3ziAaOwDIz8/H7ogIFF/KwYPp6XDLyYFUBz+aSiLBWWdnnOnqBltnZ4wICdHrTp6IiEif8PP93hhMYwcACoUCUVFRiImKgmVhIbpkZqF9YSFkKtU9X0splSLb3h7nO7ii3N4ePfv2RVBQkN6uuRMREekrfr7fPYNq7P6Rm5uL6KgoZJw9C3llJTpkZ8PpahFaV1TASKm85ffVyWQotbBAnp0tMtu3h8LcHJ26dkWwnr7yTEREZEj4+X5nBtnY/aO4uBhJSUlIio1FdUUF1AoFLKuqYFVUDGOFAlK1CiqJFLVyOcpsbVBuZgaJXA5TCwt4BwTA29tb73acJiIiMnT8fL81g27s/qFUKlFUVISCggIUFBTgSn4+aquroVQoIJPLYWxqijaOjnBwcICDgwNsbW316sBfIiKiloif79paRGNHRERE1BLo9T52RERERPQvNnZEREREBoKNHREREZGBYGNHREREZCDY2BEREREZCDZ2RERERAaCjR0RERGRgWBjR0RERGQg2NgRERERGQg2dkREREQGgo0dERERkYFgY0dERERkINjYERERERkINnZEREREBoKNHREREZGBYGNHREREZCDY2BEREREZCDZ2RERERAaCjR0RERGRgWBjR0RERGQg2NgRERERGQg2dkREREQGgo0dERERkYFgY0dERERkINjYERERERkINnZEREREBoKNHREREZGB+D+aztyTJeCZRgAAAABJRU5ErkJggg==", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "fitted_pipeline = est.fitted_pipeline_ # access best pipeline directly\n", "fitted_pipeline.plot()" @@ -2381,147 +2335,9 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
roc_auc_scorecomplexity_scorerParentsVariation_FunctionIndividualSubmitted TimestampCompleted TimestampEval ErrorPareto_FrontInstance
00.94042698.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727192e+091.727192e+09NoneNaN[('DecisionTreeClassifier_1', 'SelectPercentil...
10.95424470.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727192e+091.727192e+09NoneNaN[('DecisionTreeClassifier_1', 'ColumnOneHotEnc...
20.96794213.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727192e+091.727192e+09NoneNaN[('KNeighborsClassifier_1', 'SelectFwe_2'), ('...
3NaNNaNNaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727192e+091.727192e+09INVALIDNaN[('LogisticRegression_1', 'FeatureAgglomeratio...
40.95342180.0NaNNaN<tpot2.search_spaces.pipelines.graph.GraphPipe...1.727192e+091.727192e+09NoneNaN[('DecisionTreeClassifier_1', 'SelectPercentil...
\n", - "
" - ], - "text/plain": [ - " roc_auc_score complexity_scorer Parents Variation_Function \\\n", - "0 0.940426 98.0 NaN NaN \n", - "1 0.954244 70.0 NaN NaN \n", - "2 0.967942 13.0 NaN NaN \n", - "3 NaN NaN NaN NaN \n", - "4 0.953421 80.0 NaN NaN \n", - "\n", - " Individual Submitted Timestamp \\\n", - "0 " ] @@ -78,7 +78,7 @@ "import matplotlib.pyplot as plt\n", "import tpot2\n", "\n", - "population_size=60\n", + "population_size=30\n", "initial_population_size=100\n", "population_scaling = .5\n", "generations_until_end_population = 50\n", @@ -107,7 +107,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 2, "metadata": {}, "outputs": [ { @@ -122,7 +122,7 @@ "name": "stderr", "output_type": "stream", "text": [ - "Generation: 2%|▏ | 1/50 [00:08<07:17, 8.93s/it]" + "Generation: 2%|▏ | 1/50 [00:25<20:56, 25.64s/it]" ] }, { @@ -130,14 +130,14 @@ "output_type": "stream", "text": [ "Generation: 1\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9909340659340659\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 4%|▍ | 2/50 [00:15<06:00, 7.50s/it]" + "Generation: 4%|▍ | 2/50 [00:47<18:41, 23.37s/it]" ] }, { @@ -145,14 +145,14 @@ "output_type": "stream", "text": [ "Generation: 2\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9918914418914418\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 6%|▌ | 3/50 [00:25<06:51, 8.76s/it]" + "Generation: 6%|▌ | 3/50 [01:25<23:39, 30.21s/it]" ] }, { @@ -160,14 +160,14 @@ "output_type": "stream", "text": [ "Generation: 3\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9925990675990676\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 8%|▊ | 4/50 [00:33<06:22, 8.30s/it]" + "Generation: 8%|▊ | 4/50 [02:19<30:20, 39.58s/it]" ] }, { @@ -175,14 +175,14 @@ "output_type": "stream", "text": [ "Generation: 4\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9925990675990676\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 10%|█ | 5/50 [00:48<08:13, 10.97s/it]" + "Generation: 10%|█ | 5/50 [02:56<28:55, 38.56s/it]" ] }, { @@ -190,14 +190,14 @@ "output_type": "stream", "text": [ "Generation: 5\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9933816183816184\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 12%|█▏ | 6/50 [01:09<10:27, 14.26s/it]" + "Generation: 12%|█▏ | 6/50 [03:46<31:15, 42.63s/it]" ] }, { @@ -205,14 +205,14 @@ "output_type": "stream", "text": [ "Generation: 6\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9933816183816184\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 14%|█▍ | 7/50 [01:20<09:23, 13.10s/it]" + "Generation: 14%|█▍ | 7/50 [04:41<33:15, 46.42s/it]" ] }, { @@ -220,14 +220,14 @@ "output_type": "stream", "text": [ "Generation: 7\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9934065934065934\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 16%|█▌ | 8/50 [01:29<08:14, 11.77s/it]" + "Generation: 16%|█▌ | 8/50 [05:19<30:39, 43.79s/it]" ] }, { @@ -235,14 +235,14 @@ "output_type": "stream", "text": [ "Generation: 8\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9948468198468199\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 18%|█▊ | 9/50 [01:37<07:13, 10.57s/it]" + "Generation: 18%|█▊ | 9/50 [05:49<27:01, 39.55s/it]" ] }, { @@ -250,14 +250,14 @@ "output_type": "stream", "text": [ "Generation: 9\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9948468198468199\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 20%|██ | 10/50 [01:49<07:23, 11.09s/it]" + "Generation: 20%|██ | 10/50 [07:01<33:03, 49.59s/it]" ] }, { @@ -265,14 +265,14 @@ "output_type": "stream", "text": [ "Generation: 10\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9948468198468199\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 22%|██▏ | 11/50 [02:01<07:26, 11.45s/it]" + "Generation: 22%|██▏ | 11/50 [07:43<30:38, 47.14s/it]" ] }, { @@ -280,14 +280,14 @@ "output_type": "stream", "text": [ "Generation: 11\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9972905525846703\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 24%|██▍ | 12/50 [02:10<06:45, 10.67s/it]" + "Generation: 24%|██▍ | 12/50 [08:35<30:52, 48.75s/it]" ] }, { @@ -295,14 +295,14 @@ "output_type": "stream", "text": [ "Generation: 12\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9976114081996436\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 26%|██▌ | 13/50 [02:18<06:07, 9.92s/it]" + "Generation: 26%|██▌ | 13/50 [09:10<27:31, 44.64s/it]" ] }, { @@ -310,14 +310,14 @@ "output_type": "stream", "text": [ "Generation: 13\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9976114081996436\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 28%|██▊ | 14/50 [02:27<05:42, 9.51s/it]" + "Generation: 28%|██▊ | 14/50 [10:06<28:49, 48.04s/it]" ] }, { @@ -325,14 +325,14 @@ "output_type": "stream", "text": [ "Generation: 14\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9976114081996436\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 30%|███ | 15/50 [03:03<10:13, 17.53s/it]" + "Generation: 30%|███ | 15/50 [11:08<30:21, 52.05s/it]" ] }, { @@ -340,14 +340,14 @@ "output_type": "stream", "text": [ "Generation: 15\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9979411764705883\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 32%|███▏ | 16/50 [03:11<08:13, 14.52s/it]" + "Generation: 32%|███▏ | 16/50 [12:28<34:16, 60.49s/it]" ] }, { @@ -355,14 +355,14 @@ "output_type": "stream", "text": [ "Generation: 16\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.9985249554367203\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 34%|███▍ | 17/50 [03:20<07:08, 12.98s/it]" + "Generation: 34%|███▍ | 17/50 [14:09<40:01, 72.78s/it]" ] }, { @@ -370,14 +370,14 @@ "output_type": "stream", "text": [ "Generation: 17\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.999108734402852\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 36%|███▌ | 18/50 [03:30<06:26, 12.09s/it]" + "Generation: 36%|███▌ | 18/50 [14:49<33:37, 63.03s/it]" ] }, { @@ -385,14 +385,14 @@ "output_type": "stream", "text": [ "Generation: 18\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.999108734402852\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 38%|███▊ | 19/50 [03:39<05:44, 11.11s/it]" + "Generation: 38%|███▊ | 19/50 [15:12<26:14, 50.78s/it]" ] }, { @@ -400,14 +400,14 @@ "output_type": "stream", "text": [ "Generation: 19\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.999108734402852\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 40%|████ | 20/50 [03:47<05:06, 10.21s/it]" + "Generation: 40%|████ | 20/50 [15:55<24:16, 48.53s/it]" ] }, { @@ -415,471 +415,37 @@ "output_type": "stream", "text": [ "Generation: 20\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 42%|████▏ | 21/50 [03:58<05:01, 10.39s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 21\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 44%|████▍ | 22/50 [04:07<04:42, 10.08s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 22\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 46%|████▌ | 23/50 [04:15<04:13, 9.40s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 23\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 48%|████▊ | 24/50 [04:22<03:49, 8.83s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 24\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 50%|█████ | 25/50 [04:29<03:26, 8.26s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 25\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 52%|█████▏ | 26/50 [04:39<03:26, 8.62s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 26\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 54%|█████▍ | 27/50 [04:51<03:41, 9.62s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 27\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 56%|█████▌ | 28/50 [04:59<03:21, 9.16s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 28\n", - "Best roc_auc_score score: 1.0\n" + "Best roc_auc_score score: 0.999108734402852\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 58%|█████▊ | 29/50 [05:14<03:53, 11.11s/it]" + "Generation: 40%|████ | 20/50 [16:15<24:23, 48.79s/it]" ] }, { "name": "stdout", "output_type": "stream", "text": [ - "Generation: 29\n", - "Best roc_auc_score score: 1.0\n" + "KeyboardInterrupt\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 60%|██████ | 30/50 [05:24<03:33, 10.69s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 30\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 62%|██████▏ | 31/50 [05:34<03:19, 10.52s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 31\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 64%|██████▍ | 32/50 [05:41<02:47, 9.32s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 32\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 66%|██████▌ | 33/50 [05:52<02:49, 10.00s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 33\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 68%|██████▊ | 34/50 [06:00<02:26, 9.18s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 34\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 70%|███████ | 35/50 [06:12<02:31, 10.08s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 35\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 72%|███████▏ | 36/50 [06:20<02:14, 9.60s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 36\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 74%|███████▍ | 37/50 [06:29<02:02, 9.39s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 37\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 76%|███████▌ | 38/50 [06:36<01:44, 8.74s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 38\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 78%|███████▊ | 39/50 [06:44<01:30, 8.26s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 39\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 80%|████████ | 40/50 [06:51<01:20, 8.04s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 40\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 82%|████████▏ | 41/50 [07:03<01:21, 9.11s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 41\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 84%|████████▍ | 42/50 [07:10<01:07, 8.47s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 42\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 86%|████████▌ | 43/50 [07:17<00:57, 8.27s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 43\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 88%|████████▊ | 44/50 [07:26<00:49, 8.24s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 44\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 90%|█████████ | 45/50 [07:33<00:39, 7.86s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 45\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 92%|█████████▏| 46/50 [07:52<00:45, 11.34s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 46\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 94%|█████████▍| 47/50 [08:13<00:42, 14.12s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 47\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 96%|█████████▌| 48/50 [08:51<00:42, 21.41s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 48\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 98%|█████████▊| 49/50 [08:58<00:17, 17.13s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 49\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Generation: 100%|██████████| 50/50 [09:11<00:00, 11.02s/it]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Generation: 50\n", - "Best roc_auc_score score: 1.0\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "\n" + "\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/linear_model/_sag.py:349: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge\n", + " warnings.warn(\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ - "total time: 555.5690805912018\n" + "total time: 983.1856114864349\n" ] } ], @@ -898,26 +464,18 @@ "from sklearn.linear_model import LogisticRegression\n", "import sklearn\n", "\n", - "X, y = sklearn.datasets.load_iris(return_X_y=True)\n", - "\n", - "graph_search_space = tpot2.search_spaces.pipelines.GraphSearchPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = tpot2.config.get_search_space(\"selectors\"), \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - " )\n", + "X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", "\n", "est = tpot2.TPOTEstimator(\n", " scorers = [\"roc_auc_ovr\"],\n", " scorers_weights = [1],\n", " classification = True,\n", " cv = 5,\n", - " search_space = graph_search_space,\n", + " search_space = 'linear-light',\n", " generations = 50,\n", - " max_eval_time_mins = 60*5,\n", - " verbose = 3,\n", - "\n", + " max_time_mins=None,\n", "\n", + " verbose = 3,\n", " population_size=population_size,\n", " initial_population_size=initial_population_size,\n", " population_scaling = population_scaling,\n", @@ -940,32 +498,32 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### CV early pruning\n", + "## CV early pruning\n", "\n", - "2. Most often, we will be evaluating pipelines using cross validation. However, we can often tell within the first few folds whether or not the pipeline is going have a reasonable change of outperforming the previous best pipelines. For example, if the best score so far is .92 AUROC and the average score of the first five folds of our current pipeline is only around .61, we can be reasonably confident that the next five folds are unlikely to this pipeline ahead of the others. We can save a significant amount of compute by not computing the rest of the folds. There are two strategies that TPOT can use to accomplish this (More information on these strategies in Tutorial 8).\n", + "Most often, we will be evaluating pipelines using cross validation. However, we can often tell within the first few folds whether or not the pipeline is going have a reasonable change of outperforming the previous best pipelines. For example, if the best score so far is .92 AUROC and the average score of the first five folds of our current pipeline is only around .61, we can be reasonably confident that the next five folds are unlikely to this pipeline ahead of the others. We can save a significant amount of compute by not computing the rest of the folds. There are two strategies that TPOT can use to accomplish this (More information on these strategies in Tutorial 8).\n", " 1. Threshold Pruning: Pipelines must achieve a score above a predefined percentile threshold (based on previous pipeline scores) to proceed in each cross-validation (CV) fold.\n", " 2. Selection Pruning: Within each population, only the top N% of pipelines (ranked by performance in the previous CV fold) are selected to evaluate in the next fold.\"\n", "\n", "\n", "We can further reduce computational load by terminating the evaluation of individual pipelines early if the first few CV scores are not promising. Note that this is different than early stopping of the full algorithm. In this section we will cover:\n", "\n", - "`threshold_evaluation_early_stop`\n", + "`threshold_evaluation_pruning`\n", "\n", "`threshold_evaluation_scaling`\n", "\n", "`min_history_threshold`\n", "\n", - "`selection_evaluation_early_stop`\n", + "`selection_evaluation_pruning`\n", "\n", "`selection_evaluation_scaling`\n", "\n", "Threshold early stopping uses previous scores to identify and terminate the cross validation evaluation of poorly performing pipelines. We calculate the percentile scores from the previously evaluated pipelines. A pipeline must reach the given percentile each fold for the next to be evaluated, otherwise the pipeline is discarded.\n", "\n", - "The `threshold_evaluation_early_stop` parameter is a list that specifies the starting and ending percentiles to use as a threshold for the evaluation early stopping. W The `threshold_evaluation_scaling` parameter is a float that controls the rate at which the threshold moves from the start to end percentile. The `min_history_threshold` parameter specifies the minimum number of previous scores needed before using threshold early stopping. This ensures that the algorithm has enough historical data to make an informed decision about when to stop evaluating pipelines.\n", + "The `threshold_evaluation_pruning` parameter is a list that specifies the starting and ending percentiles to use as a threshold for the evaluation early stopping. W The `threshold_evaluation_scaling` parameter is a float that controls the rate at which the threshold moves from the start to end percentile. The `min_history_threshold` parameter specifies the minimum number of previous scores needed before using threshold early stopping. This ensures that the algorithm has enough historical data to make an informed decision about when to stop evaluating pipelines.\n", "\n", "Selection early stopping uses a selection algorithm after each fold to select which algorithms will be evaluated for the next fold. For example, after evaluating 100 individuals on fold 1, we may want to only evaluate the best 50 for the remaining folds.\n", "\n", - "The `selection_evaluation_early_stop` parameter is a list that specifies the lower and upper percentage of the population size to select each round of CV. This is used to determine which individuals to evaluate in the next generation. The `selection_evaluation_scaling` parameter is a float that controls the rate at which the selection threshold moves from the start to end percentile.\n", + "The `selection_evaluation_pruning` parameter is a list that specifies the lower and upper percentage of the population size to select each round of CV. This is used to determine which individuals to evaluate in the next generation. The `selection_evaluation_scaling` parameter is a float that controls the rate at which the selection threshold moves from the start to end percentile.\n", "\n", "By manipulating these parameters, we can control how the algorithm selects individuals to evaluate in the next generation and when to stop evaluating pipelines that are not performing well.\n", "\n", @@ -976,12 +534,12 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 3, "metadata": {}, "outputs": [ { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -993,15 +551,18 @@ "source": [ "import matplotlib.pyplot as plt\n", "import tpot2\n", + "import time\n", + "import sklearn\n", + "import sklearn.datasets\n", "\n", - "threshold_evaluation_early_stop = [30, 90]\n", + "threshold_evaluation_pruning = [30, 90]\n", "threshold_evaluation_scaling = .5\n", - "cv = 5\n", + "cv = 10\n", "\n", "#Population and budget use stepwise\n", "fig, ax1 = plt.subplots()\n", "\n", - "interpolated_values = tpot2.utils.beta_interpolation(start=threshold_evaluation_early_stop[0], end=threshold_evaluation_early_stop[-1], n=cv, n_steps=cv, scale=threshold_evaluation_scaling)\n", + "interpolated_values = tpot2.utils.beta_interpolation(start=threshold_evaluation_pruning[0], end=threshold_evaluation_pruning[-1], n=cv, n_steps=cv, scale=threshold_evaluation_scaling)\n", "ax1.step(list(range(len(interpolated_values))), interpolated_values, label=f\"threshold\")\n", "ax1.set_xlabel(\"fold\")\n", "ax1.set_ylabel(\"percentile\")\n", @@ -1011,7 +572,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -1019,43 +580,37 @@ "output_type": "stream", "text": [ "/home/perib/Projects/common/Projects/TPOT_Dev/tpot2/tpot2/tpot_estimator/estimator.py:423: UserWarning: Both generations and max_time_mins are set. TPOT will terminate when the first condition is met.\n", - " warnings.warn(\"Both generations and max_time_mins are set. TPOT will terminate when the first condition is met.\")\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/linear_model/_sag.py:349: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge\n", - " warnings.warn(\n" + " warnings.warn(\"Both generations and max_time_mins are set. TPOT will terminate when the first condition is met.\")\n" ] }, { - "name": "stdout", + "name": "stderr", "output_type": "stream", "text": [ - "total time: 98.39543533325195\n" + "Generation: 20%|██ | 1/5 [00:32<02:11, 32.94s/it]" ] } ], "source": [ - "graph_search_space = tpot2.search_spaces.pipelines.GraphSearchPipeline(\n", - " root_search_space= tpot2.config.get_search_space([\"KNeighborsClassifier\", \"LogisticRegression\", \"DecisionTreeClassifier\"]),\n", - " leaf_search_space = tpot2.config.get_search_space(\"selectors\"), \n", - " inner_search_space = tpot2.config.get_search_space([\"transformers\"]),\n", - " max_size = 10,\n", - " )\n", + "X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", "\n", "\n", "est = tpot2.TPOTEstimator( \n", - " generations=5,\n", + " generations=50,\n", + " max_time_mins=None,\n", " scorers=['roc_auc_ovr'],\n", " scorers_weights=[1],\n", " classification=True,\n", - " search_space = graph_search_space,\n", + " search_space = 'linear-light',\n", " n_jobs=32,\n", " cv=cv,\n", " \n", " # budget_range = [.3,1],\n", " # generations_until_end_budget=4,\n", "\n", - " threshold_evaluation_early_stop = threshold_evaluation_early_stop,\n", + " threshold_evaluation_pruning = threshold_evaluation_pruning,\n", " threshold_evaluation_scaling = threshold_evaluation_scaling,\n", - " verbose=0)\n", + " verbose=2)\n", "\n", "\n", "start = time.time()\n", @@ -1065,7 +620,7 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 3, "metadata": {}, "outputs": [ { @@ -1083,14 +638,14 @@ "import matplotlib.pyplot as plt\n", "import tpot2\n", "\n", - "selection_evaluation_early_stop = [.1, 1]\n", + "selection_evaluation_pruning = [.1, 1]\n", "selection_evaluation_scaling = .5\n", "cv = 5\n", "\n", "#Population and budget use stepwise\n", "fig, ax1 = plt.subplots()\n", "\n", - "interpolated_values = tpot2.utils.beta_interpolation(start=selection_evaluation_early_stop[0], end=selection_evaluation_early_stop[-1], n=cv, n_steps=cv, scale=selection_evaluation_scaling)\n", + "interpolated_values = tpot2.utils.beta_interpolation(start=selection_evaluation_pruning[0], end=selection_evaluation_pruning[-1], n=cv, n_steps=cv, scale=selection_evaluation_scaling)\n", "ax1.step(list(range(len(interpolated_values))), interpolated_values, label=f\"threshold\")\n", "ax1.set_xlabel(\"fold\")\n", "ax1.set_ylabel(\"percent to select\")\n", @@ -1100,24 +655,18 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 4, "metadata": {}, "outputs": [ { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/perib/Projects/common/Projects/TPOT_Dev/tpot2/tpot2/tpot_estimator/estimator.py:423: UserWarning: Both generations and max_time_mins are set. TPOT will terminate when the first condition is met.\n", - " warnings.warn(\"Both generations and max_time_mins are set. TPOT will terminate when the first condition is met.\")\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/linear_model/_sag.py:349: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge\n", - " warnings.warn(\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "total time: 73.06747031211853\n" + "ename": "NameError", + "evalue": "name 'time' is not defined", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", + "Cell \u001b[0;32mIn[4], line 16\u001b[0m\n\u001b[1;32m 1\u001b[0m est \u001b[38;5;241m=\u001b[39m tpot2\u001b[38;5;241m.\u001b[39mTPOTEstimator( \n\u001b[1;32m 2\u001b[0m generations\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m5\u001b[39m,\n\u001b[1;32m 3\u001b[0m scorers\u001b[38;5;241m=\u001b[39m[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mroc_auc_ovr\u001b[39m\u001b[38;5;124m'\u001b[39m],\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 12\u001b[0m \n\u001b[1;32m 13\u001b[0m verbose\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0\u001b[39m)\n\u001b[0;32m---> 16\u001b[0m start \u001b[38;5;241m=\u001b[39m \u001b[43mtime\u001b[49m\u001b[38;5;241m.\u001b[39mtime()\n\u001b[1;32m 17\u001b[0m est\u001b[38;5;241m.\u001b[39mfit(X, y)\n\u001b[1;32m 18\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtotal time: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mtime\u001b[38;5;241m.\u001b[39mtime()\u001b[38;5;241m-\u001b[39mstart\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n", + "\u001b[0;31mNameError\u001b[0m: name 'time' is not defined" ] } ], @@ -1126,15 +675,15 @@ "\n", "\n", "est = tpot2.TPOTEstimator( \n", - " generations=5,\n", - " scorers=['roc_auc_ovr'],\n", + " generations=50,\n", + " max_time_mins=None,\n", " scorers_weights=[1],\n", " classification=True,\n", - " search_space = graph_search_space,\n", + " search_space = \"linear-light\",\n", " n_jobs=32,\n", " cv=cv,\n", "\n", - " selection_evaluation_early_stop = selection_evaluation_early_stop,\n", + " selection_evaluation_pruning = selection_evaluation_pruning,\n", " selection_evaluation_scaling = selection_evaluation_scaling,\n", "\n", " verbose=0)\n", @@ -1201,11 +750,12 @@ ], "source": [ "est = tpot2.TPOTEstimator( \n", - " generations=5,\n", + " generations=50,\n", + " max_time_mins=None,\n", " scorers=['roc_auc_ovr'],\n", " scorers_weights=[1],\n", " classification=True,\n", - " search_space = graph_search_space,\n", + " search_space = 'linear-light',\n", " n_jobs=32,\n", " cv=cv,\n", "\n", @@ -1217,10 +767,10 @@ " budget_range = budget_range,\n", " generations_until_end_budget=generations_until_end_budget,\n", " \n", - " threshold_evaluation_early_stop = threshold_evaluation_early_stop,\n", + " threshold_evaluation_pruning = threshold_evaluation_pruning,\n", " threshold_evaluation_scaling = threshold_evaluation_scaling,\n", "\n", - " selection_evaluation_early_stop = selection_evaluation_early_stop,\n", + " selection_evaluation_pruning = selection_evaluation_pruning,\n", " selection_evaluation_scaling = selection_evaluation_scaling,\n", "\n", " verbose=0)\n", diff --git a/tpot2/evolvers/base_evolver.py b/tpot2/evolvers/base_evolver.py index 3cb0a3ce..54995803 100644 --- a/tpot2/evolvers/base_evolver.py +++ b/tpot2/evolvers/base_evolver.py @@ -97,10 +97,10 @@ def __init__( self, generations_until_end_budget = 1, stepwise_steps = 5, - threshold_evaluation_early_stop = None, + threshold_evaluation_pruning = None, threshold_evaluation_scaling = .5, min_history_threshold = 20, - selection_evaluation_early_stop = None, + selection_evaluation_pruning = None, selection_evaluation_scaling = .5, evaluation_early_stop_steps = None, final_score_strategy = "mean", @@ -154,7 +154,7 @@ def __init__( self, Maximum time to evaluate a single individual. If none or inf, there will be no time limit per evaluation. n_jobs : int, default=1 Number of processes to run in parallel. - memory_limit : str, default="4GB" + memory_limit : str, default=None Memory limit for each job. See Dask [LocalCluster documentation](https://distributed.dask.org/en/stable/api.html#distributed.Client) for more information. client : dask.distributed.Client, default=None A dask client to use for parallelization. If not None, this will override the n_jobs and memory_limit parameters. If None, will create a new client with num_workers=n_jobs and memory_limit=memory_limit. @@ -186,7 +186,7 @@ def __init__( self, The number of generations to run before reaching the max budget. stepwise_steps : int, default=1 The number of staircase steps to take when interpolating the budget and population size. - threshold_evaluation_early_stop : list [start, end], default=None + threshold_evaluation_pruning : list [start, end], default=None Starting and ending percentile to use as a threshold for the evaluation early stopping. The evolver will interpolate between these values over the evaluation_early_stop_steps. Values between 0 and 100. At each step of the evaluation, a threshold is calculated based on the previous evaluations. All individuals that are below the performance threshold are not evaluated for further steps. @@ -196,7 +196,7 @@ def __init__( self, Must be greater than zero. Higher numbers will move the threshold to the end faster. min_history_threshold : int, default=0 The minimum number of previous scores needed before using threshold early stopping. - selection_evaluation_early_stop : list, default=None + selection_evaluation_pruning : list, default=None A lower and upper percent of the population size to select each round of CV. Values between 0 and 1. Selects a percentage of the population to evaluate at each step of the evaluation. @@ -243,9 +243,9 @@ def __init__( self, self.rng = np.random.default_rng(rng) - if threshold_evaluation_early_stop is not None or selection_evaluation_early_stop is not None: + if threshold_evaluation_pruning is not None or selection_evaluation_pruning is not None: if evaluation_early_stop_steps is None: - raise ValueError("evaluation_early_stop_steps must be set when using threshold_evaluation_early_stop or selection_evaluation_early_stop") + raise ValueError("evaluation_early_stop_steps must be set when using threshold_evaluation_pruning or selection_evaluation_pruning") self.individual_generator = individual_generator self.population_size = population_size @@ -292,11 +292,11 @@ def __init__( self, self.generation = 0 - self.threshold_evaluation_early_stop =threshold_evaluation_early_stop + self.threshold_evaluation_pruning =threshold_evaluation_pruning self.threshold_evaluation_scaling = max(0.00001,threshold_evaluation_scaling ) self.min_history_threshold = min_history_threshold - self.selection_evaluation_early_stop = selection_evaluation_early_stop + self.selection_evaluation_pruning = selection_evaluation_pruning self.selection_evaluation_scaling = max(0.00001,selection_evaluation_scaling ) self.evaluation_early_stop_steps = evaluation_early_stop_steps self.final_score_strategy = final_score_strategy @@ -616,21 +616,21 @@ def evaluate_population(self,): #Get the current thresholds per step self.thresholds = None - if self.threshold_evaluation_early_stop is not None: + if self.threshold_evaluation_pruning is not None: old_data = self.population.evaluated_individuals[self.objective_names] old_data = old_data[old_data[self.objective_names].notnull().all(axis=1)] if len(old_data) >= self.min_history_threshold: self.thresholds = np.array([get_thresholds(old_data[obj_name], - start=self.threshold_evaluation_early_stop[0], - end=self.threshold_evaluation_early_stop[1], + start=self.threshold_evaluation_pruning[0], + end=self.threshold_evaluation_pruning[1], scale=self.threshold_evaluation_scaling, n=self.evaluation_early_stop_steps) for obj_name in self.objective_names]).T #Get the selectors survival rates per step - if self.selection_evaluation_early_stop is not None: - lower = self.cur_population_size*self.selection_evaluation_early_stop[0] - upper = self.cur_population_size*self.selection_evaluation_early_stop[1] + if self.selection_evaluation_pruning is not None: + lower = self.cur_population_size*self.selection_evaluation_pruning[0] + upper = self.cur_population_size*self.selection_evaluation_pruning[1] #survival_counts = self.cur_population_size*(scipy.special.betainc(1,self.selection_evaluation_scaling,np.linspace(0,1,self.evaluation_early_stop_steps))*(upper-lower)+lower) survival_counts = np.array(beta_interpolation(start=lower, end=upper, scale=self.selection_evaluation_scaling, n=self.evaluation_early_stop_steps, n_steps=self.evaluation_early_stop_steps)) diff --git a/tpot2/evolvers/steady_state_evolver.py b/tpot2/evolvers/steady_state_evolver.py index bea6dd2d..6a3730f8 100644 --- a/tpot2/evolvers/steady_state_evolver.py +++ b/tpot2/evolvers/steady_state_evolver.py @@ -144,7 +144,7 @@ def __init__( self, Maximum time to evaluate a single individual. If none or inf, there will be no time limit per evaluation. n_jobs : int, default=1 Number of processes to run in parallel. - memory_limit : str, default="4GB" + memory_limit : str, default=None Memory limit for each job. See Dask [LocalCluster documentation](https://distributed.dask.org/en/stable/api.html#distributed.Client) for more information. client : dask.distributed.Client, default=None A dask client to use for parallelization. If not None, this will override the n_jobs and memory_limit parameters. If None, will create a new client with num_workers=n_jobs and memory_limit=memory_limit. diff --git a/tpot2/tpot_estimator/estimator.py b/tpot2/tpot_estimator/estimator.py index 7202175d..ab51ee94 100644 --- a/tpot2/tpot_estimator/estimator.py +++ b/tpot2/tpot_estimator/estimator.py @@ -63,9 +63,9 @@ def __init__(self, early_stop = None, scorers_early_stop_tol = 0.001, other_objectives_early_stop_tol =None, - threshold_evaluation_early_stop = None, + threshold_evaluation_pruning = None, threshold_evaluation_scaling = .5, - selection_evaluation_early_stop = None, + selection_evaluation_pruning = None, selection_evaluation_scaling = .5, min_history_threshold = 20, @@ -175,7 +175,7 @@ def __init__(self, - List of categorical features. If X is a dataframe, this should be a list of column names. If X is a numpy array, this should be a list of column indices preprocessing : bool or BaseEstimator/Pipeline, - EXPERIMENTAL + EXPERIMENTAL - will be changed in future versions A pipeline that will be used to preprocess the data before CV. Note that the parameters for these steps are not optimized. Add them to the search space to be optimized. - bool : If True, will use a default preprocessing pipeline which includes imputation followed by one hot encoding. - Pipeline : If an instance of a pipeline is given, will use that pipeline as the preprocessing pipeline. @@ -232,7 +232,7 @@ def __init__(self, -int If an int is given, it will be used as the tolerance for all objectives - threshold_evaluation_early_stop : list [start, end], default=None + threshold_evaluation_pruning : list [start, end], default=None starting and ending percentile to use as a threshold for the evaluation early stopping. Values between 0 and 100. @@ -240,7 +240,7 @@ def __init__(self, A scaling factor to use when determining how fast we move the threshold moves from the start to end percentile. Must be greater than zero. Higher numbers will move the threshold to the end faster. - selection_evaluation_early_stop : list, default=None + selection_evaluation_pruning : list, default=None A lower and upper percent of the population size to select each round of CV. Values between 0 and 1. @@ -290,7 +290,7 @@ def __init__(self, n_jobs : int, default=1 Number of processes to run in parallel. - memory_limit : str, default="4GB" + memory_limit : str, default=None Memory limit for each job. See Dask [LocalCluster documentation](https://distributed.dask.org/en/stable/api.html#distributed.Client) for more information. client : dask.distributed.Client, default=None @@ -403,10 +403,10 @@ def __init__(self, self.budget_scaling = budget_scaling self.generations_until_end_budget = generations_until_end_budget self.stepwise_steps = stepwise_steps - self.threshold_evaluation_early_stop =threshold_evaluation_early_stop + self.threshold_evaluation_pruning =threshold_evaluation_pruning self.threshold_evaluation_scaling = threshold_evaluation_scaling self.min_history_threshold = min_history_threshold - self.selection_evaluation_early_stop = selection_evaluation_early_stop + self.selection_evaluation_pruning = selection_evaluation_pruning self.selection_evaluation_scaling = selection_evaluation_scaling self.warm_start = warm_start self.verbose = verbose @@ -623,7 +623,7 @@ def objective_function(pipeline_individual, - if self.threshold_evaluation_early_stop is not None or self.selection_evaluation_early_stop is not None: + if self.threshold_evaluation_pruning is not None or self.selection_evaluation_pruning is not None: evaluation_early_stop_steps = self.cv else: evaluation_early_stop_steps = None @@ -698,11 +698,11 @@ def ind_generator(rng): max_eval_time_mins = self.max_eval_time_mins, periodic_checkpoint_folder = self.periodic_checkpoint_folder, - threshold_evaluation_early_stop = self.threshold_evaluation_early_stop, + threshold_evaluation_pruning = self.threshold_evaluation_pruning, threshold_evaluation_scaling = self.threshold_evaluation_scaling, min_history_threshold = self.min_history_threshold, - selection_evaluation_early_stop = self.selection_evaluation_early_stop, + selection_evaluation_pruning = self.selection_evaluation_pruning, selection_evaluation_scaling = self.selection_evaluation_scaling, evaluation_early_stop_steps = evaluation_early_stop_steps, diff --git a/tpot2/tpot_estimator/steady_state_estimator.py b/tpot2/tpot_estimator/steady_state_estimator.py index 57b3df28..8dcbfe5d 100644 --- a/tpot2/tpot_estimator/steady_state_estimator.py +++ b/tpot2/tpot_estimator/steady_state_estimator.py @@ -56,7 +56,7 @@ def __init__(self, early_stop = None, - early_stop_seconds = None, + early_stop_mins = None, scorers_early_stop_tol = 0.001, other_objectives_early_stop_tol = None, max_time_mins=None, @@ -251,7 +251,7 @@ def __init__(self, early_stop : int, default=None Number of evaluated individuals without improvement before early stopping. Counted across all objectives independently. Triggered when all objectives have not improved by the given number of individuals. - early_stop_seconds : float, default=None + early_stop_mins : float, default=None Number of seconds without improvement before early stopping. All objectives must not have improved for the given number of seconds for this to be triggered. scorers_early_stop_tol : @@ -277,7 +277,7 @@ def __init__(self, n_jobs : int, default=1 Number of processes to run in parallel. - memory_limit : str, default="4GB" + memory_limit : str, default=None Memory limit for each job. See Dask [LocalCluster documentation](https://distributed.dask.org/en/stable/api.html#distributed.Client) for more information. client : dask.distributed.Client, default=None @@ -314,7 +314,7 @@ def __init__(self, stepwise_steps : int, default=1 The number of staircase steps to take when scaling the budget and population size. - threshold_evaluation_early_stop : list [start, end], default=None + threshold_evaluation_pruning : list [start, end], default=None starting and ending percentile to use as a threshold for the evaluation early stopping. Values between 0 and 100. @@ -325,7 +325,7 @@ def __init__(self, min_history_threshold : int, default=0 The minimum number of previous scores needed before using threshold early stopping. - selection_evaluation_early_stop : list, default=None + selection_evaluation_pruning : list, default=None A lower and upper percent of the population size to select each round of CV. Values between 0 and 1. @@ -426,7 +426,7 @@ def __init__(self, self.initial_population_size = initial_population_size self.early_stop = early_stop - self.early_stop_seconds = early_stop_seconds + self.early_stop_mins = early_stop_mins self.scorers_early_stop_tol = scorers_early_stop_tol self.other_objectives_early_stop_tol = other_objectives_early_stop_tol self.max_time_mins = max_time_mins @@ -699,7 +699,7 @@ def ind_generator(rng): early_stop_tol = self.early_stop_tol, early_stop= self.early_stop, - early_stop_seconds = self.early_stop_seconds, + early_stop_mins = self.early_stop_mins, budget_range = self.budget_range, budget_scaling = self.budget_scaling, diff --git a/tpot2/tpot_estimator/templates/tpottemplates.py b/tpot2/tpot_estimator/templates/tpottemplates.py index 5b007950..15870747 100644 --- a/tpot2/tpot_estimator/templates/tpottemplates.py +++ b/tpot2/tpot_estimator/templates/tpottemplates.py @@ -159,7 +159,7 @@ def __init__( self, 6. evaluations progress bar. (Temporary: This used to be 2. Currently, using evaluation progress bar may prevent some instances were we terminate a generation early due to it reaching max_time_mins in the middle of a generation OR a pipeline failed to be terminated normally and we need to manually terminate it.) - memory_limit : str, default="4GB" + memory_limit : str, default=None Memory limit for each job. See Dask [LocalCluster documentation](https://distributed.dask.org/en/stable/api.html#distributed.Client) for more information. client : dask.distributed.Client, default=None @@ -422,7 +422,7 @@ def __init__( self, 6. evaluations progress bar. (Temporary: This used to be 2. Currently, using evaluation progress bar may prevent some instances were we terminate a generation early due to it reaching max_time_mins in the middle of a generation OR a pipeline failed to be terminated normally and we need to manually terminate it.) - memory_limit : str, default="4GB" + memory_limit : str, default=None Memory limit for each job. See Dask [LocalCluster documentation](https://distributed.dask.org/en/stable/api.html#distributed.Client) for more information. client : dask.distributed.Client, default=None From d4128846a8f4466e9989bf60be2effa30414c766 Mon Sep 17 00:00:00 2001 From: perib Date: Sun, 29 Sep 2024 07:47:57 -0700 Subject: [PATCH 33/44] pass through memory/kwards to all inner search spaces --- Tutorial/2_Search_Spaces.ipynb | 4425 ++++++++--------- tpot2/builtin_modules/estimatortransformer.py | 3 +- .../search_spaces/pipelines/dynamic_linear.py | 2 +- tpot2/search_spaces/pipelines/dynamicunion.py | 2 +- tpot2/search_spaces/pipelines/graph.py | 2 +- tpot2/search_spaces/pipelines/sequential.py | 2 +- tpot2/search_spaces/pipelines/union.py | 2 +- 7 files changed, 2144 insertions(+), 2294 deletions(-) diff --git a/Tutorial/2_Search_Spaces.ipynb b/Tutorial/2_Search_Spaces.ipynb index fcc6a8e9..782fbb19 100644 --- a/Tutorial/2_Search_Spaces.ipynb +++ b/Tutorial/2_Search_Spaces.ipynb @@ -28,7 +28,7 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 1, "metadata": {}, "outputs": [ { @@ -36,7 +36,7 @@ "output_type": "stream", "text": [ "sampled hyperparameters\n", - "{'bootstrap': False, 'criterion': 'gini', 'max_features': 0.0696410090574, 'min_samples_leaf': 7, 'min_samples_split': 8, 'n_estimators': 128}\n" + "{'bootstrap': False, 'criterion': 'entropy', 'max_features': 0.1574830347299, 'min_samples_leaf': 10, 'min_samples_split': 6, 'n_estimators': 128}\n" ] }, { @@ -446,19 +446,19 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
RandomForestClassifier(bootstrap=False, max_features=0.0696410090574,\n",
-       "                       min_samples_leaf=7, min_samples_split=8,\n",
-       "                       n_estimators=128)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
RandomForestClassifier(bootstrap=False, criterion='entropy',\n",
+       "                       max_features=0.1574830347299, min_samples_leaf=10,\n",
+       "                       min_samples_split=6, n_estimators=128)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "RandomForestClassifier(bootstrap=False, max_features=0.0696410090574,\n", - " min_samples_leaf=7, min_samples_split=8,\n", - " n_estimators=128)" + "RandomForestClassifier(bootstrap=False, criterion='entropy',\n", + " max_features=0.1574830347299, min_samples_leaf=10,\n", + " min_samples_split=6, n_estimators=128)" ] }, - "execution_count": 21, + "execution_count": 1, "metadata": {}, "output_type": "execute_result" } @@ -501,7 +501,7 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 2, "metadata": {}, "outputs": [ { @@ -509,7 +509,7 @@ "output_type": "stream", "text": [ "sampled hyperparameters\n", - "{'bootstrap': False, 'criterion': 'entropy', 'max_features': 0.2320810853841, 'min_samples_leaf': 19, 'min_samples_split': 12, 'n_estimators': 128}\n" + "{'bootstrap': True, 'criterion': 'entropy', 'max_features': 0.2601475241557, 'min_samples_leaf': 17, 'min_samples_split': 3, 'n_estimators': 128}\n" ] }, { @@ -919,19 +919,19 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
RandomForestClassifier(bootstrap=False, criterion='entropy',\n",
-       "                       max_features=0.2320810853841, min_samples_leaf=19,\n",
-       "                       min_samples_split=12, n_estimators=128)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
RandomForestClassifier(criterion='entropy', max_features=0.2601475241557,\n",
+       "                       min_samples_leaf=17, min_samples_split=3,\n",
+       "                       n_estimators=128)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "RandomForestClassifier(bootstrap=False, criterion='entropy',\n", - " max_features=0.2320810853841, min_samples_leaf=19,\n", - " min_samples_split=12, n_estimators=128)" + "RandomForestClassifier(criterion='entropy', max_features=0.2601475241557,\n", + " min_samples_leaf=17, min_samples_split=3,\n", + " n_estimators=128)" ] }, - "execution_count": 22, + "execution_count": 2, "metadata": {}, "output_type": "execute_result" } @@ -1017,7 +1017,7 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ @@ -1053,16 +1053,16 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "" + "" ] }, - "execution_count": 24, + "execution_count": 4, "metadata": {}, "output_type": "execute_result" } @@ -1074,7 +1074,7 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 5, "metadata": {}, "outputs": [ { @@ -1082,7 +1082,7 @@ "output_type": "stream", "text": [ "sampled hyperparameters\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 7, 'p': 1, 'weights': 'uniform'}\n" + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 9, 'p': 1, 'weights': 'uniform'}\n" ] } ], @@ -1100,7 +1100,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 6, "metadata": {}, "outputs": [ { @@ -1108,7 +1108,7 @@ "output_type": "stream", "text": [ "mutated hyperparameters\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 7, 'p': 2, 'weights': 'uniform'}\n" + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 3, 'p': 3, 'weights': 'distance'}\n" ] } ], @@ -1127,7 +1127,7 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 7, "metadata": {}, "outputs": [ { @@ -1135,14 +1135,14 @@ "output_type": "stream", "text": [ "original hyperparameters for individual 1\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 3, 'p': 3, 'weights': 'uniform'}\n", + "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 6, 'p': 2, 'weights': 'distance'}\n", "original hyperparameters for individual 2\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'distance'}\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 4, 'p': 2, 'weights': 'uniform'}\n", "\n", "post crossover hyperparameters for individual 1\n", - "{'metric': 'minkowski', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'uniform'}\n", + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 6, 'p': 2, 'weights': 'uniform'}\n", "post crossover hyperparameters for individual 2\n", - "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 7, 'p': 3, 'weights': 'distance'}\n" + "{'metric': 'euclidean', 'n_jobs': 1, 'n_neighbors': 4, 'p': 2, 'weights': 'uniform'}\n" ] } ], @@ -1175,7 +1175,7 @@ }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 8, "metadata": {}, "outputs": [ { @@ -1585,13 +1585,13 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
KNeighborsClassifier(n_jobs=1, n_neighbors=7, p=3)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=6)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "KNeighborsClassifier(n_jobs=1, n_neighbors=7, p=3)" + "KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=6)" ] }, - "execution_count": 28, + "execution_count": 8, "metadata": {}, "output_type": "execute_result" } @@ -1610,7 +1610,7 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 9, "metadata": {}, "outputs": [ { @@ -2026,7 +2026,7 @@ "KNeighborsClassifier(n_neighbors=10)" ] }, - "execution_count": 29, + "execution_count": 9, "metadata": {}, "output_type": "execute_result" } @@ -2079,16 +2079,16 @@ }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "" + "" ] }, - "execution_count": 30, + "execution_count": 10, "metadata": {}, "output_type": "execute_result" } @@ -2186,7 +2186,7 @@ }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 11, "metadata": {}, "outputs": [ { @@ -2603,16 +2603,16 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
DecisionTreeClassifier(max_depth=11, max_features='sqrt', min_samples_leaf=20,\n",
-       "                       min_samples_split=10)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
LogisticRegression(C=0.0008500633703, class_weight='balanced', max_iter=1000,\n",
+       "                   n_jobs=1, penalty='l1', solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "DecisionTreeClassifier(max_depth=11, max_features='sqrt', min_samples_leaf=20,\n", - " min_samples_split=10)" + "LogisticRegression(C=0.0008500633703, class_weight='balanced', max_iter=1000,\n", + " n_jobs=1, penalty='l1', solver='saga')" ] }, - "execution_count": 31, + "execution_count": 11, "metadata": {}, "output_type": "execute_result" } @@ -2626,7 +2626,7 @@ }, { "cell_type": "code", - "execution_count": 32, + "execution_count": 12, "metadata": {}, "outputs": [ { @@ -3043,16 +3043,16 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=6,\n",
-       "                     weights='distance')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
LogisticRegression(C=0.1054489422979, class_weight='balanced', max_iter=1000,\n",
+       "                   n_jobs=1, penalty='l1', solver='liblinear')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "KNeighborsClassifier(metric='euclidean', n_jobs=1, n_neighbors=6,\n", - " weights='distance')" + "LogisticRegression(C=0.1054489422979, class_weight='balanced', max_iter=1000,\n", + " n_jobs=1, penalty='l1', solver='liblinear')" ] }, - "execution_count": 32, + "execution_count": 12, "metadata": {}, "output_type": "execute_result" } @@ -3230,7 +3230,7 @@ }, { "cell_type": "code", - "execution_count": 33, + "execution_count": 13, "metadata": {}, "outputs": [ { @@ -3647,16 +3647,13 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
DecisionTreeClassifier(max_depth=3, max_features='sqrt', min_samples_leaf=16,\n",
-       "                       min_samples_split=8)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
KNeighborsClassifier(n_jobs=1, n_neighbors=55, weights='distance')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "DecisionTreeClassifier(max_depth=3, max_features='sqrt', min_samples_leaf=16,\n", - " min_samples_split=8)" + "KNeighborsClassifier(n_jobs=1, n_neighbors=55, weights='distance')" ] }, - "execution_count": 33, + "execution_count": 13, "metadata": {}, "output_type": "execute_result" } @@ -3671,7 +3668,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 14, "metadata": {}, "outputs": [ { @@ -4088,13 +4085,16 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
LogisticRegression(C=203.4209981734027, max_iter=1000, n_jobs=1, solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
LogisticRegression(C=0.012915602763, l1_ratio=0.2577823332886, max_iter=1000,\n",
+       "                   n_jobs=1, penalty='elasticnet', solver='saga')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "LogisticRegression(C=203.4209981734027, max_iter=1000, n_jobs=1, solver='saga')" + "LogisticRegression(C=0.012915602763, l1_ratio=0.2577823332886, max_iter=1000,\n", + " n_jobs=1, penalty='elasticnet', solver='saga')" ] }, - "execution_count": 34, + "execution_count": 14, "metadata": {}, "output_type": "execute_result" } @@ -4106,7 +4106,7 @@ }, { "cell_type": "code", - "execution_count": 35, + "execution_count": 15, "metadata": {}, "outputs": [ { @@ -4523,13 +4523,19 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
GaussianNB()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
SGDClassifier(alpha=0.0038384092036, class_weight='balanced',\n",
+       "              eta0=0.7197535254246, l1_ratio=0.8816063677431,\n",
+       "              loss='modified_huber', n_jobs=1, penalty='elasticnet')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "GaussianNB()" + "SGDClassifier(alpha=0.0038384092036, class_weight='balanced',\n", + " eta0=0.7197535254246, l1_ratio=0.8816063677431,\n", + " loss='modified_huber', n_jobs=1, penalty='elasticnet')" ] }, - "execution_count": 35, + "execution_count": 15, "metadata": {}, "output_type": "execute_result" } @@ -4544,7 +4550,7 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 16, "metadata": {}, "outputs": [ { @@ -4961,13 +4967,13 @@ " /* fitted */\n", " background-color: var(--sklearn-color-fitted-level-3);\n", "}\n", - "
MultinomialNB(alpha=0.2214451695279)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
KNeighborsClassifier(n_jobs=1, n_neighbors=1, p=1, weights='distance')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "MultinomialNB(alpha=0.2214451695279)" + "KNeighborsClassifier(n_jobs=1, n_neighbors=1, p=1, weights='distance')" ] }, - "execution_count": 36, + "execution_count": 16, "metadata": {}, "output_type": "execute_result" } @@ -4987,13 +4993,13 @@ }, { "cell_type": "code", - "execution_count": 75, + "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
RandomForestClassifier(bootstrap=False, max_features=0.5976648428162,\n",
-       "                       min_samples_leaf=4, min_samples_split=7,\n",
-       "                       n_estimators=128, random_state=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
RandomForestClassifier(bootstrap=False, criterion='entropy',\n",
+       "                       max_features=0.0121463021153, min_samples_leaf=10,\n",
+       "                       min_samples_split=14, n_estimators=128, random_state=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "RandomForestClassifier(bootstrap=False, max_features=0.5976648428162,\n", - " min_samples_leaf=4, min_samples_split=7,\n", - " n_estimators=128, random_state=1)" + "RandomForestClassifier(bootstrap=False, criterion='entropy',\n", + " max_features=0.0121463021153, min_samples_leaf=10,\n", + " min_samples_split=14, n_estimators=128, random_state=1)" ] }, - "execution_count": 75, + "execution_count": 17, "metadata": {}, "output_type": "execute_result" } @@ -5430,7 +5436,7 @@ }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 18, "metadata": {}, "outputs": [ { @@ -5443,7 +5449,7 @@ { "data": { "text/html": [ - "
Pipeline(steps=[('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0061644734724)),\n",
-       "                ('pca', PCA(n_components=0.5803735556718)),\n",
+       "
Pipeline(steps=[('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0008293708451)),\n",
+       "                ('pca', PCA(n_components=0.5048643890372)),\n",
        "                ('logisticregression',\n",
-       "                 LogisticRegression(C=0.0331002885417, class_weight='balanced',\n",
-       "                                    max_iter=1000, n_jobs=1, solver='saga'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
VarianceThreshold(threshold=0.0008293708451)
PCA(n_components=0.5048643890372)
LogisticRegression(C=7.7606337566295, class_weight='balanced',\n",
+       "                   l1_ratio=0.123465163557, max_iter=1000, n_jobs=1,\n",
+       "                   penalty='elasticnet', solver='saga')
" ], "text/plain": [ "Pipeline(steps=[('variancethreshold',\n", - " VarianceThreshold(threshold=0.0061644734724)),\n", - " ('pca', PCA(n_components=0.5803735556718)),\n", + " VarianceThreshold(threshold=0.0008293708451)),\n", + " ('pca', PCA(n_components=0.5048643890372)),\n", " ('logisticregression',\n", - " LogisticRegression(C=0.0331002885417, class_weight='balanced',\n", - " max_iter=1000, n_jobs=1, solver='saga'))])" + " LogisticRegression(C=7.7606337566295, class_weight='balanced',\n", + " l1_ratio=0.123465163557, max_iter=1000,\n", + " n_jobs=1, penalty='elasticnet',\n", + " solver='saga'))])" ] }, - "execution_count": 37, + "execution_count": 18, "metadata": {}, "output_type": "execute_result" } @@ -5900,7 +5913,7 @@ }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 19, "metadata": {}, "outputs": [ { @@ -5913,7 +5926,7 @@ { "data": { "text/html": [ - "
Pipeline(steps=[('selectpercentile',\n",
-       "                 SelectPercentile(percentile=82.0501639184698)),\n",
-       "                ('zerocount', ZeroCount()),\n",
-       "                ('multinomialnb', MultinomialNB(alpha=0.7116498874199))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.1215210592814)),\n",
+       "                ('fastica', FastICA(n_components=83)),\n",
+       "                ('baggingclassifier',\n",
+       "                 BaggingClassifier(bootstrap_features=True,\n",
+       "                                   max_features=0.9057563115025,\n",
+       "                                   max_samples=0.2313759070451, n_estimators=89,\n",
+       "                                   n_jobs=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('selectpercentile',\n", - " SelectPercentile(percentile=82.0501639184698)),\n", - " ('zerocount', ZeroCount()),\n", - " ('multinomialnb', MultinomialNB(alpha=0.7116498874199))])" + "Pipeline(steps=[('variancethreshold',\n", + " VarianceThreshold(threshold=0.1215210592814)),\n", + " ('fastica', FastICA(n_components=83)),\n", + " ('baggingclassifier',\n", + " BaggingClassifier(bootstrap_features=True,\n", + " max_features=0.9057563115025,\n", + " max_samples=0.2313759070451, n_estimators=89,\n", + " n_jobs=1))])" ] }, - "execution_count": 38, + "execution_count": 19, "metadata": {}, "output_type": "execute_result" } @@ -6354,7 +6380,7 @@ }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 20, "metadata": {}, "outputs": [ { @@ -6367,7 +6393,7 @@ { "data": { "text/html": [ - "
Pipeline(steps=[('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0090024083095)),\n",
-       "                ('nystroem',\n",
-       "                 Nystroem(gamma=0.326846805684, kernel='chi2',\n",
-       "                          n_components=49)),\n",
-       "                ('mlpclassifier',\n",
-       "                 MLPClassifier(activation='identity', alpha=0.0009288789905,\n",
-       "                               early_stopping=True,\n",
-       "                               hidden_layer_sizes=[265, 265],\n",
-       "                               learning_rate='invscaling',\n",
-       "                               learning_rate_init=0.0366758440485,\n",
-       "                               n_iter_no_change=32))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('selectpercentile',\n",
+       "                 SelectPercentile(percentile=25.1697450346144)),\n",
+       "                ('kbinsdiscretizer',\n",
+       "                 KBinsDiscretizer(encode='onehot-dense', n_bins=40,\n",
+       "                                  strategy='uniform')),\n",
+       "                ('lineardiscriminantanalysis',\n",
+       "                 LinearDiscriminantAnalysis(shrinkage=0.755769834898,\n",
+       "                                            solver='eigen'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('variancethreshold',\n", - " VarianceThreshold(threshold=0.0090024083095)),\n", - " ('nystroem',\n", - " Nystroem(gamma=0.326846805684, kernel='chi2',\n", - " n_components=49)),\n", - " ('mlpclassifier',\n", - " MLPClassifier(activation='identity', alpha=0.0009288789905,\n", - " early_stopping=True,\n", - " hidden_layer_sizes=[265, 265],\n", - " learning_rate='invscaling',\n", - " learning_rate_init=0.0366758440485,\n", - " n_iter_no_change=32))])" + "Pipeline(steps=[('selectpercentile',\n", + " SelectPercentile(percentile=25.1697450346144)),\n", + " ('kbinsdiscretizer',\n", + " KBinsDiscretizer(encode='onehot-dense', n_bins=40,\n", + " strategy='uniform')),\n", + " ('lineardiscriminantanalysis',\n", + " LinearDiscriminantAnalysis(shrinkage=0.755769834898,\n", + " solver='eigen'))])" ] }, - "execution_count": 39, + "execution_count": 20, "metadata": {}, "output_type": "execute_result" } @@ -6833,7 +6845,7 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 21, "metadata": {}, "outputs": [ { @@ -6846,7 +6858,7 @@ { "data": { "text/html": [ - "
Pipeline(steps=[('zerocount-1', ZeroCount()), ('zerocount-2', ZeroCount()),\n",
-       "                ('minmaxscaler', MinMaxScaler())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('rbfsampler',\n",
+       "                 RBFSampler(gamma=0.1991726671256, n_components=7)),\n",
+       "                ('zerocount', ZeroCount()),\n",
+       "                ('binarizer', Binarizer(threshold=0.5354245073766))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('zerocount-1', ZeroCount()), ('zerocount-2', ZeroCount()),\n", - " ('minmaxscaler', MinMaxScaler())])" + "Pipeline(steps=[('rbfsampler',\n", + " RBFSampler(gamma=0.1991726671256, n_components=7)),\n", + " ('zerocount', ZeroCount()),\n", + " ('binarizer', Binarizer(threshold=0.5354245073766))])" ] }, - "execution_count": 40, + "execution_count": 21, "metadata": {}, "output_type": "execute_result" } @@ -7275,7 +7293,7 @@ }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 22, "metadata": {}, "outputs": [ { @@ -7288,7 +7306,7 @@ { "data": { "text/html": [ - "
Pipeline(steps=[('selectpercentile',\n",
-       "                 SelectPercentile(percentile=61.5466222112372)),\n",
-       "                ('robustscaler',\n",
-       "                 RobustScaler(quantile_range=(0.0479806149183,\n",
-       "                                              0.9674592383627))),\n",
-       "                ('selectfrommodel',\n",
-       "                 SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n",
-       "                                                                criterion='entropy',\n",
-       "                                                                max_features=0.3260066399479,\n",
-       "                                                                min_samples_leaf=6,\n",
-       "                                                                min_samples_split=8,\n",
-       "                                                                n_jobs=1),\n",
-       "                                 threshold=0.0001984121028))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0014251225737)),\n",
+       "                ('powertransformer', PowerTransformer())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "Pipeline(steps=[('selectpercentile',\n", - " SelectPercentile(percentile=61.5466222112372)),\n", - " ('robustscaler',\n", - " RobustScaler(quantile_range=(0.0479806149183,\n", - " 0.9674592383627))),\n", - " ('selectfrommodel',\n", - " SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n", - " criterion='entropy',\n", - " max_features=0.3260066399479,\n", - " min_samples_leaf=6,\n", - " min_samples_split=8,\n", - " n_jobs=1),\n", - " threshold=0.0001984121028))])" + "Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0014251225737)),\n", + " ('powertransformer', PowerTransformer())])" ] }, - "execution_count": 41, + "execution_count": 22, "metadata": {}, "output_type": "execute_result" } @@ -7755,7 +7731,7 @@ }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 23, "metadata": {}, "outputs": [ { @@ -7768,7 +7744,7 @@ { "data": { "text/html": [ - "
Pipeline(steps=[('pipeline',\n",
-       "                 Pipeline(steps=[('binarizer',\n",
-       "                                  Binarizer(threshold=0.4321512765788)),\n",
-       "                                 ('pca', PCA(n_components=0.6918117427918)),\n",
-       "                                 ('passkbinsdiscretizer',\n",
-       "                                  PassKBinsDiscretizer(n_bins=42))])),\n",
-       "                ('extratreesclassifier',\n",
-       "                 ExtraTreesClassifier(class_weight='balanced',\n",
-       "                                      criterion='entropy',\n",
-       "                                      max_features=0.169455524505,\n",
-       "                                      min_samples_leaf=8, min_samples_split=14,\n",
-       "                                      n_jobs=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('pipeline',\n",
+       "                 Pipeline(steps=[('nystroem',\n",
+       "                                  Nystroem(gamma=0.3480554902065,\n",
+       "                                           kernel='sigmoid', n_components=20)),\n",
+       "                                 ('binarizer',\n",
+       "                                  Binarizer(threshold=0.6696149189758)),\n",
+       "                                 ('minmaxscaler', MinMaxScaler())])),\n",
+       "                ('multinomialnb', MultinomialNB(alpha=0.0016967794962))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "Pipeline(steps=[('pipeline',\n", - " Pipeline(steps=[('binarizer',\n", - " Binarizer(threshold=0.4321512765788)),\n", - " ('pca', PCA(n_components=0.6918117427918)),\n", - " ('passkbinsdiscretizer',\n", - " PassKBinsDiscretizer(n_bins=42))])),\n", - " ('extratreesclassifier',\n", - " ExtraTreesClassifier(class_weight='balanced',\n", - " criterion='entropy',\n", - " max_features=0.169455524505,\n", - " min_samples_leaf=8, min_samples_split=14,\n", - " n_jobs=1))])" + " Pipeline(steps=[('nystroem',\n", + " Nystroem(gamma=0.3480554902065,\n", + " kernel='sigmoid', n_components=20)),\n", + " ('binarizer',\n", + " Binarizer(threshold=0.6696149189758)),\n", + " ('minmaxscaler', MinMaxScaler())])),\n", + " ('multinomialnb', MultinomialNB(alpha=0.0016967794962))])" ] }, - "execution_count": 42, + "execution_count": 23, "metadata": {}, "output_type": "execute_result" } @@ -8232,7 +8196,7 @@ }, { "cell_type": "code", - "execution_count": 43, + "execution_count": 24, "metadata": {}, "outputs": [ { @@ -8245,7 +8209,7 @@ { "data": { "text/html": [ - "
Pipeline(steps=[('pipeline',\n",
-       "                 Pipeline(steps=[('rfe',\n",
-       "                                  RFE(estimator=ExtraTreesClassifier(bootstrap=True,\n",
-       "                                                                     criterion='entropy',\n",
-       "                                                                     max_features=0.0135775754498,\n",
-       "                                                                     min_samples_leaf=8,\n",
-       "                                                                     min_samples_split=6,\n",
-       "                                                                     n_jobs=1),\n",
-       "                                      step=0.7236899597647)),\n",
-       "                                 ('columnonehotencoder', ColumnOneHotEncoder()),\n",
-       "                                 ('featureagglomeration',\n",
-       "                                  FeatureAgglomeration(linkage='average',\n",
-       "                                                       metric='l2',\n",
-       "                                                       n_clusters=150))])),\n",
-       "                ('sgdclassifier',\n",
-       "                 SGDClassifier(alpha=0.0009821180851, eta0=0.1666104101354,\n",
-       "                               l1_ratio=0.7504578619487, loss='modified_huber',\n",
-       "                               n_jobs=1, penalty='elasticnet'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
Pipeline(steps=[('pipeline',\n",
+       "                 Pipeline(steps=[('zerocount', ZeroCount()),\n",
+       "                                 ('variancethreshold',\n",
+       "                                  VarianceThreshold(threshold=0.0020422211173)),\n",
+       "                                 ('binarizer',\n",
+       "                                  Binarizer(threshold=0.9681763702))])),\n",
+       "                ('bernoullinb',\n",
+       "                 BernoulliNB(alpha=0.0816524714629, fit_prior=False))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "Pipeline(steps=[('pipeline',\n", - " Pipeline(steps=[('rfe',\n", - " RFE(estimator=ExtraTreesClassifier(bootstrap=True,\n", - " criterion='entropy',\n", - " max_features=0.0135775754498,\n", - " min_samples_leaf=8,\n", - " min_samples_split=6,\n", - " n_jobs=1),\n", - " step=0.7236899597647)),\n", - " ('columnonehotencoder', ColumnOneHotEncoder()),\n", - " ('featureagglomeration',\n", - " FeatureAgglomeration(linkage='average',\n", - " metric='l2',\n", - " n_clusters=150))])),\n", - " ('sgdclassifier',\n", - " SGDClassifier(alpha=0.0009821180851, eta0=0.1666104101354,\n", - " l1_ratio=0.7504578619487, loss='modified_huber',\n", - " n_jobs=1, penalty='elasticnet'))])" + " Pipeline(steps=[('zerocount', ZeroCount()),\n", + " ('variancethreshold',\n", + " VarianceThreshold(threshold=0.0020422211173)),\n", + " ('binarizer',\n", + " Binarizer(threshold=0.9681763702))])),\n", + " ('bernoullinb',\n", + " BernoulliNB(alpha=0.0816524714629, fit_prior=False))])" ] }, - "execution_count": 43, + "execution_count": 24, "metadata": {}, "output_type": "execute_result" } @@ -8748,13 +8664,13 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
FeatureUnion(transformer_list=[('zerocount', ZeroCount()),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureUnion(transformer_list=[('fastica',\n",
+       "                                FastICA(algorithm='deflation',\n",
+       "                                        n_components=66)),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "FeatureUnion(transformer_list=[('zerocount', ZeroCount()),\n", + "FeatureUnion(transformer_list=[('fastica',\n", + " FastICA(algorithm='deflation',\n", + " n_components=66)),\n", " ('passthrough', Passthrough())])" ] }, - "execution_count": 44, + "execution_count": 25, "metadata": {}, "output_type": "execute_result" } @@ -9190,13 +9112,13 @@ }, { "cell_type": "code", - "execution_count": 45, + "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0104695394381)),\n",
+       "
Pipeline(steps=[('variancethreshold',\n",
+       "                 VarianceThreshold(threshold=0.0009494718313)),\n",
        "                ('featureunion',\n",
-       "                 FeatureUnion(transformer_list=[('quantiletransformer',\n",
-       "                                                 QuantileTransformer(n_quantiles=93)),\n",
+       "                 FeatureUnion(transformer_list=[('binarizer',\n",
+       "                                                 Binarizer(threshold=0.8136655878085)),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
-       "                ('svc',\n",
-       "                 SVC(C=0.5015595860816, coef0=0.5773095995375, kernel='sigmoid',\n",
-       "                     max_iter=3000, probability=True))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
VarianceThreshold(threshold=0.0009494718313)
FeatureUnion(transformer_list=[('binarizer',\n",
+       "                                Binarizer(threshold=0.8136655878085)),\n",
+       "                               ('passthrough', Passthrough())])
Binarizer(threshold=0.8136655878085)
Passthrough()
AdaBoostClassifier(learning_rate=0.1727096029044, n_estimators=446)
" ], "text/plain": [ - "Pipeline(steps=[('selectfwe', SelectFwe(alpha=0.0104695394381)),\n", + "Pipeline(steps=[('variancethreshold',\n", + " VarianceThreshold(threshold=0.0009494718313)),\n", " ('featureunion',\n", - " FeatureUnion(transformer_list=[('quantiletransformer',\n", - " QuantileTransformer(n_quantiles=93)),\n", + " FeatureUnion(transformer_list=[('binarizer',\n", + " Binarizer(threshold=0.8136655878085)),\n", " ('passthrough',\n", " Passthrough())])),\n", - " ('svc',\n", - " SVC(C=0.5015595860816, coef0=0.5773095995375, kernel='sigmoid',\n", - " max_iter=3000, probability=True))])" + " ('adaboostclassifier',\n", + " AdaBoostClassifier(learning_rate=0.1727096029044,\n", + " n_estimators=446))])" ] }, - "execution_count": 45, + "execution_count": 26, "metadata": {}, "output_type": "execute_result" } @@ -9657,13 +9581,13 @@ }, { "cell_type": "code", - "execution_count": 46, + "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('featureunion',\n",
+       "
Pipeline(steps=[('featureunion',\n",
        "                 FeatureUnion(transformer_list=[('pipeline-1',\n",
-       "                                                 Pipeline(steps=[('selectfwe',\n",
-       "                                                                  SelectFwe(alpha=0.0013091622594)),\n",
-       "                                                                 ('nystroem',\n",
-       "                                                                  Nystroem(gamma=0.2527764721894,\n",
-       "                                                                           kernel='additive_chi2',\n",
-       "                                                                           n_components=40))])),\n",
-       "                                                ('pipeline-2',\n",
        "                                                 Pipeline(steps=[('variancethreshold',\n",
-       "                                                                  VarianceThreshold(threshold=0.0130961185337)),\n",
-       "                                                                 ('featureagglomeration',\n",
-       "                                                                  Fe...ge='average',\n",
-       "                                                                                       metric='l2',\n",
-       "                                                                                       n_clusters=293,\n",
-       "                                                                                       pooling_func=<function median at 0x73f3c1bda370>))]))])),\n",
-       "                ('histgradientboostingclassifier',\n",
-       "                 HistGradientBoostingClassifier(early_stopping=False,\n",
-       "                                                l2_regularization=3.1669452e-06,\n",
-       "                                                learning_rate=0.1262523910078,\n",
-       "                                                max_features=0.8008565064114,\n",
-       "                                                max_leaf_nodes=1504,\n",
-       "                                                min_samples_leaf=32, tol=0.0001,\n",
-       "                                                validation_fraction=None))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
VarianceThreshold(threshold=0.1996640297479)
PowerTransformer()
SelectFwe(alpha=0.0045323854667)
FastICA(n_components=34)
QuadraticDiscriminantAnalysis(reg_param=0.8833282196313)
" + ], "text/plain": [ "Pipeline(steps=[('featureunion',\n", " FeatureUnion(transformer_list=[('pipeline-1',\n", - " Pipeline(steps=[('selectfwe',\n", - " SelectFwe(alpha=0.0013091622594)),\n", - " ('nystroem',\n", - " Nystroem(gamma=0.2527764721894,\n", - " kernel='additive_chi2',\n", - " n_components=40))])),\n", - " ('pipeline-2',\n", " Pipeline(steps=[('variancethreshold',\n", - " VarianceThreshold(threshold=0.0130961185337)),\n", - " ('featureagglomeration',\n", - " Fe...ge='average',\n", - " metric='l2',\n", - " n_clusters=293,\n", - " pooling_func=))]))])),\n", - " ('histgradientboostingclassifier',\n", - " HistGradientBoostingClassifier(early_stopping=False,\n", - " l2_regularization=3.1669452e-06,\n", - " learning_rate=0.1262523910078,\n", - " max_features=0.8008565064114,\n", - " max_leaf_nodes=1504,\n", - " min_samples_leaf=32, tol=0.0001,\n", - " validation_fraction=None))])" + " VarianceThreshold(threshold=0.1996640297479)),\n", + " ('powertransformer',\n", + " PowerTransformer())])),\n", + " ('pipeline-2',\n", + " Pipeline(steps=[('selectfwe',\n", + " SelectFwe(alpha=0.0045323854667)),\n", + " ('fastica',\n", + " FastICA(n_components=34))]))])),\n", + " ('quadraticdiscriminantanalysis',\n", + " QuadraticDiscriminantAnalysis(reg_param=0.8833282196313))])" ] }, - "execution_count": 46, + "execution_count": 27, "metadata": {}, "output_type": "execute_result" } @@ -10199,13 +10079,13 @@ }, { "cell_type": "code", - "execution_count": 47, + "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
FeatureUnion(transformer_list=[('quantiletransformer',\n",
-       "                                QuantileTransformer(n_quantiles=81)),\n",
-       "                               ('columnonehotencoder', ColumnOneHotEncoder())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureUnion(transformer_list=[('zerocount', ZeroCount()),\n",
+       "                               ('powertransformer', PowerTransformer())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "FeatureUnion(transformer_list=[('quantiletransformer',\n", - " QuantileTransformer(n_quantiles=81)),\n", - " ('columnonehotencoder', ColumnOneHotEncoder())])" + "FeatureUnion(transformer_list=[('zerocount', ZeroCount()),\n", + " ('powertransformer', PowerTransformer())])" ] }, - "execution_count": 47, + "execution_count": 28, "metadata": {}, "output_type": "execute_result" } @@ -10640,13 +10517,13 @@ }, { "cell_type": "code", - "execution_count": 48, + "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                FeatureUnion(transformer_list=[('rbfsampler-1',\n",
-       "                                                                RBFSampler(gamma=0.6219125014396,\n",
-       "                                                                           n_components=64)),\n",
-       "                                                               ('powertransformer',\n",
-       "                                                                PowerTransformer()),\n",
-       "                                                               ('rbfsampler-2',\n",
-       "                                                                RBFSampler(gamma=0.3345729157827,\n",
-       "                                                                           n_components=23))])),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('powertransformer',\n",
+       "                                                                PowerTransformer())])),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('rbfsampler-1',\n", - " RBFSampler(gamma=0.6219125014396,\n", - " n_components=64)),\n", - " ('powertransformer',\n", - " PowerTransformer()),\n", - " ('rbfsampler-2',\n", - " RBFSampler(gamma=0.3345729157827,\n", - " n_components=23))])),\n", + " FeatureUnion(transformer_list=[('powertransformer',\n", + " PowerTransformer())])),\n", " ('passthrough', Passthrough())])" ] }, - "execution_count": 48, + "execution_count": 29, "metadata": {}, "output_type": "execute_result" } @@ -11099,13 +10958,13 @@ }, { "cell_type": "code", - "execution_count": 49, + "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.1286612361721)),\n",
+       "
Pipeline(steps=[('selectpercentile',\n",
+       "                 SelectPercentile(percentile=3.5688237635159)),\n",
        "                ('featureunion',\n",
        "                 FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                                 FeatureUnion(transformer_list=[('binarizer',\n",
-       "                                                                                 Binarizer(threshold=0.736067585858)),\n",
-       "                                                                                ('rbfsampler',\n",
-       "                                                                                 RBFSampler(gamma=0.1436440722816,\n",
-       "                                                                                            n_components=44)),\n",
-       "                                                                                ('quantiletransformer',\n",
-       "                                                                                 QuantileTransformer(n_quantiles=100))])),\n",
+       "                                                 FeatureUnion(transformer_list=[('featureagglomeration',\n",
+       "                                                                                 FeatureAgglomeration(n_clusters=28,\n",
+       "                                                                                                      pooling_func=<function max at 0x78ec455b4e30>))])),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
-       "                ('histgradientboostingclassifier',\n",
-       "                 HistGradientBoostingClassifier(early_stopping=True,\n",
-       "                                                l2_regularization=2.047622e-07,\n",
-       "                                                learning_rate=0.0164428425279,\n",
-       "                                                max_features=0.3325348714186,\n",
-       "                                                max_leaf_nodes=1940,\n",
-       "                                                min_samples_leaf=78,\n",
-       "                                                n_iter_no_change=12, tol=0.0001,\n",
-       "                                                validation_fraction=None))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
SelectPercentile(percentile=3.5688237635159)
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('featureagglomeration',\n",
+       "                                                                FeatureAgglomeration(n_clusters=28,\n",
+       "                                                                                     pooling_func=<function max at 0x78ec455b4e30>))])),\n",
+       "                               ('passthrough', Passthrough())])
FeatureAgglomeration(n_clusters=28,\n",
+       "                     pooling_func=<function max at 0x78ec455b4e30>)
Passthrough()
LogisticRegression(C=9762.07332929782, max_iter=1000, n_jobs=1, solver='saga')
" ], "text/plain": [ - "Pipeline(steps=[('variancethreshold',\n", - " VarianceThreshold(threshold=0.1286612361721)),\n", + "Pipeline(steps=[('selectpercentile',\n", + " SelectPercentile(percentile=3.5688237635159)),\n", " ('featureunion',\n", " FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('binarizer',\n", - " Binarizer(threshold=0.736067585858)),\n", - " ('rbfsampler',\n", - " RBFSampler(gamma=0.1436440722816,\n", - " n_components=44)),\n", - " ('quantiletransformer',\n", - " QuantileTransformer(n_quantiles=100))])),\n", + " FeatureUnion(transformer_list=[('featureagglomeration',\n", + " FeatureAgglomeration(n_clusters=28,\n", + " pooling_func=))])),\n", " ('passthrough',\n", " Passthrough())])),\n", - " ('histgradientboostingclassifier',\n", - " HistGradientBoostingClassifier(early_stopping=True,\n", - " l2_regularization=2.047622e-07,\n", - " learning_rate=0.0164428425279,\n", - " max_features=0.3325348714186,\n", - " max_leaf_nodes=1940,\n", - " min_samples_leaf=78,\n", - " n_iter_no_change=12, tol=0.0001,\n", - " validation_fraction=None))])" + " ('logisticregression',\n", + " LogisticRegression(C=9762.07332929782, max_iter=1000, n_jobs=1,\n", + " solver='saga'))])" ] }, - "execution_count": 49, + "execution_count": 30, "metadata": {}, "output_type": "execute_result" } @@ -11620,13 +11440,13 @@ }, { "cell_type": "code", - "execution_count": 50, + "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
ExtraTreesClassifier(max_features=0.2286391649712, min_samples_leaf=13,\n",
-       "                     min_samples_split=8, n_jobs=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
ExtraTreesClassifier(class_weight='balanced', max_features=0.6642237575313,\n",
+       "                     min_samples_leaf=17, min_samples_split=3, n_jobs=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "ExtraTreesClassifier(max_features=0.2286391649712, min_samples_leaf=13,\n", - " min_samples_split=8, n_jobs=1)" + "ExtraTreesClassifier(class_weight='balanced', max_features=0.6642237575313,\n", + " min_samples_leaf=17, min_samples_split=3, n_jobs=1)" ] }, - "execution_count": 50, + "execution_count": 31, "metadata": {}, "output_type": "execute_result" } @@ -12057,13 +11877,13 @@ }, { "cell_type": "code", - "execution_count": 51, + "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n",
-       "                                               criterion='entropy',\n",
-       "                                               max_features=0.0311518006465,\n",
-       "                                               min_samples_split=19, n_jobs=1),\n",
-       "                threshold=0.0012368197842)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n",
+       "                                               class_weight='balanced',\n",
+       "                                               max_features=0.3007313724684,\n",
+       "                                               min_samples_leaf=12,\n",
+       "                                               min_samples_split=17, n_jobs=1),\n",
+       "                threshold=0.0048046738992)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=True,\n", - " criterion='entropy',\n", - " max_features=0.0311518006465,\n", - " min_samples_split=19, n_jobs=1),\n", - " threshold=0.0012368197842)" + " class_weight='balanced',\n", + " max_features=0.3007313724684,\n", + " min_samples_leaf=12,\n", + " min_samples_split=17, n_jobs=1),\n", + " threshold=0.0048046738992)" ] }, - "execution_count": 51, + "execution_count": 32, "metadata": {}, "output_type": "execute_result" } @@ -12528,13 +12351,13 @@ }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
EstimatorTransformer(estimator=HistGradientBoostingClassifier(early_stopping=True,\n",
-       "                                                              l2_regularization=0.000117454825,\n",
-       "                                                              learning_rate=0.122899142038,\n",
-       "                                                              max_features=0.5654219816525,\n",
-       "                                                              max_leaf_nodes=1048,\n",
-       "                                                              min_samples_leaf=1,\n",
-       "                                                              n_iter_no_change=17,\n",
-       "                                                              tol=0.0001,\n",
-       "                                                              validation_fraction=0.3473838441178))
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
EstimatorTransformer(estimator=SVC(C=140.9223338924506, gamma=0.0007253447995,\n",
+       "                                   max_iter=3000, probability=True,\n",
+       "                                   shrinking=False))
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ - "EstimatorTransformer(estimator=HistGradientBoostingClassifier(early_stopping=True,\n", - " l2_regularization=0.000117454825,\n", - " learning_rate=0.122899142038,\n", - " max_features=0.5654219816525,\n", - " max_leaf_nodes=1048,\n", - " min_samples_leaf=1,\n", - " n_iter_no_change=17,\n", - " tol=0.0001,\n", - " validation_fraction=0.3473838441178))" + "EstimatorTransformer(estimator=SVC(C=140.9223338924506, gamma=0.0007253447995,\n", + " max_iter=3000, probability=True,\n", + " shrinking=False))" ] }, - "execution_count": 52, + "execution_count": 33, "metadata": {}, "output_type": "execute_result" } @@ -12995,20 +12790,20 @@ }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "array([[0.94963523, 0.05036477],\n", - " [0.91791762, 0.08208238],\n", - " [0.16108516, 0.83891484],\n", - " [0.05483536, 0.94516464],\n", - " [0.05482495, 0.94517505]])" + "array([[0.5 , 0.5 ],\n", + " [0.50964815, 0.49035185],\n", + " [0.50681558, 0.49318442],\n", + " [0.51565809, 0.48434191],\n", + " [0.52006004, 0.47993996]])" ] }, - "execution_count": 53, + "execution_count": 34, "metadata": {}, "output_type": "execute_result" } @@ -13029,48 +12824,20 @@ }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 35, "metadata": {}, "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n", - "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/svm/_base.py:297: ConvergenceWarning: Solver terminated early (max_iter=3000). Consider pre-processing your data with StandardScaler or MinMaxScaler.\n", - " warnings.warn(\n" - ] - }, { "data": { "text/plain": [ "array([[0],\n", " [0],\n", - " [0],\n", + " [1],\n", " [1],\n", " [1]])" ] }, - "execution_count": 54, + "execution_count": 35, "metadata": {}, "output_type": "execute_result" } @@ -13091,13 +12858,13 @@ }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('standardscaler', StandardScaler()),\n",
+       "
Pipeline(steps=[('normalizer', Normalizer(norm='max')),\n",
        "                ('featureunion-1',\n",
        "                 FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                                 FeatureUnion(transformer_list=[('powertransformer',\n",
-       "                                                                                 PowerTransformer()),\n",
-       "                                                                                ('nystroem-1',\n",
-       "                                                                                 Nystroem(gamma=0.4484025909592,\n",
-       "                                                                                          kernel='polynomial',\n",
-       "                                                                                          n_components=13)),\n",
-       "                                                                                ('nystroem-2',\n",
-       "                                                                                 Nystroem(gamma=0.9023618026452,\n",
+       "                                                 FeatureUnion(transformer_list=[('rbfsampler',\n",
+       "                                                                                 RBFSampler(gamma=0.7809991844556,\n",
+       "                                                                                            n_components=50)),\n",
+       "                                                                                ('columnonehotencoder',\n",
+       "                                                                                 ColumnOneHotEncoder()),\n",
+       "                                                                                ('nystroem',\n",
+       "                                                                                 Nystroem(gamma=0.3179172515929,\n",
        "                                                                                          kernel='additive_chi2',\n",
-       "                                                                                          n_component...\n",
-       "                                                                                 EstimatorTransformer(cross_val_predict_cv=10,\n",
-       "                                                                                                      estimator=GaussianNB(),\n",
-       "                                                                                                      method='predict')),\n",
-       "                                                                                ('estimatortransformer-3',\n",
-       "                                                                                 EstimatorTransformer(cross_val_predict_cv=10,\n",
-       "                                                                                                      estimator=BaggingClassifier(bootstrap=False,\n",
-       "                                                                                                                                  bootstrap_features=True,\n",
-       "                                                                                                                                  max_features=0.248985416426,\n",
-       "                                                                                                                                  max_samples=0.8328766080285,\n",
-       "                                                                                                                                  n_estimators=42,\n",
-       "                                                                                                                                  n_jobs=1),\n",
+       "                                                                                          n_components=80))])),\n",
+       "                                                ('...\n",
+       "                                                                                                                              class_weight='balanced',\n",
+       "                                                                                                                              eta0=0.4039854095517,\n",
+       "                                                                                                                              l1_ratio=0.0336982783886,\n",
+       "                                                                                                                              learning_rate='constant',\n",
+       "                                                                                                                              loss='modified_huber',\n",
+       "                                                                                                                              n_jobs=1,\n",
+       "                                                                                                                              penalty='elasticnet'),\n",
        "                                                                                                      method='predict'))])),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
-       "                ('gaussiannb', GaussianNB())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
BaggingClassifier(bootstrap=False, bootstrap_features=True,\n",
+       "                  max_features=0.3230075838011, max_samples=0.5802749777364,\n",
+       "                  n_estimators=8, n_jobs=1)
BaggingClassifier(bootstrap=False, bootstrap_features=True,\n",
+       "                  max_features=0.3230075838011, max_samples=0.5802749777364,\n",
+       "                  n_estimators=8, n_jobs=1)
ExtraTreesClassifier(bootstrap=True, criterion='entropy',\n",
+       "                     max_features=0.372253059993, min_samples_leaf=2,\n",
+       "                     min_samples_split=10, n_jobs=1)
ExtraTreesClassifier(bootstrap=True, criterion='entropy',\n",
+       "                     max_features=0.372253059993, min_samples_leaf=2,\n",
+       "                     min_samples_split=10, n_jobs=1)
SGDClassifier(alpha=0.0009170388361, class_weight='balanced',\n",
+       "              eta0=0.4039854095517, l1_ratio=0.0336982783886,\n",
+       "              learning_rate='constant', loss='modified_huber', n_jobs=1,\n",
+       "              penalty='elasticnet')
SGDClassifier(alpha=0.0009170388361, class_weight='balanced',\n",
+       "              eta0=0.4039854095517, l1_ratio=0.0336982783886,\n",
+       "              learning_rate='constant', loss='modified_huber', n_jobs=1,\n",
+       "              penalty='elasticnet')
Passthrough()
MLPClassifier(alpha=0.0867902302825, hidden_layer_sizes=[35],\n",
+       "              learning_rate='invscaling', learning_rate_init=0.0152961651727,\n",
+       "              n_iter_no_change=32)
" ], "text/plain": [ - "Pipeline(steps=[('standardscaler', StandardScaler()),\n", + "Pipeline(steps=[('normalizer', Normalizer(norm='max')),\n", " ('featureunion-1',\n", " FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('powertransformer',\n", - " PowerTransformer()),\n", - " ('nystroem-1',\n", - " Nystroem(gamma=0.4484025909592,\n", - " kernel='polynomial',\n", - " n_components=13)),\n", - " ('nystroem-2',\n", - " Nystroem(gamma=0.9023618026452,\n", + " FeatureUnion(transformer_list=[('rbfsampler',\n", + " RBFSampler(gamma=0.7809991844556,\n", + " n_components=50)),\n", + " ('columnonehotencoder',\n", + " ColumnOneHotEncoder()),\n", + " ('nystroem',\n", + " Nystroem(gamma=0.3179172515929,\n", " kernel='additive_chi2',\n", - " n_component...\n", - " EstimatorTransformer(cross_val_predict_cv=10,\n", - " estimator=GaussianNB(),\n", - " method='predict')),\n", - " ('estimatortransformer-3',\n", - " EstimatorTransformer(cross_val_predict_cv=10,\n", - " estimator=BaggingClassifier(bootstrap=False,\n", - " bootstrap_features=True,\n", - " max_features=0.248985416426,\n", - " max_samples=0.8328766080285,\n", - " n_estimators=42,\n", - " n_jobs=1),\n", + " n_components=80))])),\n", + " ('...\n", + " class_weight='balanced',\n", + " eta0=0.4039854095517,\n", + " l1_ratio=0.0336982783886,\n", + " learning_rate='constant',\n", + " loss='modified_huber',\n", + " n_jobs=1,\n", + " penalty='elasticnet'),\n", " method='predict'))])),\n", " ('passthrough',\n", " Passthrough())])),\n", - " ('gaussiannb', GaussianNB())])" + " ('mlpclassifier',\n", + " MLPClassifier(alpha=0.0867902302825, hidden_layer_sizes=[35],\n", + " learning_rate='invscaling',\n", + " learning_rate_init=0.0152961651727,\n", + " n_iter_no_change=32))])" ] }, - "execution_count": 55, + "execution_count": 36, "metadata": {}, "output_type": "execute_result" } @@ -13680,7 +13452,7 @@ }, { "cell_type": "code", - "execution_count": 56, + "execution_count": 37, "metadata": {}, "outputs": [], "source": [ @@ -13696,12 +13468,12 @@ }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 38, "metadata": {}, "outputs": [ { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13724,12 +13496,12 @@ }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 39, "metadata": {}, "outputs": [ { "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8fJSN1AAAACXBIWXMAAA9hAAAPYQGoP6dpAABUfklEQVR4nO3deVyVdf7//+dZQFYRRBZRzAU3cN8msbJf1tSUfvRj32xVW2xs1TFLU9AZwC2zzfYps7plWVaTn7SxGp0coUkHDZRccAkUAUVQZBE4y++Piukkp1zgnMPhcb/d5nbTN69znafYcD29rnNdl8Fut9sFAACAZs/o7gAAAABoHBQ7AAAAL0GxAwAA8BIUOwAAAC9BsQMAAPASFDsAAAAvQbEDAADwEhQ7AAAAL0GxAwAA8BIUOwAAAC9BsQMAAPASFDsAAAAvQbEDAADwEhQ7AAAAL0GxAwAA8BIUOwAAAC9BsQMAAPASFDsAAAAvQbEDAADwEhQ7AAAAL0GxAwAA8BIUOwAAAC9BsQMAAPASFDsAAAAvQbEDAADwEhQ7AAAAL0GxAwAA8BIUOwAAAC9hdncAAGhMVqtVpaWlKi4uVnFxsY4XFammulo2q1VGk0mt/P3VLipKkZGRioyMVFhYmEwmk7tjA0CjMNjtdru7QwDAxSorK1NWVpZ2bt+uM5WVslssCqquVkhpqXwsFhntdtkMBtWZzToVFqYKf38ZzGb5BQaqz8CB6tevn0JDQ939xwCAi0KxA9CsHT16VBlbtuhQbq58qqoUm39Y0aWlCqmslI/V6vR1dSaTTgUGqjAsTPmxHVUXEKDOcXFKvOwyRUdHu/BPAACNh2IHoFmyWCxKT0/XtvR0BZWUqFtevjqUlMhks533tqxGo46Eh2t/p1hVhIdrSGKiEhMTZTbzaRUAzQvFDkCzU1RUpHVr16rsSIF65uYqrqBAxkb4UWYzGJQbE6M9cXEK6xCjP4wZo6ioqEZIDACuQbED0Kzk5eXp49WrFXC0UIN271brqqpGf4/ygABl9uqlqvbtNW7CTerUqVOjvwcANAWKHYBmIy8vTx+++67a5uVr6HffyXwBp13PlcVo1DfxvVUaG6vxt9xCuQPQLHAfOwDNQlFRkT5evVphefn6XU5Ok5Y6STLbbLp0V47C8vP18er3VVRU1KTvBwCNgWIHwONZLBatW7tWAUcLNey77xrl83Tnwmi3a1jOd/IvPKr1a9fKYrG45H0B4EJR7AB4vPT0dJUdKdCg3bub/EjdL5ltNg36brdKCwqUkZHh0vcGgPNFsQPg0Y4ePapt6enqmZvbJBdKnIuQqir12JerrVu2qLCw0C0ZAOBcUOwAeLSMLVsUVFKiuIICt+boXlCgoJISpW/Z4tYcAPBrKHYAPFZZWZkO5eaqW16+yz5X54zRblfXvHwd2rdPZWVlbs0CAM5Q7AB4rKysLPlUValDSYm7o0iSOpaUyFxVpezsbHdHAYAGUewAeCSr1aqd27crNv/wBT0mrCmYbDZ1OnxY2ZmZsv7Kc2gBwF0odgA8Umlpqc5UViq6tNTdURxEn/ghV6mH5QIAiWIHwA1SUlKUkJCgPn36aPDgwTp06NBZM8XFxbJbLPrD5xsu6D3+euSww+97bfmXxuzYXv+/2gs8ChhSWSm7xaLi4mKH9U8//VQJCQkyGo3atWvXBW0bAC6W2d0BALQsGRkZ+uc//6lvv/1WZrNZR44cUWBg4FlzxcXFCqquvuD3+euRI5rSoWP974PNZq0dMPCCt/cTH6tVQdXVKi4uVkJCQv16jx49tGbNGk2dOvWi3wMALhTFDoBLFRUVKTQ0VGbzDz9+OnToIElav369UlJSdObMGQ0dOlQjL7tMIb843fnS4Xx9ceKE6mw23RrdXrdER0uSns/P02clJTLKoP8XFamS2jqdtlg0Zsd2DWzdWn/u2q3BLH/YnqkP+/WXJA3699d6p09fDWjdWmN2bNc7ffrKaDDoz/v360D1D/fPm9uliwa1DlHr0jId/8UjxuLi4hrtewQAF4piB8Clrr76as2fP1+9evXSNddco9tvv12dO3fWU089pX/+85/y8/PTgw8+qH/961+68meP8NpcVqoTtXX6qP8A1dpsuiU7S1eGhWl3ZYW2njqlj/sPkK/RqJN1dWrj46P3igodjtD9VPQkqX9wsFK6xalvULCyTp+WXVKPgEBllperW0CApB+O8C39/pCubttWS8N7qKimRlNycvR/AwfK12LRmTNnXPp9A4BzQbED4FLBwcHasWOHNm3apC+//FJXX3213nzzTWVnZ+t3v/udJKm6ulp9eveWsXXr+tell53UxtJSbS0/JUmqsFiUf6ZaX588pfGRUfI1/vCR4TY+Pg2/bwOnYge2bq3M8nLZZdc9HTpo3fHj6hYQoAHBwZKkjLKT2lxaqucP50uSTlrqVGuzyWi3ycpzYwF4IIodAJczm826+uqrdfXVVys8PFx/+tOfdMMNN2jFihX1M2++9ppsP3vKg13SQ7GxGhcZ6bCtL09c+NWpA1u31qKDB2UwSJPbx+jdwkJllpdrUOuQH9/Trld6x6u9n5/D62wGo0xmfnwC8DxcFQvApfbu3asDBw5Ikux2u3JycvTHP/5RmzZt0uHDP1zJeuLECVVUVanuZ+VpeJs2WlNcpDM/3j/uYFWVamw2DW/TRh8WF9Vf5Xqyrk6SZDIYZP2Np1V09ffX92eqVWOzKchsVucAf31yrFgDfzxSOLxNqN752bNhd1dUSJJqzWb5/qLsAYAnoNgBcKmKigrdfvvtio+PV0JCgmw2mx5++GG99NJLGjt2rPr27atrrrlGJl9fnQoLq3/dyLAwjQwL041Z3+r67Zn684H9strtGhkWpmEhbTT22x0as2O7/u/4cUnSuIhI3fDjnDMGg0FxAQHqHvDDVbkDg1vLJqnDj6XtgdhYnair0w3bM3Vd5n/0QfEPF0yUh4WqXVSUw7Y2bNigDh066Ouvv9aoUaN0yy23NOa3DQDOicFud/MDGAGgAbt27dL6Dz7QDV9tlo8HPeWhzmTSp1dcrj/8v//ncLsTAPAEHLED4JEiIyNlMJt1qoF73LnTqcBAGcxmRf7is34A4AkodgA8UlhYmPwCA1X4s9OxnuCdslI9+/LLuvrqq9W/f3/1799fixcvdncsAJDEVbEAPJTJZFKfgQP17YkT6p2fL9MFPgKsMVmNRnW9/Eq9c801uuKKK9wdBwDOwhE7AB6rX79+qgsI0JHw8CbZfvnpch0tLNSx48dUdw73pTscHi5LQID69u3bJHkA4GJR7AB4rNDQUHWOi9P+TrGyGQyNuu06i0UVFRWS7LJYLCotLZXtV64lsxkMOtApVp27d1doaGijZgGAxkKxA+DREi+7TBXh4cqNiWnS97FaLSovL3f69X0xMaoID1fiiBFNmgMALgbFDoBHi46O1pDERO2Ji1P5j89xbQw+ZrN8fVs5rFVVVepMTc1Zs6cCArS3e5yGjhih6OjoRssAAI2NYgfA4yUmJiq0Q4wye/WSxdh4P7batGkjg8FxeydPnnQ4JWsxGpXZu5fCYmI0fPjwRntvAGgKFDsAHs9sNuv6MWNU1b69vonv3WiftzObTGr94+PDfmKzWXXq1Kkffm0w6Jv43qqObq8/jBkjM8+HBeDhKHYAmoWoqCiNm3CTSmNj9XVCfKMduQsMCFCrVo7Pfa2urlJlba2+TohXaWysxk24SVG/eIQYAHgiHikGoFnJy8vTx6vfV8DRoxq0e7daV1Vd9DatVquOHT8uu/2He+VVtm6tvYOHyN6ls8bfcos6dep00e8BAK5AsQPQ7BQVFWnd2rUqO1Kgnrm5iisokPEif5RVVVer9NRJHe3eXbk9e6qgtFTVdXV6++23ZWjkW60AQFOh2AFoliwWi9LT07UtPV1BJSXqmpevjiUlF/SECqvRqMPh4cqJjFCxv7/St21TRkaGrFar3n33Xd18881N8CcAgMZHsQPQrB09elQZ6ek6tG+fzFVV6nT4sKJPlCqkslI+VqvT19WZTDoVGKjCtmHK69hRloAARXfsqNQFC7Rv3776udDQUOXk5HCbEwDNAsUOgFcoKytTdna2sjMzdaayUnaLRUHV1WpdWiZfi0VGu002g1G1ZrPKw0JV4e8vg9ksv8BA9R00SH379lVoaKjef/99TZgwwWHbN9xwg9auXcspWQAej2IHwKtYrVaVlpaquLhYxcXFOl5UpNozZ2S1WGQym+Xr56d2UVGKjIxUZGSkwsLCZDKZHLYxYcIEvf/++w5rK1as0J133unKPwoAnDeKHQD8QklJiRISElRcXFy/1rp1a+3cuVOxsbFuTAYAv4772AHAL4SHh+vVV191WCsvL9fdd98t/i0MwJNR7ACgAWPGjNGkSZMc1r788ku9/PLLbkoEAL+NU7EA4MTJkyeVkJCggoKC+rWAgABlZ2era9eubkwGAA3jiB0AONGmTRutWLHCYa2qqkp33nmnrL9yKxUAcBeKHQD8imuuuUZ//OMfHdb+9a9/6dlnn3VTIgBwjlOxAPAbTp8+rX79+unQoUP1a61atdKOHTvUq1cvNyYDAEccsQOA3xAcHKw33njDYa2mpkaTJk2SxWJxUyoAOBvFDgDOwRVXXKHp06c7rG3btk1PPPGEewIBQAM4FQsA56i6ulr9+/d3eJasj4+Ptm3bpn79+rkxGQD8gCN2AHCO/P399eabb8po/O+Pzrq6Ok2aNEm1tbVuTAYAP6DYAcB5+N3vfqfHHnvMYS0rK0upqaluSgQA/8WpWAA4TzU1NRo8eLB27dpVv2YymfT1119ryJAhbkwGoKWj2AHABdixY4eGDh3qcFVsr169tH37dvn5+bkxGYCWjFOxAHABBgwYoOTkZIe13bt3n7UGAK7EETsAuEB1dXW69NJLlZmZWb9mMBi0efNmjRgxwo3JALRUFDsAuAg5OTkaOHCgw1WxXbt2VVZWlgIDA92YDEBLxKlYALgI8fHxSktLc1g7cOCAZs2a5aZEAFoyjtgBwEWyWq26/PLLlZGR4bD+5Zdf6qqrrnJTKgAtEcUOABpBbm6u+vXrp+rq6vq12NhYZWdnKyQkxI3JALQknIoFgEYQFxenJUuWOKzl5+drxowZbkoEoCXiiB0ANBKbzaZRo0Zp06ZNDuuffvqprr/+ejelAtCSUOwAoBF9//336tOnjyoqKurXoqKilJOTo7CwMDcmA9AScCoWABrRJZdcoqefftphraioSA899JCbEgFoSThiBwCNzG636/rrr9dnn33msL5mzRqNHz/eTakAtAQUOwBoAgUFBUpISNDJkyfr18LDw5WTk6OIiAj3BQPg1TgVCwBNICYmRsuXL3dYKykp0dSpU8W/pwE0FYodADSR2267TePGjXNY+/jjj7Vq1So3JQLg7TgVCwBN6NixY4qPj1dJSUn9Wps2bbRr1y7FxMS4MRkAb8QROwBoQhEREXrppZcc1k6ePKkpU6ZwShZAo6PYAUATu/HGG3XLLbc4rH322Wd6/fXX3ZQIgLfiVCwAuEBpaani4+NVVFRUvxYUFKSdO3fqkksucV8wAF6FI3YA4AJhYWF67bXXHNYqKip01113yWazuSkVAG9DsQMAF7n++ut11113Oaxt2rRJL774opsSAfA2nIoFABc6deqU+vTpo8OHD9ev+fv7KysrS3FxcW5MBsAbcMQOAFwoJCREK1ascFirrq7W5MmTZbVa3ZQKgLeg2AGAi40aNUr333+/w1pGRoaeeuopNyUC4C04FQsAblBRUaH+/fvrwIED9Wu+vr7avn274uPj3ZgMQHPGETsAcIOgoCCtXLlSBoOhfq22tlaTJk1SXV2dG5MBaM4odgDgJiNGjNCMGTMc1jIzM7V48WI3JQLQ3HEqFgDcqLq6WgMHDtSePXvq18xms7Zu3aoBAwa4MRmA5ogjdgDgRv7+/nrzzTdlMpnq1ywWiyZNmqSamho3JgPQHFHsAMDNhg4dqtmzZzus7dy5U3/5y1/clAhAc8WpWADwALW1tRoyZIiys7Pr14xGozIyMjRs2DA3JgPQnFDsAMBDZGVlaciQIQ5XxXbv3l07duxQQECAG5MBaC44FQsAHqJfv36aP3++w9q+ffs0d+5cNyUC0NxwxA4APIjFYtHw4cO1bdu2+jWDwaBNmzbpiiuucGMyAM0BxQ4APMzu3bs1YMAAh6tiO3furOzsbAUFBbkxGQBPx6lYAPAwvXr10sKFCx3WDh06pEcffdRNiQA0FxyxAwAPZLVaNXLkSG3ZssVhfcOGDbrmmmvclAqAp6PYAYCHOnDggPr27auqqqr6tQ4dOmjnzp1q06aN+4IB8FicigUAD9W1a1ctXbrUYe3IkSOaPn26ewIB8HgcsQMAD2az2fT73/9eX375pcP6J598ojFjxrgpFQBPRbEDAA+Xn5+vPn36qLy8vH4tMjJSOTk5atu2rRuTAfA0nIoFAA8XGxurZ555xmGtuLhYDzzwgHsCAfBYHLEDgGbAbrdrzJgx+vTTTx3WV69erZtuuslNqQB4GoodADQThYWFio+PV1lZWf1a27ZttWvXLkVFRbkxGQBPwalYAGgmoqOj9cILLzisnThxQn/84x/Fv9EBSBQ7AGhWbr75Zt14440Oa2vXrtXbb7/tpkQAPAmnYgGgmTl+/Lji4+N1/Pjx+rWQkBDt2rVLHTp0cGMyAO7GETsAaGbatWunV1991WHt1KlTuvvuuzklC7RwFDsAaIbGjh2r22+/3WHt888/P6vwAWhZOBULAM1UWVmZEhISdPTo0fq1wMBAZWdnq0uXLm5MBsBdOGIHAM1UaGioXn/9dYe1yspK3XXXXbLZbG5KBcCdKHYA0Ixde+21mjJlisPaV199peXLl7spEQB34lQsADRzp0+fVp8+fZSXl1e/5ufnp2+//VY9evRwYzIArsYROwBo5oKDg/XGG284rJ05c0aTJ0+WxWJxUyoA7kCxAwAvcOWVV+qhhx5yWPv3v/+tJ5980k2JALgDp2IBwEtUVVWpf//+ys3NrV/z9fXVf/7zH/Xp08eNyQC4CkfsAMBLBAQEaOXKlTIa//ujvba2VpMmTVJtba0bkwFwFYodAHiR4cOHa+bMmQ5rO3bs0MKFC92UCIArcSoWALzMmTNnNHjwYOXk5NSvmUwmffPNNxo0aJAbkwFoahyxAwAv4+fnpzfffFMmk6l+zWq1atKkSTpz5owbkwFoahQ7APBCgwYNUlJSksNaTk6O5s+f76ZEAFyBU7EA4KXq6uo0bNgw7dixo37NYDBoy5YtGj58uBuTAWgqFDsA8GI7d+7U4MGDHa6K7datm7799lsFBga6MRmApsCpWADwYn369NFf/vIXh7X9+/fr8ccfd1MiAE2JI3YA4OUsFosuu+wy/fvf/3ZY37hxo6688ko3pQLQFCh2ANAC7N27V/3793e4KrZTp07Kzs5W69at3ZgMQGPiVCwAtAA9evTQ4sWLHdby8vLOupkxgOaNI3YA0ELYbDb9f//f/6evvvrKYX39+vW67rrr3JQKQGOi2AFAC3Lo0CH16dNHlZWV9Wvt27fXrl27FBoa6sZkABoDp2IBoAXp3Lmzli1b5rB29OhRPfzww25KBKAxccQOAFoYu92ua6+9Vp9//rnD+kcffaRx48a5KRWAxkCxA4AW6MiRI0pISNCpU6fq19q1a6ecnBy1a9fOjckAXAxOxQJAC9ShQwc999xzDmvHjx/XfffdJ/69DzRfHLEDgBbKbrdr7NixWrt2rcP6qlWrdMstt7gpFYCLQbEDgBasqKhICQkJOnHiRP1aaGiocnJyFB0d7cZkAC4Ep2IBoAWLiorSiy++6LBWVlamKVOmcEoWaIYodgDQwt10002aMGGCw9q6deu0cuVK9wQCcME4FQsA0IkTJxQfH6/i4uL6teDgYO3atUuxsbFuTAbgfHDEDgCgtm3b6tVXX3VYO336tO6++27ZbDY3pQJwvih2AABJ0pgxYzRp0iSHtS+//FIvv/yymxIBOF+cigUA1Dt58qT69OmjI0eO1K8FBAQoOztbXbt2dWMyAOeCI3YAgHpt2rTR66+/7rBWVVWlyZMny2q1uikVgHNFsQMAOLjmmms0depUh7UtW7bo2WefdVMiAOeKU7EAgLNUVFSob9++OnToUP1aq1attGPHDvXq1cuNyQD8Go7YAQDOEhQUpDfeeEMGg6F+raamRpMmTZLFYnFjMgC/hmIHAGjQFVdcoWnTpjmsbdu2TUuWLHFTIgC/hVOxAACnqqurNWDAAO3du7d+zcfHR9u2bVO/fv3cmAxAQzhiBwBwyt/fX2+++aaMxv/uLurq6jRx4kTV1ta6MRmAhlDsAAC/atiwYZo1a5bDWnZ2tlJTU92UCIAznIoFAPymmpoaDRkyRDt37qxfM5lM+vrrrzVkyBA3JgPwcxQ7AMA52bFjh4YOHepwVWyvXr2UmZkpf39/NyYD8BNOxQIAzsmAAQOUnJzssLZ79+6z1gC4D0fsAADnrK6uTpdeeqkyMzPr1wwGgzZv3qwRI0a4MRkAiWIHADhPOTk5GjhwoMNVsV27dlVWVpYCAwPdmAwAp2IBAOclPj5eaWlpDmsHDhw468pZAK7HETsAwHmzWq26/PLLlZGR4bD+5Zdf6qqrrnJTKgAUOwDABcnNzVW/fv1UXV1dvxYbG6vs7GyFhIS4MRnQcnEqFgBwQeLi4s56bmx+fr5mzJjhpkQAOGIHALhgNptNo0aN0qZNmxzWP/30U11//fVuSgW0XBQ7AMBF+f7779WnTx9VVFTUr0VFRSknJ0dhYWFuTAa0PJyKBQBclEsuuURPP/20w1pRUZEeeughNyUCWi6O2AEALprdbtf111+vzz77zGF9zZo1Gj9+vJtSAS0PxQ4A0CgKCgqUkJCgkydP1q+Fh4crJydHERER7gsGtCCcigUANIqYmBgtX77cYa2kpERTp04VxxAA16DYAQAazW233aZx48Y5rH388cdatWqVmxIBLQunYgEAjerYsWOKj49XSUlJ/VqbNm20a9cuxcTEuDEZ4P04YgcAaFQRERF66aWXHNZOnjypKVOmcEoWaGIUOwBAo7vxxht1yy23OKx99tlnev31192UCGgZOBULAGgSpaWlio+PV1FRUf1aUFCQdu7cqUsuucR9wQAvxhE7AECTCAsL02uvveawVlFRobvuuks2m81NqQDvRrEDADSZ66+/XnfddZfD2qZNm/Tiiy+6KRHg3TgVCwBoUqdOnVKfPn10+PDh+jV/f39lZWUpLi7OjckA78MROwBAkwoJCdGKFSsc1qqrqzV58mRZrVY3pQK8E8UOANDkRo0apfvvv99hLSMjQ0899ZSbEgHeiVOxAACXqKioUP/+/XXgwIH6NV9fX23fvl3x8fFuTAZ4D47YAQBcIigoSCtXrpTBYKhfq62t1aRJk1RXV+fGZID3oNgBAFxmxIgRmjFjhsNaZmamFi9e7KZEgHfhVCwAwKWqq6s1cOBA7dmzp37NbDZr69atGjBggBuTAc0fR+wAAC7l7++vN998UyaTqX7NYrFo0qRJqqmpcWMyoPmj2AEAXG7o0KGaPXu2w9rOnTv1l7/8xU2JAO/AqVgAgFvU1tZqyJAhys7Orl8zGo3KyMjQsGHD3JgMaL4odgAAt8nKytKQIUMcrort0aOHduzYIX9/fzcmA5onTsUCANymX79+mj9/vsPa3r17NXfuXDclApo3jtgBANzKYrFo+PDh2rZtW/2awWDQpk2bdMUVV7gxGdD8UOwAAG63e/duDRgwwOGq2M6dOys7O1tBQUFuTAY0L5yKBQC4Xa9evbRw4UKHtUOHDunRRx91UyKgeeKIHQDAI1itVo0cOVJbtmxxWN+wYYOuueYaN6UCmheKHQDAYxw4cEB9+/ZVVVVV/VqHDh20c+dOtWnTxn3BgGaCU7EAAI/RtWtXLV261GHtyJEjmj59unsCAc0MR+wAAB7FZrPp97//vb788kuH9U8++URjxoyR1Wp1eBwZgP/iiB0AwKMYjUa9/vrrat26tcP6lClTdP/996tNmzaKjY0967N4ADhiBwDwUG+88Ybuuusup1/v37+/duzY4cJEgOej2AEAPJLdbtf111+vzz77rMGvGwwGVVdXq1WrVrJarSotLVVxcbGKi4t1vKhINdXVslmtMppMauXvr3ZRUYqMjFRkZKTCwsI4nQuvZHZ3AAAAGlJYWKiDBw86/brdbldOTo7Ky8u1c/t2namslN1iUVB1tUJKS+Vvschot8tmMKjObNbesDBl+vvLYDbLLzBQfQYOVL9+/RQaGurCPxXQtDhiBwDwSDfffLNWr17d4NeioqI0YvhwDejTRwF1dYrNP6zo0lKFVFbKx2p1us06k0mnAgNVGBam/NiOqgsIUOe4OCVedpmio6Ob6o8CuAxH7AAAHqmkpOSsNZPJpOHDhytxyBCFV1So538y1fX0aZlstnPapo/VqvDycoWXl6t3fr6OhIdr/4kTemf/fg1JTFRiYqLMZnaNaL44YgcA8EhffvmlRo8erTNnzkiSIiIiNOb66xUTGqq4PXvUft8+BfkHqE1IyEW9j81gUG5MjPbExSmsQ4z+MGaMoqKiGuOPALgcxQ4A4LH279+vWbNm6T//+Y9uGjtW0VVV6pWZqYDyckmSyWRWZEREo7xXeUCAMnv1UlX79ho34SZ16tSpUbYLuBLFDgDg0fLy8vTuW28p5MAB9fj6a5l+9hm6xix2kmQxGvVNfG+VxsZq/C23UO7Q7HCDYgCAxyoqKtLHq1cr6mihrjx4SGFBwZIM9V8PCgpq1Pcz22y6dFeOwvLz9fHq91VUVNSo2weaGsUOAOCRLBaL1q1dq4CjhRr23Xcy2e0KDAhQdFSUQkPDFBERqcCAgEZ/X6PdrmE538m/8KjWr10ri8XS6O8BNBWKHQDAI6Wnp6vsSIEG7d4t88+uejUYDPL385O5CW8wbLbZNOi73SotKFBGRkaTvQ/Q2Ch2AACPc/ToUW1LT1fP3Fy1rqpyS4aQqir12JerrVu2qLCw0C0ZgPNFsQMAeJyMLVsUVFKiuIICt+boXlCgoJISpW/Z4tYcwLmi2AEAPEpZWZkO5eaqW16+jG6+cYPRblfXvHwd2rdPZWVlbs0CnAuKHQDAo2RlZcmnqkodGnjyhDt0LCmRuapK2dnZ7o4C/CaKHQDAY1itVu3cvl2x+YfP+TFhTc1ks6nT4cPKzsyU9VeeQwt4AoodAMBlDAaDkpKS6n8/c+ZMrVy5sv73paWlOlNZqejS0l/dzsqCAtW6sPhFn/ghV+kvcj3wwAOKiIjQ4MGDXZYF+DUUOwCAywQFBemdd95R+Y+PBPul4uJi2S0Wtamo+NXtvHm0QHUNfP7ObrfL1gSfywuprJTdYlFxcbHD+q233qrPPvus0d8PuFBmdwcAALQcrVq10m233aaXXnpJs2bNql/Pzc3V5MmTlZKSoqDqan1TekIfFBVpWY+emrVvr3IqKmQyGHRnTIyqrTYdq63VzVnfqqOfv17s3VtD//21/l9UlL4+eVJPdu+hd4sKlXHypEwGg2Z17qzENqGy2O1afPCgdpwuV53drodiY3V123B9VFysTaUndMZm04GqKj0Y20lHzpzRFydKFO7rq1d6x8tXUlB1tYqLi5WQkFCfOzExUd9//73rv5GAExQ7AIBLTZs2Tb/73e80bdq0+rW4uDgZjUbt/PZbtSst1QfFxzQuIlK7Kyt05EyNPhv0w6nO0xaLgs1mvV5wRO/166/AH29SfNJi0eDWIXr0ks76e8lx5VWf0f8NGKijNTW6Y2e2/j5osD4qLlaMn5+SunZVhcWiG7O+1RWhYZKk/VVV+qj/AJ20WHRd5n+0qHt3PdxpkKbv2a1/lpbqmvBwtS4t03EeMQYPR7EDALhUu3btdMMNN2jFihUO65MnT9bav/1N41u31reny7W4e3dVWC06VlujPx/Yr1FhbTUiNLTBbfoZjboy7IeSlllertHt2sloMKiDn58u8ffXwaoqpZ8sU25VlT4+9sPp1GqbTUW1NZKkS9u0kb/JJH+TST5Go64KaytJ6hEYqIKaH2Z8LRadOXOmSb4nQGOh2AEAXG7mzJkaNWqUrrvuuvq1m266SfPnzVOHSy7RqLZtZTIYFGL20f8NHKTNZaVaUXBEW06WaXbnLmdtz8/o/CPjdv1w0YZdUlq3OA0JCXH4+n9Olcv3Z683SPW/N8pQ/5k9o90mK8+NhYfj4gkAgMt17NhRiYmJ+vDDD+vXgoODFRsbq3d37NDYiEhJUmldnex2u64Lb6cHYmO1u6JSkhRoMqnSya1HBrVurXUlx2Wz21Vw5ozyq6vV2d9fw9u00btFhbL+WNS++40LNH7JZjDKZOZ4CDwbxQ4A4BazZs3S0aNHHdZGJCaqdUCAegQGSpKKa2p0285sjd6+XWkHDurB2FhJ0k1RUbpjZ7bu/+67s7Z7TdtwdfTz0+gd23Xf7u+UGhenVkajbo6KVoSvr/5nx3Zdvz1TLxzOP6+8tWazfP38HNbuueceXXrppcrOzlaHDh308ccfn9c2gcZmsNvd/LwWAAB+dMcdd6imsFALamrP+lqdxaKqykpZbTYFBgaqla+vS7N9cenv1OP3v9dVV13l0vcFzgdH7AAAHuG6667Tt99+q75Dhqjux6tdpR8KXWlZmY4fP6bKqkqdOVOt0hMnZHXhDYrrTCZV+PsrMjLSZe8JXAiO2AEAPMbx48e18uWXNeLf36j1iROqqDjt9ErU8PBw+fq45qjdPbn7lGswKKxtW5l//JzdqlWr1Lt3b5e8P3Cu+BQoAMBjhIWFSSaTDvj7q2PJcadzPj6+8jH7uCzXtGuuUUH//rp/2jSZfnY0EfA0nIoFAHiEjIwMXX/99fp0wwZ9H9Ne1gZuYWIwGBUcFKy2bdvKYDC4JJfVaFRex47qO2gQpQ4ej2IHAHCrr776SqNGjVJiYqI2bNigrKwsVfn46ESHDvUzRoNRwcGtFRkZqeDgYBldVOok6XB4uCwBAerbt6/L3hO4UJyKBQC4nN1u18aNG5WSkqLNmzc7fO3UqVPKPXRIbePiFFFQoNYBgQoIDHRpmfuJzWDQgU6x6ty9u0KdPPUC8CQcsQMAuIzdbteGDRs0YsQIjRo16qxS95Pv9uxRdUSEygcOVFBQkFtKnSTti4lRRXi4EkeMcMv7A+eLYgcAaHJ2u13r1q3T7373O1177bXKyMhocK59+/Z65plntG3bNl02apT2xnVXeUCAi9P+4FRAgPZ2j9PQESMUHR3tlgzA+aLYAQCajN1u1yeffKLBgwfrhhtu0NatWxuc69Chg1544QUdOHBA06ZNU0BAgBITExXaIUaZvXrJ8ivPgm0KFqNRmb17KSwmRsOHD3fpewMXg2IHAGh0NptNa9as0YABAzR27Fht3769wblOnTrplVde0f79+3X//ffL72eP7DKbzbp+zBhVtW+vb+J7y+ai07E2g0HfxPdWdXR7/WHMmPr71gHNATcoBgA0GqvVqg8++EBpaWnKyclxOtelSxfNnTtXd9xxh3x8fv1+dHl5efrw3XcVlp+vYTnfydyET5ywGI36Jr63SmNjNf6WW9SpU6cmey+gKVDsAAAXzWKx6L333lNaWpr27t3rdC4uLk5JSUm69dZbz+tIWF5enj5e/b4Cjh7VoN271bqqqjFiOzgVEKDM3r1UHd1e4ybcRKlDs0SxAwBcsLq6Or3zzjtasGCB9u/f73SuV69eSkpK0oQJEy74Jr9FRUVat3atyo4UqGduruIKCmRshF2YzWDQvpgY7e0ep7CYGP1hzBhFRUVd9HYBd6DYAQDOW21trd566y0tXLhQhw4dcjqXkJCg5ORkjR8/vlGe2mCxWJSenq5t6ekKKilR17x8dSwpkekCTs9ajUYdDg/XgU6xqggP19ARIzR8+HA+U4dmjWIHADhnNTU1euONN7Ro0SLl5+c7nevfv7/mzZun//mf/5GxCa5oPXr0qDLS03Vo3z6Zq6rU6fBhRZ8oVUhlpXysVqevqzOZdCowUIVtw5TXsaMsAQHq3L27ErmlCbwExQ4A8Juqq6v12muvacmSJSooKHA6N3jwYM2bN0833HCDS57lWlZWpuzsbGVnZupMZaXsFouCqqvVurRMvhaLjHabbAajas1mlYeFqsLfXwazWX6Bgeo7aJD69u3LEyXgVSh2AACnqqqq9Morr+iJJ55QUVGR07lhw4Zp/vz5uvbaa11S6H7JarWqtLRUxcXFKi4u1vGiItWeOSOrxSKT2SxfPz+1i4pSZGSkIiMjFRYW1iinhgFPQ7EDAJyloqJCL730kp588kkdO3bM6dyIESM0b948jRo1yi2FDoAjPiEKAKhXXl6uF154QcuWLdOJEyeczo0cOVLz5s3TyJEjKXSAB6HYAQB08uRJPffcc3rmmWdUVlbmdO7qq69WcnKyLrvsMhemA3CuKHYA0IKVlpbqmWee0bPPPqvy8nKnc9ddd52Sk5N16aWXujAdgPNFsQOAFqikpERPPfWUli9froqKCqdzo0ePVnJysoYMGeLCdAAuFMUOAFqQ4uJiLVu2TC+++KIqKyudzo0bN07JyckaMGCAC9MBuFgUOwBoAQoLC7V06VK9/PLLqq6ubnDGYDDoxhtvVFJSkvr27evihAAaA8UOALzYkSNHtGTJEv31r39VTU1NgzNGo1E333yz5s6dq969e7s4IYDGRLEDAC+Ul5enxYsXa8WKFaqtrW1wxmQy6bbbbtOcOXPUo0cPFycE0BQodgDgRQ4ePKhFixZp5cqVslgsDc6YzWZNnDhRjz/+uLp16+bihACaEsUOALxAbm6uFi5cqLfffltWq7XBGR8fH915552aPXu2Onfu7OKEAFyBYgcAzdiePXu0YMECrVq1SjabrcEZX19f3XPPPZo1a5ZiY2NdnBCAK1HsAKAZysnJUVpamlavXi1nj/z28/PTvffeq8cee0wxMTEuTgjAHSh2ANCMZGVlKTU1VR9++KHTGX9/f91///2aOXOmoqKiXJgOgLtR7ACgGcjMzFRqaqo++eQTpzOBgYF68MEHNWPGDEVERLgwHQBPQbEDAA/2zTffKDU1VevWrXM6ExwcrIcffljTp09XeHi4C9MB8DQUOwDwQOnp6UpNTdWGDRuczoSEhGj69OmaNm2aQkNDXZgOgKei2AGAB/nqq6+UkpKijRs3Op0JDQ3VjBkz9NBDDykkJMSF6QB4OoodALiZ3W7Xxo0blZKSos2bNzudCw8P18yZM3X//fcrODjYhQkBNBcUOwBwE7vdrs8//1wpKSnKyMhwOhcREaHHHntMU6dOVWBgoAsTAmhuKHYA4GJ2u13r169XSkqKtm7d6nQuOjpas2bN0pQpUxQQEODChACaK4odALiI3W7X2rVrlZKSou3btzud69Chg2bPnq27775bfn5+LkwIoLmj2AFAE7PZbProo4+UlpamrKwsp3OdOnXS448/rsmTJ6tVq1YuTAjAW1DsAKCJWK1WffDBB0pLS1NOTo7TuS5dumju3Lm644475OPj48KEALwNxQ4AGpnFYtF7772ntLQ07d271+lcXFyckpKSdOutt8ps5scxgIvHTxIAaCR1dXV65513tGDBAu3fv9/pXK9evZSUlKQJEybIZDK5MCEAb0exA4CLVFtbq7feeksLFy7UoUOHnM4lJCQoOTlZ48ePp9ABaBIUOwC4QDU1NXrjjTe0aNEi5efnO53r37+/kpOTNXbsWBmNRhcmBNDSUOwA4DxVV1frtdde05IlS1RQUOB0bvDgwZo3b55uuOEGGQwGFyYE0FJR7ADgHFVVVemVV17RE088oaKiIqdzw4YN0/z583XttddS6AC4FMUOAH5DRUWFXnrpJT355JM6duyY07nExETNnz9fo0aNotABcAuKHQA4UV5erhdeeEHLli3TiRMnnM6NHDlS8+bN08iRIyl0ANyKYgcAv3Dy5Ek999xzeuaZZ1RWVuZ0btSoUUpOTtbll1/uwnQA4BzFDgB+VFpaqmeeeUbPPvusysvLnc5dd911Sk5O1qWXXurCdADw2yh2AFq8kpISPfXUU1q+fLkqKiqczo0ePVrJyckaMmSIC9MBwLmj2AFosYqLi7Vs2TK9+OKLqqysdDo3btw4JSUlaeDAgS5MBwDnj2IHoMUpLCzU0qVL9fLLL6u6urrBGYPBoBtvvFFJSUnq27evixMCwIWh2AFoMY4cOaIlS5bor3/9q2pqahqcMRqNmjBhgubOnav4+HgXJwSAi0OxA+D18vLytHjxYq1YsUK1tbUNzhiNRt1+++2aM2eOevTo4eKEANA4KHYAvNbBgwe1aNEirVy5UhaLpcEZs9msiRMn6vHHH1e3bt1cnBAAGhfFDoDXyc3N1cKFC/X222/LarU2OOPj46M777xTs2fPVufOnV2cEACaBsUOgNfYs2ePFixYoFWrVslmszU44+vrq3vuuUezZs1SbGysixMCQNOi2AFo9nbt2qW0tDS9//77stvtDc74+fnp3nvv1WOPPaaYmBgXJwQA16DYAWi2srKylJqaqg8//NDpjL+/v+677z7NnDlT0dHRLkwHAK5HsQPQ7GRmZio1NVWffPKJ05nAwEA9+OCDmjFjhiIiIlyYDgDch2IHoNn45ptvlJqaqnXr1jmdCQ4O1sMPP6zp06crPDzchekAwP0odgA8Xnp6ulJTU7VhwwanMyEhIZo+fbqmTZum0NBQF6YDAM9BsQPgsb766iulpKRo48aNTmdCQ0M1Y8YMPfTQQwoJCXFhOgDwPBQ7AB7Fbrdr48aNSklJ0ebNm53OhYeH65FHHtEDDzyg4OBgFyYEAM9FsQPgEex2uz7//HOlpKQoIyPD6VxERIQeffRRTZ06VUFBQS5MCACej2IHwK3sdrvWr1+vlJQUbd261elcdHS0Zs2apSlTpiggIMCFCQGg+aDYAXALu92utWvXKiUlRdu3b3c616FDB82ePVt33323/Pz8XJgQAJofih0Al7LZbProo4+UlpamrKwsp3OxsbGaM2eOJk+erFatWrkwIQA0XxQ7AC5htVr1wQcfKC0tTTk5OU7nunTpojlz5uiOO+6Qr6+vCxMCQPNHsQPQpCwWi9577z2lpaVp7969Tufi4uI0d+5c3XrrrfLx8XFhQgDwHhQ7AE2irq5O77zzjhYsWKD9+/c7nevZs6eSkpI0YcIEmc38SAKAi8FPUQCNqra2Vm+99ZYWLlyoQ4cOOZ1LSEhQcnKyxo8fL5PJ5MKEAOC9KHYAGkVNTY1WrFihxYsXKz8/3+lcv379NG/ePI0dO1ZGo9GFCQHA+1HsAFyU6upqvfbaa1qyZIkKCgqczg0aNEjz5s3T6NGjZTAYXJgQAFoOih2AC1JVVaVXXnlFTzzxhIqKipzODRs2TPPmzdN1111HoQOAJkaxA3BeKioq9NJLL+nJJ5/UsWPHnM4lJiZq/vz5GjVqFIUOAFyEYgfgnJSXl+uFF17QsmXLdOLECadzI0eO1Lx58zRy5EgKHQC4GMUOwK86efKknnvuOT3zzDMqKytzOjdq1CglJyfr8ssvd2E6AMDPUewANKi0tFTPPPOMnn32WZWXlzudu/baa5WcnKzhw4e7MB0AoCEUOwAOSkpK9NRTT2n58uWqqKhwOjd69GglJSVp6NChLkwHAPg1FDsAkqTi4mItW7ZML774oiorK53OjRs3TklJSRo4cKAL0wEAzgXFDmjhCgsLtXTpUr388suqrq5ucMZgMOjGG29UUlKS+vbt6+KEAIBzRbEDWqgjR45oyZIl+utf/6qampoGZwwGg26++WbNnTtX8fHxLk4IADhfFDughcnLy9PixYu1YsUK1dbWNjhjNBp12223ac6cOerZs6eLEwIALhTFDmghDh48qEWLFmnlypWyWCwNzphMJk2cOFFz5sxRt27dXJwQAHCxKHaAl8vNzdXChQv19ttvy2q1Njjj4+OjO++8U7Nnz1bnzp1dnBAA0FgodoCX2rNnjxYsWKBVq1bJZrM1OOPr66t77rlHs2bNUmxsrIsTAgAaG8UO8DK7du1SWlqa3n//fdnt9gZn/Pz8dO+99+qxxx5TTEyMixMCAJoKxQ7wEllZWUpNTdWHH37odMbf319Tp07Vo48+qujoaBemAwC4AsUOaOYyMzOVmpqqTz75xOlMYGCgHnjgAT3yyCOKiIhwYToAgCtR7IBm6ptvvlFqaqrWrVvndCY4OFgPP/ywpk+frvDwcBemAwC4A8UOaGbS09OVmpqqDRs2OJ0JCQnR9OnT9fDDDyssLMyF6QAA7kSxA5qJr776SikpKdq4caPTmdDQUM2YMUMPPfSQQkJCXJgOAOAJKHaAB7Pb7dq4caNSUlK0efNmp3Ph4eF65JFHdP/996t169YuTAgA8CQUO8AD2e12ff7550pJSVFGRobTuYiICD366KOaOnWqgoKCXJgQAOCJKHaAB7Hb7Vq/fr1SUlK0detWp3PR0dF67LHHdO+99yogIMCFCQEAnoxiB3gAu92utWvXKiUlRdu3b3c616FDB82ePVt33323/Pz8XJgQANAcUOwAN7LZbProo4+UlpamrKwsp3OxsbGaM2eOJk+erFatWrkwIQCgOaHYAW5gtVr1wQcfKC0tTTk5OU7nunTpojlz5uiOO+6Qr6+vCxMCAJojih3gQhaLRe+9957S0tK0d+9ep3NxcXGaO3eubr31Vvn4+LgwIQCgOaPYAS5QV1end955RwsWLND+/fudzvXs2VNJSUmaMGGCzGb+7wkAOD/sOYAmVFtbq7feeksLFy7UoUOHnM4lJCQoOTlZ48ePl8lkcmFCAIA3odgBTaCmpkZvvPGGFi1apPz8fKdz/fr107x58zR27FgZjUYXJgQAeCOKHdCIqqur9dprr2nJkiUqKChwOjdo0CDNmzdPo0ePlsFgcGFCAIA3o9gBjaCqqkqvvPKKnnjiCRUVFTmdGzZsmObNm6frrruOQgcAaHQUO+AiVFRU6KWXXtKTTz6pY8eOOZ1LTEzU/PnzNWrUKAodAKDJUOyAC1BeXq4XXnhBy5Yt04kTJ5zOjRw5UvPmzdPIkSMpdACAJkexA87DyZMntXz5cj399NMqKytzOjdq1CglJyfr8ssvd2E6AEBLR7EDzkFpaameffZZPfvsszp16pTTuWuvvVbJyckaPny4C9MBAPADih3wK0pKSvTUU0/p+eef1+nTp53OjR49WklJSRo6dKgL0wEA4IhiBzSguLhYy5Yt04svvqjKykqnc+PGjVNSUpIGDhzownQAADSMYgf8TGFhoZYuXaqXX35Z1dXVDc4YDAbdeOONSkpKUt++fV2cEAAA5yh2gKQjR47oiSee0KuvvqqampoGZwwGg26++WbNnTtX8fHxLk4IAMBvo9ihRcvLy9OSJUv0+uuvq7a2tsEZo9Go2267TXPmzFHPnj1dnBAAgHNHsUOLdPDgQS1atEgrV66UxWJpcMZkMmnixImaM2eOunXr5uKEAACcP4odWpTc3FwtXLhQb7/9tqxWa4MzPj4+uvPOOzV79mx17tzZxQkBALhwFDu0CHv27NGCBQu0atUq2Wy2Bmd8fX11zz33aNasWYqNjXVxQgAALh7FDl4tJydHaWlpWr16tex2e4Mzfn5+uvfee/XYY48pJibGxQkBAGg8FDt4paysLKWlpWnNmjVOZ/z9/XXfffdp5syZio6OdmE6AACaBsUOXiUzM1Opqan65JNPnM4EBgbqwQcf1IwZMxQREeHCdAAANC2KHbzC1q1blZKSonXr1jmdCQ4O1sMPP6zp06crPDzchekAAHANih2atYyMDKWkpGjDhg1OZ0JCQjR9+nRNmzZNoaGhLkwHAIBrUezQLG3evFkpKSn6xz/+4XQmNDRUM2bM0EMPPaSQkBAXpgMAwD0odmg27Ha7Nm3apJSUFH311VdO58LDw/XII4/ogQceUHBwsAsTAgDgXhQ7eDy73a4vvvhCKSkpSk9PdzoXERGhRx99VFOnTlVQUJALEwIA4BkodvBYdrtd69evV0pKirZu3ep0Ljo6WrNmzdKUKVMUEBDgwoQAAHgWih08jt1u19q1a5WSkqLt27c7nevQoYNmz56tu+++W35+fi5MCACAZ6LYwWPYbDZ9/PHHSk1NVVZWltO52NhYzZkzR5MnT1arVq1cmBAAAM9GsYPbWa1WrVmzRqmpqcrJyXE616VLF82ZM0d33HGHfH19XZgQAIDmgWIHt7FYLHrvvfe0YMEC7dmzx+lcXFyckpKSdOutt8ps5j9ZAACcYS8Jl6urq9M777yjBQsWaP/+/U7nevbsqeTkZE2YMEEmk8mFCQEAaJ4odnCZ2tpavfXWW1q4cKEOHTrkdC4hIUHJyckaP348hQ4AgPNAsUOTq6mp0RtvvKFFixYpPz/f6Vy/fv00b948jR07Vkaj0YUJAQDwDhQ7NJkzZ87otdde0+LFi1VQUOB0btCgQZo3b55Gjx4tg8HgwoQAAHgXih0aXVVVlV555RUtXbpUhYWFTueGDRum+fPn69prr6XQAQDQCCh2aDQVFRV66aWX9OSTT+rYsWNO5xITEzV//nyNGjWKQgcAQCOi2OGilZeX64UXXtCyZct04sQJp3MjR47UvHnzNHLkSAodAABNgGKHC3by5EktX75cTz/9tMrKypzOjRo1SsnJybr88stdmA4AgJaHYofzVlpaqmeffVbPPvusTp065XTu2muvVXJysoYPH+7CdAAAtFwUO5yzkpISPf3001q+fLlOnz7tdG706NFKSkrS0KFDXZgOAABQ7PCbiouLtWzZMr344ouqrKx0Ojdu3DglJSVp4MCBLkwHAAB+QrGDU4WFhVq6dKlefvllVVdXNzhjMBh04403KikpSX379nVxQgAA8HMUO5zlyJEjeuKJJ/Tqq6+qpqamwRmj0agJEyZo7ty5io+Pd3FCAADQEIod6uXn52vx4sV6/fXXVVtb2+CM0WjU7bffrjlz5qhHjx4uTggAAH4NxQ46dOiQFi1apJUrV6qurq7BGbPZrIkTJ+rxxx9Xt27dXJwQAACcC4pdC5abm6uFCxfq7bffltVqbXDGx8dHd955p2bPnq3OnTu7OCEAADgfFLsWaM+ePVqwYIFWrVolm83W4Iyvr6/uuecezZo1S7GxsS5OCAAALgTFrgXJyclRWlqaVq9eLbvd3uCMn5+f7r33Xj322GOKiYlxcUIAAHAxKHYtQFZWltLS0rRmzRqnM/7+/rrvvvs0c+ZMRUdHuzAdAABoLBQ7L7Z9+3alpqbqb3/7m9OZwMBAPfjgg5oxY4YiIiJcFw4AADQ6ip0X2rp1q1JTU/Xpp586nQkODtbDDz+s6dOnKzw83IXpAABAU6HYeZGMjAylpKRow4YNTmdCQkI0ffp0TZs2TaGhoS5MBwAAmhrFzgts3rxZKSkp+sc//uF0JjQ0VDNmzNBDDz2kkJAQF6YDAACuQrFrpux2uzZt2qSUlBR99dVXTufCw8M1c+ZM3X///QoODnZhQgAA4GoUu2bGbrfriy++UEpKitLT053ORURE6LHHHtPUqVMVGBjowoQAAMBdKHbNhN1u12effaaUlBR98803Tueio6M1a9YsTZkyRQEBAS5MCAAA3I1i5+HsdrvWrl2r1NRUZWZmOp3r0KGDZs+erbvvvlt+fn4uTAgAADwFxc5D2Ww2ffzxx0pNTVVWVpbTuU6dOunxxx/X5MmT1apVKxcmBAAAnoZi52GsVqvWrFmj1NRU5eTkOJ3r0qWL5s6dqzvuuEM+Pj4uTAgAADwVxc5DWCwWrV69WmlpadqzZ4/Tubi4OCUlJenWW2+V2cxfHwAA+K8W0QysVqtKS0tVXFys4uJiHS8qUk11tWxWq4wmk1r5+6tdVJQiIyMVGRmpsLAwmUwml2Srq6vTqlWrtGDBAuXm5jqd69mzp5KTkzVhwgSXZQMAwJN58v7dXQx2u93u7hBNpaysTFlZWdq5fbvOVFbKbrEoqLpaIaWl8rFYZLTbZTMYVGc261RYmCr8/WUwm+UXGKg+AweqX79+TfZ0htraWr399ttauHChDh486HQuISFBycnJGj9+vNf/xwgAwLnw5P27u3llsTt69KgytmzRodxc+VRVKTb/sKJLSxVSWSkfq9Xp6+pMJp0KDFRhWJjyYzuqLiBAnePilHjZZYqOjm6UbDU1NXrjjTe0aNEi5efnO53r37+/kpOTNXbsWBmNxkZ5bwAAmjNP3r97Cq8qdhaLRenp6dqWnq6gkhJ1y8tXh5ISmWy2896W1WjUkfBw7e8Uq4rwcA1JTFRiYuIFf67tzJkzeu2117R48WIVFBQ4nRs8eLDmzZunG264QQaD4YLeCwAAb+LJ+3dP4zXFrqioSOvWrlXZkQL1zM1VXEGBjI3wR7MZDMqNidGeuDiFdYjRH8aMUVRU1Dm/vqqqSq+++qqeeOIJFRYWOp0bNmyY5s+fr2uvvZZCBwDAjzx1/+6pvKLY5eXl6ePVqxVwtFCDdu9W66qqRn+P8oAAZfbqpar27TVuwk3q1KmTpB/+g9uwYYP69OmjgQMH1s9XVFTo5Zdf1tKlS3Xs2DGn201MTNT8+fM1atQoCh0AAD/jzv17c9Xsi11eXp4+fPddtc3L19DvvpP5Ag7LniuL0ahv4nurNDZW42+5RbW1tbrssstUXFwsSVq9erWuu+46vfDCC1q2bJlKSkqcbmvkyJGaN2+eRo4cSaEDAOAX3Ll/b87lrlkXu6KiIr331ltqc+h7XZqT0yiHZn+LzWDQ1wnxKut0iT769P+0ZcuW+q+1bt1aRqNRJ0+edPr6UaNGKTk5WZdffnmTZwUAoDly5/795CWddfPEO5rtadlme7mlxWLRurVrFXC0UMO++84lf+mSZLTbNSznO5nyvlfPbt0cbkFSXl7utNRdd911ysjI0BdffEGpAwDACXfv3/0Lj2r92rWyWCwued/G1myLXXp6usqOFGjQ7t1Neni2IbWVler2738rJixMw4cP/9XZ0aNHa+vWrVq/fr0uvfRSFyUEAKB5cuf+3WyzadB3u1VaUKCMjAyXvndjaZbF7ujRo9qWnq6eublN8kHKX1NbV6eTJ08qsLxccXv2KHHIkAYP144bN06ZmZlau3athgwZ4tKMAAA0R+7cv/8kpKpKPfblauuWLb96NwtP1SyLXcaWLQoqKVHcr9wPrinYJZ04ceLHX0nt9+1TeEWFEn9x1C4mJkZr1qxxuEoWAAD8Onft33+pe0GBgkpKlP6zz9E3F82u2JWVlelQbq665eW77Lz7Tyx1dbLb/3tY2Gi3q+P+/ereubNCQkLq1wsKCvT999+7NBsAAM2ZO/fvv2S029U1L1+H9u1TWVmZW7Ocr2ZX7LKysuRTVaUOv3IrkaZi9vGR5HhrkvDDhxVgsahfv371axEREerQoYOL0wEA0Hy5c//ekI4lJTJXVSk7O9vdUc5Ls3p+htVq1c7t2xWbf/iCHiNysQySwtu21anyctntdplMRhkMBl1ypEAjhg1TUFCQoqOj9ac//Um+vr4uzwcAQHPk7v17Q0w2mzodPqzszEyNGDHC4S4YnuyCjtiFh4df9Bv/4Q9/UHV1tdOvP/HEE/W/Pnr0qG677TaVlpbqTGWloktLz5rvteVfGrNju/6wPVN/zMlReRNdpuzr66t24eGKaNdObcPaKiw0TF2rq9UmOFi33XabDh48qFtuuUXvvfeeJOnll1/W6tWrG+39X3vtNcXFxclgMKiioqLRtgsAwK9JSUlRQkKC+vTpo8GDB+vQoUNOZ8+3J/y0f//Htm0O6z/t26/fnqmHd+9WtdV6Qdkv2OEj+uabb1T6Y+9Yu3atnn76aUnS5MmT9emnn573Jh944AFFRERo8ODBjRr1J247Fbt+/Xr5+/s7/frPi1379u31zjvvqLi4WHaLRW0aKDTBZrPWDhio9QMHKdhs1juFRy86o/UczvHbJbU6cUJVpyu0cOFCbdq0STt37tRtt92m4uJiTZ06VRMmTLjoLD8ZNmyYPv/882Z9V2wAQPOSkZGhf/7zn/r222+1c+dO/e1vf1ObNm0abfs/7d9XHdjvsP7Tvn3dwEHyMRr0btG5XaV6Lvvvc1FeVqqsnTvrnzA1ZswY/elPf7qobd5666367LPPGiNegxrtVOznn3+uxx57TBaLRddcc42WLVsmg8Ggl156SU8//bQ6duyodu3aacSIEXrwwQd1ySWXaNeuXZKkG2+8UQU/XgHz5JNPavPmzTp58qT69++vxMREPfroo7rxxhu1ZMkS+VdUaGHuPm09dUoGGfRgbKx+/4t/GQxq3Vp7Kn8ofyW1tUrev1/FtTXyNRq1oFucugYE6FB1lR7Zu1cmg0EDg1trW/kpfdR/gJ7Ly1NJXa3yqs+oW0CA7mjfXn8+sF+n6ixq42PWku49FOHrqzcKCrTqaIHMdiner5WG9+/vkMFms+nLL7/U1q1bFRoaqokTJ2rXrl1KTk5WTU2NevXqpYULF6pVq1a6/PLLNX78eH355ZcymUx69dVXFRER0eD3OTAwUHa7XRaLRYcOHVJgYGBj/RUCANCgrKws+fr6Kj8/v36ttrZWH330kZ5//nnV1NSob9++SktLk9FolM1m08GDByVJL7zwgr744gvV1tbq9ttv16233ipJWr58udatWyeTyaThw4frSE6OTlssGr09UwODWyu5SxdJksVqlclo1ODWIdpbWalKq1V/3r9fB6p/uB3K3C5dNKh1yFn771ujozV//36dsljkazTozYQ+MhgMTl9bVFuj76urVVRTqxmXdNIN7SK0/OAhHait0dixYzV37lwZDAbt2rVLTz75pMP3Z+vWrXrkkUdUWVmpzp07680331RQUFCD38vExMQmvcCyUYpddXW1pkyZoq+++kqxsbEaM2aMPv74Yw0bNkzLli1TZmamzGazBg4cqBEjRji8dsOGDWrbtq3+/ve/y2636/Tp0/r973+vV155Rd9++60k1X8DjhcV6T9ff63TFqvWDhgoo8GgU5Y6h+1Z7XalnyzT+Mgf7i234OBBPRDbUQlBwco+fVoLDx7U6wkJWnDwoO7r2FFXtw3XU7/4Bu+rrNJbffrI12jU5F07taBbnGL8/PRZyXE9n5+nP3ftpufzvtf7nTrJ32hUhdWqgvJTOnLkiMN2br/99vpf/+Uvf3H42t69e/W3v/2t/vfPPfdc/a/P9UbGffv2Pac5AAAaQ9euXZ1+bffu3Q4fPWpoNjk5WcnJyWet9+7RQ+O7dNG/t23TK9HRkqRjx4pls9l07FixrDJoY8lxXRkerhcP5+vqtm21NLyHimpqNCUnR//34+3Ffr7//t9vd+iRTpcoMTRUFRaL/EwmPZ33vdPXHjlzRm/16aujZ87orpxduqFdhP7UqZOeqzitGX/5i26+7TatXLnyrOy1tbWaOXOm1q5dq9DQUC1dulTPP/+8Zs+efV7f28bSKMVu79696tGjhy655BJJPxxm/Ne//iWj0airrrqq/lYgN9xww1mv7dOnj/70pz/pscce07hx43611NRUV2t3QYH+FBUlo+GHq1NDzD6SpNMWi8bs2K6imhrFBQTostBQSdK/T52sb+Y/l1NRoVFhbSVJ17drpy0n/3s581Vtw+RrNKrCYtH28nLdt/s7SZLNbldMKz+dPn1aPVu10oJjxzQyMFAjAgNlrq1VVESETp06db7fPgAAWrSQ1q1lbuCGxBU2m+4+fFiS1NfPX+Mjo3RzVpY2l5bq+cM/HD08aalT7Y8XXPx8/11usSjxxy4QZP6h7mSUnXT62itCw2Q2GBTr7+/wOX2j3a7aM2ecZt+7d6+ys7N15ZVXSvqh6I0cOfJivh0XpUmuirXb7TIYDLL/4hz3L38vSd27d9eOHTu0bt06TZs2TRMnTtSDDz7Y4HZtVusvbjbyXz+dh6+2WnXnrl1aVXhUE9vHSJI+7j9AJoOzV/50u+H/8jP+98qXcB9frR3geKPh8tOntTg6Wt9WV+tflZV6/9QpzR4wQJddeqn25uY6fR8AAHA2s9EoQwNXwwYZjXq9Y0dJksFglK/RKLvseqV3vNr7+Z01//P9d0N7/V97ra/RyWUHdrusv3JBpt1u18CBA7Vx40anM67UKBdP9OjRQ/v27VNeXp5sNpvee+89XXbZZRoyZIg2btyo8vJyVVVVaf369We99ujRowoMDNTEiRM1bdq0+tOvJpNJ1l9c/WI0mZQQHa3VRUWy/VgSf3kq1t9k0twuXbSioEAWu11DQ0L03o8ftrTZ7dpbWSlJ6h0UpI0/XuXy95LjDf65gsxmhfn46J8/ztXZbNpfVaXAoCCVShoUEKD7w8NVVFcnq6QyjtYBAHDeKqurZTcaZTIYGrzwwWQ0qU2bNjJIGt4mVO/87FFfuxu4oDLIbFZrs1npP95cuMJikcVuP6fX/lyg2aRqi0Ums/PjYD179lReXl59f6msrNT+/fudzje1CzpiV1ZW5nAD3qefflqvvvqq/ud//qf+4omxY8fKYDBo+vTpGjx4sGJjYzVgwAC1bt3aYVs7d+7UzJkzZTKZ5O/vr9dff12SNGnSJPXp00dXXnmlHn30UUlSK39/XdG7t77MP6wbdmyXycnFE32Cg9U9IFAbSkqU3KWr5u3fr/cKC2Wx2zU2IlI9AgM1p3MXzdy7Vy8fOawhrUMU5OT+NMt69NC8/fu17PvvZZVd98R0UCc/Py06XqLTljpZbTbdGRYmW6tW+mbzZofX/v3vf9fmzZvVtm1b3Xfffdq+fbumTZtW/yHT559/Xn5+furZs6f+85//KCgoSOvXr9ff/vY3vfrqqw3mWblypdLS0lRcXKyIiAjdfPPNWrBgwfn9BQIAcB62b9+uGTNm6PTp05KkAQMGaPny5frXv/6lP//5z7JYLDKbzXrhhRc0YMAAdezYUYd/PIX61FNP6d1335Xdble7du20Zs0aBQYGatGiRVqzZo3MZrPMvr7ybdVK/xsVpXsLCzUspI3md+0qY16eoqOiZfjZWbcHYmOVeuCAbtieKavdrkvbtNG8oG5nZV7avYeS9+dq8aFDamU06s0+fc75tT/pERCoOrtd81JSZDUaHXL8xNfXV++9957uv//++tuQPfHEE+rWreHt3nPPPVq3bp1OnDihDh06aPny5Ro3bty5/2X8BoO9ofOjjaiyslKBgYGqrq7W5ZdfrhUrVqhPnz4XtK1//OMf2rthg67++t8XnavaapXfj39Jrx05opK6Ws3u3OWCtlVbV6u/Dx6s9bt31x+KbdWqlQoLCxX64/l9AADQsMbcvze2Ly79nXr8/ve66qqr3B3lnDT5kyeSkpK0adMmnTlzRhMnTrzgUidJkZGRyvT3V53JJJ+LvElh9unTWnDooGx2uyJbtdLS7t0veFsGP39Z27bVAw88oPbt2+v48eOaOXMmpQ4AgHPQmPv3xlRnMqnC31+RkZHujnLOmrzY/XSH5sYQGRkpg9msU4GBCi8vv6htDWvT5qyLIi7UqcBAGcxmXXbZZfrf//3fRtnmggUL9MEHHziszZgxQxMnTmyU7QMA4Ckac//emH7av19IsRs3btxZT+dYtWqVevfu3VjxGtSsnhUbFhYmv8BAFYaFedRffGHbH3KFhYU12jbnzp2ruXPnNtr2AADwVN64f//444+bINFvc9sjxS6EyWRSn4EDlR/bUVZnlyW7mNVoVF7Hjuo7aFCzeUAwAACehP174/GM79556Nevn+oCAnTkPB8w3FQOh4fLEhDAUyAAALgI7N8bR7MrdqGhoeocF6f9nWJl+5WbDruCzWDQgU6x6ty9OxdKAABwEdi/N45mV+wkKfGyy1QRHq7cmBi35tgXE6OK8HAl/uL5twAA4Pyxf794zbLYRUdHa0hiovbExak8IMAtGU4FBGhv9zgNHTFC0T8+sBgAAFw49u8Xr1kWO0lKTExUaIcYZfbqJYuLP2hpMRqV2buXwmJiNHz4cJe+NwAA3oz9+8VptsXObDbr+jFjVNW+vb6J7+2y8/E2g0HfxPdWdXR7/WHMGJl/5flxAADg/LB/vzjNtthJUlRUlMZNuEmlsbH6OiG+yZu9xWjU1wnxKo2N1bgJNykqKqpJ3w8AgJaI/fuFa/JnxbpCXl6ePl79vgKOHtWg3bvVuqqq0d/jVECAMnv3UnV0e42bcJM6derU6O8BAAD+i/37+fOKYidJRUVFWrd2rcqOFKhnbq7iCgpkbIQ/ms1g0L6YGO3tHqewmBj9YcyYZt3kAQBoTti/nx+vKXaSZLFYlJ6erm3p6QoqKVHXvHx1LCmRyWY7721ZjUYdDg/XgU6xqggP19ARIzR8+PBme84dAIDmiv37ufOqYveTo0ePKiM9XYf27ZO5qkqdDh9W9IlShVRWysdqdfq6OpNJpwIDVdg2THkdO8oSEKDO3bsrsZle8gwAgDdh//7bvLLY/aSsrEzZ2dnKzszUmcpK2S0WBVVXq3VpmXwtFhntNtkMRtWazSoPC1WFv78MZrP8AgPVd9Ag9e3bt9ndcRoAAG/H/t05ry52P7FarSotLVVxcbGKi4t1vKhItWfOyGqxyGQ2y9fPT+2iohQZGanIyEiFhYU1qwf+AgDQErF/P1uLKHYAAAAtQbO+jx0AAAD+i2IHAADgJSh2AAAAXoJiBwAA4CUodgAAAF6CYgcAAOAlKHYAAABegmIHAADgJSh2AAAAXoJiBwAA4CUodgAAAF6CYgcAAOAlKHYAAABegmIHAADgJSh2AAAAXoJiBwAA4CUodgAAAF6CYgcAAOAlKHYAAABegmIHAADgJSh2AAAAXoJiBwAA4CUodgAAAF6CYgcAAOAlKHYAAABegmIHAADgJSh2AAAAXuL/B70N3AcNisHAAAAAAElFTkSuQmCC", + "image/png": "", "text/plain": [ "
" ] @@ -13739,7 +13511,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13749,7 +13521,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13759,7 +13531,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13769,7 +13541,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13779,7 +13551,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13789,7 +13561,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13799,7 +13571,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13809,7 +13581,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13819,7 +13591,7 @@ }, { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13854,12 +13626,12 @@ }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 40, "metadata": {}, "outputs": [ { "data": { - "image/png": "", + "image/png": "", "text/plain": [ "
" ] @@ -13901,13 +13673,13 @@ }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
FeatureUnion(transformer_list=[('featureunion',\n",
-       "                                FeatureUnion(transformer_list=[('binarizer',\n",
-       "                                                                Binarizer(threshold=0.1286154935127))])),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" + "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('kbinsdiscretizer',\n",
+       "                                                                KBinsDiscretizer(encode='onehot-dense',\n",
+       "                                                                                 n_bins=9,\n",
+       "                                                                                 strategy='uniform')),\n",
+       "                                                               ('quantiletransformer',\n",
+       "                                                                QuantileTransformer(n_quantiles=697))])),\n",
+       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "FeatureUnion(transformer_list=[('featureunion',\n", - " FeatureUnion(transformer_list=[('binarizer',\n", - " Binarizer(threshold=0.1286154935127))])),\n", + " FeatureUnion(transformer_list=[('kbinsdiscretizer',\n", + " KBinsDiscretizer(encode='onehot-dense',\n", + " n_bins=9,\n", + " strategy='uniform')),\n", + " ('quantiletransformer',\n", + " QuantileTransformer(n_quantiles=697))])),\n", " ('passthrough', Passthrough())])" ] }, - "execution_count": 60, + "execution_count": 41, "metadata": {}, "output_type": "execute_result" } @@ -14354,13 +14138,13 @@ }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "
FeatureUnion(transformer_list=[('featureunion',\n",
        "                                FeatureUnion(transformer_list=[('quantiletransformer',\n",
-       "                                                                QuantileTransformer(n_quantiles=98,\n",
-       "                                                                                    output_distribution='normal'))])),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
QuantileTransformer(n_quantiles=842)
Passthrough()
" ], "text/plain": [ "FeatureUnion(transformer_list=[('featureunion',\n", " FeatureUnion(transformer_list=[('quantiletransformer',\n", - " QuantileTransformer(n_quantiles=98,\n", - " output_distribution='normal'))])),\n", + " QuantileTransformer(n_quantiles=842))])),\n", " ('passthrough', Passthrough())])" ] }, - "execution_count": 61, + "execution_count": 42, "metadata": {}, "output_type": "execute_result" } @@ -14802,13 +14583,13 @@ }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
FeatureUnion(transformer_list=[('featureunion',\n",
+       "
FeatureUnion(transformer_list=[('featureunion',\n",
        "                                FeatureUnion(transformer_list=[('estimatortransformer-1',\n",
-       "                                                                EstimatorTransformer(estimator=BernoulliNB(alpha=0.0316290799363))),\n",
+       "                                                                EstimatorTransformer(estimator=LogisticRegression(C=3553.613707181859,\n",
+       "                                                                                                                  max_iter=1000,\n",
+       "                                                                                                                  n_jobs=1,\n",
+       "                                                                                                                  solver='saga'))),\n",
        "                                                               ('estimatortransformer-2',\n",
-       "                                                                EstimatorTransformer(estimator=GaussianNB()))])),\n",
-       "                               ('passthrough', Passthrough())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LogisticRegression(C=3553.613707181859, max_iter=1000, n_jobs=1, solver='saga')
LogisticRegression(C=3553.613707181859, max_iter=1000, n_jobs=1, solver='saga')
GaussianNB()
GaussianNB()
MultinomialNB(alpha=0.0128552259108, fit_prior=False)
MultinomialNB(alpha=0.0128552259108, fit_prior=False)
Passthrough()
" ], "text/plain": [ "FeatureUnion(transformer_list=[('featureunion',\n", " FeatureUnion(transformer_list=[('estimatortransformer-1',\n", - " EstimatorTransformer(estimator=BernoulliNB(alpha=0.0316290799363))),\n", + " EstimatorTransformer(estimator=LogisticRegression(C=3553.613707181859,\n", + " max_iter=1000,\n", + " n_jobs=1,\n", + " solver='saga'))),\n", " ('estimatortransformer-2',\n", - " EstimatorTransformer(estimator=GaussianNB()))])),\n", + " EstimatorTransformer(estimator=GaussianNB())),\n", + " ('estimatortransformer-3',\n", + " EstimatorTransformer(estimator=MultinomialNB(alpha=0.0128552259108,\n", + " fit_prior=False)))])),\n", " ('passthrough', Passthrough())])" ] }, - "execution_count": 62, + "execution_count": 43, "metadata": {}, "output_type": "execute_result" } @@ -15252,13 +15051,13 @@ }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('robustscaler',\n",
-       "                 RobustScaler(quantile_range=(0.1503060406741,\n",
-       "                                              0.8118816788829))),\n",
+       "
Pipeline(steps=[('normalizer', Normalizer(norm='max')),\n",
        "                ('featureunion-1',\n",
-       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
-       "                                                 SkipTransformer()),\n",
+       "                 FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                 FeatureUnion(transformer_list=[('columnonehotencoder',\n",
+       "                                                                                 ColumnOneHotEncoder())])),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
        "                ('featureunion-2',\n",
@@ -15675,15 +15473,15 @@
        "                                                 SkipTransformer()),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
-       "                ('sgdclassifier',\n",
-       "                 SGDClassifier(alpha=0.0002054334005, eta0=0.5702721028736,\n",
-       "                               l1_ratio=0.984925401959, loss='modified_huber',\n",
-       "                               n_jobs=1, penalty='elasticnet'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Normalizer(norm='max')
FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                FeatureUnion(transformer_list=[('columnonehotencoder',\n",
+       "                                                                ColumnOneHotEncoder())])),\n",
+       "                               ('passthrough', Passthrough())])
ColumnOneHotEncoder()
Passthrough()
FeatureUnion(transformer_list=[('skiptransformer', SkipTransformer()),\n",
+       "                               ('passthrough', Passthrough())])
SkipTransformer()
Passthrough()
BaggingClassifier(bootstrap_features=True, max_features=0.6083887402217,\n",
+       "                  max_samples=0.440010144908, n_estimators=24, n_jobs=1,\n",
+       "                  oob_score=True)
" ], "text/plain": [ - "Pipeline(steps=[('robustscaler',\n", - " RobustScaler(quantile_range=(0.1503060406741,\n", - " 0.8118816788829))),\n", + "Pipeline(steps=[('normalizer', Normalizer(norm='max')),\n", " ('featureunion-1',\n", - " FeatureUnion(transformer_list=[('skiptransformer',\n", - " SkipTransformer()),\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('columnonehotencoder',\n", + " ColumnOneHotEncoder())])),\n", " ('passthrough',\n", " Passthrough())])),\n", " ('featureunion-2',\n", @@ -15714,13 +15514,14 @@ " SkipTransformer()),\n", " ('passthrough',\n", " Passthrough())])),\n", - " ('sgdclassifier',\n", - " SGDClassifier(alpha=0.0002054334005, eta0=0.5702721028736,\n", - " l1_ratio=0.984925401959, loss='modified_huber',\n", - " n_jobs=1, penalty='elasticnet'))])" + " ('baggingclassifier',\n", + " BaggingClassifier(bootstrap_features=True,\n", + " max_features=0.6083887402217,\n", + " max_samples=0.440010144908, n_estimators=24,\n", + " n_jobs=1, oob_score=True))])" ] }, - "execution_count": 63, + "execution_count": 44, "metadata": {}, "output_type": "execute_result" } @@ -15785,13 +15586,13 @@ }, { "cell_type": "code", - "execution_count": 76, + "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
Pipeline(steps=[('passthrough', Passthrough()),\n",
+       "
Pipeline(steps=[('passthrough', Passthrough()),\n",
        "                ('variancethreshold',\n",
-       "                 VarianceThreshold(threshold=0.0033508168395)),\n",
+       "                 VarianceThreshold(threshold=0.0014368451974)),\n",
        "                ('featureunion-1',\n",
-       "                 FeatureUnion(transformer_list=[('skiptransformer',\n",
-       "                                                 SkipTransformer()),\n",
+       "                 FeatureUnion(transformer_list=[('featureunion',\n",
+       "                                                 FeatureUnion(transformer_list=[('powertransformer',\n",
+       "                                                                                 PowerTransformer()),\n",
+       "                                                                                ('nystroem',\n",
+       "                                                                                 Nystroem(gamma=0.8842695866347,\n",
+       "                                                                                          kernel='sigmoid',\n",
+       "                                                                                          n_components=7))])),\n",
        "                                                ('passthrough',\n",
-       "                                                 Passthrough())])),\n",
-       "                ('featureunion-2',\n",
+       "                                                 Passth...\n",
        "                 FeatureUnion(transformer_list=[('featureunion',\n",
        "                                                 FeatureUnion(transformer_list=[('estimatortransformer',\n",
-       "                                                                                 EstimatorTransformer(estimator=AdaBoostClassifier(algorithm='SAMME',\n",
-       "                                                                                                                                   learning_rate=0.0473135874378,\n",
-       "                                                                                                                                   n_estimators=436)))])),\n",
+       "                                                                                 EstimatorTransformer(cross_val_predict_cv=5,\n",
+       "                                                                                                      estimator=BaggingClassifier(bootstrap=False,\n",
+       "                                                                                                                                  max_features=0.2031842311627,\n",
+       "                                                                                                                                  max_samples=0.4743985327407,\n",
+       "                                                                                                                                  n_estimators=89,\n",
+       "                                                                                                                                  n_jobs=1)))])),\n",
        "                                                ('passthrough',\n",
        "                                                 Passthrough())])),\n",
-       "                ('linearsvc', LinearSVC(C=0.012266617842, penalty='l1'))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
BaggingClassifier(bootstrap=False, max_features=0.2031842311627,\n",
+       "                  max_samples=0.4743985327407, n_estimators=89, n_jobs=1)
BaggingClassifier(bootstrap=False, max_features=0.2031842311627,\n",
+       "                  max_samples=0.4743985327407, n_estimators=89, n_jobs=1)
Passthrough()
BernoulliNB(alpha=4.2777686142181)
" ], "text/plain": [ "Pipeline(steps=[('passthrough', Passthrough()),\n", " ('variancethreshold',\n", - " VarianceThreshold(threshold=0.0033508168395)),\n", + " VarianceThreshold(threshold=0.0014368451974)),\n", " ('featureunion-1',\n", - " FeatureUnion(transformer_list=[('skiptransformer',\n", - " SkipTransformer()),\n", + " FeatureUnion(transformer_list=[('featureunion',\n", + " FeatureUnion(transformer_list=[('powertransformer',\n", + " PowerTransformer()),\n", + " ('nystroem',\n", + " Nystroem(gamma=0.8842695866347,\n", + " kernel='sigmoid',\n", + " n_components=7))])),\n", " ('passthrough',\n", - " Passthrough())])),\n", - " ('featureunion-2',\n", + " Passth...\n", " FeatureUnion(transformer_list=[('featureunion',\n", " FeatureUnion(transformer_list=[('estimatortransformer',\n", - " EstimatorTransformer(estimator=AdaBoostClassifier(algorithm='SAMME',\n", - " learning_rate=0.0473135874378,\n", - " n_estimators=436)))])),\n", + " EstimatorTransformer(cross_val_predict_cv=5,\n", + " estimator=BaggingClassifier(bootstrap=False,\n", + " max_features=0.2031842311627,\n", + " max_samples=0.4743985327407,\n", + " n_estimators=89,\n", + " n_jobs=1)))])),\n", " ('passthrough',\n", " Passthrough())])),\n", - " ('linearsvc', LinearSVC(C=0.012266617842, penalty='l1'))])" + " ('bernoullinb', BernoulliNB(alpha=4.2777686142181))])" ] }, - "execution_count": 76, + "execution_count": 45, "metadata": {}, "output_type": "execute_result" } @@ -16269,7 +16100,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 46, "metadata": {}, "outputs": [], "source": [ @@ -16304,7 +16135,7 @@ }, { "cell_type": "code", - "execution_count": 64, + "execution_count": 47, "metadata": {}, "outputs": [], "source": [ @@ -16324,20 +16155,22 @@ }, { "cell_type": "code", - "execution_count": 65, + "execution_count": 48, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "Generation: 100%|██████████| 5/5 [00:55<00:00, 11.02s/it]\n" + "Generation: : 8it [01:44, 13.07s/it]\n", + "/home/perib/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/preprocessing/_data.py:2785: UserWarning: n_quantiles (911) is greater than the total number of samples (284). n_quantiles is set to n_samples.\n", + " warnings.warn(\n" ] }, { "data": { "text/html": [ - "
TPOTEstimator(classification=True, cv=5, early_stop=2, generations=5,\n",
-       "              max_eval_time_mins=10, n_jobs=4,\n",
+       "
TPOTEstimator(classification=True, cv=5, early_stop=2, max_time_mins=10,\n",
+       "              n_jobs=4,\n",
        "              scorers=['roc_auc_ovr',\n",
-       "                       <function complexity_scorer at 0x73f2b05de710>],\n",
+       "                       <function complexity_scorer at 0x78eb3afa4160>],\n",
        "              scorers_weights=[1.0, -1.0],\n",
-       "              search_space=<tpot2.search_spaces.pipelines.sequential.SequentialPipeline object at 0x73f3bd5db070>,\n",
-       "              verbose=2)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.