Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc2 #156

Merged
merged 56 commits into from
Oct 10, 2024
Merged

Doc2 #156

merged 56 commits into from
Oct 10, 2024

Conversation

perib
Copy link
Collaborator

@perib perib commented Sep 29, 2024

This PR is in addition to my previous PR and builds off of it. Since this is branched from that PR, this PR includes the previous PR as well.

Various edits, additions, and rewrites to all tutorials. Including rewriting sections of tutorials 3 and 4.

Added doc strings to various classes and functions. Edited existing documentation to make some ideas clearer.

Some code cleanup, including comments, doc strings, and renaming variables.

renamed params/classes:

  • early_stop_seconds -> early_stop_mins
  • SklearnIndividualGenerator -> SearchSpace (This is the base search space class that users will interact with.)
  • GraphPipeline (the search space) -> GraphSearchPipeline. I realized we had two GraphPipeline classes. one for the search space and one for the sklearn estimator.
  • renamed the cv early stopping params to cv pruning to avoid confusion with early stopping on the entire optimization process

fixes including:

  • replaced configspace.add_hyperparameter(s), configspace.add_condition(s) with "configspace.add" (deprecation fix)
  • fixed mdr search space to correctly create the regression version of the pipeline
  • fixed a bug where max_eval_time_mins was being evaluated as seconds instead of minutes.
  • FSSNode will no longer select the current feature set during mutation. mutation will always select a new feature set.
  • fixed a bug in the base individual class that prevented reproducibility. added some additional rng = default_rng(rng) checks just in case.
  • removed unused params from tpotestimator
  • a few fixes to the hyperparameter search spaces. Some tree regressors needed to have their criterion options updated to the correct strings for the latest version.
  • fixed an issue where the gradual version of estimator node was accidentally removing categorical/fixed parameters from the params dictionary.

changes including:

  • replaced "PassKBinsDiscretizer" with "KBinsDiscretizer". The former also passed through the input data, but this is unnecessary now that we have feature unions and the standalone passthrough class
  • Changed a few default values to what should normally be used (initial defaults were mostly about making it faster to run rather than what is likely better).
  • default cv is 10
  • default search space is linear and not linear-light
  • removed memory limit by default
  • added genetic encoders and the other autoqlt transformers/selectos to get configspace
  • added an iterative imputer option tha also simultaneously learns the parameters of the inner estimator
  • removed cross_val_predict_cv from the estimators. use wrapper pipeline instead (or set in graphsearchpipeline).
  • added cross_Val_predict_cv to the template search spaces
  • edited some of the docs like contribute.md, issue template.md,
  • added **kwards to the export_pipeline functions. Functions that return an sklearn pipeline or a graphpipeline now have options for passing a memory parameter and cross_Val_predict_cv. This can be used to pass through TPOT's memory parameter to the final pipelines.
  • added template search spaces for linear-light, graph-light which are lighter/faster versions of linear and graph. added mdr search space

…eline to GraphSearchPipeline, add cross_val_predict_cv option to templates, param update
@perib
Copy link
Collaborator Author

perib commented Sep 30, 2024

This should address #112
All random decisions now come from np.random.Generator and nothing should rely on np.random.seed()

Also addresses #146, baikal has been removed (for now).

The bugs identified in the following issues have also been address #150 , #151, #152

@nickotto nickotto merged commit d8ea9ec into EpistasisLab:main Oct 10, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants