Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameter max_models does not work properly sometimes #16376

Open
baobabtree opened this issue Aug 27, 2024 · 5 comments
Open

Parameter max_models does not work properly sometimes #16376

baobabtree opened this issue Aug 27, 2024 · 5 comments
Assignees

Comments

@baobabtree
Copy link

baobabtree commented Aug 27, 2024

H2O version, Operating System and Environment

3.46.0.3,
Windows 10, R 4.4.1
wsl2 Ubuntu 24.04 (Windows subsystem for Linux), R 4.4.1, Rstudio Server
I tried on both Windows and wsl2.

Actual behavior

  1. When I set the max_models = 20, usually I got 23 models including 3 Stack Ensembles. However, sometimes I got over 40 models.
    When I set the max_models = 30, again, sometimes I got over 60 models.

  2. The models are not evenly distributed.
    Sometimes most of the models are GBM.
    Sometimes most of the models are DeepLearning.
    It varies from time to time.

  3. If I set max_runtime_secs and the time spent passed the preset number, there will be no Stack Ensembles.

Screenshots

I set the max_models = 30 and got 53 models in total.

fit <- h2o.automl(x = x, y = y, training_frame = as.h2o(yx), leaderboard_frame = newyx, nfolds = -1,
weights_column = NULL, max_models = 30, stopping_metric = "AUTO", exclude_algos = c("DRF"),
exploitation_ratio = -1, keep_cross_validation_predictions = T, sort_metric = "AUTO",
monotone_constraints = mono)

image
image

I attached the logs. This time I got 62 models in total
logs.zip

@baobabtree baobabtree added the bug label Aug 27, 2024
@tomasfryda
Copy link
Contributor

Is it possible that the extra models are just Stacked Ensembles? Stacked Ensembles are not included in the max_models. See https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/max_models.html

@baobabtree
Copy link
Author

Is it possible that the extra models are just Stacked Ensembles? Stacked Ensembles are not included in the max_models. See https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/max_models.html

There are only 3 Stacked Ensembles. I attached screenshots in the description.

@tomasfryda
Copy link
Contributor

Thanks, that does look like a bug. Would you be able to provide us with logs?
(h2o.downloadAllLogs(filename = "logs.zip"); https://docs.h2o.ai/h2o/latest-stable/h2o-docs/logs.html)

@baobabtree
Copy link
Author

Thanks, that does look like a bug. Would you be able to provide us with logs? (h2o.downloadAllLogs(filename = "logs.zip"); https://docs.h2o.ai/h2o/latest-stable/h2o-docs/logs.html)

I attached the logs in the descrption.

@tomasfryda
Copy link
Contributor

Thank you @baobabtree!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants