Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Enable distribution of estimation across a grid #38

Open
billdenney opened this issue Dec 14, 2022 · 5 comments
Open

Comments

@billdenney
Copy link
Contributor

When writing up the profile method, I was thinking that it would be useful to be able to send the different parts to be run in parallel and more generally to be able to run them on a grid (or at least not directly in the current R session).

Specific places where this could help are bootstrap and likelihood profiling, but it would be generally helpful, I think.

To do it, I'd not want to support our own grid queueing system; we would build on something that others are already doing. I think that the clustermq library would be the preferred underlying choice.

My current brainstorm for it is that we would make a new function called something like nlmixr2Q (the "Q" mirrors the main command used in the clustermq library). It would take in either multiple models (a list of model specifications) or multiple datasets (a list of data objects), but not both. All other function arguments would be applied to all of the models/datasets.

It would queue things up if clustermq is setup, and it would work just like running nlmixr2 serially if clustermq is not setup.

Thoughts?

@billdenney billdenney changed the title Enable distribution of estimation across a grid Feature request: Enable distribution of estimation across a grid Dec 14, 2022
@billdenney
Copy link
Contributor Author

FYI, I can probably take it on, if there's interest.

@mattfidler
Copy link
Member

What about future because you can change it to other parallel processing backends?

@mattfidler
Copy link
Member

I would also say a method to run multiple models on a grid would be helpful for multiple things here.

@billdenney
Copy link
Contributor Author

I've found future harder to use than clustermq in some limited tests, but I'll check it out in more detail. There are a few discussions (by people whose opinions I trust like Will Landau) where it's stated that clustermq is more efficient, so that gives me a nudge in that direction.

@mattfidler
Copy link
Member

It would be more efficient (but less flexible), if there is only clustermq. Any front end to multiple backends would have the same property.

I simply worry about the diversity of parallel processing paradigms in R; I think I would simply want a front end to them to allow more than one type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants