Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slight AoU docs adjustment [VS-1366] #8955

Merged
merged 4 commits into from
Aug 19, 2024
Merged

Conversation

mcovarr
Copy link
Collaborator

@mcovarr mcovarr commented Aug 14, 2024

No description provided.

- If re-running the extract workflow with call caching enabled, it may be necessary to increase memory in the `ExtractTask` / `PgenExtractTask` tasks. Due to the way call caching works in Cromwell (i.e. the `memory` attribute is not part of the call caching hashes), it is possible to edit the value of the `memory` runtime attribute of a task without breaking call caching. However, do *not* alter the value of the `memory_gib` input parameter as changing that absolutely will break call caching and will cause tens of thousands of shards to re-run needlessly!
- For Echo ACAF VCF extract, the VCF extract workflow was call-caching re-run with `ExtractTask` memory hard-coded to 100 GiB. 9/9 extract shards which did not complete on the initial run of the workflow succeeded on their first (non-preempted) attempt in the second run of the workflow.
- For Echo ACAF PGEN extract, the PGEN extract workflow was call-caching re-run with `PgenExtractTask` memory hard-coded to 50 GiB. 20/24 extract shards which did not complete on the initial run of the workflow succeeded on their first (non-preempted) attempt in the second run of the workflow. The remaining 4 shards hit `OutOfMemoryErrors` on their first attempt but succeeded on the second attempt with 50 GiB * 1.5 = 75 GiB of memory thanks to "retry with more memory".
- When re-running the extract workflow with call caching enabled, it will be necessary to increase memory in the `ExtractTask` / `PgenExtractTask` tasks. Due to the way call caching works in Cromwell (i.e. the `memory` runtime attribute is not part of the call caching hashes), it is possible to edit the value of the `memory` runtime attribute of a task without breaking call caching. However, do *not* alter the value of the `memory_gib` input parameter as changing that absolutely will break call caching and will cause tens of thousands of shards to re-run needlessly! Both VCF and PGEN extracts can have their memory set to `"50 GiB"` for the call-caching re-run. Most extract shards should finish on the first re-run attempt, but a few stragglers will likely OOM and automatically re-run with more memory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Are you suggesting to edit the WDL to change the value of the memory runtime attribute? Wouldn't that break caching? I must be missing something.

Copy link
Collaborator

@gbggrant gbggrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One small suggested update to the documentation.

@mcovarr mcovarr merged commit 48f4bf5 into ah_var_store Aug 19, 2024
15 of 21 checks passed
@mcovarr mcovarr deleted the vs_1366_updates_updates branch August 19, 2024 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants