Skip to content

Commit

Permalink
Update formatting.
Browse files Browse the repository at this point in the history
  • Loading branch information
VJalili committed Jun 4, 2024
1 parent 2a100a0 commit 17bbeb9
Showing 1 changed file with 74 additions and 36 deletions.
110 changes: 74 additions & 36 deletions website/docs/advanced/build_inputs.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,47 +53,85 @@ You may run the following commands to get these example inputs.
└── test
```

## Building inputs for specific use-cases (Advanced)

### build for batched workflows
### Build for batched workflows

```shell
python scripts/inputs/build_inputs.py inputs/values inputs/templates/test/GATKSVPipelineSingleSample inputs/build/NA19240/test -a '{ "test_batch" : "ref_panel_1kg", "cloud_env": "google_cloud.my_project" }'
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test/GATKSVPipelineSingleSample \
inputs/build/NA19240/test \
-a '{ "test_batch" : "ref_panel_1kg", "cloud_env": "google_cloud.my_project" }'
```


### Generating a reference panel (Advanced, single-sample mode only)
New reference panels can be generated easily from a single run of the `GATKSVPipelineBatch` workflow. If using a Cromwell server, we recommend copying the outputs to a permanent location by adding the following option to the workflow configuration file:
```
"final_workflow_outputs_dir" : "gs://my-outputs-bucket",
"use_relative_output_paths": false,
### Generating a reference panel

This section only applies to the single-sample mode.
New reference panels can be generated from a single run of the
`GATKSVPipelineBatch` workflow.
If using a Cromwell server, we recommend copying the outputs to a p
ermanent location by adding the following option to the workflow configuration file:

```json
"final_workflow_outputs_dir" : "gs://my-outputs-bucket",
"use_relative_output_paths": false,
```

Here is an example of how to generate workflow input jsons from `GATKSVPipelineBatch` workflow metadata:
```
> cromshell -t60 metadata 38c65ca4-2a07-4805-86b6-214696075fef > metadata.json
> python scripts/inputs/create_test_batch.py \
--execution-bucket gs://my-exec-bucket \
--final-workflow-outputs-dir gs://my-outputs-bucket \
metadata.json \
> inputs/values/my_ref_panel.json
> # Define your google project id (for Cromwell inputs) and Terra billing project (for workspace inputs)
> echo '{ "google_project_id": "my-google-project-id", "terra_billing_project_id": "my-terra-billing-project" }' > inputs/values/google_cloud.my_project.json
> # Build test files for batched workflows (google cloud project id required)
> python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test \
inputs/build/my_ref_panel/test \
-a '{ "test_batch" : "ref_panel_1kg", "cloud_env": "google_cloud.my_project" }'
> # Build test files for the single-sample workflow
> python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test/GATKSVPipelineSingleSample \
inputs/build/NA19240/test_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA19240", "ref_panel" : "my_ref_panel" }'
> # Build files for a Terra workspace
> python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/terra_workspaces/single_sample \
inputs/build/NA12878/terra_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA12878", "ref_panel" : "my_ref_panel" }'
```
Note that the inputs to `GATKSVPipelineBatch` may be used as resources for the reference panel and therefore should also be in a permanent location.

1. Get metadata from Cromwshell.

```shell
cromshell -t60 metadata 38c65ca4-2a07-4805-86b6-214696075fef > metadata.json
```

2. Run the script.

```shell
python scripts/inputs/create_test_batch.py \
--execution-bucket gs://my-exec-bucket \
--final-workflow-outputs-dir gs://my-outputs-bucket \
metadata.json \
> inputs/values/my_ref_panel.json
```

3. Define your google project id (for Cromwell inputs) and Terra billing project (for workspace inputs).

```shell
echo '{ "google_project_id": "my-google-project-id", "terra_billing_project_id": "my-terra-billing-project" }' > inputs/values/google_cloud.my_project.json
```

4. Build test files for batched workflows (google cloud project id required).

```shell
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test \
inputs/build/my_ref_panel/test \
-a '{ "test_batch" : "ref_panel_1kg", "cloud_env": "google_cloud.my_project" }'
```

5. Build test files for the single-sample workflow

```shell
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/test/GATKSVPipelineSingleSample \
inputs/build/NA19240/test_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA19240", "ref_panel" : "my_ref_panel" }'
```

6. Build files for a Terra workspace.

```shell
python scripts/inputs/build_inputs.py \
inputs/values \
inputs/templates/terra_workspaces/single_sample \
inputs/build/NA12878/terra_my_ref_panel \
-a '{ "single_sample" : "test_single_sample_NA12878", "ref_panel" : "my_ref_panel" }'
```

Note that the inputs to `GATKSVPipelineBatch` may be used as resources
for the reference panel and therefore should also be in a permanent location.

0 comments on commit 17bbeb9

Please sign in to comment.