Add a few additional failures to our notes doc #8980

RoriCremer · 2024-09-15T19:55:46Z

No description provided.

rsasch

Some specific changes, also I think it would be useful to have a consistent way of attaching the failures to a specific workflow, sub-workflow and task for easier use.

rsasch · 2024-10-03T16:58:22Z

scripts/variantstore/beta_docs/gvs-troubleshooting.md

+1. GVS is running very slowly!
+   1. If your GVS workflow is running very slowly compared to the example runtimes in the workspace, you may have run GVS on GVCFs that have not been reblocked. Confirm your GVCFs are reblocked.
 1. My workflow failed during ingestion, can I restart it?
   1. If it fails during ingestion, yes, the GvsBeta workflow is restartable and will pick up where it left off.


It seems like these two cases are different, since they don't involve a particular error. Maybe something like "Ingestion-Specific Issues"

rsasch · 2024-10-03T16:59:00Z

scripts/variantstore/beta_docs/gvs-troubleshooting.md

-   1. The GVS requires that sample names are unique because the sample names are used to name the samples in the VCF, and VCF format requires unique sample names. 
-   2. After deleting or renaming the duplicate sample, you can restart the workflow without any clean up.
-3. `BulkIngestGenomes/GvsBulkIngestGenomes/hash/call-ImportGenomes/GvsImportGenomes/hash/call-GetUningestedSampleIds/gvs_ids.csv Required file output '/cromwell_root/gvs_ids.csv' does not exist.`
+1. Duplicate sample names error: ERROR: The input file ~{sample_names_file} contains the following duplicate entries:


I would bring back the formatting around the actual error message, as it's easier to scan the document to find the text that was in the log.

Suggested change

1. Duplicate sample names error: ERROR: The input file ~{sample_names_file} contains the following duplicate entries:

1. `Duplicate sample names error: ERROR: The input file ~{sample_names_file} contains the following duplicate entries:`

rsasch · 2024-10-03T16:59:46Z

scripts/variantstore/beta_docs/gvs-troubleshooting.md

+1. Duplicate sample names error: ERROR: The input file ~{sample_names_file} contains the following duplicate entries:
+   1. The GVS requires that sample names are unique because the sample names are used to name the samples in the VCF, and VCF format requires unique sample names.
+   1. After deleting or renaming the duplicate sample, you can restart the workflow without any clean up.
+1. BulkIngestGenomes/GvsBulkIngestGenomes/hash/call-ImportGenomes/GvsImportGenomes/hash/call-GetUningestedSampleIds/gvs_ids.csv Required file output '/cromwell_root/gvs_ids.csv' does not exist.


similar to above

Suggested change

1. BulkIngestGenomes/GvsBulkIngestGenomes/hash/call-ImportGenomes/GvsImportGenomes/hash/call-GetUningestedSampleIds/gvs_ids.csv Required file output '/cromwell_root/gvs_ids.csv' does not exist.

1. During Ingest: `Required file output '/cromwell_root/gvs_ids.csv' does not exist.`

rsasch · 2024-10-03T17:00:03Z

scripts/variantstore/beta_docs/gvs-troubleshooting.md

   1. If you've attempted to run GVS more than once in the same BigQuery dataset, you may see this error. Please delete the dataset and create a new one. We recommend naming the new dataset something different than the one you deleted.
-4. AssignIds failure with error message: `BigQuery error in mk operation: Not found: Dataset`
-   1. This is saying that GVS was unable to find the BigQuery dataset specified in the inputs. If you haven't created a BigQuery dataset prior to running the workflow, you can follow the steps in [the quickstart](./gvs-quickstart.md). If you created it and still see this error, check the naming of the dataset matches your input specified and that the google project in the inputs is correct. Lastly, confirm you have set up the correct permissions for your Terra proxy account following the instructions in the quickstart. 
+1. AssignIds failure with error message: BigQuery error in mk operation: Not found: Dataset


Suggested change

1. AssignIds failure with error message: BigQuery error in mk operation: Not found: Dataset

1. AssignIds failure with error message: `BigQuery error in mk operation: Not found: Dataset`

rsasch · 2024-10-03T17:00:36Z

scripts/variantstore/beta_docs/gvs-troubleshooting.md

-   1. This is saying that GVS was unable to find the BigQuery dataset specified in the inputs. If you haven't created a BigQuery dataset prior to running the workflow, you can follow the steps in [the quickstart](./gvs-quickstart.md). If you created it and still see this error, check the naming of the dataset matches your input specified and that the google project in the inputs is correct. Lastly, confirm you have set up the correct permissions for your Terra proxy account following the instructions in the quickstart. 
+1. AssignIds failure with error message: BigQuery error in mk operation: Not found: Dataset
+   1. This is saying that GVS was unable to find the BigQuery dataset specified in the inputs. If you haven't created a BigQuery dataset prior to running the workflow, you can follow the steps in the quickstart. If you created it and still see this error, check the naming of the dataset matches your input specified and that the google project in the inputs is correct. Lastly, confirm you have set up the correct permissions for your Terra proxy account following the instructions in the quickstart.
+1. Ingest failure with error message: raise ValueError("vcf column not in table")


Suggested change

1. Ingest failure with error message: raise ValueError("vcf column not in table")

1. Ingest failure with error message: `raise ValueError("vcf column not in table")`

rsasch · 2024-10-03T17:00:48Z

scripts/variantstore/beta_docs/gvs-troubleshooting.md

+   1. (e.g. alternate_bases.AS_RAW_MQ, RAW_MQandDP or RAW_MQ)
+   1. This means that there is at least one incorrectly formatted sample in your data model. Confirm your GVCFs are reblocked. If the incorrectly formatted samples are a small portion of your callset and you wish to just ignore them, simply delete the from the data model and restart the workflow without them. There should be no issue with starting from here as none of these samples were loaded.
+1. Extract failure with OSError: Is a directory. If you point your extract to a directory that doesn’t already exist, it will not be happy about this. Simply make the directory and run the workflow again.
+1. Ingest failure with: Lock table error


Suggested change

1. Ingest failure with: Lock table error

1. Ingest failure with: `Lock table error`

this seems incomplete; what is the user to do if they run into this error?

rsasch · 2024-10-03T17:01:09Z

scripts/variantstore/beta_docs/gvs-troubleshooting.md

this seems incomplete

update readme with failure notes

1e29403

rsasch self-requested a review September 26, 2024 13:58

rsasch requested changes Oct 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a few additional failures to our notes doc #8980

Add a few additional failures to our notes doc #8980

RoriCremer commented Sep 15, 2024

rsasch left a comment

rsasch Oct 3, 2024

rsasch Oct 3, 2024

rsasch Oct 3, 2024

rsasch Oct 3, 2024

rsasch Oct 3, 2024

rsasch Oct 3, 2024

rsasch Oct 3, 2024

rsasch Oct 3, 2024

	1. Duplicate sample names error: ERROR: The input file ~{sample_names_file} contains the following duplicate entries:
	1. `Duplicate sample names error: ERROR: The input file ~{sample_names_file} contains the following duplicate entries:`

	1. BulkIngestGenomes/GvsBulkIngestGenomes/hash/call-ImportGenomes/GvsImportGenomes/hash/call-GetUningestedSampleIds/gvs_ids.csv Required file output '/cromwell_root/gvs_ids.csv' does not exist.
	1. During Ingest: `Required file output '/cromwell_root/gvs_ids.csv' does not exist.`

	1. AssignIds failure with error message: BigQuery error in mk operation: Not found: Dataset
	1. AssignIds failure with error message: `BigQuery error in mk operation: Not found: Dataset`

	1. Ingest failure with error message: raise ValueError("vcf column not in table")
	1. Ingest failure with error message: `raise ValueError("vcf column not in table")`

	1. Ingest failure with: Lock table error
	1. Ingest failure with: `Lock table error`

Add a few additional failures to our notes doc #8980

Are you sure you want to change the base?

Add a few additional failures to our notes doc #8980

Conversation

RoriCremer commented Sep 15, 2024

rsasch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment