Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange output by Funcotator #8965

Open
gevro opened this issue Aug 27, 2024 · 2 comments
Open

Strange output by Funcotator #8965

gevro opened this issue Aug 27, 2024 · 2 comments

Comments

@gevro
Copy link

gevro commented Aug 27, 2024

Hi,
With gatk 4.6.0 and Funcotator data sources v1.8, and output in VCF format, I'm seeing some annotations with strange character combinations inside of them:
"%7C"
"%20"

For example for one varaint chr11:54942730 C>T (hg38), for gnomAD_genome_AF, I'm seeing:
8.55286e-05_%7C_3.46021e-04

But this should simply be one number.

Seems like a bug in the parsing of the retrieval of gnomAD info from the google cloud bucket by Funcotator.

@gevro
Copy link
Author

gevro commented Aug 27, 2024

As a work-around, how do I fully localize the gnomAD data sources?

@gevro
Copy link
Author

gevro commented Aug 27, 2024

Update: I localized gnomAD and got the same result. So I looked at the gnomAD annotation files and found the issue - there is a bug in how the gnomad_genome annotation files were prepared. Some variants appear twice, which is causing Funcotator to output two allele frequency annotations for each variant:

Example:
chr11 54942730 . C T 979.63 PASS AF=8.55286e-05;AF_afr=0;AF_afr_female=0;AF_afr_male=0;AF_amr=0;AF_amr_female=0;AF_amr_male=0;AF_asj=0;AF_asj_female=0;AF_asj_male=0;AF_eas=0.00149477;AF_eas_female=0;AF_eas_male=0.00225734;AF_female=0;AF_fin=0;AF_fin_female=0;AF_fin_male=0;AF_male=0.000154131;AF_nfe=0;AF_nfe_est=0;AF_nfe_female=0;AF_nfe_male=0;AF_nfe_nwe=0;AF_nfe_onf=0;AF_nfe_seu=0;AF_oth=0;AF_oth_female=0;AF_oth_male=0;AF_popmax=0.00149477;AF_raw=0.000164376;OriginalContig=11;OriginalStart=51175001;ReverseComplementedAlleles
chr11 54942730 rs1267687142 C T 1483.06 PASS AF=0.000346021;AF_afr=0;AF_afr_female=0;AF_afr_male=0;AF_amr=0;AF_amr_female=0;AF_amr_male=0;AF_asj=0;AF_asj_female=0;AF_asj_male=0;AF_eas=0.00980392;AF_eas_female=0.00694444;AF_eas_male=0.0113636;AF_female=0.000160256;AF_fin=0;AF_fin_female=0;AF_fin_male=0;AF_male=0.000487211;AF_nfe=0;AF_nfe_est=0;AF_nfe_female=0;AF_nfe_male=0;AF_nfe_nwe=0;AF_nfe_onf=0;AF_nfe_seu=0;AF_oth=0.00221239;AF_oth_female=0;AF_oth_male=0.00438596;AF_popmax=0.00980392;AF_raw=0.000561325;OriginalContig=11;OriginalStart=54710206

This bug would affect every pipeline that uses gnomad genome Funcotator for filtering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant