Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with MT-WT epitope match for inframe mutations #1152

Open
KhacDuyNguyen0 opened this issue Sep 24, 2024 · 4 comments · May be fixed by #1155
Open

Help with MT-WT epitope match for inframe mutations #1152

KhacDuyNguyen0 opened this issue Sep 24, 2024 · 4 comments · May be fixed by #1155

Comments

@KhacDuyNguyen0
Copy link

Dear the authors,
I appriciate your workings to produce such a useful tool.

During the analysis, I identified a case involving an inframe insertion variant in the PIK3R1 gene at position 454, where T changes to TQFQEKS.
For more details,

  • The WildtypeNmer: IEAVGKKLHEYNTQFQEKSREYDRL
  • The Mutant Nmer: IEAVGKKLHEYNTQFQEKSQFQEKSREYDRL

I understand that the matched wildtype epitope should share at least half of its length with the mutant epitope. For the mutant epitope FQEKSQFQE, I believe that FQEKSREYD would be an appropriate matched wildtype epitope, but the algorithm selected (NA) for this case. Similar issues appear with other mutant epitopes shown in the picture below.
image

Another similar case occurs with mutations in the NCOR2 gene at position 1833-1834, where mutant and wildtype nmer as follows:

  • The WildtypeNmer: EHAPIWRPGTEQSSGSSGGGGGSS
  • The Mutant Nmer: EHAPIWRPGTEQSSGSSGSSGGGGGSS
    There are no matched wildtype epitopes for the following mutant epitopes GSSGSSGGG, SGSSGSSGG, SSGSSGSSG

I would like to know the reasons and the pairing rules for such cases. Thank you in advance.

Best regards,
Duy

@susannasiebert
Copy link
Contributor

This is an interesting case. I agree that I would expect these to match as you describe them. There might be a bug in our logic. Would you be able to share a VCF file with just these two variants in them so that I can try to debug in further on my end?

@KhacDuyNguyen0
Copy link
Author

I am sorry for late response, here are my VCF files for these two mutations.
inframe_mutations.zip

@susannasiebert
Copy link
Contributor

A short update: I found the reason for this behavior. When we create the fasta file for making binding predictions, we only include n-1 flanking amino acids so that each n-length substring of the peptide overlaps the mutation position. However, with these particular examples, the insertion is actually a duplication of a longer region and the presumed mutation position T is not where the mutated amino acids start (which is at the end of the duplicated region). So not enough flanking amino acids were included in the fasta file pVACseq creates. You can see this reflected by looking at the .fasta file in the MHC_Class_I subfolder of your run. I'm working on fixing this error by including a longer subsequence for the WT of inframe insertions to account for duplicating insertions.

@KhacDuyNguyen0
Copy link
Author

Thank you so much for your support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants