Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAM store needs support for .csi indexes #926

Closed
keiranmraine opened this issue Sep 12, 2017 · 10 comments
Closed

BAM store needs support for .csi indexes #926

keiranmraine opened this issue Sep 12, 2017 · 10 comments
Assignees
Labels
help wanted actively soliciting new contributors to take this on! high priority related to a high-level project goal
Milestone

Comments

@keiranmraine
Copy link
Contributor

I'm pretty sure that it doesn't but it's worth being aware that *.csi indexes will replace *.bai eventually, even for*.bam files.

samtools/hts-specs#240 (comment)

I'm not aware of any progress on migration htslib based parsing of bam/cram.

@keiranmraine keiranmraine changed the title Does the BAM adaptor support csi index files Does the BAM adaptor support csi index files? Sep 12, 2017
@sagnikbanerjee15
Copy link

Hello,

I am working with Barley which have very large chromosomes. Could you please suggest a way in which I could visualize the alignments in JBrowse and still bypass the issue with indices.

Thank you.

@rbuels rbuels added the help wanted actively soliciting new contributors to take this on! label Jan 25, 2018
@rbuels rbuels changed the title Does the BAM adaptor support csi index files? BAM store needs support for .csi indexes Jan 25, 2018
@rbuels rbuels changed the title BAM store needs support for .csi indexes BAM store needs support for .csi indexes Jan 25, 2018
@rbuels rbuels added the high priority related to a high-level project goal label Jan 25, 2018
@nathandunn
Copy link
Contributor

FYI: https://samtools.github.io/hts-specs/SAMv1.pdf

5.3 C source code for computing bin number and overlapping bins
The following functions compute bin numbers and overlaps for a BAI-style binning scheme with 6 levels and
a minimum bin size of 214 bp. See the CSI specification for generalisations of these functions designed for
binning schemes with arbitrary depth and sizes.
/* calculate bin given an alignment covering [beg,end) (zero-based, half-closed-half-open) */
int reg2bin(int beg, int end)
{
--end;
if (beg>>14 == end>>14) return ((1<<15)-1)/7 + (beg>>14);
if (beg>>17 == end>>17) return ((1<<12)-1)/7 + (beg>>17);
if (beg>>20 == end>>20) return ((1<<9)-1)/7 + (beg>>20);
if (beg>>23 == end>>23) return ((1<<6)-1)/7 + (beg>>23);
if (beg>>26 == end>>26) return ((1<<3)-1)/7 + (beg>>26);
return 0;
}
/* calculate the list of bins that may overlap with region [beg,end) (zero-based) */
#define MAX_BIN (((1<<18)-1)/7)
int reg2bins(int beg, int end, uint16_t list[MAX_BIN])
{
int i = 0, k;
--end;
list[i++] = 0;
for (k = 1 + (beg>>26); k <= 1 + (end>>26); ++k) list[i++] = k;
for (k = 9 + (beg>>23); k <= 9 + (end>>23); ++k) list[i++] = k;
for (k = 73 + (beg>>20); k <= 73 + (end>>20); ++k) list[i++] = k;
for (k = 585 + (beg>>17); k <= 585 + (end>>17); ++k) list[i++] = k;
for (k = 4681 + (beg>>14); k <= 4681 + (end>>14); ++k) list[i++] = k;
return i;
}

@keiranmraine
Copy link
Contributor Author

FYI, csi also applies to files that have traditionally used tabix indexing *.tbi:

$ tabix -h
...
Indexing Options:
   ...
   -C, --csi                  generate CSI index for VCF (default is TBI)

@FredericBGA
Copy link
Contributor

Hello

Large VCF files need also to be indexed using CSI index, so JBrowse cannot handle them right now.

@rbuels rbuels added this to the 1.15.0 milestone Apr 17, 2018
@cmdcolin
Copy link
Contributor

Began some basic csi (for vcf currently) parsing here https://github.com/GMOD/jbrowse/tree/csi_index

@cmdcolin
Copy link
Contributor

cmdcolin commented Jun 26, 2018

Woo! tested and it displays data in super big coordinates that tabix tbi can't index (when chromosome over a gigabase in length)

screenshot-localhost-2018 06 25-18-48-54

@cmdcolin
Copy link
Contributor

cmdcolin commented Jul 2, 2018

Got CSI working for BAM now also :) woo

@nathanhaigh
Copy link
Contributor

Oh man...I almost wet myself with excitement! I want to test this out ASAP with wheat! :)

1 happy man at this prospect!

@keiranmraine
Copy link
Contributor Author

keiranmraine commented Jul 4, 2018

... do I dare say that they are currently discussing/adding *.sbi indexing:

http://github.com/samtools/hts-specs/pull/321

(will help solve the "guessing" about chunks)

@cmdcolin
Copy link
Contributor

cmdcolin commented Jul 4, 2018

Oh wow haha. Is that an official solution to "bam index index"?

@rbuels rbuels closed this as completed in 34bbb6c Jul 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted actively soliciting new contributors to take this on! high priority related to a high-level project goal
Projects
None yet
Development

No branches or pull requests

7 participants