Releases: lh3/miniprot
Miniprot-0.13 (r248)
Notable changes:
-
New feature: added option -T to specify a non-standard NCBI translation
table (#56 and #57). As this is an indexing option, the binary index format
has to be changed accordingly. Miniprot will reject indices built with
previous versions. -
Improvement: properly handle reference deletions involving in-frame stop
codons (#58). Older versions would not penalize these stop codons. This
change also improves junction accuracy especially for distant homologs. -
Bugfix: in the GFF3 output, CDS now includes stop codons (#55). Note the in
GTF, CDS excludes stop codons. -
Bugfix: suppress an extra amino acid in the --trans or --aln output (#47).
In rare cases, this may lead to memory violation.
(0.13: 6 March 2024, r248)
Miniprot-0.12 (r237)
Notable changes:
-
New feature: added option --no-cs to disable the cs tag. This tag is not as
useful as the cs tag for nucleotide alignment because it does not encode the
matching amino acids. -
New feature: output the number of frameshifts and in-frame stop codons in
the PAF output. It is non-trivial to parse in-frame stop codons. -
Bugfix: fixed malformatted protein sequences when --gtf and --trans are both
specified (#45).
(0.12: 24 June 2023, r237)
Miniprot-0.11 (r234)
Notable changes:
-
New feature: added option --trans to output translated protein sequences. It
is possible to extract these sequences from the --aln output but the --trans
output is smaller and more convenient. -
Bugfix: infinite error messages if a wrong option is in use.
-
Improvement: better error messages given nonexisting query files (#40).
(0.11: 18 April 2023, r234)
Miniprot-0.10 (r225)
Miniprot-0.9 (r223)
Notable change:
- Bugfix: not all query proteins were outputted with option
-u
.
(0.9: 9 March 2023, r223)
Miniprot-0.8 (r220)
Notable changes:
-
Improvement: slightly improved the sensitivity to distant homologs at a minor
cost of specificity. On the human-zebrafish dataset, we gained 1.2% junction
sensivity at the cost of 0.2% specificity. -
New feature: added option
--aln
to output residue alignment. -
New feature: added option
-I
to automatically set the maximum intron size to
sqrt(GenomeSize) * 3.6, where GenomeSize is the total length of the
nucleotide sequences. For a small genome, a small threshold leads to
higher accuracy. This option is not the default because the reference is not
always a whole genome.
(0.8: 6 March 2023, r220)
Miniprot-0.7 (r207)
Notable changes:
-
Improvement: replaced open syncmers with modimers. This simplifies the code
and slightly reduces the memory at comparable k-mer sampling rate. This
changes the index format. -
Improvement: fine tune parameters for higher sensitivity at a minor cost of
junction accuracy: a) only index ORFs >= 30bp; b) reduced max k-mer
occurrences from 50k to 20k; c) sample k-mers at a rate of 50%; d) reduced
min number of k-mers from 5 to 3; e) add a bonus chaining score for anchors
on the same reference block. -
Improvement: adjust the max k-mer occurrence dynamically per protein.
-
Improvement: implemented 2-level chaining like minimap2 and minigraph. This
reduces chaining time. -
Bugfix: fixed a rare off-by-1 memory violation
-
Bugfix: fixed a memory leak
Overall, miniprot becomes faster at slightly higher peak memory usage. It is
more sensitive to distant homologs, though the junction accuracy of additional
alignment is usually lower. Also importantly, the index format of miniprot has
been changed. Miniprot will throw an error if you use miniprot with pre-built
indices generated with older versions.
(0.7: 25 December 2022, r207)
Miniprot-0.6 (r185)
Notable changes:
-
Improvement: for each protein, only output alignments close to the best
alignment. Also added option --outs to tune the threshold. -
New feature: output GTF with option --gtf.
(0.6: 12 December 2022, r185)
Miniprot-0.5 (r179)
Notable changes:
-
Improvement: more detailed splice model considering G|GTR..YYYNYAG|. This is
not enabled by default. Added option-j
to change the splice model. -
Added the miniprot preprint. Available at http://arxiv.org/abs/2210.08052
(0.5: 17 October 2022, r179)
Miniprot-0.4 (r165)
This version implements a better splice model and pays a little more effort in
aligning terminal exons. It improves both sensitivity and specificity by a few
percent.
Other notable changes:
-
Breaking change: changed -C to scale the splice model
-
Bugfix: implemented option -w (#12)
-
Bugfix: reduced the indexing time for highly fragmented genomes (#10)
-
New feature: output a Rank attribute in GFF
(0.4: 5 October 2022, r165)