Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added class to URLs in the response #322

Merged
merged 9 commits into from
Apr 24, 2019
50 changes: 45 additions & 5 deletions htsget.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,28 @@ The server SHOULD reply with an `UnsupportedFormat` error if the requested forma
[^a]
</td></tr>
<tr markdown="block"><td>

`class`
_optional string_
</td><td>

Request different classes of data.
By default, i.e., when `class` is not specified, the response will represent a complete read or variant data stream, encompassing SAM/CRAM/VCF headers, body data records, and EOF marker.

If `class` is specified, its value MUST be one of the following:
<table>
<tr><td>

`header`
</td><td>

Request the SAM/CRAM/VCF headers only.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmarshall What do you think of adding "(and EOF marker)" here? Useful clarification, thanks for bringing it up earlier.


The server SHOULD respond with an `InvalidInput` error if any other htsget query parameters other than `format` are specified at the same time as `class=header`.
</td></tr>
</table>
</td></tr>
<tr markdown="block"><td>
`referenceName`
_optional_
</td><td>
Expand Down Expand Up @@ -280,6 +302,17 @@ _optional object_
</td><td>
For HTTPS URLs, the server may supply a JSON object containing one or more string key-value pairs which the client MUST supply as headers with any request to the URL. For example, if headers is `{"Range": "bytes=0-1023", "Authorization": "Bearer xxxx"}`, then the client must supply the headers `Range: bytes=0-1023` and `Authorization: Bearer xxxx` with the HTTPS request to the URL.
</td></tr>
<tr markdown="block"><td>

`class`
_optional string_
</td><td>

For file formats whose specification describes a header and a body, the class indicates which of the two will be retrieved when querying this URL. The allowed values are `header` and `body`.

Either all or none of the URLs in the response MUST have a class attribute.
If `class` fields are not supplied, no assumptions can be made about which data blocks contain headers, body records, or parts of both.
</td></tr>
</table>

</td></tr>
Expand All @@ -300,24 +333,28 @@ An example of a JSON response is:
"format" : "BAM",
"urls" : [
{
"url" : "data:application/vnd.ga4gh.bam;base64,QkFNAQ=="
"url" : "data:application/vnd.ga4gh.bam;base64,QkFNAQ==",
"class" : "header"
},
{
"url" : "https://htsget.blocksrv.example/sample1234/header"
"url" : "https://htsget.blocksrv.example/sample1234/header",
"class" : "header"
},
{
"url" : "https://htsget.blocksrv.example/sample1234/run1.bam",
"headers" : {
"Authorization" : "Bearer xxxx",
"Range" : "bytes=65536-1003750"
}
},
"class" : "body"
},
{
"url" : "https://htsget.blocksrv.example/sample1234/run1.bam",
"headers" : {
"Authorization" : "Bearer xxxx",
"Range" : "bytes=2744831-9375732"
}
},
"class" : "body"
}
]
}
Expand All @@ -335,7 +372,10 @@ An example of a JSON response is:
3. Client fetches the data blocks using the URLs and headers.
4. Client concatenates data blocks to produce local blob.

While the blocks must be finally concatenated in the given order, the client may fetch them in parallel.
While the blocks must be finally concatenated in the given order, the client may fetch them in parallel and/or reuse cached data from URLs that have previously been downloaded.

When making a series of requests to fetch reads or variants within different regions of the same `<id>` resource, clients may wish to avoid re-fetching the SAM/CRAM/VCF headers each time, especially if they are large.
If the ticket contains `class` fields, the client may reuse previously downloaded and parsed headers rather than re-fetching the `header`-class URLs.

### HTTPS data block URLs

Expand Down