From 476afba3a4bdd9d3755a0c5eff31e15d281fc116 Mon Sep 17 00:00:00 2001 From: Cristina Yenyxe Gonzalez Garcia Date: Thu, 25 Apr 2019 00:46:01 +0100 Subject: [PATCH 1/4] htsget: 'class' protocol attributes for header & body (#322) htsget protocol responses may include the class attribute to distinguish which URL parts constitute the BAM/CRAM/VCF header and which the data body. Clients may also request only one or the other. --- htsget.md | 49 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 44 insertions(+), 5 deletions(-) diff --git a/htsget.md b/htsget.md index ef1f4a228..daf5ccb77 100644 --- a/htsget.md +++ b/htsget.md @@ -170,6 +170,27 @@ The server SHOULD reply with an `UnsupportedFormat` error if the requested forma +`class` +_optional string_ + + +Request different classes of data. +By default, i.e., when `class` is not specified, the response will represent a complete read or variant data stream, encompassing SAM/CRAM/VCF headers, body data records, and EOF marker. + +If `class` is specified, its value MUST be one of the following: + + +
+ +`header` + + +Request the SAM/CRAM/VCF headers only. + +The server SHOULD respond with an `InvalidInput` error if any other htsget query parameters other than `format` are specified at the same time as `class=header`. +
+ + `referenceName` _optional_ @@ -307,6 +328,17 @@ _optional object_ For HTTPS URLs, the server may supply a JSON object containing one or more string key-value pairs which the client MUST supply as headers with any request to the URL. For example, if headers is `{"Range": "bytes=0-1023", "Authorization": "Bearer xxxx"}`, then the client must supply the headers `Range: bytes=0-1023` and `Authorization: Bearer xxxx` with the HTTPS request to the URL. + + +`class` +_optional string_ + + +For file formats whose specification describes a header and a body, the class indicates which of the two will be retrieved when querying this URL. The allowed values are `header` and `body`. + +Either all or none of the URLs in the response MUST have a class attribute. +If `class` fields are not supplied, no assumptions can be made about which data blocks contain headers, body records, or parts of both. + @@ -329,24 +361,28 @@ An example of a JSON response is: "format" : "BAM", "urls" : [ { - "url" : "data:application/vnd.ga4gh.bam;base64,QkFNAQ==" + "url" : "data:application/vnd.ga4gh.bam;base64,QkFNAQ==", + "class" : "header" }, { - "url" : "https://htsget.blocksrv.example/sample1234/header" + "url" : "https://htsget.blocksrv.example/sample1234/header", + "class" : "header" }, { "url" : "https://htsget.blocksrv.example/sample1234/run1.bam", "headers" : { "Authorization" : "Bearer xxxx", "Range" : "bytes=65536-1003750" - } + }, + "class" : "body" }, { "url" : "https://htsget.blocksrv.example/sample1234/run1.bam", "headers" : { "Authorization" : "Bearer xxxx", "Range" : "bytes=2744831-9375732" - } + }, + "class" : "body" } ] } @@ -364,7 +400,10 @@ An example of a JSON response is: 3. Client fetches the data blocks using the URLs and headers. 4. Client concatenates data blocks to produce local blob. -While the blocks must be finally concatenated in the given order, the client may fetch them in parallel. +While the blocks must be finally concatenated in the given order, the client may fetch them in parallel and/or reuse cached data from URLs that have previously been downloaded. + +When making a series of requests to fetch reads or variants within different regions of the same `` resource, clients may wish to avoid re-fetching the SAM/CRAM/VCF headers each time, especially if they are large. +If the ticket contains `class` fields, the client may reuse previously downloaded and parsed headers rather than re-fetching the `header`-class URLs. ### HTTPS data block URLs From 75aa78ac01a591d0150ad580c48b07836347b730 Mon Sep 17 00:00:00 2001 From: John Marshall Date: Thu, 25 Apr 2019 10:09:00 +0100 Subject: [PATCH 2/4] Restore referenceName item formatting [minor] Restore the blank line before `referenceName` that the previous commit inadvertently removed, which is needed for this markdown formatting to be interpreted when displayed on GitHub via GFM. Also it needs *two* spaces at end-of-line to force a line break. --- htsget.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/htsget.md b/htsget.md index daf5ccb77..7e12cdbe4 100644 --- a/htsget.md +++ b/htsget.md @@ -191,7 +191,8 @@ The server SHOULD respond with an `InvalidInput` error if any other htsget query -`referenceName` + +`referenceName` _optional_ From 45c9feb5e4a7f5f684e29956af9a4406fb35edb1 Mon Sep 17 00:00:00 2001 From: John Marshall Date: Thu, 25 Apr 2019 12:40:21 +0100 Subject: [PATCH 3/4] Format subtable on gh-pages [minor] --- htsget.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/htsget.md b/htsget.md index 7e12cdbe4..aaf57045a 100644 --- a/htsget.md +++ b/htsget.md @@ -179,7 +179,7 @@ By default, i.e., when `class` is not specified, the response will represent a c If `class` is specified, its value MUST be one of the following: -
+
`header` From 52dfd0fc866fc6a073515dbc1d59eb47485c4964 Mon Sep 17 00:00:00 2001 From: Mike Lin Date: Mon, 13 May 2019 18:53:33 -0700 Subject: [PATCH 4/4] set htsget v1.2.0 --- htsget.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/htsget.md b/htsget.md index aaf57045a..af6697dfb 100644 --- a/htsget.md +++ b/htsget.md @@ -4,7 +4,7 @@ title: htsget protocol suppress_footer: true --- -# Htsget retrieval API spec v1.1.1 +# Htsget retrieval API spec v1.2.0 # Design principles