Support streaming bodies in the client #15

evert · 2014-09-15T21:17:46Z

Moved from here:

https://github.com/fruux/sabre-dav/issues/321

Hywan · 2015-01-12T13:29:03Z

Hello :-),

The Sabre\HTTP\Client::parseCurlResult method computes an array containing the response index. This index contains a Sabre\HTTP\Response object. This object extends Sabre\HTTP\Message. On this object, we have the getBody, getBodyAsStream and getBodyAsString methods.

So, because this response from Sabre\HTTP\Client::parseCurlResult is returned by the doRequest method, basically, this feature is already supported.

QED ￭.

evert · 2015-01-12T15:12:21Z

The point of this ticket is, that someone may be using the library to do downloads of very large files.

In those cases, we want to ensure that the entire file is accessed as a stream, and never placed into memory.

So while it's possible right now to convert the string into a stream, the goal is to change the client a bit so it uses a stream under the hood as well.

Hywan · 2015-01-12T15:19:29Z

@evert But the result is already in memory because cURL gives you all the “stuff”: https://github.com/fruux/sabre-http/blob/c55cbc1daa91293cda92ea4b90de79c743c4a149/lib/Client.php#L483. I will check if cURL can gives only few informations in order to create a stream.

Hywan · 2015-01-12T15:30:13Z

An interesting link from a friend of me @pmartin: http://stackoverflow.com/questions/1342583/manipulate-a-string-that-is-30-million-characters-long/1342760#1342760. However, I don't know how it works if we don't want to load the response yet we receive it but later: When reading the stream only.

Hywan · 2015-01-12T15:47:48Z

In case where the user has a stream ready and with the write permission, she can give this stream to the HTTP client and the response will be copied into this stream. This answer to one use-case.

evert · 2015-01-12T15:48:55Z

streams in request bodies already 100% work, this is about turning a HTTP response into a stream.

Hywan · 2015-01-12T15:50:43Z

@evert How does it work. I missed it in the source code?

evert · 2015-01-12T15:51:39Z

https://github.com/fruux/sabre-http/blob/master/lib/Client.php#L405

Hywan · 2015-01-12T15:57:18Z

@evert Yup, it's for sending a request. What you would like to do is for receiving a response, right?

evert · 2015-01-12T15:57:57Z

Indeed, yes!

h44z · 2015-04-12T00:15:57Z

Any news on this? I am very interrested in using streams as im going to down/upload large files >2GB with the webdav client.

Hywan · 2015-04-12T08:24:50Z

@h44z Not from me yet.

evert · 2015-04-13T15:28:32Z

This is still something that interesting for us, but we haven't had time implementing it yet.

evert · 2015-05-19T09:22:52Z

So the problem here is that curl actually doesn't have an easy way for us to just access the stream resource, as far as I can tell.

The only way we can progressively get access to the stream, is by using the CURLOPT_WRITEFUNCTION option, but that only gives us 'bits of the string' as opposed to a full-on stream.

With that function, we could send everything to a temporary stream (php://temp/) which would cache the result in memory, but that only solves part of the problem.

Ideally we'd want the response to return as soon as it starts coming in and not after all the bytes have arrived, and ideally we would want to not have to cache/buffer it anywhere.

I don't see an easy way to do that.

staabm · 2015-05-19T09:48:24Z

CURLOPT_WRITEFUNCTION seems to be invoked when a certain sized chunk was downloaded by CURL. So it looks more or less like a "real" stream.

Which parts of the problem doesnt this solve for you, could you elaborate?

evert · 2015-05-19T10:02:11Z

Well, it would be really nice if we can do something like :

$request = "...";
$response = $client->send($request);
stream_copy_to_stream('php://output', $response->getBodyAsStream());

This should:

Use real stream resources internally
Never buffer anything, anywhere.

staabm · 2015-05-19T10:20:59Z

Absolutely. So we need to change from using a string and use e.g. a php tmp stream in
https://github.com/fruux/sabre-http/blob/master/lib/Client.php#L496

this will be not "real" streaming but it will be way better as what we have ATM.
whats the actual shop stopper? Did I miss something?

evert · 2015-05-19T11:11:55Z

The problem with using the temporary stream is that it only partially solve the goals. It does not:

Avoid a cache/buffer. The entire response will be stored in memory or on disk.
We can't start reading until the entire response is in, because we can't write and read the string at the same time.

evert · 2015-05-19T11:12:16Z

It's better than nothing though.

staabm · 2015-05-19T11:53:46Z

Avoid a cache/buffer. The entire response will be stored in memory or on disk.

right, but thats how streaming works nevertheless... php://memory seems like a good fit for that.

We can't start reading until the entire response is in, because we can't write and read the string at the same time.

hm IIRC we could do this using a non-blocking read/write stream, couldn't we?

staabm · 2015-05-19T12:01:25Z

maybe it would also be a good occasion to use a different http client, e.g. https://github.com/amphp/artax

(so we dont need to workaround curl limitations)

evert · 2015-05-19T12:18:54Z

right, but thats how streaming works nevertheless... php://memory seems like a good fit for that.

That's not really true... If I didn't use curl and used PHP's built-in HTTP stream wrappers, there would be no buffer.

Here's an example (and note that I did stream_copy_to_stream() wrong in my previous code snippet, sorry about that):

stream_copy_to_stream(
   fopen('http://example.org/','r'),
   fopen('php://output','w')
);

If I do the same with a temporary stream, it would look more like this:

$tmp = fopen('php://temp','r+');

stream_copy_to_stream(
   fopen('http://example.org/','r'),
   fopen($tmp,'w')
);
rewind($tmp);

stream_copy_to_stream(
   $tmp,
   fopen('php://output','w')
);

This last example has two passes, and requires a buffer (disk or memory depending on the size) and this is exactly how the curl example with WRITEFUNCTIONwould work as well. This is far from ideal. The use-case I want to solve is indeed the '2GB download' use-case, and if I force people to store the entire thing on disk first that would be sub-optimal.

We can't start reading until the entire response is in, because we can't write and read the string at the same time.

hm IIRC we could do this using a non-blocking read/write stream, couldn't we?

Not at the same time, and not without a buffer. Perhaps with steam_select() and mkfifo() :)

maybe it would also be a good occasion to use a different http client, e.g. https://github.com/amphp/artax

Not a bad idea =)

staabm · 2015-05-19T14:38:42Z

I think the only really good solution will be a different http client.
With artax you can even use single threaded concurrency in case some requests can be made in parallel which could be another perf. win.

evert · 2015-05-19T14:46:21Z

I'll definitely look into it. My preference would go for something lightweight, so maybe artax is that =)

In the future I want to kick of sabre/davclient again, so that will be good timing to dig into that.

In order to resolve sabre-io#82, Remove the substr function in `parseCurlResult()` and register receiveCurlHeader as callback function by using CURLOPT_HEADERFUNCTION instead, To avoid additional memory copy of response body (from $response to $responseBody). This changes is not intended to resolve issue sabre-io#15

In order to resolve sabre-io#82 to avoid additional memory copy of response body (from $response to $responseBody), We deperecated `parseCurlResult` method which use `substr` function to separate HTTP headers and body and create new method `parseCurlResponse` instead. We register `receiveCurlHeader` method as callback function by curl_setopt using `CURLOPT_HEADERFUNCTION` option. This changes is not intended to resolve issue sabre-io#15 See sabre-io#115 (comment)

In order to resolve #82 to avoid additional memory copy of response body (from $response to $responseBody), We deperecated `parseCurlResult` method which use `substr` function to separate HTTP headers and body and create new method `parseCurlResponse` instead. We register `receiveCurlHeader` method as callback function by curl_setopt using `CURLOPT_HEADERFUNCTION` option. This changes is not intended to resolve issue #15 See #115 (comment)

evert added the enhancement label Sep 15, 2014

evert mentioned this issue Aug 30, 2015

stream webdav downloads using http client owncloud/core#18653

Merged

evert mentioned this issue Jun 27, 2017

Question: Allowed memory size #82

Closed

Gasol mentioned this issue Dec 12, 2018

Reduce memory footprint when parsing HTTP result #115

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support streaming bodies in the client #15

Support streaming bodies in the client #15

evert commented Sep 15, 2014

Hywan commented Jan 12, 2015

evert commented Jan 12, 2015

Hywan commented Jan 12, 2015

Hywan commented Jan 12, 2015

Hywan commented Jan 12, 2015

evert commented Jan 12, 2015

Hywan commented Jan 12, 2015

evert commented Jan 12, 2015

Hywan commented Jan 12, 2015

evert commented Jan 12, 2015

h44z commented Apr 12, 2015

Hywan commented Apr 12, 2015

evert commented Apr 13, 2015

evert commented May 19, 2015

staabm commented May 19, 2015

evert commented May 19, 2015

staabm commented May 19, 2015

evert commented May 19, 2015

evert commented May 19, 2015

staabm commented May 19, 2015

staabm commented May 19, 2015

evert commented May 19, 2015

staabm commented May 19, 2015

evert commented May 19, 2015

Support streaming bodies in the client #15

Support streaming bodies in the client #15

Comments

evert commented Sep 15, 2014

Hywan commented Jan 12, 2015

evert commented Jan 12, 2015

Hywan commented Jan 12, 2015

Hywan commented Jan 12, 2015

Hywan commented Jan 12, 2015

evert commented Jan 12, 2015

Hywan commented Jan 12, 2015

evert commented Jan 12, 2015

Hywan commented Jan 12, 2015

evert commented Jan 12, 2015

h44z commented Apr 12, 2015

Hywan commented Apr 12, 2015

evert commented Apr 13, 2015

evert commented May 19, 2015

staabm commented May 19, 2015

evert commented May 19, 2015

staabm commented May 19, 2015

evert commented May 19, 2015

evert commented May 19, 2015

staabm commented May 19, 2015

staabm commented May 19, 2015

evert commented May 19, 2015

staabm commented May 19, 2015

evert commented May 19, 2015