Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First bytes lost when using IncomingMessage & promise #1990

Closed
luissquall opened this issue Jan 4, 2016 · 13 comments
Closed

First bytes lost when using IncomingMessage & promise #1990

luissquall opened this issue Jan 4, 2016 · 13 comments

Comments

@luissquall
Copy link

I'm trying to create a Promise that resolves to a response stream, however, when I pipe the data to another stream the first bytes/chars are removed. Here is a simplified version of what I'm trying to accomplish:

var request = require('request');

new Promise(function(resolve, reject) {
    request.get('http://affenbits.com/').on('response', function(response) {
        resolve(response);
    });
}).then(function(response) {
    response.pipe(process.stdout);
});

Native http.ClientRequest doesn't strip anything:

var http = require('http');

new Promise(function(resolve, reject) {
    var req = http.request({
        hostname: 'affenbits.com'
    }, function(res) {
        resolve(res)
    });

    req.end();
}).then(function(response) {
    response.pipe(process.stdout);
});

Is there any way to avoid this behaviour?

I'm running node v0.12.9 & request 2.67.0

@simov
Copy link
Member

simov commented Feb 5, 2016

@request/promises any thoughts?

@analog-nico
Copy link
Member

Hi @luissquall , request returns a stream object already. Thus you don't need to use .on(...). This code works as intended:

new Promise(function(resolve, reject) {
    resolve(request.get('http://affenbits.com/'));
}).then(function(response) {
    response.pipe(process.stdout);
});

Anyway, the code you use doesn't use the stream but uses the IncomingMessage instead. I wonder why it doesn't work with Request. Let me dig deeper...

@analog-nico
Copy link
Member

Here is what I understand happened with the IncomingMessage in the case of request-promise versus http. Please handle the info with care because I don't know request's internals 100%.

request internally registers the .on('response', self.onRequestResponse.bind(self)) event of http.request(...). onRequestResponse serves the purpose of processing http's IncomingMessage by forwarding its data to the stream returned by request. Most importantly it forwards the IncomingMessage to the 'response' event of request to which the first code snippet registers to.

    request.get('http://affenbits.com/').on('response', function(response) {
        resolve(response);
    })

Thus the promise gets resolved but bear in mind that the .then(...) part is not executed immediately. In between the onRequestResponse function executes the following code:

    responseContent.on('data', function (chunk) {
      self._destdata = true
      self.emit('data', chunk)
    })

By that the IncomingMessage starts to get drained!

Since no one registered to the 'data' event and no one has called .pipe(...) yet, the streamed data gets lost. At least shortly after the streaming begun the .then(...) part gets executed:

.then(function(response) {
    response.pipe(process.stdout);
});

With that we got someone who is consuming the already streaming data. But only the last part - which streams in after .pipe(...) was called - is then printed to stdout.

As you can see the whole thing is pretty involved and not just a bug in request. IMHO .on('response', ...) is not designed to be used in this way by the user. The user should use the stream as suggested in my previous answer.

@luissquall
Copy link
Author

Thank you very much @analog-nico for the deep explanation, it makes sense.

resolve(request.get('http://affenbits.com/')); is not very helpful in my case because I would like to have more fine-grained control over the resolved promise: resolve when the response.statusCode is 200 else fail.

For instance:

request.get('http://affenbits.com/')
    .on('error', function(err) {
        reject(new Error(err));
    }).on('response', function(response) {
        if (response.statusCode == 200) {
            resolve(response);
        } else {
            reject(new Error(response.statusCode));
        }
    });

@analog-nico
Copy link
Member

You are welcome @luissquall .

You may use this code:

var stream = request.get('http://affenbits.com/');
stream.on('error', function (err) {
    reject(err);
});
stream.on('response', function (incomingMessage) {
    if (incomingMessage.statusCode === 200) {
        resolve(stream); // <-- This should do the trick!
    } else {
        reject(new Error(incomingMessage.statusCode));
    }
});

Of course this becomes a lot of boilerplate for a general use case. That is why I created request/request-promise#90 to cover this in Request-Promise as well. It will probably look like:

var rp = require('request-promise');

rp({ uri: 'http://affenbits.com/', resolveWithStream: true })
    .then(function (stream) {
        stream.pipe(process.stdout);
    })
    .catch(function (err) {
        /* Either non-2xx status code or technical error */
    });

Let me know if you would be interested in using Request-Promise with this new feature - namely the resolveWithStream option. Or maybe you would design the new feature a little different...

@luissquall
Copy link
Author

Couldn't make it work :/ Perhaps subscribing to 'response' cancels the original stream, nothing is piped to stdout:

A simplified test:

new Promise(function(resolve, reject) {
    var stream = request.get('http://affenbits.com/');
    stream.on('error', function (err) {
        reject(err);
    });
    stream.on('response', function (incomingMessage) {
        resolve(stream);
    });
}).then(function(response) {
    response.pipe(process.stdout);
});

On the second matter...I think it would be awesome but I'm not sure what 'success' should mean in a generic solution: a 2xx message or network success. resolveWithStream might be a test function.

@analog-nico
Copy link
Member

Sorry, you are right. When you call response.pipe(process.stdout); the stream emits an error with the message You cannot pipe after data has been emitted from the response. The problem is that .pipe(...) is not called synchronously.

All in all I would say streams don't work well together with promises. Proof me wrong but both are two competing approaches to handle asynchronous JavaScript programming. And both have of course their own areas where they are most useful.

I would suggest that you either don't do streaming and just work with promises - have a look at Request-Promise for that - or only use streaming without promises. I assume you are more familiar with promises so use them if you don't request megabytes of data. But if you really have to download megabytes in a single request then set up a clean stream chain so that you reduce the memory footprint of you node app - for what streams were designed for.

@hassansin
Copy link

Since the 'response' event is emitted before the data starts to drain, an easy fix would be to pause the response stream immediately inside 'response' event callback and thus stop emitting 'data' events. But when you pipe the stream inside promise 'then' callback, the stream resumes to flowing mode.

new Promise(function(resolve, reject) {
    request.get('http://affenbits.com/')
        .on('response', function(response) {
            response.pause();
            resolve(response);  
        });
}).then(function(response) {
    response.pipe(process.stdout);
});

@luissquall
Copy link
Author

Thank you @hassansin. Pausing the stream fixes the error.

I think the lib should cover this edge case, you might not want to consume the stream 'instantly', what do you think guys?

@analog-nico
Copy link
Member

@luissquall This topic recently came up in #2075 and @simov suggested to implement that into Request-Promise rather than Request. I can't speak for Request but I already added the issue #90 for Request-Promise.

@simov
Copy link
Member

simov commented Feb 17, 2016

That's a good fix idd. Additional option for pausing the response stream would do it, but I want to see more use cases before adding it to request.

@simov simov added the Promises label Feb 17, 2016
@hassansin
Copy link

I can mention another use use where I needed to verify the fingerprint before starting the stream:

return new Promise(function(resolve, reject) {
    request(options)
    .on('response', function(response) {
      response.pause();
      try {
        verifyFingerprint(response);
        return resolve(response);
      }
      catch (err) {
        return reject(err);
      }
    })
    .on('error', function(err) {
      return reject(err);
    });
  });

@stale
Copy link

stale bot commented Nov 23, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Nov 23, 2018
@stale stale bot closed this as completed Nov 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants