Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added transform option #195

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

feat: added transform option #195

wants to merge 1 commit into from

Conversation

wesleytodd
Copy link
Member

@wesleytodd wesleytodd commented Jun 11, 2020

Prior work:

Adds a transform option. Allows users to transform inflight requests.

Things to note:

  1. As discussed in previous threads, when using the transform option it can invalidate etag and last-modified. I added documentation to that effect and set the defaults to false when transform is passed.
  2. Range requests, again as discussed previously, will behave oddly if you are not aware that your stream might not get a complete file. Also documented with a warning to set acceptRanges: false if you cannot write a transform which handles this kind of request.
  3. Compared to previous PRs, I added the res and opts to the function, this should make it easier to implement transforms which support range and other types of requests.

I think this resolves all the open conversations on the topic I saw from before. Let me know what you all think!

@@ -455,37 +460,37 @@

* update range-parser and fresh

0.1.4 / 2013-08-11
0.1.4 / 2013-08-11
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my editor removed the trailing white space. This seems like a good thing, but I can make a separate commit if you think it worth it.

@dougwilson
Copy link
Contributor

Thanks @wesleytodd for helping pish this forward. I am generally skeptical that this type of option is truly worth all the complexity added, ultimately. Did you have a specific use case you are trying to achieve by this feature you can describe?

@wesleytodd
Copy link
Member Author

wesleytodd commented Jun 11, 2020

I do have a specific use case. I have an npm registry proxy, it caches the packuments and for now is re-writing the tarball urls before it writes them to disk. This is for robust npm client testing, and so I often open the registry on port 0. So every time I restart I have to blow away the cache because it points to the wrong port. I prefer to re-write on the other end, before sending it from the cache, which would enable me to re-use the cache. I was using serve-static for this, but without this feature in send I would have to extract my whole own file server. With this, I can just modify the json on the fly. Does that make sense?

Copy link
Contributor

@dougwilson dougwilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not all my comments yet, as it seems when you force push and i was in the middle of the review, all my pending comments are erased... i assumed it was ready to review by the review assignment.

`etag` and `lastModified` and to use the underlying file as it normally would.

*Note on range requests:* When a range request is sent, the read stream will
only read part of the file. If your transform is preforming replacements, or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would violate the actual http spec, as the range is on the actual recieved output. This will be a large issue for example if the transform doubled the output (maybe encode as hex); the web browser will send a range way past the end of the actual underlying file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think defaulting acceptRanges: false when transform is passed is a good compromise?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If transform starts removing pretty much all the features of this module, at what point is the feature actually just a separate functionality?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, default is not removal. In my use case I actually would force set etag and lastModified to true and acceptRanges to false anyway. To me that is not removing, just normal options setting for a specific use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, to clarify i mean the removal as in what happens by default when the transform option is enabled. Yes, folks can then understand the foot guns and re enable certain things, but many will use the defaults.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think the defaults are good mostly with the transform option, especially if we turn off range requests by default. FWIW, range requests are so uncommon in my experience. The only time I have ever used them was building a video player, but that is a very uncommon use case to optimize for.

If you are using the transform option, the default should be no range requests because it means you need to be very careful your transform can handle range requests. The content-length is a little less clear, but it really depends on your transform, so it makes sense to not set it if you use transform. Lastly for last-modified, this also depends on what your transform does so it makes sense to leave it up to the user.

The more I think about this, the more I think this is the right approach to a transform option default as long as it is well documented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree on some of this; I think we can make a transform work with all the features if we implement it instead along the lines of a virtual fs, as I poked at in another comment, and not like this implementation. This implementation, ultimately, is just a simple solution that ends up having a lot of pitfalls I think don't need to be there if done differently. Range requests are not uncommon in practice; the module implements specifically what is not uncommon around the web. A very common point (besides audio and video files) that use range requests are pdf files too. That is how the web browser can display the first page of a 50mb pdf without waiting around forever. The browsers make use of range requests for quite a variety of file types under the hood.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also generally think it is bad API design for the default setting of one option to change based on what another, different option is set to, as it just adds confusion as to what exactly is going to happen with a set of options, as the defaults are now dynamic to other factors.

A function which returns a stream which can be used to transform the data.
If this option is passed it will change the defaults of `etag` and `lastModified`
to `false`. If the transform stream changes the content length, it is also
responsible for updating the `Content-Length` header or changing to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely to be a large foot gun. Is there some way to just make this work without adding such a potential issue? For example, for the intended use cases of this, is it more likely than not that the content length will be different from before?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought hard on this. I don't think there is a good solution which doesn't rely on the user setting it if they are modifying the content. So in my case, I need to buffer the entire file to parse the json and then transform.

In this case, I can calculate the new length, but if you are doing something like escaping content in a CSV you might be able to do it mid stream and in that way you would not know the content length in advance. In this case, you might consider moving to chunked encoding to avoid it. One of the reference implementations just does that for you, but that is also very presumptuous.

TBQH, I feel that this is an advanced feature, requiring advanced understanding. That justifies, to me, documenting it and letting users "do the right thing" for their use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this feature is being implemented on the "wrong side" of this module's operations. As in, maybe the transformation should happen prior to the stream entering into this module, perhaps solving the virtual fs stuff folks want can also solve this as well, including keeping all the features of this module working an intact, like ranges, etags, etc.

I'm thinking like in a way, the transform is a kind of virtual fs in itself, where the contents, at least, are virtual.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, well if we allowed you to pass in the reference to fs then yeah we could do that. That said, I am strongly opposed to virtual fs implementations. There is so much complexity in that, it is just a really hard trade off compared to such a simple feature such as this. Have you ever dug into things like vinyl-fs? Maybe there are better more modern ones, but either way it is a big dependency to take on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to expand on what I mean by a virtual fs is not those enormous things. More like providing an interface that is much simpler to do the little but that this module needs like "hey, give me a readable stream for the file of this name" or some such. In that way, the returned stream in this case could have all the transforms added at that point, which would be below where things like ranging and similar are done. If the "virtual" thing didn't understand (or able to support) ranges, it would just return the full stream, for example. I am just thinking out loud here, to be honest. I just think having the transform part between this module and the fs seems like a better fit to get all the features working, but happy to hear other solutions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, ok then yeah I think I can get on board with that. So we would need fs.stat and fs.createReadStream. I think it would be pretty simple to do that. So in the case of doing a transform you would have to read the metadata and manage your own stat data based on the transform you are going to do. So in my case, I would have to modify the file size in the stat call by reading the whole file and applying the transform, then throw it out. I guess maybe I could cache it in memory, anyway it would push more of the complexity to the end user, but would allow you to be more technically correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. And to also be clear: I don't think we even need to copy those names / those specific interfaces. We can "dumb it down" to what is actually needed for this module to do it's operations and let the "default impl" map to those fs.* calls, but when users want to do something different, they get an API that is easier and only what they need to work with. For example, fs.stat returns a lot of things we don't need -- so no need to make the user implement a fs.stat interface. The same for fs.createReadStream especially, as that take a whole slew of options that this module is just never going to need to send, and hammering it down to a simplified interface also means that it is clear to the user what, exactly, they need to implement to get it working well.

I know we need to fix the TOCTOU issue, so perhaps we can roll those things together. But even if not, we would do well to make this interface not prevent us from resolving that issue, at least :) To be specific, the underlying changes that were being held is to (1) open a fd to the fs path, (2) stat the fd and then (3) read from the fd. I suppose this simplified interface could provide the same mechanical interface or something.

@wesleytodd
Copy link
Member Author

This is not all my comments yet, as it seems when you force push and i was in the middle of the review, all my pending comments are erased... i assumed it was ready to review by the review assignment.

Github really needs to fix that. Sorry I was fixing the build failures. I wont push anymore for a bit to be safe.

@dougwilson
Copy link
Contributor

How does this interact with a http HEAD request, by the way? We want to be sure, even if this is an advanced feature, it doesn't end up violating http itself. Ideally we should prevent the user from making such violations if possible.

@wesleytodd
Copy link
Member Author

wesleytodd commented Jun 11, 2020

How does this interact with a http HEAD request, by the way?

Great question, and one I did not test or think of. Now that you say it, I am confident this would be broken if you did a transform which modified anything represented in the headers values. Would it make sense to, when transform is passed, run the transform and read the file but not send it?

EDIT: also only do that when it is a file request

@dougwilson
Copy link
Contributor

Yea, that could probably work. It is of course hard to know what the transform is going to do, and the transform (as it is currently) cannot make any shortcuts to simply in the case of HEAD, which may be useful. I assume that a transform could change the Content-Type and other aspects, is that right? Ultimately, all the HEAD needs to do is just return the same headers as the same request would have returned if it were a GET method (including Content-Length, if GET would set that).

@wesleytodd
Copy link
Member Author

I assume that a transform could change the Content-Type and other aspects

Yeah, it could for sure do that. With your idea above of a fs wrapper I think we could get the behavior you are describing. To be honest though, we can get it also from this implementation.

If the only requirement is to be technically correct, then I think I prefer the transform interface from here over the extra complexity a user would need to have to get this with building their own fs.createReadStream and fs.stat.

@dougwilson
Copy link
Contributor

If the only requirement is to be technically correct

I mean, I would say that I think the "requirement" is that the user can implement a transform without scracificing all the other features in this module by default. As in, they should be able to easily implement a transform that "just works" with the module and it's feature set that makes this module actually more valuable than fs.createReadStream(...).pipe(res).

then I think I prefer the transform interface from here over the extra complexity a user would need to have to get this with building their own fs.createReadStream and fs.stat.

Ideally I tried to clear this in a comment above; I'm not suggesting the user copy the interface of those two fs methods, as they do more than is necessary for this module. The two most popular requests for this module is transform and virtual fs. I think we can actually solve both at the same time, and to me, that is worth it just based on what the users seem to be wanting.

@dougwilson
Copy link
Contributor

dougwilson commented Jun 12, 2020

I do have a specific use case. I have an npm registry proxy, it caches the packuments and for now is re-writing the tarball urls before it writes them to disk. This is for robust npm client testing, and so I often open the registry on port 0. So every time I restart I have to blow away the cache because it points to the wrong port. I prefer to re-write on the other end, before sending it from the cache, which would enable me to re-use the cache. I was using serve-static for this, but without this feature in send I would have to extract my whole own file server. With this, I can just modify the json on the fly. Does that make sense?

I am definately not following on what your use-case means and how, exactly, the transform of this module actually helps in this case, haha. Are you saying that you have a bunch of files on disk you want to serve up, and then some of those are json, in which you want to read the json, modify it, and then return that as the response? If so, I can show you a solution that does all of that very easily without this functionality even being necessary. But I think I'm just not understanding the underlying ask, perhaps, not sure. Because unless these json files are gigabytes, doing what you are describing as a transform stream seems unnecessarily complicated compare with what is already possible with express today.

@wesleytodd
Copy link
Member Author

If so, I can show you a solution that does all of that very easily without this functionality even being necessary.

Yes, but remember I currently have serve-static in between. I know I can use this module directly to do this, but I could not find a way to also have all of the serve-static logic and the transform.

I also could add this functionality at the serve-static layer for my use case. I only did it here because it looked like there was already some agreed upon progress. If we think it would be better to move this up one level, that would also work for me.

If not, I will take a stab at doing an abstraction over the fs calls tomorrow and see what I come up with.

I think your concerns are valid, I just worry about overcomplicating things for less common use cases. For example, writing a transform stream on a pdf. While I totally understand that there are many specific use cases for range requests, those are not commonly the use cases where you would use a transform stream.

Lastly, I fully agree we should never break spec, and also should "do the right thing" by default. But we cannot also let perfect get in the way of shipping the things that are useful to the end users.

@wesleytodd
Copy link
Member Author

Actually, without support here, wouldn't that just move all of the complexity we are discussing into that layer? I guess that might also not be a perfect solution.

@dougwilson
Copy link
Contributor

Yes, but remember I currently have serve-static in between. I know I can use this module directly to do this, but I could not find a way to also have all of the serve-static logic and the transform.

Adding caveats doesn't help :) I'm sure you can keeping adding them all day until you eventually describe what you're trying to achieve. Is it possible to share the code at all? My understanding above would still work even with your caveat, though.

Actually, without support here, wouldn't that just move all of the complexity we are discussing into that layer? I guess that might also not be a perfect solution.

No, but also I cannot say for sure unless you can share what you are trying to achieve so we can actually work together on a solution. Right now, there is not enough information unless I just accept your solution here for what it is. I want to help make a good, workable solution, but to do this, it needs to start at the goals what what it is that is trying to be accomplished in a meaningful way, so alternative solutions can be thought of and measured against that high-level goal.

@wesleytodd
Copy link
Member Author

I added you on the private repo where I am working on this. One thing to remember, I took up this PR because it was existing work and it also solved my need. There are other ways I can solve my problem, but I wanted to take this opportunity to also move forward the more generic work here. So if we can use my direct use case as an example, but think about how a more generic solution can be added here for others to use, that would be awesome.

@wesleytodd
Copy link
Member Author

Also an aside, this is where having a good real time chat for the team would be nice 😄. Slack even has pretty decent video chat and screen share.

@dougwilson
Copy link
Contributor

The reason those did not land is due to the issues mentioned in those PRs, and this PR doesn't actually resolve those issues. On top of that, I ultimately think it is the wrong solution to the underlying goal, and the initial PRs simply just did what looked easy, but it turns out, it is much more complex and those PRs just didn't want to solve that complexity. Landing something "just because" feels very wrong to me. I think we need to provide good solutions to the issues and now just around the "hey, add this 1-2 lines to your source and then it works, but with all these caveats that are ultimately going to make it hard to use in practice" :)

@wesleytodd
Copy link
Member Author

Yeah I agree completely that if it is hard to use in practice it is not worth landing. My point above is that "what is use in practice"? Because that usually doesn't mean every feature all the time working together. That said, I take your feedback and would rather move on to trying to see what a fs wrapper would look like because that can solve all of the problems well.

@@ -78,6 +78,8 @@ the 4th byte in the stream.

Enable or disable etag generation, defaults to true.

This will default to `false` if the [`transform` option is passed](#transform).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is misleading; if the transform option is passed, but as transform: undefined, I assume this will not default to false, but the wording says it will as long as the option is passed (without regard to what value is passed).

@@ -104,6 +106,8 @@ in preferred order.
Enable or disable `Last-Modified` header, defaults to true. Uses the file
system's last modified value.

This will default to `false` if the [`transform` option is passed](#transform).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is misleading; if the transform option is passed, but as transform: undefined, I assume this will not default to false, but the wording says it will as long as the option is passed (without regard to what value is passed).

##### transform

A function which returns a stream which can be used to transform the data.
If this option is passed it will change the defaults of `etag` and `lastModified`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is misleading; if the transform option is passed, but as transform: undefined, I assume this will not default those to false, but the wording says it will as long as the option is passed (without regard to what value is passed).

@@ -119,6 +123,55 @@ Serve files relative to `path`.
Byte offset at which the stream starts, defaults to 0. The start is inclusive,
meaning `start: 2` will include the 3rd byte in the stream.

##### transform

A function which returns a stream which can be used to transform the data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be useful to point to what, exactly, "a stream" means. I assume it means it needs to be a Node.js ReadableStream ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it is addressed below, which seems like duplicate wording. I guess that it is explained down below, but maybe it doesn't need to be listed twice, idk on this comment, but left it for posterity.

`Transfer-Encoding: chunked`.

The transform function is given the readable stream from `fs.createReadStream`,
the `response` object and, the `options` passed to `send()`. It must return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of passing in the options to this transform function? Remember that there is those APIs like sendStream.etag(false) that changes the setting, but is not reflected in that options object; a transform implementer may not realize that, so if those are not important, we should strip them out. Or if they are important, they should get added in from the root source of the setting values.

can set `acceptRanges: false` to disallow range requests or you will have
to ensure that your transform logic can handle partial data. For an example
of a transform which can work on range requests, see the test labled
`should transform range requests`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the test, it is honestly hard to tell what exactly is going on there. Is there a better real-world transform like the tobi one that can be used instead? This is mainly because the docs are pointing folks to look there.

read: function () { }
})
stream.pipe(concatStream(function (d) {
var _d = Buffer.from(d.toString('utf8').replace('tobi', 'not tobi'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that this example does not work with range requests; just above this it says if it doesn't work with them, be sure to turn them off. So the example should probably head the notice from the docs above as a good example :)

// headers for their stream type.
res.removeHeader('Content-Length')

stream = this._transform(stream, res, options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So since the original stream is overwritten and now lost; does this open open the floor to fd leaks again? Does using a transform stream mean users now need to be careful not to cause them with the fs steam encounters an error, or similar?

@dougwilson
Copy link
Contributor

Yeah I agree completely that if it is hard to use in practice it is not worth landing. My point above is that "what is use in practice"? Because that usually doesn't mean every feature all the time working together. That said, I take your feedback and would rather move on to trying to see what a fs wrapper would look like because that can solve all of the problems well.

Yea, sorry, I am trying to articulate myself over this forum, so please bear with me if it is not quite coming through :) I think the point I am trying to make is that ideally a system does not have "two modes of operation" for example -- if using a transform stream by default will cut off a large amount of functionality this module is providing, at what point is it just a different module. For example, you can certainly get by with just fs.createReadStream(...).pipe(res). It seems that the transform option, as implemented, basically turns off mostly everything in this module that makes it more than that snippet, essentially creating a single API (the send stream) that operates very differently depending on a single option.

But apart from that, I think we can actually implement transform in a way that doesn't require turning off all those features, which is really what I'm trying to drive towards. Of course, users can still turn off features they do not want, but those features add a lot of value, and I think if transform is provided by the platform, it should feel like it's part of the platform, letting you take advantage of all the features the platform is providing (etags for caching, range requests for partial downloads / download resumption, etc.). And from the two main feature requests from this module (transform + virtual fs), I think we can provide an interface that can be both easy to use and solve both things, which would be the biggest win for users and ourselves.

@dougwilson
Copy link
Contributor

dougwilson commented Jun 12, 2020

I added you on the private repo where I am working on this.

Ok, so you are not using Express, I see. That would make the solution more complicated, because the added crutch added by trying to implement everything over again that Express provides, which seems a bit odd, really. I see quite a bit of things in there that is already Express sugar, but the bolt ons actually provide spec footguns that Express takes away :)

Regardless, where is the spot in the code in which you would need to place this functionality, exactly? I assume at the one spot where the serve-static module is used at?

@dougwilson
Copy link
Contributor

Ok, I see where in there. The way you are using the router, you can implement this and reduce your overall complexity a lot. I think were you may get mixed up in that, for your packuments, those are not actually "static" for your use-case, but here you are, trying to use something created for static use-cases (I mean, it's part of the name of the module you imported 😆 ). I see what you mean by wanting to move the transform. I assume I can make a PR to your repo? If so, I would be happy to throw one up tomorrow, if you are willing, to see what I mean. I think it would reduce a lot of the complexity in your impl there, ultimately, and get what you are trying to achieve.

@wesleytodd
Copy link
Member Author

Happy to have a PR, but I don't want you to spend your time if it just to use send directly and do send(...).pipe(transform).pipe(res). I can do that version of it myself. The reason it is the way it is today is because I thought I could get away with it being purely static. When I realized I could not, I figured one way I could solve it was finishing this PR.

@wesleytodd
Copy link
Member Author

Also, happy to hop on a call, because it might be faster. The problem here is that I shared a work in progress, hence the private repo, so it is messy.

Since it is clear this PR will not move forward, I think we should close all of these so that noone else makes the same mistake I did with trying to pick it up again. Then I can work on my project with just send and a new api to implement the transform as discussed above.

@dougwilson
Copy link
Contributor

but I don't want you to spend your time if it just to use send directly and do send(...).pipe(transform).pipe(res).

Gotcha. So I wasn't going to just change it to send(...).pipe(transform).pipe(res), haha. I don't think that would work, anyway, as you cannot just pass any steam as the argument to this module's .pipe, for instance, as this module is not actually a standard Node.js stream: it needs to write headers to the response and such for the features to work, for instance.

Also, happy to hop on a call, because it might be faster. The problem here is that I shared a work in progress, hence the private repo, so it is messy.

I am always happy to do so as well anytime; we don't have a "norm" for that, so would need to ad-hoc it somehow if you like; I think you have my phone number for example. While we were in the TC meeting yesterday, I actually opened for discussion the idea to have a "realtime" communications area, to see what folks think. If that gets going, that could be a method in the future (but for the time being, you can email me to set up something I suppose as a solution, lol).

I think we should close all of these so that noone else makes the same mistake I did with trying to pick it up again.

Hm... I guess that makes sense. I feel bad closing folk's PRs on them, but looking back on at least one (I didn't look at all before responding here), I guess they were stale in that I provided some feedback and OP didn't respond back. I can see from that POV. There is actually a top-level issue about transform that helps, as I think the idea to transform is valid. Just looking at the age of at least one of those PRs, I think once the virtual fs requests came up is when I started to form the new idea about getting them both done together, as I think I tried to articulate above (again, sorry if it's not clear, please be gentle; it's hard to share a large type of design over comments in GH, so probably need to whiteboard them and/or write up a technical document to describe what I'm thinking of to convey it better).

@wesleytodd
Copy link
Member Author

wesleytodd commented Jun 12, 2020

Ok, so I spent a bit this morning looking at how we could provide a "synthetic filesystem". The first challenge will be that we expose the underlying stat data in our api via the file event. We use stat.size, stat.mtime and stat.isDirectory, but then pass the full stat to self.emit('file' ... . So if we were to shoot for non-breaking changes we would need an api which would allow for replicating the entire stat object even if we only use part of it.

I think there are reasonable breaking changes we can make, but since I don't think this is nearly important enough to hold up express@5 for, it would take years to roll out across the board. I am not sure the trade-off would be worth it on that timescale, so I think we should shoot for an api which allows for backward compatibility. If you, @dougwilson, agree with this I will continue my experimentation with this in mind.

@wesleytodd
Copy link
Member Author

OK! I did a bunch more experimentation and I found an interesting solution which enables what I want while also not requiring any changes in this lib. Posting here to see what you think, and if it seems fine to you, I will publish the work as a module and we can include an example of how to use it.

function transformResp (res, transform) {
  transform.pipe(res)
  return new Proxy(res, {
    get: (target, prop) => {
      if (prop === 'write' || prop === 'end') {
        return (d, e, c) => transform[prop](d, e, c)
      }
      return target[prop]
    }
  })
}

And here is an example of using it:

const buf = []
const s = new TransformStream({
  transform: (d, e, cb) => {
    buf.push(d)
    cb()
  },
  flush: (cb) => {
    const json = JSON.parse(Buffer.concat(buf).toString('utf8'))
    // transform json...
    const b = Buffer.from(JSON.stringify(json), 'utf8')
    res.setHeader('content-length', b.length)
    cb(null, b)
  }
})

// Send file
send(req, '/file.json').pipe(transformResp(res, s))

Obviously this is not quite complete, there are other properties you would need to proxy to get the full behavior of the stream, but it illustrates the general point. With this I am able to use the same transform stream in HEAD requests and when I proxy to an upstream server, which was my goal with adding the transform option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants