Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added file upload support to lambda. Supports multipart. Includes tests. #3676

Closed
wants to merge 32 commits into from

Conversation

k00k
Copy link
Contributor

@k00k k00k commented Jan 14, 2020

Fixes #1419, #1703 file upload using lambda.
Based on original PR #1739

TODO:

  • Update CHANGELOG.md with your change (include reference to issue & this PR)
  • Make sure all of the significant new logic is covered by tests
  • Rebase your changes on master so that they can be merged easily
  • Make sure all tests and linter rules pass

After needing graphql-upload support in apollo-server-lambda, we (me and @smschick) found PR #1739 that was created a while ago and never was updated after @abernix had some comments on the PR. The main thing that @abernix had mentioned was being sure that the file upload process was happening in the right order.

This PR takes those comments and some of the work from the original PR and hopefully solves the issue. Thanks to @charleswong28 for his work on the orig PR.

Please let me know if there's anything you'd like to see. Currently all tests are passing which includes single and multiple file uploads.

@apollo-cla
Copy link

@k00k: Thank you for submitting a pull request! Before we can merge it, you'll need to sign the Apollo Contributor License Agreement here: https://contribute.apollographql.com/

@k00k k00k mentioned this pull request Jan 14, 2020
4 tasks
@k00k k00k requested a review from abernix January 14, 2020 22:27
@jacky-ttt
Copy link

@k00k I tried your pull request. I can upload image to s3 via apollo-server-lambda locally using serverless-s3-local. But when I upload the code to a real aws lambda. The file saved cannot be read correctly. Maybe the encoding is wrong. Have you encounter such problem? Do I need to config something on AWS Lambda?

@k00k
Copy link
Contributor Author

k00k commented Jan 20, 2020

@jacky-ttt I'll take a look. In the meantime, can you share your code?

@k00k
Copy link
Contributor Author

k00k commented Jan 20, 2020

@jacky-ttt After some testing, I believe I was able to replicate the problem you were having. This is a guess, of course, because I don't know exactly all of the steps you did or didn't take. But I was able to upload locally and get "corrupted" files when sent through a real Lambda.

I'm guessing it's likely due to needing to set API Gateway's Binary Media Types. In order to pass binary data via multipart/form-data, you need to add that as a Binary Media Type in your API Gateway's settings for that lambda.

I've published an example repository that takes care of this for you if you deploy with the Serverless Framework. Otherwise, the README outlines how to do it manually.

Please let me know if that does in fact solve your problem.

If anyone else has other/better ideas on how to handle that, please feel free to chime in. But I believe it's likely out of the scope of this repo and thus shouldn't affect the merging of this PR.

@jacky-ttt
Copy link

@k00k Thanks for the quick reply. Sorry for not share some code in the first place. The example repo is really helpful. Your suspicion is correct. What I miss is to use serverless-apigw-binary to add binary media type, multipart/form-data.

@aymericbouzy
Copy link

@abernix I'm very much interested in this feature. I was wondering how much time is usually needed for such a PR to be validated and released? Thanks in advance. And thank you so much for everything that is being done on Apollo Server ❤️

@k00k
Copy link
Contributor Author

k00k commented Jan 24, 2020

@abernix and @martijnwalraven is there anything we can do to help get this PR moving forward?

@mrfoster
Copy link

I am still hitting the following error

This implementation of ApolloServer does not support file uploads because the environment cannot accept multi-part forms

Thrown here: https://github.com/abcya/apollo-server/blob/fb60598286e9f45e55e3ccb197f1ce11b337014a/packages/apollo-server-core/src/ApolloServer.ts#L280

Which is caused by supportUploads always returning false in the superclass: https://github.com/abcya/apollo-server/blob/fb60598286e9f45e55e3ccb197f1ce11b337014a/packages/apollo-server-core/src/ApolloServer.ts#L670

It looks like it should be as simple as adding the following in the ApolloServer subclass.

  supportsUploads(): boolean {
    return true;
  }

I mentioned it before here:
#1739 (comment)

I'm bit puzzled how it could be working at all unless it's still being executed against a very old version.

@k00k
Copy link
Contributor Author

k00k commented Jan 27, 2020

@mrfoster

You're correct, I've added the supportsUpload to the subclass. Good catch.

As to why I and a few others (and the automated tests) didn't have trouble with this originally, I agree, very weird. When you tested, were you supplying an uploads property/obj when instantiating ApolloServer? What version of Node were you testing on? I was mainly testing with 12.14.1 (current Lambda latest).

In any case, please pull and give it a shot now, should be fine.

@mrfoster
Copy link

@k00k Thanks for adding that.

Yes, uploads option is being supplied to set maxFileSize. Using node v10.16.3 with typescript 3.7.4.

I am actually copying apollo-server-lambda files into my project to get it working because I ran into some issues building it. Very much looking forward to using a npm package once it's been accepted.

Thanks to you and @charleswong28 for sorting it, much appreciated.

@aymericbouzy
Copy link

Thanks @abernix! In fact I don't see the Artifacts tab ... but your direct link works! Seems to be a weird bug of CircleCI.

Screenshot 2020-01-31 at 16 35 38

npm install https://59244-56760633-gh.circle-artifacts.com/0/packed/apollo-server-lambda/apollo-server-lambda-2.9.16.tgz

Will try this way and provide feedback 👍

@abernix
Copy link
Member

abernix commented Jan 31, 2020

Interesting! Thanks for the note, @aymericbouzy.

That is indeed peculiar/buggy, more so since it shows the artifact tree without the tab when I go to that link in Incognito mode. It looks like the tab reliably shows, so long as you're logged in as SOME user in their "New UX" experience: https://app.circleci.com/jobs/github/apollographql/apollo-server/59244/artifacts

Looking forward to knowing how your use of it goes! (No hurry.)

@aymericbouzy
Copy link

I managed to make it work 🎉

Some feedback on my "developer experience" :

  1. using the artifact was a breeze 👏

  2. I'm using Serverless Framework : I had to add the following configuration to make it work (with provider aws):

provider:
  apiGateway:
    binaryMediaTypes:
      - "*/*"

I thought this was not very well documented, so including it in the documentation will help people get started I believe. What's particularly hard about this bug is that when this property is missing, the file simply shows up as corrupted, but there are no errors per say, no warning, nothing to point you at what is wrong about your setup. It is not defined this way by default, which may be an issue with Serverless Framework? I don't know why it is this way, but at the moment, this is a real footgun.

  1. Since this feature hides away the implementation details and the file is provided as a stream, it might not be clear for the user that the file is actually not streamed into the AWS Lambda : the handler receives the whole file in the event, and the file is simply provided as a stream to be consumed. But this is not memory efficient in any way (unless I totally misunderstand this) and configuring the limits in terms of file size seems important since AWS Lambda runs in a limited memory environment. However, the way to do this seems undocumented at the time being https://www.apollographql.com/docs/apollo-server/data/file-uploads/ even if the option seems to be available when I read the source.

  2. I had seemingly unrelated CORS errors popping up when I tried this : apparently I shouldn't have been sending Authorization: null header. I fixed it by not sending this header when I had no access-token : this has nothing to do with this PR per say, but I think it's likely people may have issues with CORS when opting in with this feature ? I still don't fully grasp how all this CORS thing work, and maybe the documentation should be improved. https://www.apollographql.com/docs/apollo-server/deployment/lambda/

  3. I would also have appreciated having a direct link to the "client side" of things regarding file uploads in the "server side" documentation because even if I'm a backend developer, I wanted to create a micro client app to test everything was working correctly.

  4. My use case was to transfer the received file to AWS S3. I tried providing the stream as is, however it failed because AWS S3 requires to send the ContentLength and couldn't determine the content length before consuming the stream. It would have greatly helped if I had had access to the size of the file in addition to its mimetype and encoding. It seems to me this information could be provided at no cost, but I might be wrong. I ended up providing it as a Buffer by creating this small util:

const streamToBuffer = stream =>
  new Promise(async resolve => {
    const chunks = []
    stream
      .on("data", data => chunks.push(data))
      .on("end", () => resolve(Buffer.concat(chunks)))
  })

I guess I have to open up an issue on https://github.com/jaydenseric/graphql-upload ?

  1. stream property is now deprecated, and one should use createReadStream() instead: this is not documented correctly at the moment.

Overall : the code works really well. There are a number of loop holes in the documentation that could easily be fixed I guess. I'm willing to suggest some changes to the documentation through some PRs 👍 I can also create separate issues if needed.

@k00k
Copy link
Contributor Author

k00k commented Feb 3, 2020

@aymericbouzy Regarding some of the points that you highlighted, I had made an example repository that talked about the binary media types for API GW. It also has it all working with s3, were you not aware of this example? If you were and it wasn't helpful, please let me know why so I can make it more so.

Copy link
Member

@abernix abernix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments throughout. I'm glad that the current implementation is working for some right now, but I think there needs to be more pattern alignment with other packages in this repository and improvements to TypeScript typings to call it ready.

It's really worth noting that, while I certainly appreciate the efforts here, we are actively working on removing individual integrations, like apollo-server-lambda, apollo-server-express, etc.) as part of the #3184 proposal in the Apollo Server 3.0 roadmap (#2360 ). Those are to be replaced with a more ubiquitous transport.

That's to say, this work may need to be re-visited when that happens. That's not to say that this work will be fruitless though, as there are learnings that will need to be applied to the new approach too, but I'm hopeful that the patterns and duplication across integrations will be substantially reduced.

The first work of that new transport work is appearing (recently) in #3576, but it's still not quite ready. Feel free to tinker with the idea though if you feel so compelled, as it is functional, but it's currently just an HTTP transport and there's no Lambda handler for it just yet. The expectation is that there will be off-the-npm-shelf npm packages, much like the aws-serverless-express I linked above, that will bridge the gap and coerce the special event object into a more generic interface (namely, something like http.IncomingMessage) which can be used in the same way across the various implementations.

Comment on lines +34 to +38
export const NODE_MINOR_VERSION: number = parseInt(
process.versions.node.split('.', 2)[1],
10,
);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for tests and our tests only use the latest major versions of these servers. Let's remove this and the bits below that utilize it. Node.js 8 was also never LTS until 8.9.

// This translates the arguments from the middleware into graphQL options It
// provides typings for the integration specific behavior, ideally this would
// be propagated with a generic to the super class
createGraphQLServerOptions(
event: APIGatewayProxyEvent,
context: LambdaContext,
context: LambdaContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
context: LambdaContext
context: LambdaContext,

): Promise<GraphQLOptions> {
return super.graphQLServerOptions({ event, context });
return super.graphQLServerOptions({ event, context});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave out unrelated formatting changes.

Suggested change
return super.graphQLServerOptions({ event, context});
return super.graphQLServerOptions({ event, context });

@@ -213,19 +225,29 @@ export class ApolloServer extends ApolloServerBase {
}
}

const callbackFilter: APIGatewayProxyCallback = (error, result) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is a pattern that started in the other PR, but any reason to not just declare the response before the existing representation of callbackFilter, reference response.end() inside of it and leave the arguments passed to the function returned from graphqlLambda the same? (i.e. (event, context, callbackFilter))?

I'm just missing the justification for changing so many lines here.

...requestCorsHeadersObject,
const makeCallbackFilter = (doneFn: Function): APIGatewayProxyCallback => {
return (error, result) => {
doneFn(); // needed to close the response
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my above comment, why not just call response.end() here?

request.headers = event.headers;
request.headers["content-type"] = contentType;

processFileUploads(request, serverParams.response, serverParams.uploadsConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, okay, based on the casting, I now think you're trying to simulate the behavior of an http.IncomingMessage which extends stream.Readable and events.EventEmitter.

I understand the need to do that, and I fully expect that's what graphql-upload would expect, but I think this PR needs to find a more structured and strongly-typed way to do that.

Perhaps it's worth using something like this strongly-typed http-lambda package (though this seems to be published into an npm namespace I didn't find right away; perhaps https://github.com/awslabs/aws-serverless-express is an alternative) to coerce the event into an http.IncomingMessage?

To be honest, I'd rather we just actually instantiate an http.IncomingMessage by invoking constructor (i.e. new) rather than using a stream.Readable as if it were the right interface. We could create a subclass of stream.Readable that implements the necessary properties, rather than just setting random properties (e.g. headers) onto a stream instance. Does that make sense?

Comment on lines +106 to +113
} else {
// standard query/mutation - no files uploaded
runQuery(JSON.parse(event.body));
}
} else {
// just query string params
runQuery(event.queryStringParameters);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting these to else if's at the same depth would be preferable over the if/if/else/else setup.

import {
renderPlaygroundPage,
RenderPageOptions as PlaygroundRenderPageOptions,
} from '@apollographql/graphql-playground-html';

import { graphqlLambda } from './lambdaApollo';
import { Headers } from 'apollo-server-env';
import stream from 'stream'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import only the symbols which are necessary, rather than the whole module.

Suggested change
import stream from 'stream'
import { Writeable } from 'stream'

@@ -213,19 +225,29 @@ export class ApolloServer extends ApolloServerBase {
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make reference to this again in comments below, but is there a reason that most of the logic that's being added in lambdaApollo.ts (below, in this PR) isn't added here instead (before graphqlLambda is ever kicked off)?

This is more of what I was thinking with my comment on the other PR (https://github.com/apollographql/apollo-server/pull/1739/files#r288265931) and would make this match the other integrations. This is most important from a maintainability standpoint over the the Apollo Server project as a whole since this divergent pattern makes it difficult to grok what's going on (particularly by starting a new pattern which passes two option objects to graphqlLambda which already has an options property for its GraphQLOptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the reason that everyone (me and the orig PR) has been organizing the code to process the files first, before doing the actual query is because the processFileUploads function's response is actually needed for the query body when it is a multipart form. We're struggling to follow the same design principles of the other integrations (Express, HAPI, Fastify, etc) because those integrations all have the benefit of actual proper middleware implementations at the server level and that's where those implementations process the uploads. The lambda server/proxy doesn't really have that level of middleware as I understand it. And I'm pretty sure we're not going to get into a complete re-write of that part here. However, in those other integrations, middleware acts much the same way as I understand it, by handling the file upload first and then performing the query.

All that said, I did take another stab at it, moving the logic out to ApolloServer.ts. The graphqlLambda function kicks off the chain, and we kinda cheat by passing processEvent as the first arg so that we can await the file upload processing in lambdaApollo.ts. Not sure it really changes anything though, as we still wait for the upload to finish before running the query, but that seems to be the nature of the beast. If you'd like to see this WIP, you can look at the work in this branch: https://github.com/abcya/apollo-server/tree/event-rework/packages/apollo-server-lambda/src
Please ignore the any typings all over the place, was just trying to prove a concept quickly. Would definitely need some work there, just didn't want to put in too much effort without discussion (I'm not convinced myself).

The more time we spend on this, the more I see why the original PR structured it the way he did with the .then(), and I actually think it's a decent approach given the limitations we're up against.

request: {
url: event.path,

const runQuery = (query:any) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the type of this query be a HttpQueryRequest?:

export interface HttpQueryRequest {

@abernix
Copy link
Member

abernix commented Feb 7, 2020

I should add, most of all: I still think this is a valuable addition to the v2 series. If you’re willing to keep working through it, I’m happy to see it through!

@k00k
Copy link
Contributor Author

k00k commented Feb 19, 2020

@abernix any feedback on my reply about organization of logic?

@ghost
Copy link

ghost commented Feb 28, 2020

@abernix any feedback on @k00k 's reply about organization of logic? This is getting stagnant and we'd love to get this implemented... or at least get some closure.

@mathix420
Copy link
Contributor

any news there ?

@erickrawczyk
Copy link

@abernix any way to help move this forward?

@mathix420 mathix420 mentioned this pull request Mar 28, 2020
4 tasks
abernix added a commit that referenced this pull request Apr 10, 2020
Aim to provide file-upload parity support — as already supported within
the other Apollo Server integration packages — via the third-party
`graphql-upload` package.

Co-authored-by: charleswong28 <chung.triniti@gmail.com>
Co-authored-by: Steve Babigian <steve@noisykid.com>
Co-authored-by: Jesse Rosenberger <git@jro.cc>
Closes: #1419
Closes: #1703
Supersedes: #1739
...and therefore...
Closes: #1739
Supersedes: #3676
...and therefore...
Closes: #3676
@abernix
Copy link
Member

abernix commented Apr 10, 2020

Superseded by #3926, which has merged into release-2.13.0. An alpha release has been published on the alpha npm tag which includes #3926, and can be installed with npm install apollo-server-lambda@2.13.0-alpha.0. Please try it out and report any issues on a new issue which references #3926.

@abernix abernix closed this Apr 10, 2020
@mateo2181
Copy link

@jacky-ttt After some testing, I believe I was able to replicate the problem you were having. This is a guess, of course, because I don't know exactly all of the steps you did or didn't take. But I was able to upload locally and get "corrupted" files when sent through a real Lambda.

I'm guessing it's likely due to needing to set API Gateway's Binary Media Types. In order to pass binary data via multipart/form-data, you need to add that as a Binary Media Type in your API Gateway's settings for that lambda.

I've published an example repository that takes care of this for you if you deploy with the Serverless Framework. Otherwise, the README outlines how to do it manually.

Please let me know if that does in fact solve your problem.

If anyone else has other/better ideas on how to handle that, please feel free to chime in. But I believe it's likely out of the scope of this repo and thus shouldn't affect the merging of this PR.

@k00k I followed your comment and follow the steps in your repo but I can't uplod images with my lambda function. I was working with serverless-offline I everything works fine, but I had not success with my function in AWS.

Do you know which can be the problem?

Thanks in advance!

@k00k
Copy link
Contributor Author

k00k commented Apr 29, 2020

@k00k I followed your comment and follow the steps in your repo but I can't uplod images with my lambda function. I was working with serverless-offline I everything works fine, but I had not success with my function in AWS.

Do you know which can be the problem?

If you did either the manual method or used the serverless-apigw-binary plugin, that should do it. Not sure outside of that.

@mathix420
Copy link
Contributor

@k00k I followed your comment and follow the steps in your repo but I can't uplod images with my lambda function. I was working with serverless-offline I everything works fine, but I had not success with my function in AWS.

Do you know which can be the problem?

Thanks in advance!

It can be plenty of things did you watch your lambda logs ? Can you provide us more details

@mateo2181
Copy link

Yes, I had an error connecting with Cloudinary. Now is working perfectly!
Thanks!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File uploads using apollo-server-lambda
9 participants