Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Astro Env #894

Open
wants to merge 42 commits into
base: main
Choose a base branch
from
Open

Astro Env #894

wants to merge 42 commits into from

Conversation

florian-lefebvre
Copy link
Member

Summary

astro:env - Improve DX and security around environment variables in Astro

Links

proposals/0046-astro-env.md Outdated Show resolved Hide resolved
proposals/0046-astro-env.md Outdated Show resolved Hide resolved
proposals/0046-astro-env.md Outdated Show resolved Hide resolved
proposals/0046-astro-env.md Outdated Show resolved Hide resolved
proposals/0046-astro-env.md Outdated Show resolved Hide resolved
proposals/0046-astro-env.md Outdated Show resolved Hide resolved
proposals/0046-astro-env.md Outdated Show resolved Hide resolved
proposals/0046-astro-env.md Outdated Show resolved Hide resolved
Co-authored-by: Nate Moore <natemoo-re@users.noreply.github.com>
@matthewp
Copy link
Contributor

matthewp commented Apr 18, 2024

I agree with @bholmesdev that we should align the APIs. I think I tend to prefer making public/private/client/server properties of the options object rather than chaining. This mostly comes down to the discussion above about what order chaining should happen.

In a fluent API (another name for chaining) the expectation is that they can be chained in any order as they are operating on a common this value. If that's not the case and ordering matters I would find using chaining to be unintuitive.

If we go with an options object we can make the types work by making them be a discriminated union where private and client can't be used together.

@matthewp
Copy link
Contributor

Some questions I have on the client and server options are:

  1. Should it be required that you provide at least one of these?
  2. If not required then what is the default? That it be applied to both?

I think I lean towards making at least one of these be required rather than picking a default, because accidentally bringing in values into the client, even if "public" would be a footgun.

@matthewp
Copy link
Contributor

On the topic of getSecret() in the client, I think it should throw if you call that function. The only reason for it existing at all is because of TypeScript. There's no use-case for calling it on the client, so we should throw.

@bluwy
Copy link
Member

bluwy commented Apr 18, 2024

In a fluent API (another name for chaining) the expectation is that they can be chained in any order as they are operating on a common this value. If that's not the case and ordering matters I would find using chaining to be unintuitive.

I don't think we should strictly follow this. Instead I think we should follow a zod-like API so it feels unified. Zod also has certain cases where certain chains can only be used after another, for example z.string().datetime(). If we opt for an pure-options based config, then it doesn't feel as unified (to me).

Also, Zod supports .optional() and .default() instead of { default/optional: boolean }.

EDIT: The fluent API wiki also doesn't explicitly mention that methods need to / can be in any order.

@Princesseuh
Copy link
Member

Princesseuh commented Apr 18, 2024

No comments from me, looks neat. I like the builder API for the .public() and stuff, always looks cute. I agree with bluwy regarding fluent vs options

@matthewp
Copy link
Contributor

I think we should have the fluent vs. options API discussion externally. It will probably block this RFC from being fully accepted but shouldn't affect implementation very much. I'll start a thread elsewhere.

@Tc-001
Copy link

Tc-001 commented May 10, 2024

Hmmm... one thing that is not quite clear to me is how you would define an env variable for both client and server.

Something like PUBLIC_SOME_THIRD_PARTY_API="https://example.com/api" that gets imported into custom_fetch.ts that then gets used by both server and client

@florian-lefebvre
Copy link
Member Author

I agree it's not super clear but a client variable can be imported on both client and server

@Tc-001
Copy link

Tc-001 commented May 11, 2024

Oh I see!
Then you probably need to change this: https://github.com/withastro/roadmap/pull/894/files#diff-3c58a592f2b3662c3cda9ddfdd25a71de2a989fff0a5fcebfcd805dee36be9baR68

Otherwise looks really nice!

@florian-lefebvre
Copy link
Member Author

I've updated the RFC to match the current implementation PR state and make things clearer

@ematipico
Copy link
Member

ematipico commented May 13, 2024

@florian-lefebvre thank you for this RFC. I think you mostly covered everything. However, something important that is missing is how an adapter gets to provide their "getter" function: process.env VS Deno.env.

Comment on lines 383 to 397
overrideProcessEnv({
getEnv,
variables: [
{
destKey: 'PUBLIC_NETLIFY',
srcKey: 'NETLIFY',
},
{
destKey: 'PUBLIC_DEPLOY_URL',
srcKey: 'DEPLOY_URL',
default: url,
},
],
});
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think an adapter developer should import overrideProcessEnv like this (from the runtime, using a long import). Instead, I think we should provide this function via hook, maybe in astro:config:done (or another hook before the build). As you explained in the RFC, this is required during the dev/build, and we want to have these environment variables available before generating and rendering an actual page.

This way, we can control it, and we will provide this function only if the feature is enabled by the user and supported by the adapter. Win-win.

We could call it setOverrideProcessEnv.

Also, can you explain what default, destKey, and srcKey are? I looked at the source code and adapter PR, and I couldn't understand them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah my bad I just wanted to get everything out of my head so some things related to the adapter may be messy!

About exposing it in a hook, i see a few issues:

  • It can be called in quite a few hooks, from astro:config:setup to astro:build:setup
  • It does nothing that involves astro (eg. settings)

So I don't know it it should be accessible from all those hooks parameters.

About the namings I'm happy to change them, just wanted something quick that worked to test on the adapters PR but basically:

  • destKey is the key that should be set on process.env
  • srcKey is the key passed to getEnv inside the function. For example on netlify, that's important because NETLIFY is exposed but it has to be PUBLIC_NETLIFY with Astro Env, so it helps reducing boilerplate
  • default is meant to be used as a fallback value, for example for DEPLOY_URL in dev

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • It can be called in quite a few hooks, from astro:config:setup to astro:build:setup

Maybe I worded my thoughts badly, sorry! I meant, let's choose only one hook. If it's required for dev too, maybe astro:config:setup makes more sense.

  • It does nothing that involves astro (eg. settings)

Why? It sets an internal runtime function that is used inside the rendering phase (components that can read the environment variables)

  • destKey is the key that should be set on process.env

  • srcKey is the key passed to getEnv inside the function. For example on netlify, that's important because NETLIFY is exposed but it has to be PUBLIC_NETLIFY with Astro Env, so it helps reducing boilerplate

  • default is meant to be used as a fallback value, for example for DEPLOY_URL in dev

Awesome! I am not very good at naming, so I am happy with the one chosen. It would be great if we could document them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? It sets an internal runtime function that is used inside the rendering phase (components that can read the environment variables)

It's not! It's just setting process.env so it doesn't call any internal astro stuff.

Maybe I worded my thoughts badly, sorry! I meant, let's choose only one hook. If it's required for dev too, maybe astro:config:setup makes more sense.

Well it's not exactly true. Yes astro:config:setup makes sense for dev but I would also expose it another hook that runs in both dev and build (probably astro:config:done) to allow the netlify build env variable inlining

Awesome! I am not very good at naming, so I am happy with the one chosen. It would be great if we could document them

Yes I'll add some

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all this does is set a variable on process.env then why do we need an API for this at all? Why don't they just set the variable themselves.

"astro:config:done": () => {
  process.env.PUBLIC_NETLIFY = process.env.NETLIFY;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I started to do but there are a few issues that will be shared any adapters and so I thought a utility would help. For example, if NETLIFY is undefined here, PUBLIC_NETLIFY will actually be "undefined".
I just created this utility as the logic is a bit annoying, can be used several times per adapter (eg. dev and build) and can lead to errors easily.

https://github.com/withastro/astro/blob/feat/astro-env/packages/astro/src/runtime/server/astro-env.ts#L30-L44

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is a large enough RFC already, I think adding this extra complexity (and maintenance) is not worth it at this time. I don't think this adds enough value to justify the maintenance in Astro. Integrations are advanced usage so they should be able to manage conditionally setting up environment variables in dev, if needed.


##### `app.setGetEnv`

Under the hood, `getSecret` uses `getEnv` from the runtime. It can be overrided by calling `app.setGetEnv`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this part of the App class? The App class is for rendering. Setting environment variables is unrelated to rendering for my hosts.

In addition, doing it this way means that variables can only be accessed inside of a render function. If you try to access a variable in the top-level of a module like so:

import { getSecret } from 'astro:env/server';

const API_URL = getSecret('API_URL');

This will result in the variable being undefined because setGetEnv is not called until the very end of the bundle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I added getEnv to app.render options to have it per request but then moved it to a method. I thought calling app.setGetEnv right after the instantiation of App would be enough, where should that go then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we don't need the function app.setGetEnv, and just expose the function setGetEnv from astro/env.

Users can still call setGetEnv from anywhere inside their code, even during the rendering phase, and at the top level module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we don't need the function app.setGetEnv, and just expose the function setGetEnv from astro/env.

Alright so we can't use astro/env as it's already taken by a dts (https://github.com/withastro/astro/blob/main/packages/astro/package.json#L34), what export should we use then?

Users can still call setGetEnv from anywhere inside their code, even during the rendering phase, and at the top level module.

As for matthew's example, it will still fail for Cloudflare I believe but I guess it's just a Cloudflare limitation in general.

Users can still call setGetEnv from anywhere inside their code, even during the rendering phase, and at the top level module.

Currently, I'm calling setGetEnv with an implementation that throws (to fail if an adapter does not support getEnv) in the App constructor as I assumed it would run before anything else. Where should I put it instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, top-level usage is probably not possible in Cloudflare. That's just a limitation they've chosen, and shouldn't affect other runtimes.

As for where this lives, since adapters are built through Vite, is there any reason they can't just use a astro:env export? astro:env/setup or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue I see is typing it, I guess it should not be done through typegen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure you can add types for this, we have them in packages/astro/client.d.ts

@theoephraim
Copy link

Hello fellow Astronauts!

We over @dmno-dev have been very deep in this specific problem space for the last few months, building out a platform-agnostic configuration tool meant to be a unified system for your entire stack. You can read more at dmno.dev and specifically check out our Astro integration. While our solution may be overkill for some folks, we hope it becomes the go-to solution for anyone working in a monorepo or wanting some tools with a bit more than what is built-in.

Regardless, we are super happy to see the progress on this RFC and for Astro to have amazing first-class env var handling.

Apologies for being a little late to the party, as I know some excellent implementation work by @florian-lefebvre is already underway, but I wanted to share a few thoughts here (that I’ve shared with him already) from what we've learned while building out DMNO.

Please understand this is all meant to be constructive, simply to provoke discussion, and get the best possible outcome for Astro users.


Challenges of hybrid rendering

In any SSR app (output server or hybrid) it can be very difficult to reason about where/when certain code will be run and/or bundled. In fact it often changes because the parent page/component or the site settings rather than what is happening in the component itself.

This presents challenges on two fronts - preventing leaks, and reasoning about if config will be actually dynamic (vary at boot time) or not.

There just is no easy way to enforce that the script part of a component has access to secrets, but only when being run on the server... Or to ensure that a config item you want to vary dynamically at boot time was not used during a pre-render of a page, effectively making it static even though it was not "bundled" into the code.

When the line between client and server can be so blurry, I think it will lead to confusion to say an item is a “server” or “client” item, or vary behaviour depending on how it is accessed.

Instead I think we should rely on a clear config schema to express the user's intent, build the system to do the right thing whenever possible, and to protect the user from anything unexpected.

Config Schema

When we think about how to classify our config items, to me it really boils down to 2 concepts:

  • Is this data "sensitive" (vs "public") (alt terms "secret","private")
    meaning do we need to handle it sensitively and make sure it is never leaked
    This obviously implies you can never access the item from the client. But considering we are in a world of hybrid server and client rendering and items may be used in various code that could be run in both, it’s just not that simple, and it is often not obvious while authoring code whether it will be running in the client or server. The important thing is, we handle the data with care and never leak the value in ANYTHING sent over the wire.

  • Is the item "dynamic" (vs "static") (alt related terms “build” vs “boot” time, "runtime", “bundled”)
    meaning is the value is always loaded at boot time and never bundled into built or pre-rendered code
    Obviously if you are building a static site, everything will be replaced at build time… But if not, you may have items that you want to replace, and some which you may want to dynamically change using env vars at boot time. I do not believe this is as tied to whether something is sensitive as the current proposal and most frameworks (ex: PUBLIC_/NEXT_PUBLIC_) assume, likely because our tooling mostly originated in a SSG, totally static era.

The important part here is that these 2 concepts are fairly easy to reason about and totally decoupled

In DMNO this is accomplished by marking items with sensitive: true/false and dynamic: true/false along with project-level settings to define the default behaviour (ex: default_static vs default_dynamic).

It should be noted that most users probably won't care about this dynamic/static stuff, and the default behaviour can still be set to public_static, which matches what most are used to (sensitive items are dynamic and public items are static).

The Edge cases

  • sensitive+static - it is less common, but I see no reason to not let a user have a sensitive config item affect a build and tree-shaking
    Re: security - yes if someone has access to the source code, they could see a secret that has been "baked" at build time, but in the case of deploying code to a server you control, or publishing a docker container to a private registry, those secrets would not be compromised.

  • public+dynamic - this presents some challenges as they would need to be loaded by the client at runtime, but it certainly possible
    Re: performance - these can be fetched on demand and users can decide to use them sparingly, not on page load

You can read more about DMNO's handling of this in our dynamic config guide.

Access patterns

The current proposal has multiple methods of accessing certain config items which is supposed to help achieve safety. But this means you must think about which access pattern to use, and changing the schema of an item may necessitate changing how it is accessed throughout your code.

By using a combination of proxy objects and build-time rewrites, the user gets to access everything exactly the same way, and it all just works as intended. In DMNO, we inject 2 globals (which could of course be imported from virtual modules instead)

  • DMNO_CONFIG (could be ALL_CONFIG) - includes every config item
  • DMNO_PUBLIC_CONFIG (could be PUBLIC_CONFIG) - includes only non-sensitive items

NOTE - while the extra PUBLIC object is not strictly necessary, I do think it's helpful when browsing what items can be safely used on the client, and it's a bit closer to what folks are used to with the PUBLIC_ prefix.

The injected proxy objects mostly let us give the user more helpful error messages (ex: “that item doesn’t exist” vs “that item exists but it is sensitive”), and it handles loading of any dynamic-public items. Loading these dynamic config items in the client uses a blocking http request so that we can maintain the same DX, rather than needing to do something like await getConfigItem(‘DYNAMIC_THING’).

Protection from footguns

Once we have a full schema to express the user's intent, we can effectively protect the user from doing the wrong thing, without having to think too much:

  • we only replace "static" items at build time, both on the client and the server
  • ALL_CONFIG is not available on the client, although we do inject a proxy object that gives helpful error messages
  • any dynamic+public items accessed on the client load from a blocking http request, on demand via that proxy object
  • We inject a middleware to detect any leaked secrets before it goes out over the wire
  • We can inject a vite build hook to detect any leaked secrets in any built assets
  • We can check if any dynamic config items were used during a pre-render and throw or warn the user

PUBLIC_ prefix

One more note on the current proposal. Once we've introduced a proper schema, I don't think it is helpful to enforce a naming convention. I'd recommend keeping one or the other as it just adds another source of truth and opportunity for confusion. Personally with the rest of the system keeping you safe, I'd recommend relying on the schema, as it means you only update the schema, not where the config is accessed.

In DMNO, our type system allows extendable types to be marked as sensitive - for example any instance of a StripeSecretTokenDataType is always going to be sensitive - so we must use the schema


Hopefully this is helpful, at the very least to kick off some discussion. And please if this sounds interesting, head over to our Astro integration guide and give it a try. We're happy to get deep into the weeds with folks and to help in any way we can!

Also please don't hesitate to reach out by email or on the Astro Discord.

Cheers!

@florian-lefebvre florian-lefebvre changed the title rfc: Astro Env Astro Env May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants