Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getZxy with Azure Function app entry point #197

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

tengl
Copy link

@tengl tengl commented Jun 16, 2023

This PR adds code that can be used to deploy protomaps on Azure. See README.md for more info.

It is intended to be deployed in an Azure Function App what is exposed though a CDN.

The pmtiles file can be stored at any URL, but if you use Azure then you probably want to deploy it to Azure Blob Storage. Just make sure the function app has access to the container/file.

This exact code has note been tested. I just used it as base for a Function App, which has some modifications that I didn't want to include here.

Most of the code should work for any Source and any back-end, with minor modifications.

@bdon
Copy link
Member

bdon commented Jun 16, 2023

Thanks, this looks like a good starting point. The two vital features needed are

  1. the azure function needs to be able to authenticate with a private Blob Storage bucket - were you able to accomplish this?
  2. The azure function needs to integrate with an edge-cached CDN, which I think in Azure requires Front Door - or are you using a different CDN?

@tengl
Copy link
Author

tengl commented Jun 17, 2023

Thanks, this looks like a good starting point. The two vital features needed are

  1. the azure function needs to be able to authenticate with a private Blob Storage bucket - were you able to accomplish this?
  2. The azure function needs to integrate with an edge-cached CDN, which I think in Azure requires Front Door - or are you using a different CDN?
  1. No, I have tried it. I used a public container. But it should be possible to do it with a private container and a SAS token.
  2. I used Azure Classic CDN. It is cheaper and is probably better for static content.

I will try with SAS token next week and get back to you.

@bdon
Copy link
Member

bdon commented Jun 17, 2023

@tengl the preference is for the authentication between Azure Functions and Blob Storage to be role-based without needing to specify a hardcoded key/secret. in Cloudflare this is done via Workers Bindings and in AWS via IAM roles.

The principle of the AWS and Cloudflare integrations is to present a well-tested and uniform solution for deploying protomaps within a single provider. It is of course possible to mix-and-match - e.g. Azure Functions serving from an AWS S3 Bucket - but the cost model is less efficient and it becomes difficult to debug across providers. So an integrated solution for Azure will be extremely valuable.

@tengl
Copy link
Author

tengl commented Jun 19, 2023

I see. I guess it should be possible using this tutorial, https://learn.microsoft.com/en-us/azure/app-service/tutorial-connect-app-access-storage-javascript?tabs=azure-portal. I've only browsed through it, so I'm not sure.

We could make an AzureStorageSource and use that instead of the FetchSource. But I will probably not have time to do it this week, we'll see if I can manage next week. Feel free to modify the PR if you have the time.

The SAS token and private container works, as expected. We will go for that solution until we find a way to use the storage blob SDK.

@tengl
Copy link
Author

tengl commented Nov 17, 2023

@bdon I just pushed some updates. I made an AzureStorageSource that should be able to use managed identities. I haven't tested it properly yet, but I will hopefully do it next week.

It still supports FetchSource.

@tengl
Copy link
Author

tengl commented Nov 20, 2023

@bdon I pushed some fixes. It seems to work now, at least on one of my pmtiles files.

I've tested on two different pmtiles files and both local debug and on Azure. All combinations work when using PMTILES_PATH and FetchSource and as well as when using AzureStorageSource (Azure Storage managed identities) by specifying the storage details.

A caveat though, it doesn't work if multiple maps are enabled and they mix sources. In that case I get an error saying incorrect header check for one of the pmtiles files. It probably has something to do with caching.

A note, using FetchSource and SAS token URLs seems to be a lot faster than using Azure Managed Identities. I'm sure/hope that it can be optimized, because right now it is too slow.

I consider it done, but there are some fixes for edge cases and optimizations to be made. I won't have the time to continue work on this at the moment.

@bdon
Copy link
Member

bdon commented Nov 21, 2023

A caveat though, it doesn't work if multiple maps are enabled and they mix sources. In that case I get an error saying incorrect header check for one of the pmtiles files. It probably has something to do with caching.

What does "mix sources" mean? In the AWS and Cloudflare setups, a single instance of the function can serve multiple PMTiles files, so the cache in global memory needs to key portions of each file based on the filename.

@tengl
Copy link
Author

tengl commented Nov 21, 2023

What does "mix sources" mean? In the AWS and Cloudflare setups, a single instance of the function can serve multiple PMTiles files, so the cache in global memory needs to key portions of each file based on the filename.

I made it a bit more flexible, so that the code can accept any kind of source. You can have one pmtiles files served by a FetchSource and another served by a AzureStorageSource. The file is resolved by appending {map_name} to the parameter. If I only use one type of source for multiple pmtiles files, it works, but if I have one FetchSource and one AzureStorageSource I get the error. I've tried both ResolvedValueCache and SharedPromiseCache, so the problem should not be related to the caches.

@tengl
Copy link
Author

tengl commented Nov 22, 2023

@bdon I fixed it now. It was something wrong with the buffer. I don't really understand what, but if you look at commit
6dd2e61 maybe you can figure it out.

Before that change, the data.buffer.length would always be 8192, even if data.length was 1905.

break;
}

if (p_header.etag) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want this. If I am reading this correctly it will pass the etag of the entire archive as the etag of every tile; this means the etag is no longer properly content-aware since tiles will almost always be different.

Copy link
Author

@tengl tengl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I've changed that to use the tileData.etag, but haven't pushed the change. I've also added a Last-Modified, which is the reason I haven't pushed it yet. I need #304 to be merged first.

But I don't think this is a problem, because the source pmtiles file will not change, so every tile can have the same ETag. It actually will have the same ETag, at least if you use FetchSource or AzureStorageSource because the ETag comes from the file in storage, not the bytes. The ETag is unique in combination with the URL, so it will work in browsers. If you change pmtiles file, the ETag will change for all tiles.

But anyway, by using tileData.etag it will be fine if some other source has unique ETags for each tile.

);

let allowed_origin = "*";
if (typeof process.env["ALLOWED_ORIGINS"] !== "undefined") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not interact with a caching CDN in front correctly - if we store the Access-Control-Allow-Origin in the CDN cache, we need to invalidate the entire cache any time we change the set of allowed hosts. For AWS we can work around this by letting CloudFront dynamically handle CORS, for Cloudflare the Workers code can determine the correct header on every invocation. Is there any Azure CDN functionality to do dynamic CORS like CloudFront?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I need to check that. I used * in our case. I pretty sure Azure CDN can set the header dynamically.

Also, I see that the ALLOWED_ORIGINS parameter is used in a different place as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked ChatGDP. Got this answer:

  1. Vary Header: If your service correctly implements CORS, it should include a Vary: Origin header in its responses. This header tells caches, including CDNs, that the response varies based on the value of the Origin request header. When a CDN sees this, it should cache separate copies of the resource for each unique value of the Origin header. In this case, a request from example.com would hit one cache entry, and a request from other.com would hit a different cache entry, assuming both are allowed origins.

  2. Caching Without Vary Header: If your service does not include a Vary: Origin header, the CDN might cache the response to the first request and then serve that cached response to the second request, regardless of the origin. This can lead to incorrect Access-Control-Allow-Origin headers being sent to clients and potential CORS errors in the browser.

  3. CDN Configuration: Some CDNs offer more advanced configurations that can help manage CORS headers more effectively. For example, you might be able to configure the CDN to dynamically alter the Access-Control-Allow-Origin header based on the origin of the request. However, this capability depends on the specific CDN and its feature set.

  4. Potential Issues: Without proper handling, you might encounter issues where a user from an unallowed origin receives a cached response indicating they are allowed, or a user from an allowed origin receives a response indicating they are not allowed, due to the CDN serving a cached response intended for a different origin.

Testing and Verification: It's crucial to test how your CDN handles requests from different origins, especially if you are dynamically setting the Access-Control-Allow-Origin header. Verify that the correct headers are returned and that CORS is functioning as expected for users from different domains.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In summary, I think that in this case it is probably better to have the CDN handle CORS. That way the tile will be cached in the CDN, based on URL and cache-control headers, and subsequent requests will use the cached entry (I assume).

function getAzureDetails(mapName: string) {
let defaultAccountName = process.env.AZURE_STORAGE_ACCOUNT_NAME;
let defaultContainerName = process.env.AZURE_STORAGE_CONTAINER_NAME;
let defaultBlobName = process.env.AZURE_STORAGE_BLOB_NAME;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the AWS and Cloudfront implementations there is a hard constraint of 1 function = 1 bucket. It looks like this behavior is something specific to your use case in Azure? I would strongly prefer there is as close to 100% parity between AWS/Cloudflare/Azure meaning there shouldn't be any way to override using ${mapName} - was there a situation where the existing design did not meet your requirements?

Copy link
Author

@tengl tengl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I did that is that we need to specify a blob name, which is a path that could be maps/basemaps/protomaps/20231124.pmtiles for map name default and maps/custom_maps/rail/europe_20231123.pmtiles for map name rail.

If we want the function to support multiple maps by ${mapName} in the URL, I can't think of an easy way to map that. I would prefer if the Function App didn't decide how we organize our files, like it does in the AWS and Cloudflare cases. A suggestion is to keep the AZURE_STORAGE_BLOB_NAME_${mapName} override, and let the users only specify one specific storage account and container.

I don't have a strong opinion in this case, I'll let you decide the solution. We don't store custom maps in the same folder as basemaps and the files don't have the same name. But we use PMTILES_PATH anyway, because it is faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply (I'm also going offline for a week tomorrow but will prioritize this when I get back)

I would prefer if the Function App didn't decide how we organize our files

Then the easiest thing to do is to map the bucket key directly to the URL. It looks like you want to perform "aliasing", where a presentation name like map1 in the URL hits another key in the storage bucket. If we support this feature in Azure then we need to make it work in CF and AWS as well. This kind of aliasing could happen outside this code function logic in your own application-specific code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the easiest thing to do is to map the bucket key directly to the URL. It looks like you want to perform "aliasing", where a presentation name like map1 in the URL hits another key in the storage bucket. If we support this feature in Azure then we need to make it work in CF and AWS as well. This kind of aliasing could happen outside this code function logic in your own application-specific code.

Well, you could say that it is in the application code, because getAzureStorageSource is called in the application code and you can easily implement your own version, the source takes the details as paramerters. But I can change so that the default getAzureStorageSource doesn't support "aliasing" and you would have to implement your own version if you need more advanced logic. I'll also change so that it just replaces {name} in the blob name, just like the other sources.

Copy link
Author

@tengl tengl Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed some changes to make it more like Cloudflare and AWS implementations. No environment variables (well, almost) are read in the main "library code", but there is a default implementation to get the Azure storage details, so the users can use it or develop their own version if they like.

I removed support for .pbf since I figure no-one uses AzureStorageSoure yet so there is no need to be backwards compatible.

No need to rush it, look at it when you have the time.

@bdon
Copy link
Member

bdon commented Feb 9, 2024

Finally revisiting this, some thoughts:

  • It looks like since SAS tokens are the only efficient way to authenticate requests to a storage account / container, we might as well use raw Fetch. The FetchSource from the pmtiles library is, as of v3.0.0, specific to browsers, so we probably want our own implementation that just does something like this:
  async getBytes(
    offset: number,
    length: number,
    signal?: AbortSignal,
    etag?: string
  ): Promise<RangeResponse> {
    const requestHeaders = new Headers();
    requestHeaders.set("range", `bytes=${offset}-${offset + length - 1}`);
    if (etag) requestHeaders.set("if-match", etag);

    const url = `https://${storageAccountName}.blob.core.windows.net/${containerName}/${this.archiveName}${sasToken}`;

    let resp = await fetch(url, {
      signal: signal,
      headers: requestHeaders,
    });

The SAS token can be scoped to a container's read operations, but it looks like the way to use them is to set an expiration date far into the future. Is there any alternative SAS token generation feature equivalent to AWS Roles?

@bdon
Copy link
Member

bdon commented Feb 9, 2024

Also, ideally we can deploy the code as-is using az functionapp deployment source config-zip -g RESOURCE_GROUP -n FUNCTIONAPP_NAME --src ZIPFILE.zip - it seems like it expects a certain zip file structure for NodeJS apps, were you able to deploy using the CLI or are you using some IDE integration?

@tengl
Copy link
Author

tengl commented Feb 9, 2024

  • It looks like since SAS tokens are the only efficient way to authenticate requests to a storage account / container, we might as well use raw Fetch.

Yes, we use SAS tokens and it's working great so far.

The FetchSource from the pmtiles library is, as of v3.0.0, specific to browsers,

I didn't realize that :-/. We used the FetchSource and it's working fine.

so we probably want our own implementation that just does something like this:

Yes, that might be good enough. The only thing (I think) specific to Azure Functions is the entry point, every thing else should be generic.

I think the only parameter to the function should be the URL, which includes the SAS token because that's how you copy it from Azure Portal. No need to split it up in multiple parameters. It also makes it easier to support other hosting scenarios, only the entry points needs to be customized as long as the pmtiles file is hosted via http.

The SAS token can be scoped to a container's read operations, but it looks like the way to use them is to set an expiration date far into the future. Is there any alternative SAS token generation feature equivalent to AWS Roles?

Not that I am aware of, apart from Managed Identities.

@tengl
Copy link
Author

tengl commented Feb 9, 2024

Also, ideally we can deploy the code as-is using az functionapp deployment source config-zip -g RESOURCE_GROUP -n FUNCTIONAPP_NAME --src ZIPFILE.zip - it seems like it expects a certain zip file structure for NodeJS apps, were you able to deploy using the CLI or are you using some IDE integration?

I used VS Code integration. In my experience that is the best way if you want to make changes to the code. But I'm no expert in Azure Functions, I just dot the minimal to get the job done :-)

If we want to deploy directly to Azure the CLI is probably the best way. Maybe just create a bash and ps1 script for that?

@bdon
Copy link
Member

bdon commented Feb 9, 2024

I didn't realize that :-/. We used the FetchSource and it's working fine.

It should work fine, but has some logic that is irrelevant for non-browser usage, and won't have efficient ETag handling that we just added to AWS / Cloudflare.

Another option for Azure is to use Azure Container Apps and not use Functions at all, but I don't use Azure enough to get a sense of which is better supported. Azure Functions seems like a mess with multiple runtimes, versions and upload methods, but Containers might have slow start times.

const p_header = await p.getHeader();

if (
(ifNonMatchEtag && ifNonMatchEtag === p_header.etag) ||
Copy link
Member

@bdon bdon Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tengl we are working through the ETag logic in protomaps/go-pmtiles#137 and I think the behavior here may not be intended. While it's OK for all the individual tiles to have the same ETag as the whole archive as implemented here, we can't short-circuit the 304 response, because the underlying archive may have changed on storage; in other words every request to the Azure Function must make at least one request to the storage blob to properly reflect the present state of the ETag.

Given that this optimization isn't possible any longer I suggest that we converge the PMTiles decoding implementations on generating content-based etags for each unique tile, like the Lambda function does now: https://github.com/protomaps/PMTiles/blob/main/serverless/aws/src/index.ts#L238

This would also provide the benefit that updating an archive does not invalidate all etags, any tiles that remain the same across archives can be cached by intermediate CDNs.

Would making this change to unique ETags still work for your deployment?

Copy link
Author

@tengl tengl Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the Azure Function would always have to make a request to the storage in order to detect changes, right? Isn't that the point of ETags, the only thing they save is the data transfer. I don't understand how you can short-circuit the 304 response and still detect changes in the archive?

So the process is something like this?

  1. Client request tile without ETag.
  2. Request tile data from archive.
  3. Store the ETag for the archive.
  4. Calculate and store ETag for the tile.
  5. Return tile ETag to client.
  6. New client request for same tile with ETag.
  7. Request tile data from archive with ETag for archive.
  8. Calculate ETag from archive data, same as client request. Return 304.
  9. UPDATE OF ARCHIVE.
  10. Client request same tile with ETag.
  11. Request tile data from archive, with ETag for archive.
  12. ETag mismatch, new data in response.
  13. Overwrite ETag for archive.
  14. Calculate new ETag for tile.
  15. Check if client ETag is same as new calculated ETag.
  16. If yes, return unchanged, if no, return data and new ETag.

That saves the data transfer between the Azure Function and the client, but not from the storage blob to the Azure Function.

I don't think we need to care for my deployment, I use my own repo so I can change the code however I want and it makes it easier to deploy using my current tooling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the implementation on line 67 is that it short circuits based on the archive etag and the If-None-Match request header matching. but the call to p.getHeader() could be returning a cached, not fresh value. So on step 10-12 above, the azure function will short-circuit and not detect the mismatch at line 12. Am I interpreting that right or missing something? (it's been awhile since I've looked at this part of the code)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you're right. We don't want that.

But how would unique ETags work? Like the process I described above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for an Azure Function written in JavaScript, it would call one of the Node standard library hashing functions such as md5 on the tile contents, then set those as a strong ETag header. So this would add some computational overhead to every request. Same as how the lambda works here

Copy link
Author

@tengl tengl Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the overhead would also be a data transfer from the storage to the function. Every request needs to calculate the ETag with the tile data, right? Unless we don't check the ETag of the archive.

An optimized version just for HTTP (Fetch) sources could do a HTTP HEAD request at regular intervals to check if the ETag for the archive has changed. If it has, invalidate cache and load new data, otherwise short-circuit and return 304. The tiles could still have individual ETags, and they could still be the same after the archive has changed. That would save a data transfer from the function to the client if the ETag is still the same. But I don't know how that would be implemented and still fit other sources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are two different perspectives on the purpose of the Azure Function:

  1. It is a "view" into the storage, not a cache. So every request into the function generates at least one data transfer to the underlying storage. There is caching going on, but it is the minimal necessary parts (headers, directories) to accomplish efficient behavior. It is paired with a standard HTTP cache in front of it like Azure CDN or CloudFront.
  2. It itself acts like a cache and is allowed to return stale data. In this case you can optimize away some requests by using the header ETag, at the cost of possibly returning a version that is older than the underlying storage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, I agree that it is a view and that it should cache as little as possible. Leaving all caching to CDN is fine.
  2. Sure, depending on how long the ETag for the archive is valid but I suspect the CDN will return stale data more often (depending on Cache-Control values, I guess). Keeping the function simple it probably better than optimizing for some cases where the CDN doesn't hit the cache.

Anyway, the function is pretty fast, normally about 200-300 ms round-trip time but it can take up to 700 ms. From my very rudimentary tests, I think that at least half of that time is between the CDN and the function, and the rest is between the function and storage.

My subjective impression is that the map feels pretty responsive even when navigating in areas that the CDN hasn't cached. So the conclusion is:

  1. Keep the function as simple as possible, translate the incoming request and fetch the data from the storage.
  2. Recalculate the ETag from the tile content.
  3. Avoid caching.
  4. I suggest the configuration of the function only needs an URL to a file (with SAS token), which means that most of the code can be used in other hosting environments that has HTTP hosted archives.

@bdon
Copy link
Member

bdon commented Mar 18, 2024

FYI, I have been mostly working on the Google Cloud integration which uses a Docker container on Docker Registry.

https://docs.protomaps.com/deploy/google-cloud

If this deployment method works scale-to-zero on Azure Containers, that is preferable over a custom Function App because it's simply less code. But I haven't succeeded in deploying this (or getting a scale-to-zero config). Anyone with Azure insight want to try it out?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants