Skip to content

Latest commit

 

History

History
769 lines (625 loc) · 38.8 KB

PROXY_WASM.md

File metadata and controls

769 lines (625 loc) · 38.8 KB

Proxy-Wasm

Table of Contents

What is Proxy-Wasm?

Proxy-Wasm is a set of binary specifications and conventions used to extend L4/L7 proxies with WebAssembly. These extensions are called "filters". Originally developed for WebAssembly support in Envoy, the specification ensures filters are "proxy-agnostic", which allows for a degree of portability of these extensions between multiple proxy runtimes (e.g. Envoy & Nginx).

Due to sitting in-between the low-level proxy runtime and the filter code, Proxy-Wasm specifications are made of two complementary counterparts: a host ABI and an SDK library. The former exposes low-level bits of the proxy to the latter, which is a Proxy-Wasm SDK library used to write the filters. There exists one Proxy-Wasm SDK library for each supported language. All filters are built atop the SDK library's abstractions, themselves powered by the ABI and the underlying runtime.


Schema of Proxy-Wasm SDK structure

Fig.1 - Proxy-Wasm SDK Structure


Presently, the Proxy-Wasm SDK libraries are implemented in 5 languages:

  1. AssemblyScript
  2. C++
  3. Go (TinyGo)
  4. Rust

The Proxy-Wasm host ABI is implemented by Envoy and a handful of other proxies, including ngx_wasm_module.

Other useful Proxy-Wasm resources:

  • SDK specifications - the complete Proxy-Wasm (SDK & ABI) specification is hosted and maintained in this GitHub repository.
  • Also see Examples below for comprehensive and complete filters examples.

Back to TOC

Proxy-Wasm in ngx_wasm_module

Because of its existing adoption in proxies and its tailored "reverse-gateway" design, Proxy-Wasm is the prominent way of extending Nginx with WebAssembly in ngx_wasm_module. WasmX itself aims at introducing other innovations in Nginx, but a large part of ngx_wasm_module is dedicated to its Proxy-Wasm implementation.

Back to TOC

Filter Entrypoints

Each SDK library provides a framework to implement a Proxy-Wasm filter in a given language. Filters are written by implementing "entrypoints" or "callbacks" that are hooked into the proxy lifecycle (i.e. on_request_headers, on_response_body). These extension points are part of an interface defined by the Proxy-Wasm specifications and exposed as a framework by the SDK libraries.

The Proxy-Wasm specifications allow for entrypoints within several contexts:

  • A Root context, representing the filter within the proxy process. This context is unique for each filter configured in ngx_wasm_module. It exposes the configuration entrypoints on_vm_start, on_configure, as well as on_tick.
  • An HTTP context, representing the filter within a request/response lifecycle. This context is unique for each request traversing the filter, and exposes the on_request_headers, on_response_body, on_http_call_response and so on entrypoints.
  • A Stream context, representing the filter within a connection lifecycle in one of the L4 protocols. Presently, no support for the stream context is provided in ngx_wasm_module.

As an example, let's see what a filter and its contexts would look like if it were written with the Rust SDK Library (version 0.2.1):

// lib.rs
use proxy_wasm::{traits::*, types::*};

struct MyRootContext {};
struct MyHttpContext{};

proxy_wasm::main! {{
    // the wasm instance initial entrypoint
    // ...
    proxy_wasm::set_log_level(LogLevel::Info);
    // create and set the root context for this filter
    proxy_wasm::set_root_context(|_| -> Box<dyn MyRootContext> {
        Box::new(MyRootContext {});
    });
}}

// implement root entrypoints
impl Context for TestRoot;
impl RootContext for TestRoot {
    fn on_configure(&mut self, config_size: usize) -> bool {
        if let Some(config_bytes) = self.get_plugin_configuration() {
            // handle configuration
            // ...
        }

        true // return value - continue
    }

    fn get_type(&self) -> Option<ContextType> {
        Some(ContextType::HttpContext) // return value - causes the SDK to call create_http_context
    }

    fn create_http_context(&self, context_id: u32) -> Option<Box<dyn HttpContext>> {
        Some(Box::new(MyHttpContext {})) // return value - create the HTTP context for this request
    }
}

// implement http entrypoints
impl Context for MyHttpContext;
impl HttpContext for MyHttpContext {
    fn on_http_request_headers(&mut self, nheaders: usize, eof: bool) -> Action {
        // do something when request headers are received
        // ...

        Action::Continue // return value - continue to next filter
    }

    fn on_http_response_headers(&mut self, nheaders: usize, eof: bool) -> Action {
        // do something when response headers are about to be sent
        // ...

        Action::Continue // return value - continue to next filter
    }
}

Fig.2 - Proxy-Wasm Rust filter example


In the above example we have a filter implementing two contexts: the Root and HTTP contexts. Both contexts are represented by Rust structures which the filter even instantiates itself, although through the proxy-wasm-rust-sdk framework.

These structures can contain members and variables that may need to be stored through a filter's lifecycle. For more detailed filters examples, see Examples.

For an overview of all supported filters entrypoints and their corresponding Nginx phases, see Filters execution in Nginx.

Back to TOC

Filter Chains

The Proxy-Wasm specifications are designed so that several filters can be chained together and execute as a "filter chain" within a Stream or HTTP context.

In a similar fashion to Envoy proxy extensions, each entrypoint's return value determines the next processing step in the chain, such as "continue to next filter", or "pause" (i.e. "yield"). While Envoy extensions support more nuanced return values, Proxy-Wasm specifications up to 0.2.1 only support Pause and Continue.

As an example, see the above Fig.2 example in which Action::Continue is an enum value in the Rust SDK, and used as a return value to indicate that the filter chain should continue processing.

Paused filter chains will continue to the next step on a "resume" event (i.e. on HTTP call response). For an example of external HTTP calls, browse the Examples or even the ngx_wasm_module test filters in t/lib.

The Proxy-Wasm filter chain is ultimately embedded in a larger "execution chain" concept, which is described in the Execution Chain section of "Essential WasmX Concepts".

Back to TOC

Host ABI Implementation

From the point of view of Proxy-Wasm, ngx_wasm_module is what we call a host implementation of the specification. In other words, ngx_wasm_module implements the Proxy-Wasm Host ABI: a set of functions exposing low-level features of the proxy (i.e. Nginx) to the Proxy-Wasm SDK libraries.

For example, two so-called "host functions" that a proxy must implement as part of the Host ABI are:

i32 (proxy_result_t) proxy_log(i32 (proxy_log_level_t) log_lvl,
                               i32 (const char*) msg,
                               i32 (size_t) msg_len);

i32 (proxy_result_t) proxy_get_buffer(i32 (proxy_buffer_type_t) buf_type,
                                      i32 (offset_t) offset,
                                      i32 (size_t) max_size,
                                      i32 (const char**) return_buffer_data,
                                      i32 (size_t*) return_buffer_size,
                                      i32 (uint32_t*) return_flags);

When the Proxy-Wasm filter is running in ngx_wasm_module and invokes a logging facility, all SDK libraries forward the call to proxy_log, a host C function implemented by ngx_wasm_module and available for import to all Proxy-Wasm filters. In proxy_log, ngx_wasm_module invokes the Nginx logging facilities and the user message appears in Nginx's error.log file.

The purpose of proxy_get_buffer is for Proxy-Wasm filters to retrieve one of the available buffers (e.g. request or response payloads). It is invoked by SDK libraries when a filter calls one of get_http_request_body, get_http_response_body, get_http_call_response_body, etc... What ngx_wasm_module does in its implementation of proxy_get_buffer is manipulating the Nginx request structures (i.e. ngx_chain_t) to produce a representation of the requested buffer for the filter.

Both of the above examples are low-level ABI functions powering the abstractions offered by the Proxy-Wasm SDK libraries. Many other features are powered this way; below is a complete list elaborating the state of support for the Host ABI.

Back to TOC

Filters Execution in Nginx

In ngx_wasm_module, the Proxy-Wasm HTTP context is implemented as an Nginx HTTP module: ngx_http_wasm_module.

Because Nginx modules themselves are extensions (i.e. of Nginx), it is valuable to understand how Proxy-Wasm filters are executed in relation to the underlying Nginx phases for multiple reasons (compatibility with other Nginx modules, yielding I/O capabilities, etc...).

The diagram depicted below shows the flow of filter chains as they go through the Nginx phases and in relation to their parent processes. Keep in mind that other Nginx modules may execute before and after each phase.


Schema of Proxy-Wasm steps in Nginx phases

Fig.3 - Proxy-Wasm steps in Nginx phases


Back to TOC

Host Properties

Proxy-Wasm filters can access host, connection, or request contexts variables via so-called "properties". For example, request.path, response.code, or connection.mtls. The [prefix].[name] notation of a property is referred to as its "path". See Supported Properties.

Proxy-Wasm SDK libraries expose the get_property and set_property APIs, which expect a property path as an array of strings (e.g. [prefix, name]), but note that in our documentation we refer to paths in their [prefix].[name] format.

All properties' values are strings.

In ngx_wasm_module, Host properties are prefixed with wasmx and can be used to store or retrieve any sort of additional context variable. Host properties are scoped to a filter's current context, be it Root or HTTP.

For example, setting wasmx.foo in on_configure means on_tick will be able to read the value of wasmx.foo. However, wasmx.foo in on_request_headers will be considered unset, since the entrypoint is executed within the HTTP context.

For example:

impl RootContext for TestRoot {
    fn on_configure(&mut self, config_size: usize) -> bool {
        // set wasmx.hello = "world"
        self.set_property(vec!["wasmx", "hello"], Some("world".as_bytes()));

        true
    }

    fn on_tick(&mut self) {
        let value = ctx.get_property(vec!["wasmx", "hello"]);
        // wasmx.hello == "world"
        assert!(value == Some("world"));
    }
}

impl HttpContext for MyHttpContext {
    fn on_http_request_headers(&mut self, nheaders: usize, eof: bool) -> Action {
        let value = ctx.get_property(vec!["wasmx", "hello"]);
        // not found in HTTP context
        assert!(value == None);

        // set wasmx.hello = "http"
        self.set_property(vec!["wasmx", "hello"], Some("http".as_bytes()));

        Action::Continue
    }

    fn on_log(&mut self) {
        let value = ctx.get_property(vec!["wasmx", "hello"]);
        // wasmx.hello == "http"
        assert!(value == Some("http"));
    }
}

Fig.4 - Proxy-Wasm Host properties example


Back to TOC

Nginx Properties

Proxy-Wasm filters can access host, connection, or request contexts variables via so-called "properties". For example, request.path, response.code, or connection.mtls. The [prefix].[name] notation of a property is referred to as its "path". See Supported Properties.

Proxy-Wasm SDK libraries expose the get_property and set_property APIs, which expect a property path as an array of strings (e.g. [prefix, name]), but note that in our documentation we refer to paths in their [prefix].[name] format.

All properties' values are strings.

In ngx_wasm_module, it is also possible to retrieve and manipulate Nginx Variables via Proxy-Wasm properties by using the ngx prefix. The path notation to access an Nginx variable is [ngx, VAR] (or ngx.VAR in dotted notation). Note that very few Nginx variables can actually be written to. Calling set_property on an immutable Nginx variable will result in a trap.

For example:

impl HttpContext for MyHttpContext {
    fn on_log(&mut self) {
        // get the $hostname Nginx variable
        let value = ctx.get_property(vec!["ngx", "hostname"]);
        assert!(value == Some("host.com"));

        // get the $pid Nginx variable
        let value = ctx.get_property(vec!["ngx", "pid"]);
        assert!(value == Some("1234"));
    }
}

Fig.5 - Proxy-Wasm Nginx properties example


Back to TOC

Supported Specifications

This section describes the current state of support for the Proxy-Wasm specifications and different SDK libraries:

Back to TOC

Tested SDKs

Presently, ngx_wasm_module is tested with the following SDK versions:

More SDKs and more SDK versions are to be added to ngx_wasm_module's CI environment.

Back to TOC

Supported Entrypoints

Proxy-Wasm filters are written atop an SDK ABI which is itself a versioned component of the Proxy-Wasm specifications.

Presently, ngx_wasm_module supports the following SDK ABI versions:

  • 0.1.0 ✔️
  • 0.2.0 ✔️
  • 0.2.1 ✔️

All filters compiled with Proxy-Wasm SDK libraries implementing these ABI versions will be compatible with ngx_wasm_module.

Most extension points (i.e. "callbacks" or "handlers") that can be implemented in filters are available in all ABI versions with rare exceptions as the specifications are still evolving.

The following table lists all such Proxy-Wasm filters extension points (as of SDK ABI 0.2.1) and their present status in ngx_wasm_module:

Name Supported Comment
Root contexts
proxy_wasm::main! ✔️ Allocate the root context.
on_vm_start ✔️ VM configuration handler.
on_configure ✔️ Filter configuration handler.
on_tick ✔️ Background tick handler.
Stream (L4) contexts
on_new_connection NYI.
on_downstream_data NYI.
on_upstream_data NYI.
on_upstream_close NYI.
on_downstream_close NYI.
on_log NYI.
on_done NYI.
HTTP contexts
on_http_request_headers ✔️ Client request headers handler.
on_http_request_body ✔️ Client request body handler.
on_http_request_trailers NYI. Client request trailers handler.
on_http_request_metadata NYI. Client HTTP/2 METADATA frame handler.
on_http_response_headers ✔️ Response headers handler.
on_http_response_body ✔️ Response body handler.
on_http_response_trailers NYI. Response trailers handler.
on_http_response_metadata NYI. Upstream HTTP/2 METADATA frame handler.
on_http_call_response ✔️ Dispatch HTTP call response handler.
on_grpc_call_initial_metadata Dispatch gRPC call response, initial metadata handler.
on_grpc_call_message NYI.
on_grpc_call_trailing_metadata NYI.
on_grpc_call_close NYI.
on_log ✔️ HTTP context log handler.
on_done ✔️ HTTP context done handler.
Shared memory queues
on_queue_ready NYI

"NYI" stands for "Not Yet Implemented".

Back to TOC

Supported Host ABI

Proxy-Wasm filters are written atop a Host ABI which is itself a versioned component of the Proxy-Wasm specifications.

Presently, ngx_wasm_module supports the following SDK ABI versions:

  • 0.1.0 ✔️
  • 0.2.0 ✔️
  • 0.2.1 ✔️

All filters compiled with Proxy-Wasm SDK libraries compatible with these ABI versions will be compatible with ngx_wasm_module.

Most extension points (i.e. "callbacks" or "handlers") that can be implemented in filters are available in all ABI versions, with rare exceptions are the specifications are still evolving.

The following table lists all such Proxy-Wasm filters extension points (as of SDK ABI 0.2.1) and their present status in ngx_wasm_module:

Name Supported Comment
Integration
proxy_log ✔️
proxy_get_log_level
proxy_get_current_time_nanoseconds ✔️
Context
proxy_set_effective_context ✔️
proxy_done
Timers
proxy_set_tick_period_milliseconds ✔️
Buffers
proxy_get_buffer_bytes ✔️
proxy_set_buffer_bytes ✔️
Maps
proxy_get_header_map_pairs ✔️
proxy_get_header_map_value ✔️
proxy_set_header_map_pairs ✔️
proxy_add_header_map_pairs ✔️
proxy_replace_header_map_pairs ✔️
proxy_remove_header_map_pairs ✔️
Properties
proxy_get_property ✔️
proxy_set_property ✔️
Stream
proxy_resume_downstream
proxy_resume_upstream
proxy_continue_stream ✔️
proxy_close_stream
HTTP
proxy_continue_request ✔️
proxy_continue_response Yielding not supported in response phases.
proxy_send_local_response ✔️
HTTP dispatch
proxy_http_call ✔️
gRPC dispatch
proxy_grpc_call
proxy_grpc_stream
proxy_grpc_send
proxy_grpc_cancel
proxy_grpc_close
proxy_get_status Host function for proxy-wasm-rust-sdk get_grpc_status.
Shared key/value stores
proxy_get_shared_data ✔️
proxy_set_shared_data ✔️
Shared queues
proxy_register_shared_queue ✔️
proxy_dequeue_shared_queue ✔️
proxy_enqueue_shared_queue ✔️ No automatic eviction mechanism if the queue is full.
proxy_resolve_shared_queue
Stats/metrics
proxy_define_metric
proxy_get_metric
proxy_record_metric
proxy_increment_metric
Custom extension points
proxy_call_foreign_function

Back to TOC

Supported Properties

Proxy-Wasm filters can access host, connection, or request contexts variables via so-called "properties". For example, request.path, response.code, or connection.mtls. The [prefix].[name] notation of a property is referred to as its "path".

Proxy-Wasm SDK libraries expose the get_property and set_property APIs, which expect a property path as an array of strings (e.g. [prefix, name]), but note that in our documentation we refer to paths in their [prefix].[name] format.

All properties' values are strings. Properties whose value carries a boolean meaning return the strings "true" or "false".

This section focuses on Envoy Attributes, which are not formally documented as part of the Proxy-Wasm specifications but supported by all existing Proxy-Wasm host implementations through the properties ABI. In order to ensure portability of existing Proxy-Wasm Envoy filters, ngx_wasm_module aims at supporting as many Envoy Attributes as possible.

Besides Envoy Attributes, ngx_wasm_module also supports two other kinds of properties with the following prefixes:

The following table lists all Envoy Attributes (as of Envoy 1.23) and their implementation state in ngx_wasm_module:

Property Path Read Supported Write Supported Comment
Request properties
request.path ✔️ Maps to ngx.request_uri.
request.url_path ✔️ Maps to ngx.uri.
request.host ✔️ Maps to ngx.hostname.
request.scheme ✔️ Maps to ngx.scheme.
request.method ✔️ Maps to ngx.request_method.
request.useragent ✔️ Maps to ngx.http_user_agent.
request.protocol ✔️ Maps to ngx.server_protocol.
request.query ✔️ ✔️ Maps to ngx.args.
request.id ✔️ Returns x-request-id header value.
request.referer ✔️ Returns referer header value.
request.time ✔️ Timestamp of the first byte received, with milliseconds as the decimal part.
request.duration ✔️ Maps to ngx.request_time.
request.size ✔️ Maps to ngx.content_length.
request.total_size ✔️ Maps to ngx.request_length.
request.headers.* ✔️ Returns the value of any request header, e.g. request.headers.date.
Response properties
response.code ✔️ Maps to ngx.status.
response.size ✔️ Maps to ngx.body_bytes_sent.
response.total_size ✔️ Maps to ngx.bytes_sent.
response.grpc_status NYI.
response.trailers NYI.
response.code_details Not supported.
response.flags Not supported.
response.headers.* ✔️ Returns the value of any response header, e.g. response.headers.date.
Upstream properties
upstream.address ✔️ Returns Nginx upstream address if any. This value is retrieved from Nginx's r->upstream member, mostly set through ngx_http_proxy_module.
upstream.port ✔️ Returns Nginx upstream port if any. This value is retrieved from Nginx's r->upstream member, mostly set through ngx_http_proxy_module.
upstream.tls_version NYI.
upstream.subject_local_certificate NYI.
upstream.subject_peer_certificate NYI.
upstream.dns_san_local_certificate NYI.
upstream.dns_san_peer_certificate NYI.
upstream.uri_san_local_certificate NYI.
upstream.uri_san_peer_certificate NYI.
upstream.sha256_peer_certificate_digest NYI.
upstream.local_address NYI.
upstream.transport_failure_reason NYI.
Connection properties
destination.address ✔️ Maps to ngx.proxy_protocol_addr.
destination.port ✔️ Maps to ngx.proxy_protocol_port.
connection.requested_server_name ✔️ Maps to ngx.ssl_server_name.
connection.tls_version ✔️ Maps to ngx.ssl_protocol.
connection.subject_local_certificate ✔️ Maps to ngx.ssl_client_s_dn.
connection.id ✔️ Returns downstream connection number, i.e. r->connection->number.
connection.mtls ✔️ Returns downstream connection mTLS status as a "true"/"false" string.
connection.subject_peer_certificate NYI.
connection.dns_san_local_certificate NYI.
connection.dns_peer_local_certificate NYI.
connection.uri_san_local_certificate NYI.
connection.uri_peer_local_certificate NYI.
connection.sha256_peer_certificate_digest NYI.
connection.termination_details Not supported.
source.address ✔️ Maps to ngx.remote_addr.
source.port ✔️ Maps to ngx.remote_port.
Proxy-Wasm properties
plugin_name ✔️ Returns current filter name.
plugin_root_id ✔️ Returns filter's root context id.
plugin_vm_id NYI.
node Not supported.
cluster_name Not supported.
cluster_metadata Not supported.
listener_direction Not supported.
listener_metadata Not supported.
route_name Not supported.
route_metadata Not supported.
upstream_host_metadata Not supported.

"NYI" stands for "Not Yet Implemented".

"Not supported" means the attribute will likely not be supported in ngx_wasm_module, most likely due to a Host incompatibility.

Back to TOC

Response Body Buffering

Buffering of response body chunks is supported within ngx_wasm_module so filters don't have to implement buffering themselves. This allows the on_response_body step to be invoked with the full response body available for read via get_http_response_body.

When response buffering is enabled, response chunks will be copied to buffers defined by the wasm_response_body_buffers directive while execution of the Proxy-Wasm filter chain is temporarily suspended until buffering is complete, at which point on_response_body will be invoked again.

To enable this behavior from a filter based on Proxy-Wasm ABI v0.2.1, the filter must return Action::Pause from on_response_body. Once enabled, ngx_wasm_module will accumulate subsequent body chunks until either eof is reached, or the buffers are full. When either of these conditions are met, on_response_body will be invoked again and the body buffer will contain the buffered chunks.

In other words, once body buffering is enabled, the next on_response_body invocation will contain the buffered body and may be invoked again if eof was not reached but the buffers are full.

A typical response buffering flow could be:

  1. 1st on_response_body call: ignore 1st chunk, requesting buffering.
    1. Check for eof=false.
    2. Ensure buffering was not already requested.
    3. Return Action::Pause, requesting buffering.
  2. 2nd on_response_body call: buffering ended, but how?
    1. If eof=true, the full response body is in the buffers.
    2. If eof=false, the buffers are full, but more chunks are expected (users should treat the buffers as if it were a single, non-buffered chunk).
  3. nth on_response_body call: next chunks, if any.

Returning Action::Pause when buffering has already taken place will be ignored (i.e. treated as Action::Continue) and an error log will be printed.

Notes

Keep in mind there are fundamental issues with buffering bodies at scale due to the nature of the workload, hard buffer limits defined by wasm_response_body_buffers, and Wasm memory limits themselves (loading and manipulating the body in filters). This feature should be used with extreme caution in production environments.

Back to TOC

Examples

Back to TOC

Current Limitations

Besides the state of support for the Proxy-Wasm SDK in ngx_wasm_module, other factors are at play when porting the SDK to a new Host proxy runtime.

Proxy-Wasm's design was primarily influenced by Envoy concepts and features, but because Envoy and Nginx differ in underlying implementations there remains a few limitations on some supported features (non-exhaustive list):

  1. The eof flag will always be false in the following steps, even if the request/response has a body:
    • on_http_request_headers
    • on_http_response_headers

This is due to internal Nginx constraints while reading request/response payloads.

  1. Pausing a filter (i.e. Action::Pause) can only be done in the following steps:

    • on_http_request_headers
    • on_http_request_body
    • on_http_response_body (to enable body buffering)
    • on_http_call_response
  2. The "queue" shared memory implementation does not implement an automatic eviction mechanism when the allocated memory slab is full:

    • proxy_enqueue_shared_queue

Future ngx_wasm_module and WasmX work will be aimed at lifting these limitations when possible and increasing overall surface support for the Proxy-Wasm SDK.

Back to TOC