Document how to write and run tests #373

caspervonb · 2021-01-14T22:55:25Z

Based on my recent work and our discussions in yesterday's meeting I started drafting up the basis of a simple specification of a test runner and guidelines for authoring tests.

docs/Testing.md

sbc100 · 2021-01-15T10:02:55Z

docs/Testing.md

@@ -0,0 +1,127 @@
+# Testing
+


I wonder if the sources for test should be in a separate directory, that way a test source can be in any format, include multi-file or even nested hierarchy or files. Specifically I see two advantages to this:

The test binary directory (the one the test runner processes) only contains files with a fixed set of known suffixes.

The test source directory structure can be language-specfic and take what ever form it likes. Perhaps some source languages like to have each program in its directory, or each program requires several files?

Presumably we'll gather all the tests in a webassembly/wasi-testsuite repository ala webassembly/testsuite that can be pulled in as submodule or subtree containing all the test fixtures you'll need to bring your own runner in an existing test setup.

Having a single self contained source file here makes it really easy to reference exactly what is failing; an implementation's test runner can even parse that programatically from a trace and output that.

That said; some of the webassembly proposal repositories do have a test/meta directory which contains the files used to generate tests.

Generating parametric tests this way is definitively an option but I'd still like the generated sources to be available in the main test directory next to the binary for reference without having to "ship" the entire meta directory.

The test binary directory (the one the test runner processes) only contains files with a fixed set of known suffixes.

I don't think we can promise this in general.

As proposals emerge the tests may have to be expanded upon with new meta files; additional permissions for sockets/networking comes to mind for example.

This initials set is just enough to support the filesystem and command proposals.

One additional comment to this: though wasi-nn is rather exotic compared to, say, wasi-filesystem, I would like to consider what it would take to write WASI tests. With the current spec, the ML model and the input tensors would likely be shipped as separate files since they are rather large. Does that fit into this model?

Additionally, the output tensor(s) will likely not be exact matches but instead need some type of fuzzy assertion. Not sure how we will handle that...

One additional comment to this: though wasi-nn is rather exotic compared to, say, wasi-filesystem, I would like to consider what it would take to write WASI tests. With the current spec, the ML model and the input tensors would likely be shipped as separate files since they are rather large. Does that fit into this model?

It's open ended and additive so it can be extended to encompass whatever we need but I'm not sure what wasi-nn needs, not that familiar with the proposal but seems it just loads models from memory?

If so then a data section seems more appropriate.

Additionally, the output tensor(s) will likely not be exact matches but instead need some type of fuzzy assertion. Not sure how we will handle that...

Depends on the previous question but probably just define some fuzzy asserters in the source language.
Internal logic should be handled internally IMO.

Really a case of you tell me tho; I'm not familiar enough with that proposal at this time to make any recommendations.

Depends on the previous question but probably just define some fuzzy asserters in the source language.
Internal logic should be handled internally IMO.

Sounds good.

I'm not sure what wasi-nn needs, not that familiar with the proposal but seems it just loads models from memory?

That's right but, practically speaking, those models/weights/tensors are large and are easier to manage as separate files than static data sections in a compiled Wasm file. It's not an exact corollary, but in an example over in Wasmtime I load this data from files before using them in wasi-nn. If we could bundle additional files with the tests and were allowed to use these files in <basename>.arg (or even statically in the test, I guess), that would seem easiest. What do you think?

Another reason to embed any data the test needs to that it avoid the dependency on the filesystem API, making the test more precise and less fragile. It should be fairly straight forward to embed data in wasm, I think there are a few different ways you can do this if you are using an llvm-based toolchain.

If we could bundle additional files with the tests and were allowed to use these files in .arg (or even statically in the test, I guess), that would seem easiest. What do you think?

You could but this also introduces a lot of extra unrelated dependencies in your tests that can be avoided.

Different toolchains have different mechanisms for it but it can be quite trivial to embed the data. In Rust for example one could use the include_bytes macro.

let model_1 = include_bytes!("model_1.tensor")

Other toolchains have their ways too.

docs/Testing.md

sbc100 · 2021-01-15T10:06:39Z

docs/Testing.md

+## Writing tests
+
+Any source language may be used to write tests as long as the compiler supports
+the wasm32-unknown-unknown target and is well-known and freely available.


wasm32-unknown-unknown is an llvm-ism so we probably don't want to mention that here. Perhaps:

"Any source language may be used to write tests as long as the compiler supports
WebAssembly output targeting the WASI API"

The part about "well-known and freely available" will probably want to be tightened up too.

wasm32-unknown-unknown is an llvm-ism so we probably don't want to mention that here. Perhaps:

Targeting wasm32-wasi is probably fine actually, just don't pull in libc.

I just want to make sure we point out that using wasm32-wasi's main is generally going to make for a bad test.

For example; trying to test proc_exit in main isn't great as it'll do all the bootstrapping that libc needs but isn't relevant in the context.

The part about "well-known and freely available" will probably want to be tightened up too.

Yeah I'll need to think about what that means.

I'm thinking:

We can't allow proprietary compilers.

We can't allow non-portable compilers.

We can't allow toy languages that won't be or aren't maintained.

I'll double back to this later.

I do tend to think we do want to allow tests to pull in libc. It does call proc_exit and some of the stdio routines needlessly pull in fdstat_get and perhaps other things, however bare WASI is really inconvenient to code to, so if we have to do everythin in bare WASI, it seems like we'd end with people writing fewer tests. It seems better to have more tests, even if some of them do call more functions than they strictly need to.

I do tend to think we do want to allow tests to pull in libc. It does call proc_exit and some of the stdio routines needlessly pull in fdstat_get and perhaps other things, however bare WASI is really inconvenient to code to, so if we have to do everythin in bare WASI, it seems like we'd end with people writing fewer tests. It seems better to have more tests, even if some of them do call more functions than they strictly need to.

From experience it was a bit tedious to deal with as _main also calls into the prestat, args and env functions as-well (at-least at the time).

Simple to write tests against but early days of WASI in Deno basically couldn't be tested with my libc tests until everything was bootstrapped.

I think we'll just have to revisit this once the split is done and it's a bit clearer how that's going to work; but presumably we'd want to generate headers and such based on the the local witx files and not be dependent on the current state of say the wasi-libc implementation and headers or the wasi crate (which would also be a circular dependency).

As for wasi-libc itself we can test that in wasi-sdk with the same harness, it's already fairly similar we'd just have to collect them along with the tests from this repository into something like webassembly/wasi-testsuite to let implementers make use of them as-well.

sbc100 · 2021-01-15T10:07:47Z

docs/Testing.md

+- \<basename\>.stdin
+- \<basename\>.stdout
+- \<basename\>.stderr
+- \<basename\>.status


We should probably say something about where the test live for a given proposal, and how a test running can/should find the tests for given proposal or all proposals.

We should probably say something about where the test live for a given proposal

Read through make-snapshot.sh and I'm thinking a proposal repository would put it's tests in test/<module_name>/<test_name> where proposal is the root directory of the repository.

The make snapshot script merges proposals with $snapshot_dir/proposals/${repo} so this allows us to use simple path filtering to match just about any selection of tests.

Run all the tests in a standalone repository:

git clone https://github.com/WebAssembly/wasi-clocks wasi-test-runner.sh wasmtime wasi-clocks/test

Run a specific proposals tests in a snapshot

git clone https://github.com/WebAssembly/wasi wasi-test-runner.sh wasmtime wasi/snapshot/ephemeral/proposals/<proposal>

Run a specific module's tests in standard

git clone https://github.com/WebAssembly/wasi wasi-test-runner.sh wasmtime wasi/standard/<module>

Simple but effective.

Just having the full test path be test/<module>.<function>[-\<variant\>].<ext> would also work if we desire a flatter structure.

caspervonb · 2021-01-15T16:17:18Z

docs/Testing.md

+
+In addition to the source and binary files, a test can have any of the following
+auxilary files and directories:
+- \<basename\>.arg


TODO; add an example for each file type

docs/Testing.md

caspervonb · 2021-01-21T23:03:29Z

One open ended question is how do we handle printing diagnostics to the testing environment.

Assertions which just call unreachable aren't that great but it It would be preferable to NOT depend on output_stream in every single test to print the failure before aborting.

Spectests rely on getting a spectest module which just has a bunch of prints in it; maybe we could do something similar.

Doesn't matter that much in say Deno's current test-suite; I'm expecting us to support 100% but when we're testing specific modules as we'll end up doing here it'd be good if we cut this inter-module dependency.

pchickey · 2021-01-22T00:21:50Z

The wasmtime repo's WASI test suite uses Rust as the source language, which fails by panicking. It will print messages to stdout and eventually hit the unreachable opcode. Our test harness captures guest stdout and stderr, and prints any of them that are non-empty when the instance traps.

This doesn't do you much good if your implementation doesn't have basic stuff like writing to stdio working, but in my experience that is usually the first thing you get working.

caspervonb · 2021-01-22T02:12:25Z

The wasmtime repo's WASI test suite uses Rust as the source language, which fails by panicking. It will print messages to stdout and eventually hit the unreachable opcode. Our test harness captures guest stdout and stderr, and prints any of them that are non-empty when the instance traps.

I'd assume that be the case for any current tests; but with the new module split I'm not sure if that makes sense to do here.

sbc100 · 2021-01-22T02:55:51Z

The wasmtime repo's WASI test suite uses Rust as the source language, which fails by panicking. It will print messages to stdout and eventually hit the unreachable opcode. Our test harness captures guest stdout and stderr, and prints any of them that are non-empty when the instance traps.

I'd assume that be the case for any current tests; but with the new module split I'm not sure if that makes sense to do here.

I think it make sense to assume stdout and the ability to set an exit code as a minimum bar for running any test. If we don't do that we would forced to create some other bespoke I/O system just for the test system (wouldn't we?). I'm not sure its worth the extra complexity to do that.

caspervonb · 2021-01-26T11:43:17Z

I think it make sense to assume stdout and the ability to set an exit code as a minimum bar for running any test. If we don't do that we would forced to create some other bespoke I/O system just for the test system (wouldn't we?). I'm not sure its worth the extra complexity to do that.

It does make them more integration test like; which is fine just wanted to point it out as I'm unclear on where this moduralization that started last-year'ish is going (especially weak linkage).

Deno and my test repos do this and it has worked out fine.

Technically you don't actually even have to implement stdio (but it does have to be stubbed); failed assertions will always terminate with an unreachable instruction at the end regardless.

sunfishcode

Thanks for writing this up! This looks like what we talked about in the meeting, and a good start.

sunfishcode · 2021-01-28T02:04:49Z

docs/Testing.md

+A test case takes the form of a binary \<basename\>.wasm file, next to that
+that there will always be a \<basename\>.\<ext\> source file from which the
+test was originally compiled from which can be used as a reference in the event
+of an error.


I like the simplicity of this, but I expect it'll be too limiting as we add more tests in more languages. It'll be convenient to build Rust source files with cargo, and other language source files with their respective build systems, and that'll often require additional support files, so it won't always be as simple as <basename>.<ext>.

What would you think of a convention where we have a sources directory, and in it we have source code and build scripts that one can run to re-generate the wasm files?

What would you think of a convention where we have a sources directory, and in it we have source code and build scripts that one can run to re-generate the wasm files?

Some of the WebAssembly proposals have a meta directory that contain wast generators; might want to keep with that convention?

Would still like to see the "final" source files next to the test binaries as part of the build for reference as more often than not assertion errors don't make much sense without looking at the source.

Slight duplication but copying it back out to the parent directory next to the binary as part of the build means the implementor running the tests doesn't have to care how a particular proposal structured the build system.

In-fact, this way the meta directory could just be omitted from an implementation's test-data folder entirely if they have no need for it.

sunfishcode · 2021-01-28T02:08:29Z

docs/Testing.md

+
+- Prepare inputs
+  - Given an `<input>.wasm` file; take the `<basename>` of said file.
+  - If `<basename>.<arg>` exists; take the program arguments from said file.


It's a minor detail, but I'd find this more clear with ".args" instead of ".arg".

Also, we should document the whitespacing convention in the arguments file. Can we say that arguments are separated by whitespace, and '"' quotes may be used to enclose an argument containing whitespace?

sunfishcode · 2021-01-28T02:11:16Z

docs/Testing.md

+```bash
+# Usage: $1 <runtime> <path_to_binary.wasm>
+# $1 wasmtime proc_exit-success.wasm
+# $1 wasmer proc_exit-failure.wasm


I know these are only examples, but in the interests of not wanting to exclude engines, we shouldn't mention either of these examples here. Would "Usage: $1 <runtime_cmd> <path_to_binary>.wasm" be clear enough, without the examples?

sunfishcode · 2021-01-28T02:14:26Z

docs/Testing.md

+## Writing tests
+
+Any source language may be used to write tests as long as the compiler supports
+the wasm32-unknown-unknown target and is well-known and freely available.


I do tend to think we do want to allow tests to pull in libc. It does call proc_exit and some of the stdio routines needlessly pull in fdstat_get and perhaps other things, however bare WASI is really inconvenient to code to, so if we have to do everythin in bare WASI, it seems like we'd end with people writing fewer tests. It seems better to have more tests, even if some of them do call more functions than they strictly need to.

caspervonb added 2 commits January 14, 2021 22:51

Create Testing.md

2897fc0

Add note about wasm32-unknown-unknown

77dc00b

caspervonb commented Jan 15, 2021

View reviewed changes

docs/Testing.md Outdated Show resolved Hide resolved

sbc100 reviewed Jan 15, 2021

View reviewed changes

caspervonb commented Jan 15, 2021

View reviewed changes

caspervonb added 3 commits January 15, 2021 16:57

gqq

bb3e04b

Quote instead of escape

5a9e3f4

Given an <input.wasm> file

2686e48

caspervonb commented Jan 15, 2021

View reviewed changes

docs/Testing.md Outdated Show resolved Hide resolved

Update docs/Testing.md

73ae0ef

caspervonb mentioned this pull request Jan 15, 2021

Add tests WebAssembly/wasi-clocks#2

Closed

Add $env to the example runner

76e1d40

Base automatically changed from master to main January 19, 2021 23:08

linclark added this to the WASI Modularization milestone Jan 26, 2021

sunfishcode reviewed Jan 28, 2021

View reviewed changes

Minor fixes

6d24e22

sunfishcode mentioned this pull request Feb 23, 2021

Initial checkin of test scripts, copied from wasi-sdk. #397

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document how to write and run tests #373

Document how to write and run tests #373

caspervonb commented Jan 14, 2021 •

edited

sbc100 Jan 15, 2021

caspervonb Jan 15, 2021 •

edited

caspervonb Jan 15, 2021 •

edited

abrown Jan 21, 2021

caspervonb Jan 21, 2021

abrown Jan 21, 2021 •

edited

sbc100 Jan 21, 2021

caspervonb Jan 21, 2021 •

edited

sbc100 Jan 15, 2021

caspervonb Jan 15, 2021 •

edited

sunfishcode Jan 28, 2021

caspervonb Jan 28, 2021 •

edited

sbc100 Jan 15, 2021

caspervonb Jan 15, 2021

caspervonb Jan 15, 2021

caspervonb Jan 15, 2021

caspervonb commented Jan 21, 2021 •

edited

pchickey commented Jan 22, 2021

caspervonb commented Jan 22, 2021

sbc100 commented Jan 22, 2021 •

edited

caspervonb commented Jan 26, 2021 •

edited

sunfishcode left a comment

sunfishcode Jan 28, 2021

caspervonb Jan 28, 2021 •

edited

sunfishcode Jan 28, 2021

sunfishcode Jan 28, 2021

sunfishcode Jan 28, 2021

Document how to write and run tests #373

Are you sure you want to change the base?

Document how to write and run tests #373

Conversation

caspervonb commented Jan 14, 2021 • edited

Choose a reason for hiding this comment

caspervonb Jan 15, 2021 • edited

Choose a reason for hiding this comment

caspervonb Jan 15, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abrown Jan 21, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caspervonb Jan 21, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caspervonb Jan 15, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caspervonb Jan 28, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caspervonb commented Jan 21, 2021 • edited

pchickey commented Jan 22, 2021

caspervonb commented Jan 22, 2021

sbc100 commented Jan 22, 2021 • edited

caspervonb commented Jan 26, 2021 • edited

sunfishcode left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caspervonb Jan 28, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caspervonb commented Jan 14, 2021 •

edited

caspervonb Jan 15, 2021 •

edited

caspervonb Jan 15, 2021 •

edited

abrown Jan 21, 2021 •

edited

caspervonb Jan 21, 2021 •

edited

caspervonb Jan 15, 2021 •

edited

caspervonb Jan 28, 2021 •

edited

caspervonb commented Jan 21, 2021 •

edited

sbc100 commented Jan 22, 2021 •

edited

caspervonb commented Jan 26, 2021 •

edited

caspervonb Jan 28, 2021 •

edited