Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add expo-sqlite, the ability to test it, and a promisifying wrapper #5184

Merged
merged 18 commits into from Jan 11, 2022

Conversation

gnprice
Copy link
Member

@gnprice gnprice commented Dec 30, 2021

This accomplishes a first stage of #4841, sound storage with SQLite. In this PR, we gain the ability to use SQLite from our JS code.

The core of this is that we use expo-sqlite, and we add a small wrapper to give a nice Promise-based interface atop its callbacks-based API.

The first complication is that expo-sqlite does not run in Jest: expo/expo#15125 . If we're going to write code that uses SQL to manage our users' data, we'll certainly want to have the ability to test that code. We fix that by pulling in sqlite3 to get SQLite in Node (where Jest runs), and then writing a small amount of JS code to provide a version of expo-sqlite atop that.

The next complication or two is that expo-sqlite (really its dependency @expo/websql, aka node-websql) does not handle transactions well (chat discussion):

  • If you start a transaction which starts making some changes, then throws an exception before it makes the other half of the needed changes… the transaction gets committed, with the half-done changes you've made so far. Yikes!
  • If you start a transaction which executes a SQL statement, the statement finishes, and then your resulting callback throws an exception… the transaction, and the whole database object, get stuck in a wedged state where nothing else you do will ever run.
    • This is better than committing a half-done transaction (because at least it won't permanently corrupt your actual data), but still not good. The right behavior would be to roll back the transaction.
  • If you start a transaction and then do some asynchronous work, the transaction will go ahead and commit with the changes you've made so far, without waiting for the other statements you were about to try to add to it.
    • This is less severe than the first issue because code that hits it is likely to hit it 100% of the time, so that it's caught in development.
    • But it's a blocker for us being able to have, say, a transaction that reads some data from our upcoming AsyncStorage replacement, decompresses it to get meaningful app-level data, does something with that, compresses the results, and writes those back -- which is exactly what a typical migration will look like for us in that system as we continue making changes to the data structures we keep in Redux. That's because our compression and decompression is done asynchronously, with a round-trip to native code.
    • Transactions auto-close when queuing microtasks between executeSql() calls nolanlawson/node-websql#46 describes a narrowed-down version of this issue.

Happily we're able to work around those issues with a small amount of additional code in our promisifying wrapper. Then we also write tests, not only for the desired behavior in the wrapper but to exercise the underlying broken behavior. Those get somewhat more complicated, particularly because the bad behavior involves unhandled exceptions and Jest doesn't take those quietly.

Together that seems like plenty for one PR. The actual AsyncStorage replacement we'll save for the next PR after this one.

The workaround isn't awesome: in particular it busy-waits, so that if you do something async that could take a long time without needing the CPU, like a network request, the workaround will burn CPU (and hence power) while waiting for it. Moreover, while some of the issues are bugs that could be fixed in the underlying layers, others -- in particular the incompatibility with async work in a transaction -- are inherent in the API. So as a later followup we may want to cut out the JS code of expo-sqlite and its dependencies entirely, rather than working around it on top. But this version works correctly, which I think makes it good enough to build on and move forward for now.

(An earlier version of this, together with some of the changes to come after it, was a draft PR #5157.)

@gnprice gnprice added a-data-sync Zulip's event system, event queues, staleness/liveness P1 high-priority labels Dec 30, 2021
Copy link
Contributor

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! See some comments below, including one (edit: #5184 (comment)) to help me get a better handle on the bundles of things we're fixing with these libraries. I'll also force-push with the ios/Podfile.lock change mentioned in one of the comments.

@@ -38,6 +38,7 @@
"expo-apple-authentication": "^3.2.1",
"expo-application": "^3.2.0",
"expo-screen-orientation": "^3.3.0",
"expo-sqlite": "^9.1.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deps: Add expo-sqlite

We should also run pod install at this commit, so the change is propagated to ios/Podfile.lock. I'll force-push with this done in case you don't have your Mac handy. 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 🙂 I do have the Mac handy (on my desk at home, it's here right next to my main desktop) but had forgotten to test there.

@@ -67,6 +67,8 @@ jest.mock('react-native', () => {
return ReactNative;
});

jest.mock('immediate', () => cb => Promise.resolve().then(cb));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expo-sqlite tests: Get a working `immediate` in Jest

[…]

In Jest tests, that doesn't work; the symptom is that the test hangs
and then times out at 5 seconds, not having completed.  Looking at the
`immediate` implementation, it relies on `setTimeout` -- so that'd do it.

[…]

I don't yet understand this; is the issue that we can't expect setTimeout to be available in the Jest environment? We can, can't we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's present but mocked out:
https://jestjs.io/docs/timer-mocks
(see also our async-test.js), and as a result when something running under a test creates a timer (as with setTimeout), the timer callback may not be run unless the test calls jest.runAllTimers() or one of its friends. Plus, as you can see in async-test.js, and more so in the "our promisified sqlite > transaction with internal asynchrony …" test added in this series, even doing that can get hairy because the jest.runAllTimers() isn't effective before the timer has actually been created.

As that same "transaction with internal asynchrony" test illustrates, this applies even if the setTimeout duration is zero, as it is in the immediate implementation (which just doesn't pass a second argument to setTimeout.)


After looking harder at that immediate implementation, I think the setTimeout is actually on a bit of a side path of the control flow -- the thing that's actually at issue here is process.nextTick. (Which is a Node thing that's a lot like setTimeout(…, 0) but runs before those do.)

But it looks like Jest's fake timers (the "modern" version, which we're using these days) override that too. If I put this in a test:

  const p = new Promise(r => process.nextTick(() => { console.log('B'); r(); }));
  console.log('A');
  //   await p; // this will hang
  jest.advanceTimersByTime(1);
  console.log('C');
  await p; // this is fine
  console.log('D');

it prints A, B, C, D in that order. And if I uncomment the await p that's before the advanceTimersByTime, then it hangs there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also the comments I just left on https://gist.github.com/apieceofbart/e6dea8d884d29cf88cdb54ef14ddbcc4 , which I found when trying to understand how Jest interacts with process.nextTick and had some helpful examples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, this is making more sense.

There's some talk about having "modern" fake timers give you the option to not have nextTick mocked: jestjs/jest#10221 (comment)

But even with a not-mocked nextTick, it sounds like we'd still want this change, because setTimeout is involved—

[…] (The
implementation also uses `setTimeout`, in what seems to be a
convoluted substitute for try/catch.  So that'd be an additional
reason it wouldn't work, if it got that far.)

—and Jest's fake timers will always mock setTimeout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.

FWIW now that I understand process.nextTick better, I'm fairly confident that the right thing is for Jest to not mock it by default. (Previously I didn't feel I understood the issue well enough to have a firm opinion.)

Details in chat here: https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/Node.20process.2EnextTick/near/1307747
(mainly so that if it ever comes up in the future, we have a good shot at finding the discussion again.)

}

close() {
this._closed = true;
this._db.close();
}

/**
* Suppress some otherwise-unhandled Promise rejections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  /**
   * Suppress some otherwise-unhandled Promise rejections.
   * 
   * […]
   */

"Some" is kind of vague; is that OK?

In _exec, we've wrapped one—of two—calls to callback with a try/catch. callback itself isn't an async function, so I guess it's appropriate that we don't await the callback call inside the try.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think being vague here is basically OK, because only a few tests will use it.

Probably a try/catch around that other callback call site would make sense. I'll add that. That might also give me a clean less-vague description to give this allowUnhandled method.


import invariant from 'invariant';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: invariant goes above the blank line, right, with the other third-party imports?

@@ -115,6 +115,7 @@
"react-test-renderer": "17.0.1",
"redux-mock-store": "^1.5.1",
"rollup": "^2.26.5",
"sqlite3": "^5.0.2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I click the "Display the rich diff" button (a sheet-of-paper icon) on the yarn.lock changes in this commit in the GitHub UI, I see some security warnings about tar:

image

Here's yarn why tar:

yarn why v1.22.10
[1/4] 🤔  Why do we have the module "tar"...?
[2/4] 🚚  Initialising dependency graph...
[3/4] 🔍  Finding dependency...
[4/4] 🚡  Calculating file sizes...
=> Found "tar@2.2.2"
info Reasons this module exists
   - "sqlite3#node-gyp" depends on it
   - Hoisted from "sqlite3#node-gyp#tar"
info Disk size without dependencies: "1.49MB"
info Disk size with unique dependencies: "1.67MB"
info Disk size with transitive dependencies: "2.14MB"
info Number of shared dependencies: 13
=> Found "node-pre-gyp#tar@4.4.19"
info This module exists because "sqlite3#node-pre-gyp" depends on it.
info Disk size without dependencies: "232KB"
info Disk size with unique dependencies: "508KB"
info Disk size with transitive dependencies: "612KB"
info Number of shared dependencies: 8
✨  Done in 0.65s.

Looks like sqlite3 brings in node-gyp which brings in tar at an offending version.

I can't yet tell how much we might or might not need to worry about these. I think TryGhost/node-sqlite3#1493 is the issue where the warnings are being tracked in sqlite3.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Yeah, seems like sqlite3 is not being particularly maintained.

Perhaps I'll switch to the vscode folks' fork of it, as discussed here: TryGhost/node-sqlite3#1493 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, looking into it, I don't think the vscode fork is really any more maintained -- they've pretty clearly signaled it's for their own internal use, and the way they've treated it is all consistent with that: microsoft/vscode-node-sqlite3#14 (comment)

Instead, I'll just bump the node-gyp version in a Yarn resolution: TryGhost/node-sqlite3#1493 (comment)

@@ -88,6 +94,23 @@ export class SQLDatabase {
}
}

// An absurd little workaround for expo-sqlite, or really
// the @expo/websql library under it, being too eager to check a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@expo/websql

FYI expo/node-websql#1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, thanks for reporting that. I did locate (I forget how) that that seemed to be the repo that package was published from; it'd be good for it to update its "repository" metadata on NPM to match.

Comment on lines 81 to 82
// transaction callback itself throws an exception, and to get the whole
// database object stuck if a statement callback throws an exception.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// […] to get the whole
// database object stuck if a statement callback throws an exception.

Wow, glad you caught this! Do you think there might be lingering cases to be handled where the whole DB object can get stuck? I wonder if we could get good logging to Sentry when this happens.

Does this happen when an error is thrown from within an error callback, not a success callback, and should we handle that as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think there might be lingering cases to be handled where the whole DB object can get stuck? I wonder if we could get good logging to Sentry when this happens.

Yeah, good question. I'm not prepared to rule out that that could happen, but I think it would at least be a distinct issue -- I think this issue is completely resolved by what's in the wrapper now.

Specifically the issue is that in node_modules/@expo/websql/lib/websql/WebSQLDatabase.js, the _running property on the WebSQLDatabase instance gets set to true, blocking any new transactions, and won't get set back to false until _onTransactionComplete is called. Looking at its sibling WebSQLTransaction.js, it won't call that until the transaction's own _running property gets set back to false. And at the call sites of batchTask.sqlCallback and batchTask.sqlErrorCallback, if those throw an exception then that propagates right past where we were going to call the local onDone, which was the place that was supposed to set ._running back to false.

In our wrapper, the callbacks passed to executeSql are always trivial and shouldn't throw; they just resolve or reject a promise. So their caller in WebSQLTransaction.js should always get through them and make it to onDone, so things don't get stuck. (If app code hits an error, that happens in the promise's callback (or further down the line), after we resolve or reject it and after we've returned to the WebSQLTransaction code to let it complete its work.) That leaves the danger it'll go ahead and commit when it shouldn't.

And that in turn is resolved by the hold loop keeping something in the queue, and putting something error-causing there when the transaction should fail.

Does this happen when an error is thrown from within an error callback, not a success callback, and should we handle that as well?

Yeah, as described above, it does and we do. "A statement callback" was meant to cover both kinds of callbacks from executeSql, but could perhaps be more explicit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a link in that code comment back to this subthread.

const hold = () => tx.executeSql('SELECT 1', [], () => (done ? undefined : hold()));
const hold = () =>
tx.executeSql('SELECT 1', [], () =>
error ? tx.executeSql('SELECT error') : done ? undefined : hold(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. It looks like tx.executeSql('SELECT error') is meant to cause an error beneath the application-level code, and that's how we cause the transaction to roll back (rather than, I dunno, some bit of API on the transaction object).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. There is no such API on the transaction object, sadly… unless you count this, and other variations of "attempt an invalid statement".

That aspect goes back to the old proposed Web standard that the API is based on:
https://www.w3.org/TR/webdatabase/#asynchronous-database-api

f: () => void | Promise<void>,
): Promise<void> {
let done = false;
const hold = () => tx.executeSql('SELECT 1', [], () => (done ? undefined : hold()));
Copy link
Contributor

@chrisbobbe chrisbobbe Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it'd be pretty pathological for a future/different @expo/websql implementation to have executeSql call the callback synchronously. When that occurred to me, I was more convinced that we can't have an infinite recursion problem here. (Right?)

I wonder about a strain on performance caused by these rapid-fire 'SELECT 1' statements at all the different layers. There will be more two-way traffic on the React Native bridge; I'm not sure what that might mean for memory. Worth at least manually testing how it feels when we're writing something huge, like users on CZO, in a future PR where we start actually using this new storage setup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it'd be pretty pathological for a future/different @expo/websql implementation to have executeSql call the callback synchronously. When that occurred to me, I was more convinced that we can't have an infinite recursion problem here. (Right?)

Yeah, that would be quite pathological. Generally in a callback-based API, it's a principle of good API behavior that if the callback's invocation is usually asynchronous, or ever asynchronous, or the API looks like it might be asynchronous, then it's only ever invoked asynchronously and never synchronously, i.e. it's never invoked before the API function has returned to its caller. When a callback is sometimes unexpectedly called synchronously, it's tricky for users of the API to avoid having subtle bugs.

There's a bit of discussion of this on a page of Node's docs that I ran across while trying to understand process.nextTick:
https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/#why-would-that-be-allowed

More concretely, the Web SQL Database spec which this library generally aims to follow specifically requires that the statement just gets queued up, not synchronously executed:
https://www.w3.org/TR/webdatabase/#executing-sql-statements
and so a fortiori the statement callbacks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder about a strain on performance caused by these rapid-fire 'SELECT 1' statements at all the different layers. There will be more two-way traffic on the React Native bridge; I'm not sure what that might mean for memory. Worth at least manually testing how it feels when we're writing something huge, like users on CZO?

Indeed.

Note that as one mitigating factor, the statements are run in order -- so when we queue one of these SELECT 1 statements after some other statement, its callback won't get run (and possibly put another one on the queue) until that other statement completes. I believe that means that in a transaction with N actual app-level statements, this workaround will only add N+2 of these queue-keepalive SELECT 1 statements, no matter how big or slow the app-level statements are.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we queue one of these SELECT 1 statements after some other statement, its callback won't get run (and possibly put another one on the queue) until that other statement completes

Mm, interesting, I didn't think of that. That's good, I guess!

I believe that means that in a transaction with N actual app-level statements, this workaround will only add N+2 of these queue-keepalive SELECT 1 statements, no matter how big or slow the app-level statements are.

I think I'm counting N+1. Where do you get the 2 in N+2?

Also, if one of these SELECT 1 statements is alone in the queue and other app-level statements aren't being enqueued, can't we accidentally break such a bound by running SELECT 1 over and over until another app-level statement gets enqueued and gets its turn? I think that could happen if there's some async non-statement work we're waiting for, like compression/decompression steps you mention in the issue description, for migrations. You've mentioned here:

We'll need to be mindful not to put real slow non-database steps in the middle of a transaction, which would be a similarly bad user experience.

I'm not sure what if any noticeable strain on performance these SELECT 1 statements will have. Possibly none or not much.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm counting N+1. Where do you get the 2 in N+2?

Hmm, not sure. When I think through it again now I only see N+1 🙂 -- one before each app-level statement, and one at the end.

Also, that is a valid upper bound but here's a tighter one: if the app-level code awaits a tx.executeSql only M times, then we'll have at most M+1 of these SELECT 1 statements. In particular, if you have code like

  db.transaction(tx => {
    tx.executeSql();
    tx.executeSql();
    tx.executeSql();
    tx.executeSql();
  });

where you just issue a bunch of statements but don't wait for any of their results, then we'll only do SELECT 1 once.

Also, if one of these SELECT 1 statements is alone in the queue and other app-level statements aren't being enqueued, can't we accidentally break such a bound by running SELECT 1 over and over until another app-level statement gets enqueued and gets its turn? I think that could happen if there's some async non-statement work we're waiting for, like compression/decompression steps you mention in the issue description, for migrations.

Ah, yeah, my comment above was stated too broadly. That's right -- this bound is really about what happens if the app-level code is doing a bunch of SQL statements but not doing anything else that waits for I/O (or for a timer); in other words, if the only things it ever awaits are either a tx.executeSql(…) or a promise that's already resolved.

I'm not sure what if any noticeable strain on performance these SELECT 1 statements will have. Possibly none or not much.

I think a key thing in our usage will be that outside of migrations, all our transactions will be entirely synchronous: they're all inside the replacement AsyncStorage, and those never have a transaction callback wait for anything before enqueueing all the statements it wants to do. So they only ever have a single SELECT 1 added here.

That means the performance impact should be basically limited to migrations. It's pretty OK if those run somewhat slower, because they're infrequent. A migration would have to take over a second for its performance to start really being a problem. Compressing or decompressing a few megabytes should take a very small fraction of that, so it'd have to be a very large slowdown to be a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, sounds good! Thanks!

@@ -136,4 +140,143 @@ describe('expo-sqlite', () => {
// Instead, we get:
expect(data).toEqual({ double: 8 }); // bad: missing the second INSERT
});

describe('transactions failing in app code', () => {
Copy link
Contributor

@chrisbobbe chrisbobbe Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expo-sqlite tests: Show transactions' (non-)handling of exceptions

Is there a connection between the "(non-)handling of exceptions" and the "broken asynchrony handling" (from an earlier commit) that you'd like to highlight? For example, is one a consequence of the other in some interesting way?

Or can I think of these as separate messes that we're cleaning up, as I try to follow everything we're doing in this PR? 🙂

I'll take a closer look once I have your answer. 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they can basically be thought of as separate.

In particular, the Web standards approached these issues in basically the opposite order:

  • The Web SQL Database spec requires that exceptions be handled, in a basically reasonable way: if the transaction callback, or a statement callback, raises, then the transaction gets rolled back.

    • So the fact that node-websql doesn't do this is simply a bug.
    • (Reading closely, in the case of a statement error callback the text isn't totally clear about what happens if it raises. But I believe it does end up requiring that that's a rollback -- "the error callback did not return false" includes the case where the error callback didn't return anything and instead threw.)
  • But the eager auto-committing that's incompatible with asynchrony in the app code is faithful to the spec.

    And the IndexedDB spec, which is the actual Web standard that came after it, makes the same choice and explicitly explains it:

    Transactions are expected to be short lived. This is encouraged by the automatic committing functionality described below.

    Authors can still cause transactions to run for a long time; however, this usage pattern is not advised as it can lead to a poor user experience.

So in a perfect implementation of the Web SQL draft standard, or of the related actual Web standard, the second of our issues is fixed, but the one we fix first remains unfixed.


Re that warning in the IndexedDB spec:

I think what they have in mind for "Authors can still cause transactions to run for a long time" is basically to do something just like our workaround, and to do so in a situation where the thing the workaround waits for is something that potentially takes a truly long time, like a network request. Then the user's device would sit there with its CPU spinning in a loop of SELECT 1 filler requests (just spelled differently because IndexedDB isn't SQL), while the request takes 60 seconds timing out.

We'll need to be mindful not to put real slow non-database steps in the middle of a transaction, which would be a similarly bad user experience. But I have more confidence in us avoiding that than the browser developers have (reasonably!) in the authors of, say, the ad-targeting scripts that are all over any news site and much of the rest of the Web.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks! This is very helpful.

@gnprice
Copy link
Member Author

gnprice commented Jan 10, 2022

Thanks @chrisbobbe for the careful review! Replied to threads above, and pushed a revision that I believe addresses all the discussions. Please take a look!

Copy link
Contributor

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks! I agree that the previous discussions here on the PR are resolved.

I've just gone and caught up on the discussion on CZO. I have one specific comment from that (see below), and I replied there with a more general question: https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/sqlite.20.23M4841/near/1307676

}

/**
* Convenience method for a single read-only query.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should say that the query is run in its own transaction.

I think we want to design our wrapper to never expose a way to execute SQL statements outside of a transaction, right? That would solve a potential problem Anders points out on CZO:

[…] you could get incorrect interleaving if the external event triggers a non-transactional query during an asynchronous action inside a transaction?

(It solves it because it prevents us from ever triggering non-transactional queries anywhere.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, good thought.

We'll be adding a bit more code in this area shortly.  "Storage"
makes a more apt name for this than "boot".
This isn't quite the latest version; for now we'll use v9.x in order
to stick with "unimodules".  Issue zulip#5133 is open for the migration
that'll let us advance to v10.x.

Include a libdef, generated with flowgen with small manual fixups
as described in comments.  The libdef is actually from a newer
version because that's what I tried first; I tried rerunning flowgen
on the older version, and the differences in the new one are pure
compatible improvements (cleaning up some type definitions; adding
comments; replacing `any[]` as the type for `args` on `executeSql`.)
Studied the implementation (at 10.0.3) and confirmed that this array
doesn't get mutated.
We'll use this to get SQLite in the Jest environment, on Node.
In particular this leads to using a reasonably recent `tar` package,
fixing vulnerabilities in the old one it was using.

Upstream has already bumped this to node-gyp 7.x in their master
branch, but haven't posted a release to NPM:
  TryGhost/node-sqlite3#1493

Empirically node-gyp 8.x, the latest, works fine.  That's also
reported by someone on that issue thread:
  TryGhost/node-sqlite3#1493 (comment)
May as well go for that, then.  (There was no 8.x yet when the
version specified in sqlite3 was bumped to 7.x.)

Some other people on that thread report using a fork made by the
VS Code developers, which posted some releases in November.  But
that fork seems pretty clearly intended for VS Code's own internal
use, with no promises for broader consumption:
  microsoft/vscode-node-sqlite3#14 (comment)
so that doesn't seem like an improvement over upstream.
This is much, much nicer for actually writing the types than a
libdef would be.  In particular iterating on changes is much
faster and easier, because Flow does its usual incremental thing
rather than restarting from scratch on each edit.

More details in some new docs text in howto/libdefs.
The `@expo/websql` package (aka `node-websql`) that provides most of
the JS layer of expo-sqlite uses in turn the `immediate` package to
get an implementation of basically `setImmediate`.

In Jest tests, that doesn't work; the symptom is that the test hangs
and then times out at 5 seconds, not having completed.  Looking at the
`immediate` implementation, it relies on `process.nextTick` (when
that's present, i.e. on Node).  With Jest's "modern" fake timers --
which we now use, and have for a while -- that's mocked out, so that
explains why the "immediate" callbacks never get run.  (The
implementation also uses `setTimeout`, in what seems to be a
convoluted substitute for try/catch.  So that'd be an additional
reason it wouldn't work, if it got that far.)

We live in a JS world that has Promises.  Given those, setImmediate
has a trivial implementation.  Replace the whole module with that.

Quite likely we want a similar substitution in the actual app, but
leave that for another time for now, because in the app this module
seems to at least be working at all.
Mostly this amounts to a fourth platform-specific implementation for
expo-sqlite: upstream has implementations for Android, iOS, and web,
and this is one for Node.

The one "mock" aspect is that this keeps the underlying databases in
memory, rather than on disk.  That's helpful for isolation.  There's
also an extra function provided, to delete a database.

Include a small smoke-test to demonstrate that this does indeed run.
These just exercise simple happy cases of using transactions.

If you do something more complicated -- like the transaction's code
hits an unexpected error midway through, or you need to do some
asynchronous work with the result of one query before making the
next -- then expo-sqlite turns out to not behave so well.

Those tests get more complicated to write (partly because of the
bad behavior itself, and the need to protect the Jest environment
from it), so we'll add them separately.  We'll also work around
them in the promisified wrapper we'll be adding over expo-sqlite.
…Jest

As we add more, and more-complex, tests here, these will help keep
them from getting too noisily verbose with just the boring mechanics
of wiring up expo-sqlite's callbacks-based API to the flow of Jest.
This test shows that if using expo-sqlite you try to make a
transaction that involves some asynchronous work between SQL queries,
the transaction gets committed in a half-done state.

This is a limitation that stems from Web SQL Database, the API (an
old proposed standard, implemented in Chromium-based browsers but no
others) that expo-sqlite mostly provides: that API doesn't provide
any way for the implementation to know when you're done with any
async work, so effectively it assumes you don't try to do any.

Moreover there's an implementation bug, which isn't due to the API:
when you do try to do this, the subsequent statements you try to add
to the transaction get silently ignored, with no error.

We'll work around this issue in the promisifying wrapper we'll add
atop expo-sqlite.  Meanwhile, this test serves to demonstrate what
the issue is in the first place.
The base behavior is the same as the one-line implementation we'd
had, but now there are some knobs for tests to turn.

This will let us test code -- for example expo-sqlite, and its
dependency `@expo/websql` / node-websql -- that uses `immediate`
and allows exceptions to propagate to it unhandled.
We'll work around this behavior with our promisified wrapper.
This simple version has some important caveats -- limitations of
the underlying expo-sqlite library, which become more salient with
the higher expectations set by a Promise-based API.  We'll add
workarounds in upcoming commits to resolve those issues.
@gnprice
Copy link
Member Author

gnprice commented Jan 11, 2022

OK, and merged! Thanks again @chrisbobbe for the thorough reviews.

@gnprice gnprice deleted the pr-sqlite branch January 11, 2022 01:44
@chrisbobbe
Copy link
Contributor

Yay, glad to have this in! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a-data-sync Zulip's event system, event queues, staleness/liveness P1 high-priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants