Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed implementation of type generation #1106

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

julienmalard
Copy link
Contributor

Seems to pass...

@haadcode
Copy link
Member

Thanks @julienmalard!

Is there a way to achieve this by having the types generation in a separate repo? Or do they need to be in the repo they’re generated from (ie. this repo)?

@julienmalard
Copy link
Contributor Author

They would need to be in the same repo as tsc needs access to the source code and JSDoc comments. However, it doesn't change any of the source code files (these can continue to be javascript) and simply outputs .d.ts declaration files to the dist folder at compilation time.

@julienmalard
Copy link
Contributor Author

P.S. Alternatively, these could be configured to generate types into a separate subfolder in the dist folder.

@haydenyoung
Copy link
Member

@julienmalard is this something that could be handed off to DefiniteTyped?

@julienmalard
Copy link
Contributor Author

@haydenyoung Probably not, since these are already included as JSDocs in the main repo. From what I understand, moving to DefinitelyTyped would require duplicating the repo, with potential syncronisation difficulties later on.

This PR just has TS compile the JSDoc comments already present in the repo to TS declaration files that anyone using orbit-db in a TypeScript project can use, but it doesn't change anything regarding the compilation process or actual JS files themselves.

@@ -1,9 +1,3 @@
/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for removing this jsdoc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeScript gives an error when it is there, saying that modules cannot import from outside the module. I'm not too sure how to fix it...any ideas?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because it shouldn't be there but a few lines below, just before the Heads function, if you move it the error goes away

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh interesting. Thanks :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just fixed that, and also updated the PR to output type declarations to dist/types to have a cleaner dist directory.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice ty. Read my comment below for the problems with generating types in the current state and my proposed solution

@nicocossiom
Copy link

nicocossiom commented Oct 24, 2023

After sometime trying to figure how to bring TS support here are my thoughts on this matter:

Problems with tsc+jsdoc

The declaration code generated through tsc with jsdoc is incomplete to say the least. The problems are two-fold:

Module design pattern

  • Module design pattern used in Orbitdb does not play nicely with tsc as all it sees are functions that return objects that represent this modules. The object's members are not typed usually and the return typings for jsdoc are invalid Typescript types/interfaces, i.e.
/**
 * Creates an instance of OrbitDB.
 * @function createOrbitDB
 * @param {Object} params One or more parameters for configuring OrbitDB.
 * @param {IPFS} params.ipfs An IPFS instance.
 * @param {string} [params.id] The id of the identity to use for this OrbitDB instance.
 * @param {module:Identity|Object} [params.identity] An identity instance or an object containing an Identity Provider instance and any additional params required to create the identity using the specified provider.
 * @param {Function} [params.identity.provider] An initialized identity provider.
 * @param {module:Identities} [params.identities] An Identities system instance.
 * @param {string} [params.directory] A location for storing OrbitDB data.
 * @return {module:OrbitDB~OrbitDB} An instance of OrbitDB.
 * @throws "IPFS instance is required argument" if no IPFS instance is provided.
 * @instance
 */
 const OrbitDB = async ({ ipfs, id, identity, identities, directory } = {}) => {
 // rest of the code 
 ...
 // the actual module (object)
 return {
    id,
    open,
    stop,
    ipfs,
    directory,
    keystore,
    identity,
    peerId
  } 

As can be seen the return is typed as module:OrbitDB~OrbitDB, but tsc doesn't understand this since it is not valid TS. Also even if one would think TS might be smart enough to infer types, none of the members of the object are correctly typed, and even if they where I don't think this would work as tsc would pickup the return type from the function description.

This is the declaration generated from this code:

declare function OrbitDB({ ipfs, id, identity, identities, directory }?: {
    ipfs: IPFS;
    id?: string;
    identity?: any;
}): any;

As can be seen it's pretty much useless.

JSDoc module: typings and undefined types/interfaces

  • JSDoc documentation clashes with the solution, as it is to correctly type module members using JSDoc. For the previous example we could define a type
/**
 * @typedef {Object} OpenFnParams
 * @property {string} OpenFnParams.directory - The directory to use for this database.
 * ....etc
 */

/** 
 * @typedef {( params: OpenFnParams) => import("./database.js").Database} openFn
*/

/**
 * @typedef {Object} OrbitDB
 * @property {string} OrbitDB.id - The id for this OrbitDB instance.
 * @property {import("ipfs-core").IPFS} OrbitDB.ipfs - The IPFS instance used by this OrbitDB instance.
 * @property {string} OrbitDB.directory - The directory used by this OrbitDB instance.
 * @property {import("./key-store.js").KeyStore} OrbitDB.keystore - The keystore used by this OrbitDB instance.
 * @property {import("./identities/identity.js").Identity} OrbitDB.identity - The identity used by this OrbitDB instance.
 * @property {import("@libp2p/interface-peer-id").PeerId} OrbitDB.peerId - The peerId used by this OrbitDB instance.
 * @property {{ [key: string]: import("./database.js").Database }} OrbitDB.databases - The databases used by this OrbitDB instance.
 * @property {()=>void} OrbitDB.stop - Stops OrbitDB, closing the underlying keystore and manifest store.
 * @property {openFn} OrbitDB.open - Opens a database.
 */

The generated code looks something like this:

export type OpenFnParams = {
    directory: string;
    // etc
};
export type openFn = (params: OpenFnParams) => any;
export type OrbitDB = {
    id: string;
    ipfs: import("ipfs-core").IPFS;
    directory: string;
    keystore: any; // since they were not typed well the inferred type from import is any
    identity: any; // same here, we need to also typedef these two in their source files
    peerId: import("@libp2p/interface-peer-id").PeerId;
    databases: {
        [key: string]: any;
    };
    stop: () => void;
    open: openFn;
};
// etc

The problem is that this would mess up the JSDoc documentation site, which is quite nice, but the whole module:Orbit~Whatever is JSDoc syntax which doesn't hold any type meaning but rather is just a way to structure the documentation (or that's my understanding).

The solution: hand-made declarations

What I did to generate typings is to directly define the interfaces and types one for one after generating the declaration files. This was quite painstakingly slow but once done it works nicely. See my fork or orbit which adds a types folder which is referenced from package.json to get the types from there.

An example of use in TS:

const ipfs = await createIPFS()
const orbitdb = await createOrbitDB({ ipfs: ipfs })
const docstore =
        await this.orbitdb.open(
          "test",
          {
            Database: Documents<{ user: string, age: number }, "user"> ({ indexBy: "user" })
          })

The type of docstore would be

const docstore: Documents<{
    user: string;
    age: number;
}, "user">

So now all operations on the document are correctly typed hence passing data that doesn't fit the described document would fail on compile time, i.e:

await docstore.put({ user: "alice", age: 24 }) // works as expected
await docstore.put({ username: "alice", age: 24 }) 
/* Fails with Argument of type '{ username: string; age: number; }' 
is not assignable to parameter of type '{ user: string; age: number; }'.
*/

Why this should be the implemented solution going forward

This works perfectly and IMO should be the way to give TS support to orbitdb until some rewrite in TS is done at some point. I understand this last point might be controversial since the rewrite would be quite vast and maybe the main contributors prefer JS. I am not going to get into a debate over this, as long as someone is willing to do as I did here no rewrite is necessary.

I plan to do a PR so my fork with types is merged into main and orbitdb can be used in conjunction with TS. Only left writing some documentation and that's it.
What I would say should be a non-negotiable is to include typings along with the @orbit/core package and not as a separate package. Since most of the changes that are going to happen from here on the near future won't be many I think keeping parity between types and JS source is very feasible.

Any advice, recommendations, documentation, PR's or issues are very welcome on my fork here. Also, I hope this post explains well the current problems and why my proposed solution makes it work for everyone.

@julienmalard
Copy link
Contributor Author

Thanks @nicocossiom for this analysis! I don't know JSDoc very well, but think that your solution seems better (though it would still entail keeping the types folder up-to-date with the source code).
Regarding typing data input into databases, should this be done at the orbit-db level, or perhaps better with a separate package (e.g., https://github.com/reseau-constellation/bohr-db) that also enforces the types at runtime? If we just check types at compile time, this could I think give a false sense of security since values that don't repect the typings could have been added to the repository before or else be received from a non-conforming peer.

@nicocossiom
Copy link

nicocossiom commented Oct 24, 2023

@julienmalard Thanks for your rapid response. Regarding the issue of keeping up typings with the source code:
It is what it is, either

  • Codebase moves to TS (switch could be made by parts but since the source of orbit is not that big I think doing it as a whole would make more sense).
  • Codebase keeps being JS while declaring types in JSDoc, which would be very annoying but a more integrated and feasible solution.

This last one would mean that the API documentation site (api.orbitdb.org) would suffer some changes as explained in my previous comment (the whole module:Orbitdb... stuff).

To be frank I don't really understand the module thing, the functions that are typed as modules are just functions that when called return objects with methods and other members that represent these so called modules. Maybe it's because I don't understand what modules are in JSDoc terms or in JS in general (I am 22 just graduated CS and have not been really coding for that long).

I think that the short-term answer is the second one for sure. If @haadcode or @haydenyoung and the other maintainers agree I wouldn't mind transforming the manually typed .d.ts files I have and transform them into JSDoc comments inside the source code that can generate the same .d.ts files with tsc, tedious but very doable and also easier to maintain in the long term (not to mention a somewhat better development experience since IDE's do pickup on the typings provided by JSDoc).

In relation to your input about data type safety:
The compiler can only check if the data you put into the database matches the shape you described, but it cannot guarantee that the data already in the database or received from other peers has the same shape. You can even "trick" the compiler to force it to feed it the data with double assertions (data as unknown as T, T being whatever you want the compiler to think the database holds).
You can even create a database of Event<number>, close it and reopen the same database but giving the open function the typing Event<string> and TS will not complain.
This is obvious since there is no runtime checks for this and is expected behaviour. I do agree that this type of assertions can be tricky since they are neither real nor checked, and can give that "false sense of security" that you talked about.

I have to say your proposal of runtime validation with bohr-db seems awesome, I haven't had any time to check it out (with code I mean) but if it does work as advertised I think this takes orbitdb to the next level. I see you have used Proxy for this which seems like a good solution, though I wonder about other peers (that do not use these Proxies) getting access to a database and manipulating it. Is this a possible scenario?.
I guess this is an access controller problem, not a data validation one. Maybe an access controller can be implemented that ensures only those using a Proxy can write to a db (idk how though).

Also, the other stores you have developed seem quite interesting as well, maybe they should be integrated into the @orbitdb/core package.

@julienmalard
Copy link
Contributor Author

julienmalard commented Oct 25, 2023

Hello @nicocossiom ! I think you're right about the typing options (i.e., sticking to manual types) for as long as the orbit-db code is written in JS. I majored in agriculture and don't understand JSDoc modules completely either ! But perhaps @haadcode or @haydenyoung could confirm. If they agree to your proposition to merge your .d.ts files, I'll close this PR as superceded.

Regarding datatypes, I agree with your examples. Thanks for your enthusiasm for bohr-db! I used a Proxy so as to only modify the Database methods of interest for runtype type checking; there isn't any way for the access controller to know if you are using a Proxy or not (and even less to know if another user did). But the Proxy does check data at both write and read time, so any invalid entries added by someone else (or even by the current user bypassing the proxy, as in the tests) will be rejected and excluded from the returned data.

Regarding the kuiper stores, if the orbit-db maintainers like them, I'd be more than happy to hand them over to the main repo. I'd also be grateful for feedback on the ordered-keyvalue store; it works for simple examples but there may be edge cases to consider more carefully.

@haydenyoung
Copy link
Member

@nicocossiom I'm happy going forward with your proposed solution. Even though it may mean maintaining, manually, type declarations using a types dir. While I'd prefer not to be managing docs manually, as you say, the changes moving forward should be minimal, I don't envisage large rewrites of functionality. And if we have to store type declarations in a dir within the orbitdb project, so be it. The community has asked for typedefs so we will ensure OrbitDB meets the expectations of the community.

To be frank I don't really understand the module thing, the functions that are typed as modules are just functions that when called return objects with methods and other members that represent these so called modules.

I think your summary is accurate. My understanding is there's nothing particularly special about modules besides some var scoping it's just a method for packaging code in discrete chunks. I would say an unclear path in multiple documentation projects has probably led to a lot of the complexities of ts/js doc generation.

Regarding the kuiper stores, if the orbit-db maintainers like them, I'd be more than happy to hand them over to the main repo. I'd also be grateful for feedback on the ordered-keyvalue store; it works for simple examples but there may be edge cases to consider more carefully.

It is worth investigation although I do like the modular nature of store packages and that they can be sourced from various locations. This keeps the main codebase simple and clean while providing a framework for pluggable storage mechanisms from various 3rd parties. Anyway, perhaps a discussion in a separate issue or on the Matrix channel.

@haadcode I'm fine moving ahead with @nicocossiom proposal if you are happy with it.

Thanks @julienmalard and @nicocossiom for all your input.

@julienmalard
Copy link
Contributor Author

@haydenyoung Thanks; I'm happy to keep maintaining the kuiper stores as individual packages as well.

@MichaelJCole
Copy link

MichaelJCole commented Nov 19, 2023

@nicocossiom thanks for this, it's a huge contribution, and thanks to @julienmalard for getting it started.

Orbitdb should get a "TS" badge on npmjs.com once it's published.

Some examples:

The types should only need to change when the API changes, and it's easy for type-nerds to submit drive-by PRs to fix the types as needed. Mostly the typing helps with tooling like vs-code. Anyways, thank you!

@haadcode
Copy link
Member

haadcode commented Feb 1, 2024

Hi everyone,

We've discussed the PR with @haydenyoung as well as what our goals are here. The main hurdle we want to avoid is to add workload in developing and maintaining orbitdb (the JS implementation, this one here). We don't want to be in a situation where making changes to the JS implementation requires manual work and a dependency exists in order to keep TS related things updated. We also want to avoid complexity as much as possible.

The ideal solution here is that the types are in a separate repo, module and in a separate npm package. If this is not directly possible, the alternative would be to create a separate TS implementation.

A TS implementation would allow for this repo to be the canonical implementation and not entwined with TS. This would allow both implementations to move forward as seen best by the maintainers of each repo. If you're interested creating and maintaining a TS implementation, we'd be happy to add it to this org, publish it under the @orbitdb scope in npm and generally make it an official implementation of OrbitDB.

Let us know if you're interested and want to do that, or if you have an alternative solution that satisfies the aforementioned requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants