Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype for jj api #3601

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Prototype for jj api #3601

wants to merge 4 commits into from

Conversation

matts1
Copy link
Collaborator

@matts1 matts1 commented Apr 30, 2024

This is a prototype of jj api. It doesn't have anything like tests or documentation, but it works and I'd like to get approval on the general code structure before continuing.

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added tests to cover my changes

This crate contains only the minimal set of code for an external library to start using it.
To try this out, run:
cargo run api grpc --port 8888

Then run cargo run api_client

message ListWorkspacesRequest {
jj_api.objects.RepoOptions repo = 1;
google.protobuf.FieldMask field_mask = 2;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that field masks would be really handy because sometimes you only want change IDs, while sometimes you want full changes. Unfortunately, although the type exists, there's currently no way to actually do anything based on these field masks in rust.

@@ -0,0 +1,25 @@
use std::path::{Path, PathBuf};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serialization and deserialization are a pain here. I wrote a bunch of wrappers because it gets to be a massive pain in the ass since protos are represented as a string, but in reality the type we usually want is Option<String>, Option<Path>, Option<ChangeID>, etc.

I'd like to see if we can think of a better way to do it, but I'm not hopeful.

Maybe we need to have every proto message have its own rust type, and implement

impl From<Change> for ChangeProto { ... }
impl TryFrom<ChangeProto> for Change { 
    type Error = tonic::Status,

    ....
}

}

#[tonic::async_trait]
impl JjService for GrpcServicer {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly just to get around the fact this requires async functions, while I intend for the jj CLI to call the sync functions directly.

}

#[tokio::main(flavor = "current_thread")]
pub async fn start_api(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the fact that the GRPC server is started in the jj_lib crate rather than the jj_cli crate. However, I anticipate that:

  1. The jj daemon will likely need to start this server
  2. My proposal for generalized hook support (FR: Generalized hook support #3577) will require jj_lib to ensure that an API server is running.

Happy to move this to jj_cli if we don't think that's the case.

opts: &Option<jj_api::objects::RepoOptions>,
) -> Result<WorkspaceLoader, Status> {
opts.as_ref()
.map(|opts| from_proto::path(&opts.repo_path))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of work to just get one field, which is why I'm considering a mechanism where we just have:
grpc_servicer.rs would write:

// servicer.rs
fn list_workspaces(TypeSafeListWorkspacesRequest) -> Result<TypeSafeListWorkspacesResponse, Status> {
    ...
}

// grpc_servicer.rs

async fn list_workspaces(
        &self,
        request: Request<ListWorkspacesRequest>,
    ) -> Result<Response<ListWorkspacesResponse>, Status> {
        self.servicer
            .list_workspaces(request.get_ref().try_into()?)
            .to_proto()
            .map(Response::new)
    }

})
}

fn repo(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that we may be able to take advantage of caching to improve performance of functions like this. Seems like work for later though

@gulbanana
Copy link
Collaborator

some interesting stuff here. you've noted that the idea is that eventually the jj CLI would be reimplemented on top of this API... do you have any thoughts on how that would interact with the server component and its lifetime management? is the "servicer" supposed to be a lower level piece, used by both server and CLI?

@gulbanana
Copy link
Collaborator

more specifically, do you think it would be feasible for the lower-level apis to be hosted in a long-running server that isn't request-response like gRPC? jj web, for example, might want to use a websocket connection with server->client push.

@matts1
Copy link
Collaborator Author

matts1 commented May 1, 2024

The intention was that all communication would be done via the Servicer class, which just takes FooRequest protos and returns FooResponse protos.

This would mean that jj-cli would completely bypass the gRPC layer, and simply write:

let servicer = jj_lib::api::Servicer::new(Some(command.workspace_loader()?));
let response = servicer.list_workspaces(ListWorkspacesRequest{
});
// print based on response

It would also mean that we could implement:

  • jj api grpc (grpc sockets)
  • jj api grpc --web (grpc over web)
  • jj api web (websocket connection)
  • jj api json (stdin / stdout json communication)
  • jj api json --pipe=/path/to/pipe

My was to ensure that we didn't lock ourselves into gRPC here, and so I made that intermediary layer Servicer with the intention that we can easily just add other layers on top of it (eg. your jj web suggestion).

@gulbanana
Copy link
Collaborator

Would this mean that each CLI command becomes a method on Servicer, implementing the semantics and options of the former command? That’s a very high level interface. It would imply that alternate clients also each need their own set of Servicer methods with the appropriate semantics. For example, jj web would want to be able to control when snapshots occur, rather than having them implicitly take place at the start of commands as the CLI does.

@matts1
Copy link
Collaborator Author

matts1 commented May 1, 2024

No, it would mean that each CLI command can be constructed out of some combination of servicer methods. For example, jj commit might involve:

fn commit_command(args: Args, servicer: Servicer) {
  let commit = servicer.get_commit(GetCommitRequest{
      CommitRef{revset: "@"}
  });
  let current_commit = CommitRef{change_id: commit.change_id};

  let description = args.description.unwrap_or_else(|| editor(commit.description));
  servicer.update_commit(CommitUpdateRequest{
    commit: current_commit,
    desc: description,
  });
  let new_commit = servicer.create_commit(CreateCommitRequest{
    parents: vec![current_commit],
  }
  servicer.edit_commit(EditCommitRequest{commit: CommitRef{id: new_commit.id}})
}

@gulbanana
Copy link
Collaborator

gulbanana commented May 1, 2024

That seems workable, although figuring out exactly where to draw the API line will be a long process. I'm a bit concerned that it means each change to jj will have duplicated work - updating the internal data model, and updating the way it's exposed in the API.

Servicer could also become rather vast, since it'd be a flattened Facade (in the design patterns sense) for the entire codebase.

@matts1
Copy link
Collaborator Author

matts1 commented May 1, 2024

figuring out exactly where to draw the API line will be a long and complex process

+1 to that. It seems like a massive pain in the ass that will inevitably involve some very long discussions, but I think it's one that needs to be done.

I'm a bit concerned that it means each change to jj will have duplicated work.

I do agree here, but I think that's inherent in any mechanism through which we expose jj internals to the outside world. The objective of jj-lib structures is to be performant, while the objective of any structures you expose is to:

  • Easily serialized and deserialized
  • Support field masking (eg. I want to be able to get just a change ID, or get a whole change, at will)
  • Stable
    • If we want to change the type of something internally to optimize performance, that shouldn't result in a change to the API
    • If we want to change the type of something externally, proto supports reserved felds, so you can go through the deprecation process:
      1. Update the proto to add the new field, fill in both fields, mark the old field as deprecated
      2. Give users some time to migrate
      3. Delete the old field, mark it as reserved. When users depending on the old proto serialize it, they will just find the field empty, so can hopefully fail gracefully.

Servicer could also become rather vast

Correct. I think it's the right approach, though. I don't think there's anything inherently wrong with it being vast.

@noahmayr
Copy link
Collaborator

noahmayr commented May 1, 2024

Should we create a discussion/issue separate from #3219 to collect the needs of both the cli and gg (and potentially other tooling being developed right now like vim/vscode extensions) so we know more clearly what APIs will need to be exposed?

Afaict #3219 is more concerned with an API command existing, not which APIs the servicer facade will provide over all it's potential channels (directly calling it in the library, gRPC, etc.)

@joyously
Copy link

joyously commented May 1, 2024

I'm a bit concerned that it means each change to jj will have duplicated work - updating the internal data model, and updating the way it's exposed in the API.

This sounds wrong to me. None of the internal data model should be exposed; otherwise it's not internal. Only some of the data would be transferred back and forth, as in REST or GraphQL.

@martinvonz
Copy link
Owner

martinvonz commented May 1, 2024

Should we create a discussion/issue separate from #3219 to collect the needs of both the cli and gg (and potentially other tooling being developed right now like vim/vscode extensions) so we know more clearly what APIs will need to be exposed?

Sounds like a good idea to me. I assume it's not all of the current Rust API, but it's unclear to me which parts of it would be available in the RPC API. I suppose everything used by commands will need to be available, but probably not everything in cli_util.rs because we should hide more of that in the library crate. OTOH, maybe those APIs should still be available outside of the crate even there are higher-level APIs available?

@matts1, if you agree with creating a separate issue (or design doc, perhaps) for documenting the API requirements, what do you think about picking an arbitrary command and showing how the rewritten version of that would look like? For example, what would jj parallelize look like after rewriting to use the RPC API?

@gulbanana
Copy link
Collaborator

I think that would be a great way to learn what shape the design would have to take.

@PhilipMetzger
Copy link
Collaborator

I am happy to see that you're so eager to see it implemented, but I would need at least two design docs before even starting this huge project. As defining the jj-lib API in terms of jj api is already a huge change and then jj api itself. See my concerns in the issue here: #3034 (comment). We've also have nothing to gain for the promised stability, as we're not nearing a 1.0 yet.

It would also mean that we could implement:

  • jj api grpc (grpc sockets)
  • jj api grpc --web (grpc over web)
  • jj api web (websocket connection)
  • jj api json (stdin / stdout json communication)
  • jj api json --pipe=/path/to/pipe

Could we please not go in that direction, imo jj api should only start a server, not define how it behaves. As this is once again duplicating work which could be covered by the templating language (#3262).

@matts1
Copy link
Collaborator Author

matts1 commented May 2, 2024

I would need at least two design docs before even starting this huge project.

I agree, I implemented this to see if it was viable, try things out, see what did and didn't work, and get some feedback

As defining the jj-lib API in terms of jj api is already a huge change and then jj api itself. See my concerns in the issue here: #3034 (comment). We've also have nothing to gain for the promised stability, as we're not nearing a 1.0 yet.

You're right that there's not a large benefit to this right now. I think the main reason to do this is simply because it's going to get harder to do this as we add more things to jj.

Imo jj api should only start a server, not define how it behaves.

It feels to me like we might have a different understanding of "start a server". I agree that all jj api should do is to start a server. However, the way I see it, different clients might want to communicate with the server in different ways. For example:

  • One client might expect to communicate over gRPC
  • Another client might be working from javascript, so would prefer gRPC-web
  • Another client might not have proto/grpc support, and so would prefer to communicate over HTTP with REST/json
  • Another client might not want to have to deal with the network, because that introduces a new class of errors, so maybe they want to communicate over a linux pipe instead

How would you deal with all these situations? You say that all it should do is "bring up a server", but what kind of server? If it's gRPC, that would certainly meet my needs, but why specifically gRPC? Why not a HTTP server which communicates over json, for example? Whether they're represented as json, serialized proto, textproto, or anything else, is irrelevant to the actual API.

If I were to write let response = grpcClient.UpdateCommit(UpdateCommitRequest{change_id: "abc"}), why is that any different, fundamentally, to sending that same serialized proto over a named pipe? Or sending a json-encoded variant of that proto over a named pipe? And why should we disallow it?

@matts1
Copy link
Collaborator Author

matts1 commented May 14, 2024

I've started work on the design doc. For now, it just has a bunch of design questions we need answered before starting on the project, and my personal thoughts on how they stack up to each other.

Any feedback would be appreciated. I expect there's things I've missed in the comparison, and maybe there's more options I haven't considered.

@PhilipMetzger
Copy link
Collaborator

I've started work on the design doc. For now, it just has a bunch of design questions we need answered before starting on the project, and my personal thoughts on how they stack up to each other.

Any feedback would be appreciated.

I added a bunch of questions and suggestions, to provide some feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants