Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factorise bytecode and native toplevels #10124

Merged
20 commits merged into from Jan 26, 2021
Merged

Factorise bytecode and native toplevels #10124

20 commits merged into from Jan 26, 2021

Conversation

AltGr
Copy link
Member

@AltGr AltGr commented Jan 5, 2021

The native toplevel has evolved from an old fork of the bytecode one, and there remains a lot of duplicated code, often with fixes only on one side.

This is an attempt at resolving this by unifying the parts that can as much as possible. It follows from #10061 and #10078.

This works but is still in a pretty rough state (needs rebasing, the Toploop API is broken and needs to be restored, module names need to be adjusted to match their contents, some more cleanup needed...)

@Drup
Copy link
Contributor

Drup commented Jan 6, 2021

In principle, I'm strongly in favor, the toplevel code in general is in dire need of improvements. I can review it.

Since the diffs are going to be extremely hard to read, I encourage you to add/keep/merge the interfaces everywhere, to make the refactoring as "obviously correct" as much as possible.

@AltGr
Copy link
Member Author

AltGr commented Jan 6, 2021

@Drup thanks; this is why I try for now to do a pure reorganisation, without improvements (at least for the bytecode toplevel). Reproducing the current Toploop interface is a requirement before I unmark this as draft.

since the content didn't really match the name anymore; we need the name
`Toploop` to provide a backwards-compatible interface anyway.
@AltGr
Copy link
Member Author

AltGr commented Jan 7, 2021

Update: the module names now make more sense, and the Toploop API is preserved.

(except for the deprecated `directive_table`, which should be referred to
through accessors)
Always forget to re-run `make alldepend` and not just `make depend` :/
@AltGr AltGr marked this pull request as ready for review January 8, 2021 15:06
@ghost
Copy link

ghost commented Jan 11, 2021

Looking at byte/toploop.ml and native/toploop.ml, there seem to still be duplicated code that could be factored; for instance the use_xxx functions or run_script. WDYT?

@ghost
Copy link

ghost commented Jan 11, 2021

Another thought: I'm assuming that Topeval and Topcommon are not meant to be used by end users. We should add a disclaimer in the mli files to that intent. Or even not install their cmi files, although that might cause issues with downstream compilation/tooling.

@AltGr
Copy link
Member Author

AltGr commented Jan 11, 2021

Looking at byte/toploop.ml and native/toploop.ml, there seem to still be duplicated code that could be factored; for instance the use_xxx functions or run_script. WDYT?

Yes, there is still room for factorisation in that part (I'm assuming you mean {byte,native}/topeval.ml) ; even the larger and more involved execute_phrase should be factorised, but that will involve a more involved rewrite: my idea was to submit this superficial reorganisation first, and then get to what requires deeper code changes, to make reviewing easier.

Some functions like use_output are still copied verbatim, indeed; I'll check if a few more can be shared with minimal code changes. Some of them may be sandwiched between specific definitions, though, so in that case it will require splitting the module with back-and-forth between shared and specific module parts (ugh), or defining them as functors in TopCommon if their dependencies are reasonably contained (like I did for the printers). I'll have another look.

Another thought: I'm assuming that Topeval and Topcommon are not meant to be used by end users. We should add a disclaimer in the mli files to that intent. Or even not install their cmi files, although that might cause issues with downstream compilation/tooling.

Indeed! I'll add such disclaimers.

@ghost
Copy link

ghost commented Jan 11, 2021

@Drup I'm also looking at this issue as we have been working on this with @AltGr, but your review is definitely welcome.

In case that's helpful, I'm using manual patdiff commands to review code moved between files. For instance, to review code moved from toplevel/byte/topdirs.ml to toplevel/byte/topmain.ml I do:
patdiff <(git show `git merge-base HEAD origin/trunk`:toplevel/byte/topdirs.ml) toplevel/byte/topmain.ml
and I ignore the big red and green blocks.

@ghost
Copy link

ghost commented Jan 11, 2021

Yes, there is still room for factorisation in that part (I'm assuming you mean {byte,native}/topeval.ml) ; even the larger and more involved execute_phrase should be factorised, but that will involve a more involved rewrite: my idea was to submit this superficial reorganisation first, and then get to what requires deeper code changes, to make reviewing easier.

That seems reasonable.

Some functions like use_output are still copied verbatim, indeed; I'll check if a few more can be shared with minimal code changes. Some of them may be sandwiched between specific definitions, though, so in that case it will require splitting the module with back-and-forth between shared and specific module parts (ugh), or defining them as functors in TopCommon if their dependencies are reasonably contained (like I did for the printers). I'll have another look.

Not excited about splitting the module, it's likely to make the logic more difficult to follow. The functor approach looks more promising to me.

@AltGr
Copy link
Member Author

AltGr commented Jan 13, 2021

Addressed the two comments above.
AppVeyor error transient:

Error downloading remote file: One or more errors occurred.
Inner Exception: Remote server returned 429: too many requests

@ghost
Copy link

ghost commented Jan 13, 2021

One side effect of this PR: we are now reserving two more global names: Topcommon and Topeval. A bit annoying but nothing new here. I checked the dune-universe and found no clashes.

Looks all good to me otherwise, marking it approved. @Drup do you want to review this PR before I merge it?

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AltGr
Copy link
Member Author

AltGr commented Jan 13, 2021

The parts your mentionned were indeed straightforward to split, I had overlooked them;
the eval_xxx functions fall in the case I mentionned above though, they are quite trivial but still duplicated. They could be handled in the same way as the printer functions are handled, and maybe together (functor in TopCommon, with TopEval applying it and includeing the result)

@Drup
Copy link
Contributor

Drup commented Jan 13, 2021

Looks all good to me otherwise, marking it approved. @Drup do you want to review this PR before I merge it?

I do, give me the end of the week.

@AltGr
Copy link
Member Author

AltGr commented Jan 15, 2021

There are a few fixes needed in the Makefile, to properly install the artifacts for ocamltoplevel.cmxa. A question here: at the moment, the native toplevel is not compiled by default, and installed optionally if compiled. Does it sound reasonable to switch it to compiled by default ? (Otherwise, we'll probably have to ship it as a separate library in opam, and that sounds like a lot of trouble, partially defeating the purpose).

The META files bundled with findlib also need an update: see ocaml/ocamlfind#10

With these precautions, I managed to compile and run topfind and utop, load libraries etc.

@gasche
Copy link
Member

gasche commented Jan 15, 2021

The question how whether the native compiler is compiled/installed by default is a maintenance question rather than a purely technical question. I think it should be discussed in a separate issue or on caml-devel (it has been in the past), and it is best to keep it separate from the current PR.

(The two questions are related, as in: if the factorization work goes well, it makes the native toplevel much easier to maintain, and the argument against not-enabling it weaker. But still I think we want to proceed in two distinct steps here.)

@AltGr
Copy link
Member Author

AltGr commented Jan 18, 2021

@gasche that makes perfect sense, I'll amend this PR with a fix for installation that keeps it optional.

@AltGr
Copy link
Member Author

AltGr commented Jan 18, 2021

So with these latest changes and the patch to OCamlfind, utop compiles and runs with the two (mode byte) removed (and using explicitely src/top/utop.exe)

Copy link
Contributor

@Drup Drup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, it was the end of this week in the end ... :)

I just looked at the code, I think it's a great start. As @AltGr said, this is "just" a code reorganization (moving functions around, without modifying them), to lay the groundwork for more changes. I think it's pretty safe and we can merge it directly, there should be no changes in behaviour whatsoever.

The next thing to factorize is definitely the collection of eval functions ...

module MakePrinter
(O: Genprintval.OBJ)
(Printer: Genprintval.S with type t = O.t)
: PRINTER with type Printer.t = O.t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the O parameter here ? It's filled by the Obj module in both cases ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I just forwarded the argument to Genprintval.Make, but that could be simplified here.

@ghost ghost merged commit c2bbc6f into ocaml:trunk Jan 26, 2021
@AltGr
Copy link
Member Author

AltGr commented Jan 26, 2021

Ah one thing I just noticed is that the native toplevel used to advertise that it was native in the welcome text, which is no longer the case. That would be trivial to add back if needed ?

@gasche
Copy link
Member

gasche commented Jan 26, 2021

I think that would be nice, it's useful to be sure that we are in the expected environment. (For example I rely regularly on the fact that it shows the OCaml version string.)

@AltGr
Copy link
Member Author

AltGr commented Jan 27, 2021

Ok @gasche addressed this in #10175 (as well as @Drup 's remark); thanks

@Octachron Octachron mentioned this pull request Feb 1, 2021
ghost pushed a commit to AltGr/ocaml that referenced this pull request Feb 1, 2021
It was dropped in ocaml#10124.

It seemed cleaner and safer to add the string to the specific module
used rather than trivially rely on `Sys.backend_type`. The string is
left empty for the bytecode toplevel, to avoid a visible change from the
current version.
ghost pushed a commit that referenced this pull request Feb 1, 2021
- Toplevel factorisation: simplify MakePrinter functor
- Re-add a discrimination string to the native toplevel header

  It seemed cleaner and safer to add the string to the specific module
  used rather than trivially rely on `Sys.backend_type`. The string is
  left empty for the bytecode toplevel, to avoid a visible change from the
  current version.
smuenzel pushed a commit to smuenzel/ocaml that referenced this pull request Mar 30, 2021
- Toplevel factorisation: simplify MakePrinter functor
- Re-add a discrimination string to the native toplevel header

  It seemed cleaner and safer to add the string to the specific module
  used rather than trivially rely on `Sys.backend_type`. The string is
  left empty for the bytecode toplevel, to avoid a visible change from the
  current version.
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants