Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with scala native #1302

Closed
djspiewak opened this issue Oct 12, 2020 · 33 comments · Fixed by #3057 or #3138
Closed

Experiment with scala native #1302

djspiewak opened this issue Oct 12, 2020 · 33 comments · Fixed by #3057 or #3138

Comments

@djspiewak
Copy link
Member

It would be really interesting to see what we can do on this front, now that Scala Native is on 2.12. We would need Cats to publish for it, though. This is a rather large and sprawling project, tbh, and I'm not sure if there's even a definitive payoff at the end, but it will be cool to try anyway!

@djspiewak
Copy link
Member Author

To note, the only async-anything support within scala-native comes via libuv and https://github.com/scala-native/scala-native-loop. We could technically do something like this, but it's an open question as to whether or not it would be worthwhile.

@benhutchison
Copy link
Member

🙌 If you come up with any experiments, I'd love to test them out

@djspiewak
Copy link
Member Author

This is definitely one of those things that will be a little interest-driven. 😃 Also if anyone wants to dive in and start playing with getting it working, the Cats Effect side of it would actually be very straightforward (for the most part, you just need to implement an ExecutionContext that works), and I would love to help out anyone interested in trying their hand at it.

@benhutchison
Copy link
Member

benhutchison commented Feb 12, 2021

So I did some investigating and concluded that integrating with Scala Native & libuv may be currently possible, but it's bleeding edge and exceeds my personal bravery level.

I think it might be prudent to wait for stabilization of:

@JD557
Copy link

JD557 commented Feb 14, 2021

Would it probably make sense to just publish the kernel for scala native for now?

While it migh make sense to wait before porting cats IO, having the typeclasses available would allow:

  • Library authors to still cross-publish code with CE3 to scala-native
  • Application authors to plug in a simpler IO monad and use only a subset of operations (e.g. it should be simple to write an IO monad that only implements Sync and the required typeclasses).
  • ZIO supposedly (I think there are some caveats) already supports scala-native 0.4.0, so maybe one could use that to run CE3 libraries in scala-native while the Cats IO is not ready

WDYT?

@benhutchison
Copy link
Member

benhutchison commented Feb 14, 2021 via email

@ScalaWilliam
Copy link

Would it probably make sense to just publish the kernel for scala native for now?

While it migh make sense to wait before porting cats IO, having the typeclasses available would allow:

  • Library authors to still cross-publish code with CE3 to scala-native
  • Application authors to plug in a simpler IO monad and use only a subset of operations (e.g. it should be simple to write an IO monad that only implements Sync and the required typeclasses).
  • ZIO supposedly (I think there are some caveats) already supports scala-native 0.4.0, so maybe one could use that to run CE3 libraries in scala-native while the Cats IO is not ready

WDYT?

Would be very welcome 👍

@vasilmkd
Copy link
Member

I like the proposal.

@djspiewak
Copy link
Member Author

I like the proposal as well. Could also conceivably do std as well as kernel. My main concern is that Scala Native currently doesn't support Scala 3, so the build matrix gets to be… very annoying. But, regardless, PRs welcome.

@armanbilge
Copy link
Member

I got that cross-build fever 😅 #2168

@fancellu
Copy link

Scala native is now meant to work on Windows

scala-native/scala-native#2370

@arashi01
Copy link

I like the proposal as well. Could also conceivably do std as well as kernel. My main concern is that Scala Native currently doesn't support Scala 3, so the build matrix gets to be… very annoying. But, regardless, PRs welcome.

That has now been remedied also.

@armanbilge
Copy link
Member

armanbilge commented Jan 30, 2022

@arashi01 Yes, this has been on my mind :) we still need the Typelevel ecosystem to be built for Native/3 which is currently blocked at typelevel/scalacheck#868. The problem is supporting Native requires jumping to Scala 3.1, see discussion about that in typelevel/cats#4016. But, hopefully once scala/scala3#14156 arrives in 3.1.2 this will no longer be a problem.

A longer-term problem for Cats Effect is that our test suite uses specs2, which is not currently crossed for Native on any Scala version. So either somebody will have to take that on, or we'll have to port all our tests to e.g. munit (of which the hardest part is convincing @djspiewak of this at all :).

Update: specs2 now runs on Native :)

@matthughes
Copy link

@armanbilge 3.1.2 fixes the referenced problem. It seems like scalacheck has been updated as well. Can you summarize the current ecosystem blockers?

@armanbilge
Copy link
Member

armanbilge commented Jun 7, 2022

Ah yes, update time :)

  1. Projects are now bumping straight to Scala 3.1 and dropping 3.0, so 3.1.2 is no longer special.
  2. We still need the following dependencies, but they should be straightforward to cross.

So there's no longer any major blockers :)


However, I don't think that Cats Effect on Native 0.4 will be very interesting or useful. Since Scala Native 0.4 does not support multi-threading, we can't even implement an async Scheduler for example to support IO.sleeps 😕

Furthermore, to build interesting applications, we will need asynchronous I/O (TCP, TLS, UDP, file systems). So we need to think ahead to how fs2-io will cross-build. The JS cross-build of fs2-io shows that anything is possible after that (http4s ember, skunk, etc.).

As suggested in #1302 (comment), one option could be to implement a libuv-based runtime via scala-native-loop. I'm 👎 on this happening in Cats Effect for various reasons.

  • If Cats Effect Native bakes in a runtime, then all downstreams (read: fs2-io, http4s) will be forced to build around that runtime.

  • libuv only makes sense for a single compute thread. So when Native 0.5 arrives with multithreading, any libuv integrations won't be able to take advantage of that.

  • Node.js is libuv based, but Deno (a modern answer to Node.js) chose to use Rust Tokio instead (a modern answer to libuv). The Cats Effect JVM runtime itself is inspired by Tokio.

    • Supports both single-threaded (Native 0.4) and multi-threaded (Native 0.5) compute pools. So integration efforts would not go to waste.
    • Far more developer-friendly options for TLS than anything I've seen for libuv. So porting Ember and Skunk would actually be viable.

    However, this would be blocked by:

So my current proposal is that Cats Effect Native 0.4 ships with a "dummy" runtime. By this I mean IO.blocking delegates to IO.delay, and IO.sleep just continually re-schedules a fiber until sufficient time has passed. Yes, this is an utter joke of a runtime 🙃 but it should be sufficient for testing.

The point of this would be to allow downstreams to also start exploratory cross-building for Native. That way we can figure out what problems lie ahead and be more prepared for when Native 0.5 ships with multi-threading support and things get more interesting.

Meanwhile, nothing stops a Highly-Motivated Developer™ from building a libuv runtime for Cats Effect in an external project. As pointed out in #1302 (comment), all that's needed is an ExecutionContext and Scheduler, which is already available in scala-native-loop.

  • Furthermore, if fs2-io cross-publishes its Socket and Files interfaces then said Highly-Motivated Developer™ could implement those as well.
  • In fact, if fs2-io cross-publishes its Files interface, then http4s can cross-publish its core, server, and client modules. Then a Highly Motivated Developer™ could implement async cURL client and NGINX Unit server backends.
  • All of the above also applies to a Tokio-based runtime, as an external project.

So basically the goal is to unblock downstream explorative work. It's better to do this outside of Cats Effect to take advantage of:

@JD557
Copy link

JD557 commented Jun 7, 2022

So my current proposal is that Cats Effect Native 0.4 ships with a "dummy" runtime.

Just to bump my previous point that (if it doesn't make te build matrix too weird), it might also make sense to just not publish any native runtime for now (publishing just std and kernel). That would allow for libraries to start being ported and some applications could still be written, but with custom runtimes.

Maybe there could be a dummy runtime like the one proposed, but without being "the official cats effect runtime" (as in, living in a seperate project clearly marked as experimental).

Mostly because I'm not sure what's preferable:

  1. Have a dummy runtime like the one proposed (where some operations will clearly have huge performance problems and maybe unexpected behavior)
  2. A dummy runtime that does not implement all typeclasses (where a lot of applications would not compile, but you wouldn't get any surprises).

@keynmol
Copy link

keynmol commented Jun 7, 2022

FWIW, I agree with @JD557 and hope that CE publishes all the modules that can be published without having to add any dummy implementations.

CE's typeclasses sit at the top of the hierarchy for a lot of libraries that don't exercise runtime in the library code, so it would be best to have all the auxillary modules published and ready to go, so that when (not if, there's a lot of Highly Motivated Individuals around) the runtime is ready, it can sweep through the ecosystem, either as a separate project or as one built into CE itself.

The upgrades train took a long time to get even to this current point, so if something can be done to shorten it next time round - I'd support that and will be happy to help on either SN or CE side.

@armanbilge
Copy link
Member

armanbilge commented Jun 7, 2022

Just to bump my previous point that (if it doesn't make te build matrix too weird), it might also make sense to just not publish any native runtime for now (publishing just std and kernel). That would allow for libraries to start being ported and some applications could still be written, but with custom runtimes.

Thanks for raising this! It's important we cross-publish Cats Effect core, because fs2 depends on it, particularly SyncIO. Also the IO runloop implements highly non-trivial logic that would be a burden to implement elsewhere.

It sucks to delegate IO.blocking to IO.delay but until Scala Native supports multi-threading this is literally the only viable implementation. Scala.js does the same thing. Without it, cross-building downstreams will be very awkward.

You are right though, that the ExecutionContext and Scheduler of this dummy runtime I propose should be restricted to the testkit. Without them, it becomes much more difficult to test these things, starting from Cats Effect itself.

so it would be best to have all the auxillary modules published and ready to go, so that when (not if, there's a lot of Highly Motivated Individuals around) the runtime is ready, it can sweep through the ecosystem

I also desire this "sweeping" phenomenon :) and the best way to achieve that IMHO is to start pro-actively cross-building and testing downstream libraries, and we cannot do this without IO and a (dummy) runtime.

@mpilquist
Copy link
Member

Amazing status update. :)

Note fs2-core relies on SyncIO (for the compiler of Pure, Id, Fallible streams).

@armanbilge
Copy link
Member

armanbilge commented Jun 7, 2022

so that when (not if, there's a lot of Highly Motivated Individuals around) the runtime is ready

Just to clarify, I'm skeptical that we will ever have an official Cats Effect runtime for Native 0.4.

  • Native 0.4 is single-threaded, so we cannot natively implement this runtime.
  • We can integrate with libuv or Tokio, but this requires a coordinated effort between fs2-io, http4s, ip4s, etc. etc.
  • I don't think this should be libuv, due to poor support for multi-threaded compute and TLS
  • Tokio has got better long term viability, but requires more upfront work.

The story changes dramatically when Native 0.5 arrives with multi-threaded support.

  • the Work Stealing Threadpool could become a viable runtime
  • even better, we could start sharing sources across JVM/Native in fs2-io and friends (I even added support to sbt-crossproject specifically for this). This is a big win for maintenance burden, which is why I think we should start investigating how well the existing sources cross-compile as soon as possible, even if it's against a dummy runtime.

For the record, I am very supportive of a libuv runtime, as an external project :)

Also for the record, I'm not the BDFL or even a maintainer of many projects in the stack, so these are all my personal opinions, I certainly don't have any final say on the matter. My bias is towards minimizing build-complexity and maintenance burden over the long-term, but I admittedly have limited experience with this.

that the ExecutionContext and Scheduler of this dummy runtime I propose should be restricted to the testkit.

Actually, I might take this back. There are still interesting things you could do with the dummy runtime. For example, it would be sufficient to cross the fs2 hexdump4s application. It would essentially have the performance of a sync, blocking I/O app. But at least you get to use fs2! :)

@vasilmkd
Copy link
Member

vasilmkd commented Jun 7, 2022

Out of curiosity, does anyone know what would be a blocker to implementing java.lang.Thread on top of pthreads?

@armanbilge
Copy link
Member

That's exactly what's been done in WojciechMazur/scala-native#23 :) there's a lot of good stuff in that branch, worth checking out!

The real blocker for multi-threading is Garbage Collection. IIUC the Native 0.4 runtime will crash if managed memory is accessed/modified by an external thread, but I haven't tried this at all.

@armanbilge
Copy link
Member

I had a very informative chat with @lolgab on the Scala Native Discord (thank you so much 🙏 ) who was supportive of my proposal here.

If it's swappable to another one implemented outside of the core somehow, having a busy-waiting runtime (if I understood correctly what you did) is probably the best thing to do.

https://discord.com/channels/632150470000902164/635668881951686686/985223897706266745

We agree that ideally an event loop runtime would eventually live in Scala Native core itself.

@armanbilge
Copy link
Member

The upgrades train took a long time to get even to this current point, so if something can be done to shorten it next time round - I'd support that and will be happy to help on either SN or CE side.

@keynmol thank you so much for volunteering! To support the Cats Effect effort please port the following Java lib to Scala Native. Scala.js has them, so there's no reason Scala Native can't have them either.

  • java.lang.ThreadLocal (including withInitial, which Scala.js is missing)
  • java.util.concurrent.ConcurrentMap
  • java.util.concurrent.ConcurrentHashMap
  • java.util.concurrent.ConcurrentSkipListSet
  • java.util.concurrent.ThreadLocalRandom

@armanbilge
Copy link
Member

armanbilge commented Jun 18, 2022

I put up a PR against my own repo. Locally, the entire test suite is passing on Native :)

I experimented with two runtimes:

First I tried a sort of event-loopy ExecutionContext. It places scheduled tasks on a priority queue which it interleaves with a task queue, and uses Thread.sleep() when there is no work to be done.

Unfortunately, this runtime could not be used for tests, because specs2 has a hard-assumption that the Scala Native global ExecutionContext is being used since it relies on the ability to call scalanative.runtime.loop().

So I also implemented the original "dummy" runtime I proposed, which implements a busy-wait Scheduler that works cooperatively with the global ExecutionContext. This worked for tests, and since it has the maximum compatibility across the Scala Native ecosystem I made this the default for the Native IOApp.

I'll try to put up a snapshot later. PRs against my branch are welcome btw.

@armanbilge
Copy link
Member

armanbilge commented Jun 18, 2022

@armanbilge
Copy link
Member

PR is up:

@armanbilge
Copy link
Member

Based on my PR, a very small demo of async I/O: https://github.com/armanbilge/epollcat

@armanbilge
Copy link
Member

Cross-linking to the FS2 ticket. I did some preliminary investigations to see what Scala Native is missing to support FS2 and wrote a summary.

@armanbilge
Copy link
Member

armanbilge commented Aug 21, 2022

After extensive bootlegging, I've published a fully async, non-blocking http4s-curl client.

https://github.com/http4s/http4s-curl

@armanbilge armanbilge linked a pull request Sep 13, 2022 that will close this issue
@armanbilge
Copy link
Member

Probably the final update on this issue in #3138 (comment).

tl;dr we have a viable ecosystem on Cats Effect Native, including http4s ember and skunk, and plenty more to follow. So I am ready to begin publishing this stuff :)

@armanbilge
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet