Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRAN Release? #80

Open
eitsupi opened this issue Mar 18, 2023 · 91 comments
Open

CRAN Release? #80

eitsupi opened this issue Mar 18, 2023 · 91 comments
Labels
installation About installation or setup

Comments

@eitsupi
Copy link
Collaborator

eitsupi commented Mar 18, 2023

The Rust version of Debian testing will soon be 1.65, which may allow us to build rpolars tuned to build in the R-universe.
(Note that the macOS builder's Rust version may be older than Debian's. PRQL/prqlc-r#94)
https://tracker.debian.org/pkg/rustc
https://qa.debian.org/excuses.php?experimental=1&package=rustc

@sorhawell Could you potentially do a CRAN release?
Probably need to modify build scripts (like #29) and documentation (Of course I would like to contribute to that.).

@sorhawell
Copy link
Collaborator

Cool :)

For mac this issue will trigger a warning or error with CRAN check. I likely have to bundle some shared macOS tool or include as a requirement in DESCRIPTION.

Is it allowed on CRAN, not to release on all OS, e.g. skip mac in first round?

Regarding #29
Using CARGO_HOME is very useful to run CHECK in development without recompiling for 30 min. There is something about an environment variable called something like IS_CRAN right? I could disable CARGO_HOME for cran builds then.
I failed to rediscover the exact envvar name in docs.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Mar 19, 2023

For mac this issue will trigger a warning or error with CRAN check. I likely have to bundle some shared macOS tool or include as a requirement in DESCRIPTION.

I'm sorry I don't understand the whole issue, but are you saying that due to the issue, rpolars can's update polars?
(Currently, the command check is succeeding on the main branch, right?)

There is something about an environment variable called something like IS_CRAN right?

Yes, this must be NOT_CRAN. (See also extendr/rextendr#233)
Note that this is an environment variable name initiated independently by the devtools package, not defined by CRAN.

@sorhawell
Copy link
Collaborator

sorhawell commented Mar 19, 2023

r-lib/rcmdchek does not support to filter errors. Many projects get the same warning on large binary size, which CRAN is not enforcing anyways.

Best advice I could find in issue track is to ignore any warnings in R CMD Check to avoid builds flagged as failed.

I found the practice sub-optimal, as I would not notice my own newly introduced warnings.
Currently I run this filter in workflow to catch all warnings and errors from R CMD Check, which were not already known.

Currently I ignore the binary size warning and the mac dependency warning/error.I don't know how to fix it yet. However, I have never heard of any mac installation except cran Check, where it was actually a problem.

@sorhawell
Copy link
Collaborator

@eitsupi Oh sorry I did not reply well to your question.

I'm` sorry I don't understand the whole issue, but are you saying that due to the issue, rpolars can's update polars?

We can update polars just fine. The Mac DLL warning just started at a certain commit when rust-polars introduced some new system calls. I have not solved the warning yet. I suspect that any Mac does support these system calls.

I could immediately try to submit a build to the cran windows test build system and see what we get back. I wonder if there are any limitations on build time.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Mar 19, 2023

Thanks for the reply.
I don't have macOS and don't have enough knowledge to help solve that problem, but I think there is a possibility that someone else can contribute.

@sorhawell
Copy link
Collaborator

@eitsupi
I believe win + mac cran check warnings/fails have disappeared since around R4.3.0 I guess we should be able to make a submission to CRAN parallel to R-universe now :)

@eitsupi eitsupi added this to the 1st CRAN Release milestone Apr 27, 2023
@eitsupi
Copy link
Collaborator Author

eitsupi commented Apr 27, 2023

Grad to hear that!

Debian's Rust version is still 1.63, so we will not be able to do a CRAN release any time soon.

I hope to have the Debian Rust version up by summer and we can get the tasks for the CRAN release done by then.
I created a milestone https://github.com/pola-rs/r-polars/milestone/1.

@sorhawell
Copy link
Collaborator

Actually I think we can get away with using Required Rust version >=1.62

@sorhawell
Copy link
Collaborator

sorhawell commented Apr 27, 2023

ohh now I remember that namespace thing that required >= 1.64

@grantmcdermott
Copy link
Collaborator

Out of interest, I just peeked at the Debian rustc logs and 1.64 was accepted into the unstable branch earlier this week. See News here: https://packages.qa.debian.org/r/rustc.html

Last time it only took a week for 1.63 to migrate to the testing branch (which is what CRAN uses) after acceptance on unstable. So hopefully not too long now...

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jun 23, 2023

Last time it only took a week for 1.63 to migrate to the testing branch (which is what CRAN uses) after acceptance on unstable. So hopefully not too long now...

Debian Rust has been completely stuck for the last six months, and it makes little sense to base the schedule on anything earlier than that.....

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jun 27, 2023

Migration to Rust 1.65 seems to be going well as far as the progress status is seen.
At the earliest, we expect Debian testing to be updated to Rust 1.65 in 5 days.

@sorhawell Any thoughts on submitting to CRAN?

One major issue is that the main branch assumes nightly toolchains and cannot be submitted to CRAN, so I think we need to consider whether to continue to enable simd in the main branch.
-> Resolved 🎉

@sorhawell

This comment was marked as resolved.

@eitsupi

This comment was marked as resolved.

@eitsupi

This comment was marked as off-topic.

@eitsupi

This comment was marked as resolved.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jul 3, 2023

FYI, I sent the latest version to R-hub builder and gets the following notes.

* checking CRAN incoming feasibility ... [15s] NOTE
Maintainer: 'Soren Welling <sorhawell@gmail.com>'

New submission

Version contains large components (0.6.1.9000)

Possibly misspelled words in DESCRIPTION:
  Polars (2:8)
  polars (9:37)

Found the following (possibly) invalid URLs:
  URL: https://pola-rs.github.io/polars/py-polars/html/reference/expressions (moved to https://pola-rs.github.io/polars/py-polars/html/reference/expressions/)
    From: man/Expr_class.Rd
    Status: 200
    Message: OK

The Title field starts with the package name.
The Title field should be in title case. Current version is:
'Polars ported to R'
In title case that is:
'Polars Ported to R'

The Description field should not start with the package name,
  'This package' or similar.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jul 3, 2023

Debian testing's rustc is now 1.66!

@sorhawell I am going to work on the Makevars modifications and document updates, is it possible for you to make a submission to CRAN within the next few days?
https://github.com/pola-rs/r-polars/milestone/1

@sorhawell
Copy link
Collaborator

Debian testing's rustc is now 1.66!

@sorhawell I am going to work on the Makevars modifications and document updates, is it possible for you to make a submission to CRAN within the next few days? https://github.com/pola-rs/r-polars/milestone/1

Does Testing mean CRAN supports it now? Or does Testing mean in a few days 1.66 will be stable on Debian?

I can do a submission tomorrow e.g. ? :)

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jul 3, 2023

Does Testing mean CRAN supports it now? Or does Testing mean in a few days 1.66 will be stable on Debian?

Debian "testing" is one of Debian's release channels for rolling releases.
The Debian used on CRAN is Debian testing.
https://cran.r-project.org/web/checks/check_flavors.html

I can do a submission tomorrow e.g. ? :)

Wonderful!
I will send some PRs to resolve the issues added to the milestone, I hope so you can consider submitting them to CRAN once they are resolved.

@eitsupi

This comment was marked as outdated.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jul 4, 2023

I am assuming that once #304 is merged we can send the package to CRAN....

It might be better to resolve #230 before releasing, but this seemed like a pretty big change and one that would be difficult for me to fix right away.

@sorhawell
Copy link
Collaborator

@eitsupi I was kidnapped into a meeting+follow 6 hours ago. What should I start on now towards cran release?

@sorhawell
Copy link
Collaborator

sorhawell commented Aug 23, 2023

It seems from prsqlr that a tarball size of 12MB is unacceptably above a CRAN limit 10MB. We are allegedly at ~30MB? We cannot slim down to 10MB I think.

@etiennebacher Was there a line saying "in total isolation" in new rules? I think I saw "no downloading of binaries."

@sorhawell
Copy link
Collaborator

such as polars-sql. Do you think those have been completely removed?

it seems rust-polars features ipc, parquet, csv and json trigger polars-sql dependency. I don't know if we can cherry-pick sub features within rust-polars. I tried naively to replace e.g. "parquet" with the sub features ["polars-io", "polars-core/parquet", "polars-lazy/parquet", "polars-io/parquet", "polars-sql/parquet"], but the compiler said I could not do that, I think.

@mrwunderbar666
Copy link

I suggested using zenodo, after reading the CRAN submission policies in detail. Here are the relevant sections:

If the sources are too large, it is acceptable to download them as part of installation, but do ensure that the download is of a fixed version rather than the latest. Only as a last resort and with the agreement of the CRAN team should a package download pre-compiled software.

Specifically for rust it says:

Packages wishing to use Rust code should either

include the code in the package, or
download a specific version from a secure and reliable site and check that the download is the expected code by some sort of checksum. The expected checksum needs to be embedded in the source package.

@etiennebacher
Copy link
Collaborator

@etiennebacher Was there a line saying "in total isolation" in new rules? I think I saw "no downloading of binaries."

You're right @sorhawell, I misread the guidelines for Rust. However, as @mrwunderbar666 said above, the CRAN policies say

Packages wishing to use Rust code should either include the code in the package, or download a specific version from a secure and reliable site

but AFAICT nowhere in CRAN policies do they mention Zenodo. So I think before anything we should ask them directly if this would be acceptable to them (and if they agree with this submitting strategy overall).

@eitsupi
Copy link
Collaborator Author

eitsupi commented Aug 24, 2023

I believe this is where Zenodo's name came up. (It is truly surprising that they equate Crates.io with GitHub.)

https://stat.ethz.ch/pipermail/r-package-devel/2023q3/009332.html

Yutani,

I'm not quite sure your reading fully matches the intent of the policy. Cargo.lock is not sufficient, it is expected that the package will provide all the sources, it is not expected to use cargo to resolve them from random (possibly inaccessible) places. So the package author is expected to either include the sources in the package or (if prohibitive due to extreme size) have a release tar ball available at a fixed, secure, reliable location (I was recommending Zenodo.org for that reason - GitHub is neither fixed nor reliable by definition).

Based on that, I'm not sure I fully understand the scope of your proposal for improvement. Carlo.lock is certainly the first step that the package author should take in creating the distribution tar ball so you can fix the versions, but it is not sufficient as the next step involves collecting the related sources. We don't want R users to be involved in that can of worms (especially since the lock file itself provides no guarantees of accessibility of the components and we don't want to have to manually inspect it), the package should be ready to be used which is why it has to do that step first. Does that explain the intent better? (In general, the downloading at install time is actually a problem, because it's not uncommon to use R in environments that have no Internet access, but the download is a concession for extreme cases where the tar balls may be too big to make it part of the package, but it's yet another can of worms...).

Cheers,
Simon

@sorhawell
Copy link
Collaborator

sorhawell commented Sep 1, 2023

@mrwunderbar666 I'm sorry to say when bump r-polars dependency to rust-polars to 0.32.1 the minimal required version of rustc is now 1.70 for without SIMD and rust nightly-2023-07-27 for with. CRAN only supports 1.65 or 1.66 or something like that.

I think we have hit another hard wall. rust-polars have made no promise of only using the about 2 years older rustc versions released via debian as CRAN uses.

Thank you for looking into alternatives, but I think this is show stopper for now.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Sep 2, 2023

rust-polars have made no promise of only using the about 2 years older rustc versions released via debian as CRAN uses.

To Debian's honor: Rust 1.66 was released in December 2022, which is still less than a year old.
https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1660-2022-12-15

@mrwunderbar666
Copy link

mrwunderbar666 commented Sep 6, 2023

Thanks for looking into this, I haven't considered the version of rustc :/

I'll be looking around for other ways of getting polars to CRAN, but I guess for now this should not be a priority for package development. As a Linux user, I could install the package from r-universe without problems, so it would suffice for now

EDIT: should not be a priority

@eitsupi
Copy link
Collaborator Author

eitsupi commented Sep 6, 2023

@mrwunderbar666

I guess for now this should be a priority for package development.

I would like to agree with that, but I think we did our best (see #308 for example) and as a result it looks impossible.

CRAN says #80 (comment)

This was seen using 1500% CPU on the Fedora check system during
installation. As in

  • checking whether package ‘polars’ can be installed ... [6192s/525s] ERROR

which has also exceeded the allowed CPU time limit on that system.

Checking in R-universe, the polars package does indeed have a longer build time than arrow and duckdb.
In other words, it is very likely that it cannot be built in the time allowed by CRAN.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Sep 6, 2023

For example, the arrow package turns off multiple flags in the build on CRAN (I believe), so it may be possible to reduce build time by reducing the rust-polars features in the same way here.
So someone may want to work on that.

Edit: I think the arrow package is installed almost completely by default now.

@mrwunderbar666
Copy link

@eitsupi Sorry, mistyped! I meant it should not be a priority

@eitsupi
Copy link
Collaborator Author

eitsupi commented Sep 16, 2023

In other words, all existing packages on CRAN that use Rust (gifski, string2path, hellorust, rshift, tok) vendor dependent crates. But that is not possible here (> 10MB), and it is unclear how to work around it.

CRAN has approved 12MB of prqlr source code.
Likewise in polars, source package size may no longer be an issue.

Since Debian may soon reach Rust 1.70, it may be worthwhile to try again to submit to CRAN in a few months.

My concern is that the build time is too long...
Checking on R-universe, duckdb was built in 30 minutes while polars took over 50 minutes.
https://github.com/r-universe/hannes/actions/runs/6105495163
https://github.com/r-universe/pola-rs/actions/runs/6137342310

@eitsupi
Copy link
Collaborator Author

eitsupi commented Oct 7, 2023

It seems that the initial arrow package on CRAN did not include libarrow.
https://github.com/cran/arrow/tree/7d575c8eacb1757dcb38cc4115bf0ece39690b79

In fact, we can confirm that nothing works when this version is installed.

> remotes::install_version("arrow", version = "0.14.1")
Downloading package from url: https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/Archive/arrow/arrow_0.14.1.tar.gz
These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?

1: All
2: CRAN packages only
3: None
4: withr (2.5.0 -> 2.5.1) [CRAN]


Enter one or more numbers, or an empty line to skip updates: 1
withr      (2.5.0 -> 2.5.1) [CRAN]
bit        (NA    -> 4.0.5) [CRAN]
tidyselect (NA    -> 1.2.0) [CRAN]
bit64      (NA    -> 4.0.5) [CRAN]
assertthat (NA    -> 0.2.1) [CRAN]
Installing 5 packages: withr, bit, tidyselect, bit64, assertthat
Installing packages into/usr/local/lib/R/site-library’
(aslibis unspecified)
trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/withr_2.5.1.tar.gz'
Content type 'binary/octet-stream' length 228097 bytes (222 KB)
==================================================
downloaded 222 KB

trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/bit_4.0.5.tar.gz'
Content type 'binary/octet-stream' length 1130781 bytes (1.1 MB)
==================================================
downloaded 1.1 MB

trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/tidyselect_1.2.0.tar.gz'
Content type 'binary/octet-stream' length 219931 bytes (214 KB)
==================================================
downloaded 214 KB

trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/bit64_4.0.5.tar.gz'
Content type 'binary/octet-stream' length 475413 bytes (464 KB)
==================================================
downloaded 464 KB

trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/assertthat_0.2.1.tar.gz'
Content type 'binary/octet-stream' length 52457 bytes (51 KB)
==================================================
downloaded 51 KB

* installing *binary* packagewithr...
* DONE (withr)
* installing *binary* packagebit...
* DONE (bit)
* installing *binary* packageassertthat...
* DONE (assertthat)
* installing *binary* packagetidyselect...
* DONE (tidyselect)
* installing *binary* packagebit64...
* DONE (bit64)

The downloaded source packages are in/tmp/RtmpcEbl7u/downloaded_packagesInstalling package into/usr/local/lib/R/site-library’
(aslibis unspecified)
* installing *binary* packagearrow...
* DONE (arrow)
> library(arrow)
Attaching package:arrowThe following object is masked frompackage:utils:

    timestamp

The following objects are masked frompackage:base:

    array, table

> set.seed(24)
> tab <- arrow::table(x = 1:10, y = rnorm(10))
Error in Table__from_dots(dots, schema) :
  Cannot call Table__from_dots(). Please use arrow::install_arrow() to install required runtime libraries.

Similarly, it may be worthwhile here to reduce functionality from the default feature to the bare minimum to reduce build time on CRAN......

@eitsupi
Copy link
Collaborator Author

eitsupi commented Nov 3, 2023

I'm still considering the possibility of a CRAN release, but given the frequency of breaking changes with frequent breaking changes upstream, maybe it's better to avoid CRAN releases for a while?

@etiennebacher
Copy link
Collaborator

I don't see why this is a problem as long as we advertise well that there are still a lot of breaking changes upstream. The way I see it, releasing on CRAN is just a way to make it easier for people to install the package. We should emphasize that it shouldn't be used in other packages for now, but I don't see the number of breaking changes as a blocker.

If you still prefer to wait for upstream polars to be more stable, they plan to release 1.0 by the end of the year: pola-rs/polars#6616 (comment)

@eitsupi
Copy link
Collaborator Author

eitsupi commented Nov 3, 2023

I don't see why this is a problem as long as we advertise well that there are still a lot of breaking changes upstream.

My intention was that it would take a lot of effort to fix with reverse dependency checking.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jan 13, 2024

I think almost all the issues have been resolved, so if Rust on CRAN catches up to MSRV of polars, I would like to submit to CRAN again.

However, Rust on CRAN has been stagnant for more than half a year due to the outdated Fedora server becoming a bottleneck, so we don't know when that will happen.
https://stat.ethz.ch/pipermail/r-package-devel/2023q4/010147.html

@etiennebacher
Copy link
Collaborator

Supposing that the Rust version of CRAN matches the MSRV of polars, would we actually be ready to submit? If I remember correctly one of the main issues was also the package size, do we have a clear strategy for this?

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jan 13, 2024

I think the package size issue has now been resolved.
The size of prqlr 0.7.0 is currently 15MB, making it the third largest package on CRAN, but it was released quickly.

@etiennebacher
Copy link
Collaborator

Wasn't polars much larger than that? This comment mentioned 34MB

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jan 13, 2024

Wasn't polars much larger than that? This comment mentioned 34MB

The point here is that packages larger than 5MB are usually not allowed. I don't think there is any difference between 15MB and 34MB in the sense that it exceeds 5MB. (Maybe there is, but we won't know until we ask CRAN)

@david-cortes
Copy link

Wasn't polars much larger than that? This comment mentioned 34MB

The point here is that packages larger than 5MB are usually not allowed. I don't think there is any difference between 15MB and 34MB in the sense that it exceeds 5MB. (Maybe there is, but we won't know until we ask CRAN)

Not sure if this is applicable to rust packages, but if e.g. header files are required from other libraries and they make up substantial space, one potential workaround could be to create multiple packages containing only those headers under inst, and then add those packages as dependencies under LinkingTo in DESCRIPTION.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Mar 10, 2024

Thank you for your comment. Cargo supports vendoring very well and the CRAN team is aware of the package size issue, so it's probably OK for Rust to just include all the source code in the package.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Mar 18, 2024

Given the low update frequency of Rust on CRAN (they are currently using Fedora 36 from EOL for almost a year now, and they are at Rust 1.69 with no plans to be updated in the future) and the high update frequency of MSRV on (Rust) Polars, I don't know if we should submit polars to CRAN even if they temporarily meet the MSRV requirements.

One option would be to start CRAN releases after the infrastructure on CRAN has improved and there is a guarantee that Rust will be regularly updated.
(See https://github.com/RConsortium/r-repositories-wg)

@etiennebacher
Copy link
Collaborator

(See RConsortium/r-repositories-wg)

I don't see any discussion on Rust on this repo, am I missing something or do you mean that we should initiate the discussion there?

@eitsupi
Copy link
Collaborator Author

eitsupi commented Mar 19, 2024

@etiennebacher This is a problem with the current CRAN infrastructure, not with Rust support. In other words, the current CRAN may be stuck with the Rust version 1.69 forever because it does not know if Fedora 36 will be used until a week, a year, or 10 years from now.
I suspect this situation will continue until the current infrastructure, which volunteers have spent a great deal of effort to maintain, is completely renewed.

If updated to container-based IaC, at least it will not happen that Linux distributions that have reached EOL will remain in use for years to come.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
installation About installation or setup
Projects
None yet
Development

No branches or pull requests

7 participants