Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scope of future plan #457

Open
klar-C opened this issue Dec 24, 2020 · 5 comments
Open

Scope of future plan #457

klar-C opened this issue Dec 24, 2020 · 5 comments

Comments

@klar-C
Copy link

klar-C commented Dec 24, 2020

Hi,

Is it possible to have a plan being valid in a specific scope of a R session only, i.e. to have multiple plans in the same R session?

I know it is possible to specify the workers for the plan, but is it possible to have multiple plans? future({}) does not seem to have an option for specifying a plan either.

cl <- parallel::makePSOCKcluster(1L,rscript_libs='E:/rpackages')
parallel::clusterEvalQ(cl, { .libPaths('E:/rpackages') })
plan(cluster, workers = cl)

The reason I'm asking is because I would like to have separate plans/clusters for different shiny sessions that run in the same R process. The workers of the clusters need to do some setup and I'd like that to be specific to the user session only.

The idea I would have had was to rerun the plan() command before each future call, but that could mess up the promises later in the event loop?

Edit: Attached to the same question ("linking a cluster to a session") I wanted to ask whether it is possible to keep the state of a cluster after a future. I tried to assign objects to the global environment but in the following future run they all seem gone. This makes sense in the paradigm of futures being indifferent to who calls them, but as said my goal is to link a specific cluster to a session (and e.g. keep some base data in the cluster which I don't want to re-query each time).
I potentially could re-implement it with https://cran.r-project.org/web/packages/future/vignettes/future-6-future-api-backend-specification.html but I was hoping there is some trick/setting that I'm just not aware of.

It looks like constantly using the persistent=TRUE flag works in f <- future({...},persistent=TRUE). Found it by looking at the source code. Please let me know if this is not the right way to do it / think about it.

Best regards

@klar-C
Copy link
Author

klar-C commented Dec 26, 2020

Here's an example I think works but let me know if there's something I'm missing. I tested it, and it seems to work fine in terms of keeping the clusters separate.

sessionclusterclass <- R6Class(
  "sessionclusterclass", 
  public = list(
    localcl=NULL,
    sessiontoken=NULL,
    initialize=function() {
      session <- getDefaultReactiveDomain()
      self$localcl <- parallel::makePSOCKcluster(1)
      currentlib <- .libPaths()[1]
      currentloaded <- (.packages())
      self$sessiontoken <- {if (is.null(session$token)) {uuid::UUIDgenerate()} else {session$token}}
      eval(parse_expr(glue("parallel::clusterEvalQ(self$localcl,.libPaths('{currentlib}'))")))
      plan(cluster,workers=self$localcl)
      r <- future({
        assign('anchorsessiontoken',sessiontoken,envir=.GlobalEnv)
        library(magrittr)
        currentloaded %>% purrr::walk(~library(.x,character.only=TRUE))
      },persistent=TRUE,globals=list(sessiontoken=self$sessiontoken,currentloaded=currentloaded))
      invisible(value(r))
      return(invisible(TRUE))
    },
    closedown=function() {
      parallel::stopCluster(self$localcl)
    },
    future=function(exp,...) {
      plan(cluster,workers=self$localcl,.cleanup = FALSE,.init=FALSE)
      exp <- substitute(exp)
      f_anonym <- pryr::make_function(list(),body=exp) %>% stripfunction() %>% capture.output() %>% paste(collapse='\n')
      cat('starting future...\n')
      r <- future({
        if (!sessiontoken==anchorsessiontoken) stop('sessiontokens do not line up within the cluster')
        f <- eval(rlang::parse_expr(f_anonym))
        r <- f()
        r
      },globals=list(f_anonym=f_anonym,sessiontoken=self$sessiontoken,...),persistent=TRUE)
      # test for the sessiontoken that gets returned needs to happen outside     
      return(r)
    }
  )
)


ui <- div(
  DTOutput('trend_tbl'),
  actionButton('ok','ok'),
  actionButton('run_trends','run_trends')
)



server <- function(input, output, session) {  
  scc <- sessionclusterclass$new()  
  observeEvent(input$ok,{
    print('hello')
  })  
  dt_trend <- eventReactive(input$run_trends,{
      dat_func <- function() {        
        start_time <- Sys.time()
        dt <- data.table(x = rnorm(100), y = rnorm(100))
        trendy_tbl <- head(dt, 10)
        ggplo1 <- ggplot(dt) + geom_point(aes(x=x,y=y))
        Sys.sleep(10)
        list(trendy_tbl)
      }
      scc$future({
        dat_func()
      },dat_func=dat_func)
  })  
  output$trend_tbl <- renderDT({dt_trend() %...>% '[['(1)})  
}

stripfunction <- function(f,env=.GlobalEnv) {
  f <- pryr::make_function(formals(f),body(f),env=env)
  return(f)
}

shinyApp(ui,server)

@HenrikBengtsson
Copy link
Owner

@klar-C, could you please edit your comments to use Markdown fenced code blocks, cf. https://guides.github.com/features/mastering-markdown/? That would make it easier to read the code for me and others.

@klar-C
Copy link
Author

klar-C commented Dec 26, 2020

@HenrikBengtsson Yep, done - apologies.

@HenrikBengtsson
Copy link
Owner

I know it is possible to specify the workers for the plan, but is it possible to have multiple plans? future({}) does not seem to have an option for specifying a plan either.

Correct and correct. The most likely solution to this is what I refer to as "resource" specifications, e.g.

f1 <- future(..., resources = "abc")
f2 <- future(..., resources = "def")

where only pre-registered future backends that can provide the resource "abc" will be able to take on future f1 and likewise for f2. The user would set up a set of future backends, e.g.

abc <- tweak(cluster, workers = 3L, provides = c("abc", "localhost", mem=4*1024^3))
def <- tweak(cluster, workers = 2L, provides = c("def", "localhost", mem=2*1024^3))
setOfPlans(list(abc, def))

If the above is set and the code requests resource "ghi" but none of the known plans supports it, then an error is thrown, e.g.

f3 <- future(..., resources = "ghi")
## Error: Cannot resolve future because none of the registered future backends provides resource 'ghi'

This is obviously not implemented but that is how I anticipate expanding the Future API so that one in code can specify various types of resources, including generic resources (aka "gres" in the HPC scheduler world) such as "abc" and "def".

Although it's not obvious from the commits, basically every new release of the package contain updates toward supporting the above. There's a large amount of cleanups/constraining of the Future API that need to take place for this to happen, e.g. dropping support for local = FALSE and persistent = TRUE. In order to drop those, especially the latter, alternatives solutions for them need to be provided.

It looks like constantly using the persistent=TRUE flag works in f <- future({...},persistent=TRUE). Found it by looking at the source code. Please let me know if this is not the right way to do it / think about it.

Please see above and Issue #433; support for argument persistent is deprecated and will become defunct at some point. Per #433, it probably won't get defunct until easy use of "sticky globals" is implemented and possibly also not before "resources" are implemented but I cannot promise anything.

Other than that, it looks like you've understood the inner machinery of the current 'cluster' implementation. Skimming through I think you've got the gist. If you choose to use persistent=TRUE, and the other internal arguments .cleanup and .init for now, please know that I cannot promise it will be around. Also, please don't submit such code to CRAN because that will be a blocker for me when I need to remove those in some future release.

So, unfortunately, I don't think there's an obvious solution to your needs until resources and sets of backends are supported, which is still quite a long way out. You're not the first one we these needs; I've labeled other similar feature requests on this as 'feature/resources'.

@klar-C
Copy link
Author

klar-C commented Dec 26, 2020

Ok, got it - thanks a lot for the detailed response.

The solution you laid out would be exactly what I was looking for in the first place.

And def. won't submit anything to CRAN on this.

Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants