Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelization #45

Open
2 of 3 tasks
wildart opened this issue Mar 18, 2020 · 11 comments
Open
2 of 3 tasks

Parallelization #45

wildart opened this issue Mar 18, 2020 · 11 comments

Comments

@wildart
Copy link
Owner

wildart commented Mar 18, 2020

Consider parallelization of the algorithms in multiple modes:

  • Single process (core)
  • Multi-threading
  • Multi-process (multi-core)
@tpdsantos
Copy link

Despite the major changes I tried to pull in #43 , parallelization of the entire population is not that hard to implement, using the DistributedArrays package. I already have a prototype that works well with several processes in the same computer and I almost have a way to easily incorporate this in a cluster. If I have time I can add this prototype in another PR

@wildart
Copy link
Owner Author

wildart commented Mar 18, 2020

I do not think that DistributedArrays is an answer. It covers only multi-core scenario. Even basic julia parallel computing routines are enough for scatter-gather computations that are required for evolutionary algorithms.

My goal is to make some sort of universal parallelization pipeline that would be configured to a specific computational topology, and the used to run any evolutionary algorithm.
In order to do that, all parallelizable parts of the evolutionary algorithms need to be self-contained side-effect free functions, similar to #26 or ga part of #43.

The computational pipeline should have some simple interface and comprehensive syntax,

input |> ga(fitness = objFunc, mutation = inversion) |> Distributed(ncores = 10) 

or maybe even some simple DSL,

@local ga(input, mutationRate = 0.2, tolIter = 20) do
    population |> roulette |> inversion |> offspring
end

@tpdsantos
Copy link

I do not think that DistributedArrays is an answer. It covers only multi-core scenario. Even basic julia parallel computing routines are enough for scatter-gather computations that are required for evolutionary algorithms.

I understand what you're saying, but the Distributed package also deals with multi-core only. You could use something like Base.Threads, but that wouldn't be easy at all, since the ga function would need major changes.

@wildart
Copy link
Owner Author

wildart commented Mar 18, 2020

but that wouldn't be easy at all, since the ga function would need major changes.

Which you already doing in #43 😉

Anyway, I think the right approach would be to start compartmentalizing code of evolutionary functions.

@wildart
Copy link
Owner Author

wildart commented Apr 23, 2020

The #49 should provide a easier way of implementing parallelized versions of existing algorithms by introducing a series of new states with appropriate parallel update_state! implementations.

@jtravs
Copy link

jtravs commented Nov 24, 2020

An easy approach to parallelisation which would be useful, is to ask the user (caller) to calculate the fitness for many individuals at once. Then the user can simply parallelise that call themselves (using whatever means is appropriate for their machine) and minimal changes are required in this package. i.e. what I am suggesting is simply that the user provides a function which must internally iterate over teh population. In a simple case I could already use this directly and parallelise through e.g. a call to pmap.

@wildart
Copy link
Owner Author

wildart commented Dec 1, 2020

calculate the fitness for many individuals at once.

That might work. Currently, the fitness evaluation done by value call with the objective and the individual parameters. If a broadcast (in-place) version of it can be introduced to perform bulk evaluation, it can be overloaded for specific individual type to introduce concurrent broadcast version.

@gasagna
Copy link

gasagna commented Jul 7, 2021

Hi, nice package!

Have there been any update on the parallelisation?

@wildart
Copy link
Owner Author

wildart commented Jul 16, 2021

I added a simple override for multi-threaded fitness evaluation: https://wildart.github.io/Evolutionary.jl/dev/tutorial/#Parallelization.
Look up the dev part of documentation for information on creating additional overrides for parallel fitness evaluation: https://wildart.github.io/Evolutionary.jl/dev/dev/#Parallelization.

  • Note: This parallelization implementation work better if fitness function requires considerable computational resources.

@nguyentmanh
Copy link

Hi!

I'm trying to implement parallelization following the first link, but when I download the package and check Evolutionary.Options(), parallelization is not listed. Actually, "rng" and "callback" are also missing. Do you know what might be the reason? Thanks!

@mfogelson
Copy link

Just want to follow up and saw that allowing value functions to be able to use Distributed would be great addition.

I think the threading helps but doesn't allow you to evaluate multiple samples from the population simultaneously

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants