Parallelization #45

wildart · 2020-03-18T19:14:13Z

Consider parallelization of the algorithms in multiple modes:

Single process (core)
Multi-threading
Multi-process (multi-core)

tpdsantos · 2020-03-18T19:32:53Z

Despite the major changes I tried to pull in #43 , parallelization of the entire population is not that hard to implement, using the DistributedArrays package. I already have a prototype that works well with several processes in the same computer and I almost have a way to easily incorporate this in a cluster. If I have time I can add this prototype in another PR

wildart · 2020-03-18T20:26:27Z

I do not think that DistributedArrays is an answer. It covers only multi-core scenario. Even basic julia parallel computing routines are enough for scatter-gather computations that are required for evolutionary algorithms.

My goal is to make some sort of universal parallelization pipeline that would be configured to a specific computational topology, and the used to run any evolutionary algorithm.
In order to do that, all parallelizable parts of the evolutionary algorithms need to be self-contained side-effect free functions, similar to #26 or ga part of #43.

The computational pipeline should have some simple interface and comprehensive syntax,

input |> ga(fitness = objFunc, mutation = inversion) |> Distributed(ncores = 10)

or maybe even some simple DSL,

@local ga(input, mutationRate = 0.2, tolIter = 20) do
    population |> roulette |> inversion |> offspring
end

tpdsantos · 2020-03-18T20:35:49Z

I do not think that DistributedArrays is an answer. It covers only multi-core scenario. Even basic julia parallel computing routines are enough for scatter-gather computations that are required for evolutionary algorithms.

I understand what you're saying, but the Distributed package also deals with multi-core only. You could use something like Base.Threads, but that wouldn't be easy at all, since the ga function would need major changes.

wildart · 2020-03-18T21:00:05Z

but that wouldn't be easy at all, since the ga function would need major changes.

Which you already doing in #43 😉

Anyway, I think the right approach would be to start compartmentalizing code of evolutionary functions.

wildart · 2020-04-23T23:07:06Z

The #49 should provide a easier way of implementing parallelized versions of existing algorithms by introducing a series of new states with appropriate parallel update_state! implementations.

jtravs · 2020-11-24T22:41:03Z

An easy approach to parallelisation which would be useful, is to ask the user (caller) to calculate the fitness for many individuals at once. Then the user can simply parallelise that call themselves (using whatever means is appropriate for their machine) and minimal changes are required in this package. i.e. what I am suggesting is simply that the user provides a function which must internally iterate over teh population. In a simple case I could already use this directly and parallelise through e.g. a call to pmap.

wildart · 2020-12-01T01:32:02Z

calculate the fitness for many individuals at once.

That might work. Currently, the fitness evaluation done by value call with the objective and the individual parameters. If a broadcast (in-place) version of it can be introduced to perform bulk evaluation, it can be overloaded for specific individual type to introduce concurrent broadcast version.

gasagna · 2021-07-07T17:05:01Z

Hi, nice package!

Have there been any update on the parallelisation?

wildart · 2021-07-16T19:42:20Z

I added a simple override for multi-threaded fitness evaluation: https://wildart.github.io/Evolutionary.jl/dev/tutorial/#Parallelization.
Look up the dev part of documentation for information on creating additional overrides for parallel fitness evaluation: https://wildart.github.io/Evolutionary.jl/dev/dev/#Parallelization.

Note: This parallelization implementation work better if fitness function requires considerable computational resources.

nguyentmanh · 2022-05-14T05:04:15Z

Hi!

I'm trying to implement parallelization following the first link, but when I download the package and check Evolutionary.Options(), parallelization is not listed. Actually, "rng" and "callback" are also missing. Do you know what might be the reason? Thanks!

mfogelson · 2023-02-23T21:28:37Z

Just want to follow up and saw that allowing value functions to be able to use Distributed would be great addition.

I think the threading helps but doesn't allow you to evaluate multiple samples from the population simultaneously

wildart mentioned this issue Mar 18, 2020

Major change for creating individuals and population #43

Open

wildart added the enhancement label Feb 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelization #45

Parallelization #45

wildart commented Mar 18, 2020 •

edited

tpdsantos commented Mar 18, 2020

wildart commented Mar 18, 2020

tpdsantos commented Mar 18, 2020

wildart commented Mar 18, 2020

wildart commented Apr 23, 2020

jtravs commented Nov 24, 2020

wildart commented Dec 1, 2020

gasagna commented Jul 7, 2021

wildart commented Jul 16, 2021

nguyentmanh commented May 14, 2022

mfogelson commented Feb 23, 2023

Parallelization #45

Parallelization #45

Comments

wildart commented Mar 18, 2020 • edited

tpdsantos commented Mar 18, 2020

wildart commented Mar 18, 2020

tpdsantos commented Mar 18, 2020

wildart commented Mar 18, 2020

wildart commented Apr 23, 2020

jtravs commented Nov 24, 2020

wildart commented Dec 1, 2020

gasagna commented Jul 7, 2021

wildart commented Jul 16, 2021

nguyentmanh commented May 14, 2022

mfogelson commented Feb 23, 2023

wildart commented Mar 18, 2020 •

edited