Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solution worse than initial guess #54

Open
compleathorseplayer opened this issue Jun 15, 2020 · 16 comments
Open

Solution worse than initial guess #54

compleathorseplayer opened this issue Jun 15, 2020 · 16 comments
Assignees

Comments

@compleathorseplayer
Copy link

compleathorseplayer commented Jun 15, 2020

I have been comparing performance of CMAES() with BlackBoxOptim on a 50-dimensional problem. BlackBoxOptim has been giving qualitatively better results, and even when I initialise CMAES() with the solution from BBO, the candidate solution it (CMAES) produces (after 1500 iterations) is an order of magnitude worse (in terms of objective) than the initial value.

Am I missing something?

I am just using the basic res=Evolutionary.optimize(f, b, CMAES())

Thanks for any help!

PS this is just a test example of lasso linear regression

@wildart wildart self-assigned this Jun 15, 2020
@wildart
Copy link
Owner

wildart commented Jun 15, 2020

It's because default population size is 1. Play around with μ and λ parameters for better results.

I'll fix these defaults in the next release.

@compleathorseplayer
Copy link
Author

compleathorseplayer commented Jun 15, 2020

Thanks. I am used to DE/PSO, where the default is to keep the previous best in the population and usually the current best as well. I don't have good intuition that would help me choose \mu and \lambda, but I'll see if I can figure out from the (very brief!) documentation how to set the population size.
Thanks for incorporating these algorithms - I just want to get to know more about the latest optimisation methods by playing around.

I looked through the documentation and I didn't see any examples of how to set population size
except on the GA method, which I tried to copy.

cma_out=Evolutionary.optimize(f, binit,CMAES(populationSize=100))

This did not work - I just tried copying the introductory example:

result = Evolutionary.optimize(
x -> sum(x.^2), ones(3),
GA(populationSize = 100, selection = susinv,
crossover = discrete, mutation = domainrange(ones(3))))

Could you please give me a hint?

Is there somewhere else with documentation or examples that I missed?

PS Tried

cma_out=Evolutionary.optimize(f, binit,CMAES(lambda=100))

Didn't work, either (never heard of lambda).

Is there some documentation on this somewhere, or are we supposed to guess?

@wildart
Copy link
Owner

wildart commented Jun 16, 2020

So, CMAES parameters are actually, μ and λ , see all parameters here https://wildart.github.io/Evolutionary.jl/dev/cmaes/.

julia> result = Evolutionary.optimize(x -> sum(x.^2), ones(3), CMAES())

 * Status: success

 * Candidate solution
    Minimizer:  [2.769642578217888e-6, 1.217369418910909e-6, -5.842439600062792e-6]
    Minimum:    4.328700879355879e-11
    Iterations: 841

 * Found with
    Algorithm: CMAES{Float64}

julia> result = Evolutionary.optimize(x -> sum(x.^2), ones(3), CMAES=5, λ=50))

 * Status: success

 * Candidate solution
    Minimizer:  [2.960367799978342e-7, -1.5133439532656052e-8, 3.154587969807887e-7]
    Minimum:    1.873810486961411e-13
    Iterations: 55

 * Found with
    Algorithm: CMAES{Float64}

@compleathorseplayer
Copy link
Author

Hi. I managed to get it working with the improved parameters. I think there must be something wrong with the implementation, because the objective function keeps on getting bigger and bigger. [Or perhaps it is maximization?]
I checked the objective function, which minimises quite easily using BlackBoxOptim.
I even started from a solution obtained by BBO, but the solutions from CMAES() kept on getting worse and worse the longer I ran it. (this should not happen for any optimisation algorithm).
I am just optimising a penalised least-squares here (as a toy problem).

@wildart
Copy link
Owner

wildart commented Jun 16, 2020

You need to be sure that an objective function can be minimized. Plus, for CMAES population need to be a vector (#55). Moreover, you might want to play around with successive_f_tol option to gen more stable result. Here is a ridge regression implementation:

julia> using Evolutionary, MultivariateStats

julia> m, n, n2 = 9, 6, 3
(9, 6, 3)

julia> X, A, E = randn(m, n), randn(n, n2), randn(m, n2) * 0.1
([0.759505 -0.498365  -0.140265 -1.58317; -0.44183 -0.405412  -2.07436 -0.775688;  ; -1.14085 -0.727431  0.00781635 -0.679711; -0.0292818 0.171201  1.26535 -1.31306], [0.542783 0.482472 0.371404; -0.470562 -0.831348 0.39559;  ; 0.420441 1.19128 -0.0968535; 0.761603 -0.728169 -1.03778], [-0.171325 -0.00380767 -0.0559398; 0.041633 -0.0956199 -0.0616854;  ; 0.19494 -0.105901 0.134468; 0.0228577 -0.0315222 -0.0501453])

julia> Y0 = X * A + E
9×3 Array{Float64,2}:
 -0.0432945   1.52319    4.45818  
 -1.33188    -2.00191    0.0683152
  1.86799    -1.21064   -0.623752 
 -0.684079   -1.34664    2.50405  
  0.0504756   1.52595    0.90838  
  1.35427    -0.180189   0.767996 
 -2.46034    -1.78414   -1.55623  
 -0.99868     0.605168  -1.05425  
 -1.27783     2.58438   -0.51622  

julia> r = 0.1
0.1

julia> W = ridge(X, Y0, r; bias=false)
6×3 Array{Float64,2}:
  0.41124     0.499272   0.321012  
 -0.526903   -0.769362   0.304696  
 -0.62234     0.257677  -1.97596   
  0.0164645   0.142042   0.429671  
  0.425532    1.07026    0.00437913
  0.808652   -0.681024  -1.02297   

julia> ridge_obj(w) = let a = reshape(w,n,n2); sum(abs2, Y0 .- X*a) + r*sum(abs2, a) end
ridge_obj (generic function with 1 method)

julia> result = Evolutionary.optimize(ridge_obj, ones(n*n2), CMAES=10, λ=50), Evolutionary.Options(successive_f_tol=15))

 * Status: success

 * Candidate solution
    Minimizer:  [0.4112402105950299, -0.5269029265047389, -0.6223401986374688,  ...]
    Minimum:    1.0358078520398115
    Iterations: 550

 * Found with
    Algorithm: CMAES{Float64}


julia> abs.(W .- reshape(Evolutionary.minimizer(result), n, n2))
6×3 Array{Float64,2}:
 3.24025e-10  3.41177e-12  9.94897e-10
 9.57585e-10  6.57563e-10  1.39324e-9 
 4.8907e-10   9.63813e-10  4.11856e-10
 1.61619e-10  5.99307e-10  1.59142e-9 
 4.99667e-12  6.50458e-10  1.2045e-9  
 2.07028e-10  2.20672e-10  1.39011e-9 

@compleathorseplayer
Copy link
Author

compleathorseplayer commented Jun 19, 2020

Hi. Thanks, I will try examples like that myself, and compare with BlackBoxOptim

Is there any place where examples like the above are shown? It is very difficult to guess how to use parameters.

I tried to use the GA version and also had difficulties:

using Evolutionary
res = Evolutionary.optimize(f_bin,
bv0,
GA(populationSize=50,\epsilon=5,crossoverRate=.3,mutationRate=.1),
Evolutionary.Options(iterations=20))

And it complained 'No Method Matching ...' Though it worked with GA(populationSize=50).
Similarly, I am not getting any progress from the initial guess on any of the examples I have tried with GA() or CMAES(). As I said, I will try your examples above.

Thanks for your help!
PS I just found out that the '\epsilon-tab' argument is not working for me. I left it out and it worked.
But I need it. Any ideas of how to include it without the Greek symbol?

@compleathorseplayer
Copy link
Author

compleathorseplayer commented Jun 19, 2020

Hi. Has anyone apart from you gotten this to work?

I have used GA and Evolutionary Optimisation in the past (and even teach courses on it) so I have some experience, and I have to say honestly, that I haven't gotten anything to work here at all.
Like I said above, I will try to repeat your examples, but so far, with quite a bit a playing around, I haven't gotten any solutions better than the initial guesses, and in most cases, worse.
[sorry to give this kind of feedback, but obviously if you have taken the time and effort to write this (which is not unappreciated!) and make it available, you obviously must care about it's being useful]
I would love to be able to explore these methods in Julia and to share them with my students, but in my opinion, this implementation is not working.

@wildart
Copy link
Owner

wildart commented Jun 19, 2020

I understand you frustration about Unicode symbols. I'm going to provide a English-based parameter names in future versions.

Any ideas of how to include it without the Greek symbol?

You may copy a Unicode symbol from any web source, e.g. from the package docs https://wildart.github.io/Evolutionary.jl/dev/cmaes/

GA(populationSize=50,\epsilon=5,crossoverRate=.3,mutationRate=.1),

Unfortunately, you cannot use \epsilon as a parameters instead of ɛ. And with default call GA(populationSize=50) the performance will be truly abysmal, because mutation and selection operations by default are identity (no mutation). You need to provide parameters for mutation operations, e.g.

    result = Evolutionary.optimize(
        objfun,
        initial_individual,
        GA(
            populationSize = 4P,
            mutationRate = 0.15,
            ɛ = 0.1,
            selection = susinv,
            crossover = intermediate(0.25),
            mutation = domainrange(fill(0.5,N)),
        ));

Is there any place where examples like the above are shown? It is very difficult to guess how to use parameters.

Look in the test folder. There are many test that show how to use the package functions.

Hi. Has anyone apart from you gotten this to work?

I am not aware of any efforts, but I periodically receive various issues, which I'm happy to address, that indicate that this library is used. I appreciate your feedback it has been quite valuable.

@compleathorseplayer
Copy link
Author

compleathorseplayer commented Jun 19, 2020

Hi Art, My main suggestion would be to make the defaults more reasonable. Given how bad my results have been, I also wonder whether the default tolerance is in the right order of magnitude -- all my experience with it so far seems to indicate that the algorithm is not even trying to improve (for the default tolerance).

With regard to the use of greek symbols, I regard their usage in Julia as a bit precious (and this coming from a PhD mathematician and MIT alum!) - why not allow the anglicised versions (in this case 'epsilon') as well?[just a thought]

I really like the simplicity of your interface, and would love to have this as a tool, so thank you for your efforts. I still haven't gotten it usefully working, but look forward to its development.
My main reason for trying this out in the first place was to see whether the hype about CMAES is justified. I still haven't answered that. BlackBoxOptim (based on DE and a bit of PSO?) seems to work pretty well. Is there an example you might include that would show off CMEAS's best features? [everyone can do Rosenbrock and Ackley]
That would be really cool, both for myself and my students.
Thanks!

@compleathorseplayer
Copy link
Author

compleathorseplayer commented Jun 20, 2020

Hi Art,

First, I verified that I could run your regression example using CMAES, and that worked fine.

Next, I tried your Suggestion above for the GA.

There were a few issue, but I eventually got it running without error. The 'epsilon' issue had to do with it expecting '\varepsilon(tab)' instead of '\epsilon(tab)'.

Then there did seem to something which did seem to be a bug. The line 'mutation=... '
I didn't know what N was so I guessed the dimensionality of my problem (33).
It fell over because somewhere there was an instance of Bool(.99) which it didn't like.
I deleted that line and the algorithm ran OK, and even said 'success', but the solution was just the initial guess which I happen to know isn't even near the correct solution.

Here is what I am trying to do. Suppose I have two regression functions which have generated data but I don't know which data point belongs to which model (say each point has about a 50% chance of belonging to one or the other models), nor do I know the regression parameters. What I want to do is have the GA divide the points into two groups, where for each group a least-squares fit is made and measured by Sum of Squares, the total SS being the Objective to be miminised.
BlackBoxOptim can solve this problem, but I have to round the candidates to binary which is obviously wasteful and takes a long time. Unfortunately the GA in Evolutionary does not work at all for this (no progress at all!), despite your hints and suggestions (for which I am nevertheless grateful!).

It seems to be behaving as if it is not distinguishing between objective values. I even multiply the SSE in my objective function by huge numbers to make sure changes weren't being ignored due to tolerance, but not once did I see any progress at all.

Thanks for any hints!
I will try to post a toy version of my problem on here.

@wildart
Copy link
Owner

wildart commented Jun 21, 2020

I added updated CMAES with new default parameters & weighted recombination. It is now based on https://arxiv.org/abs/1604.00772. The old version was truly a toy implementation with many limitations.

With GA you need to be sure that the mutation operations must correspond to the individual types. E.g. in OneMax problem the individual is vector of bits, so only bit-wise mutations are applicable. Moreover, the objective function is a bit count, so you need to take negative for minimization.

Sometimes, when the function minimum is 0, some selection operations would not work correctly, e.g. roulette wheel selection, because the selection process is proportionate to a fitness value that goes to 0. So, you either take a multiplicative inverse of the fitness function, or pick more appropriate selection algorithm.

@compleathorseplayer
Copy link
Author

compleathorseplayer commented Jun 21, 2020

Hi Art, Thanks for getting back to me. I will re-download and try the CMA-ES.

In terms of the GA, I have worked out my toy problem. I think you will see what seems to be problematic if you try to solve it.

x=randn(100); y=[(1 .+2*x[1:50]);(3 .+.5*x[51:100])]; #generate two regressions

# here is the objective function - takes bit string (n=100) as an argument

function f(b)  
    inds0=1:100;  inds1=inds0[b]; inds2=inds0[.!b];
    m1=length(inds1); m2=length(inds2); sse=0;

    if m1>0      #  Regression 1
        beta1=[ones(m1) x[inds1]]\y[inds1] 
        sse=sse+sum((y[inds1]-[ones(m1) x[inds1]]*beta1).^2)
        end # get coefs and sse for gp 2

    if m2>0     # Regression 2
        beta2=[ones(m2) x[inds2]]\y[inds2] 
        sse=sse+sum((y[inds2]-[ones(m2) x[inds2]]*beta2).^2)
    end 
end

# Starting point (alternating 1's and 0's)
binvec0=(ones(100).==1); binvec0[2:2:100].=(1==2); 

#My attempt based on your input:

result = Evolutionary.optimize(
        f,
        binvec0,
        GA(
            populationSize = 40,
            mutationRate = 0.15,
            \u03b5 = 0.1,
            selection = susinv,
            crossover = intermediate(0.25),
            #mutation = domainrange(fill(0.5,100))
        ))

bopt=Evolutionary.minimizer(result);

Note I have commented out the 'mutation' line because it throws an error.
When I run without, it runs happily but gives a nonsensical solution.

Thanks for any help!

Cheers

@wildart
Copy link
Owner

wildart commented Jun 22, 2020

With bit vector, you need binary crossover (https://wildart.github.io/Evolutionary.jl/dev/crossover/) and mutations (https://wildart.github.io/Evolutionary.jl/dev/mutation/#Genetic-Algorithm-1)

julia> result = Evolutionary.optimize(
               f,
               binvec0,
               GA(
                   populationSize = 40,
                   mutationRate = 0.15,
                   epsilon = 0.1,
                   selection = susinv,
                   crossover = twopoint,
                   mutation = flip
               ))

 * Status: success

 * Candidate solution
    Minimizer:  [true, true, true,  ...]
    Minimum:    2.311115933264683e-29
    Iterations: 240

 * Found with
    Algorithm: GA


julia> bopt=Evolutionary.minimizer(result);

julia> count(bopt[1:50])
50

@compleathorseplayer
Copy link
Author

Thanks so much for taking a look! Yes, that is the solution!

Of course I know the necessity for binary crossover but did not understand the syntax of your previous example well enough to know that that was not happening.
Is there any reason why the algorithm couldn't just assume bitwise operations by default then, whenever presented with binary data? [just a thought]

Short of this, if the last 3 subcommands (describing the operations) could be clearly documented somewhere, that would make a huge difference to potential users.

Thank you!

@wildart
Copy link
Owner

wildart commented Jun 22, 2020

Is there any reason why the algorithm couldn't just assume bitwise operations by default then, whenever presented with binary data?

The API deigned in a way that an individual setup is separate from the algorithm parameters, so user should draw the connection between the two.

I plan to improve the documentation, but it'll take some time.

@compleathorseplayer
Copy link
Author

compleathorseplayer commented Jun 22, 2020

That's great. Regarding your comment about the API, I wonder if I might suggest that if (as it seems) the algorithm falls over for bit string arguments if the operations are not set to binary, perhaps inside a try/catch (with a warning) if the argument is binary and there is an error, perhaps the binary operations might be attempted? (just a thought)

I should add that since your hint above, I have tried out the GA on quite a few examples, and have had very good results!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants