Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve and optimize "pastel distinct" #67

Open
3 of 6 tasks
sharkdp opened this issue Aug 25, 2019 · 5 comments
Open
3 of 6 tasks

Improve and optimize "pastel distinct" #67

sharkdp opened this issue Aug 25, 2019 · 5 comments
Labels
enhancement New feature or request

Comments

@sharkdp
Copy link
Owner

sharkdp commented Aug 25, 2019

  • After the optimization is done, re-arrange the colors such that the minimal difference between a color and any of its predecessors is maximized (https://en.wikipedia.org/wiki/Farthest-first_traversal)
  • add --max-luminance (and similar) options in order to generate colors for a white/black background
  • add --num-iterations-mean, --num-iterations-max, etc to fine-tune the simulated annealing parameters
  • support something like pastel distinct 8 red blue yellow (fix three colors, generate five new)

optimize:

  • do not re-compute all distances (order N^2), but only the once that changed (order N)
  • Is there something we can easily parallelize? The mutual distance-computation?
@sharkdp
Copy link
Owner Author

sharkdp commented Sep 1, 2019

@d-dorazio has implemented the major part of this here: #83

@sharkdp
Copy link
Owner Author

sharkdp commented Sep 1, 2019

I have added a hidden --print-minimal-distance option for debugging/inspection.

To gain some more insights, I wrote a small Python script which runs pastel distinct N (N = 2…13) many times to collect some statistics about the minimal distance.

Here is the current version:

[002] max: 258.693, min: 258.675 (100.0%), mean: 258.693 (100.0%)
[003] max: 171.247, min: 171.199 (100.0%), mean: 171.244 (100.0%)
[004] max: 134.193, min: 134.139 (100.0%), mean: 134.180 (100.0%)
[005] max: 111.769, min: 107.472 ( 96.2%), mean: 109.117 ( 97.6%)
[006] max: 103.226, min: 103.180 (100.0%), mean: 103.218 (100.0%)
[007] max:  93.686, min:  90.554 ( 96.7%), mean:  92.303 ( 98.5%)
[008] max:  87.813, min:  85.984 ( 97.9%), mean:  87.488 ( 99.6%)
[009] max:  83.054, min:  80.084 ( 96.4%), mean:  82.133 ( 98.9%)
[010] max:  79.765, min:  79.651 ( 99.9%), mean:  79.721 ( 99.9%)
[011] max:  74.069, min:  71.182 ( 96.1%), mean:  73.915 ( 99.8%)
[012] max:  70.588, min:  67.531 ( 95.7%), mean:  69.774 ( 98.8%)
[013] max:  66.987, min:  64.273 ( 95.9%), mean:  66.091 ( 98.7%)

max shows the best achieved score (i.e. the maximum of all maximized minimal distances). min shows the worst achieved score. mean is the average across all runs. Interestingly, there are some low-N cases like N=5 or N=7 which seem particularly hard (low mean and min values). As far as I can tell, the algorithm seems to get stuck in a local minimum (it looks like this in the --verbose output).

Example (N=5):
This solution is close to the optimum (rearranged):
#ff0800, #0400ff, #230025, #ffff00, #00ffff (D_min = 111.77)
But this solution comes up often (rearranged):
#ff0000, #0000ff, #000030, #ffffe2, #00ff00 (D_min = 107.53)

Unfortunately, when I increase the number of iterations (and decrease the cooling rate), the "mean" values improve (which is expected), but similar "min" values still appear:

[002] max: 258.693, min: 258.675 (100.0%), mean: 258.693 (100.0%)
[003] max: 171.247, min: 171.247 (100.0%), mean: 171.247 (100.0%)
[004] max: 134.193, min: 134.153 (100.0%), mean: 134.181 (100.0%)
[005] max: 111.769, min: 107.473 ( 96.2%), mean: 110.262 ( 98.7%)
[006] max: 103.226, min: 103.119 ( 99.9%), mean: 103.215 (100.0%)
[007] max:  93.686, min:  90.576 ( 96.7%), mean:  92.873 ( 99.1%)
[008] max:  87.813, min:  85.743 ( 97.6%), mean:  87.421 ( 99.6%)
[009] max:  83.054, min:  80.857 ( 97.4%), mean:  82.322 ( 99.1%)
[010] max:  79.765, min:  77.684 ( 97.4%), mean:  79.696 ( 99.9%)
[011] max:  74.069, min:  71.165 ( 96.1%), mean:  73.931 ( 99.8%)
[012] max:  70.588, min:  67.757 ( 96.0%), mean:  70.145 ( 99.4%)
[013] max:  67.004, min:  63.047 ( 94.1%), mean:  66.523 ( 99.3%)

This means that the users can still have "unlucky" runs.

The best thing I can come up with right now is to perform multiple runs with random initial starting conditions (starting colors). Then, we would simply pick the best run afterwards. This is also something that could easily be parallelized.

@d-dorazio FYI

@danieledapo
Copy link
Contributor

I think that running the simulated annealing multiple times is fine and probably the best way to not get stuck in a local minimum. Also, this is an embarrassingly parallel thing to do which is quite nice.

I'm not too sure whether it makes sense to run the process on a completely different set of starting colors rather than re-run the process on the same starting ones. That's because I think is less likely to get stuck in a local minimum in the same set of inputs twice vs different ones, but I'm not too sure, it's just a gut-feeling.

@sharkdp
Copy link
Owner Author

sharkdp commented Sep 2, 2019

That's because I think is less likely to get stuck in a local minimum in the same set of inputs twice vs different ones

I would think that the exact opposite would be the case. Certainly, if the initial temperature is high enough, it shouldn't really matter.

@danieledapo
Copy link
Contributor

danieledapo commented Sep 17, 2019

  • support something like pastel distinct 8 red blue yellow (fix three colors, generate five new)

I guess this is done given that #88 was merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants