Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability for user to set a reference_job as in the config in ord… #281

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

JamesGood626
Copy link

@JamesGood626 JamesGood626 commented Mar 21, 2019

#179

Test assertion which exercises the new feature is located in /test/benchee/formatters/console_test.exs:71

In the event that a reference_job is unable to match one of the scenarios in the list provided as an argument to comparison_report/4 in /lib/benchee/formatters/console/run_time.ex, I decided that the default scenario will be used rather than warning the user. Can change that so the output will provide a warning if requested.

Updated the defstruct, dialyzer types, and documentation in /lib/benchee/configuration.ex to include reference_job as a config option.

…er to facilitate choosing which function in the benchmark to compare all other functions too.
@devonestes
Copy link
Collaborator

Great start!

Can you add your .elixir_ls folder to your local repo excludes so all those files aren't included in this PR? You can find information about that here: https://help.github.com/en/articles/ignoring-files#explicit-repository-excludes

Copy link
Member

@PragTob PragTob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there, thank you for your contribution 💚

As @devonestes said would be cool if you could ignore the .elixir_ls stuff.

Timing for this is (sadly) a bit unfortunate as #280 completely switches up where and how relative statistics are computed. It should make the implementation easier in the end but it will most likely incur merge conflicts here + I'm probably going to merge it today (unless @devonestes has objections).

I'm also not too sure about output here, say we have 3 jobs and we put the middle one as the reference job what should the output look like?

Middle - 100 ips
Fastest - 200 ips 0,5x slower
Slowest - 50 ips 2x slower

?

I think the nicest output would probably be:

Fastest - 200 ips 2x faster
Middle - 100 ips
Slowest - 50 ips 2x slower

As I said the other PR is likely to introduce merge conflicts (and the need to handle the reference job already at the statistics level). Happy to help you resolve merge conflicts when that is through although it might also be an option to start fresh (which I know would suck, sorry :( )

@@ -243,6 +264,22 @@ defmodule Benchee.Formatters.Console.RunTime do
]
end

defp find_reference_job(scenarios, reference_job) do
scenarios
|> Enum.reduce(%{found_job: nil, other_scenarios: []}, fn scenario, acc ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be mistaken but you could probably use Enum.split_with/2 here instead :)

)
end)

assert output =~ ~r/Comparison:.*\s.*First.*\s.*Second/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be cool to also check that multipliers and differences are also displayed correctly + maybe a test in benchee_test for the feature as a whole :)

@PragTob
Copy link
Member

PragTob commented Mar 21, 2019

Oh yeah regarding when the reference job can't be found: I'd warn the user but go on with the default so that it works and I don't have to run the benchmark again but know that my settings aren't right/complete. Which I'd also test :)

reference_job: reference_job
})
when is_binary(reference_job) do
%{found_job: found_job, other_scenarios: other_scenarios} =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Often in Elixir tuples are used in cases like this where a function returns more than one discrete "thing".

So, lines 242-250 could also be written something like this:

{scenario, other_scenarios} =
  case find_reference_job(scenarios, reference_job) do
    {nil, other_scenarios} -> {scenario, other_scenarios}
    found -> found
  end

defp comparison_report([scenario | other_scenarios], units, label_width, %{
reference_job: reference_job
})
when is_binary(reference_job) do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call putting the guard clause here!

find_reference_job([scenario | other_scenarios], reference_job)

scenario =
if found_job != nil do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually negations should be avoid as clauses in if statements if possible. Many people find them a little easier to grasp that way.

So this could also be written as:

if is_nil(found_job) do
  scenario
else
  found_job
end

@devonestes
Copy link
Collaborator

@PragTob James and I worked a bit on this together yesterday, and I had to go before we finished, so I'll handle the merge conflicts on this if there are any.

@JamesGood626
Copy link
Author

Thanks for the feedback. And apologies for my delayed response. I'll have to take a crack it this weekend.

@JamesGood626
Copy link
Author

Okay, so excuse my newbness, but I've recently synced my fork to this repo, and pulled down all of the new changes. I had git stashed all of the changes made in this feature branch I made on my local, before switching over to master and fetching the upstream changes.

What's the best way I should proceed with implementing this in order to avoid major merge conflicts.
Should I just create an entirely new branch and merge that, or just merge the changes from master into this current reference_job_feature branch and then commit again?

@PragTob
Copy link
Member

PragTob commented Apr 15, 2019

hey @JamesGood626 !

No worries, we're happy to help :) Thanks for working on this 👏

First things first:
I think it'd be a good idea to put .elixir_ls in your local .gitignore (as devon mentioned) and delete the files as that makes the overview easier.

Secondly:

To get the master changes back in here sounds about right what you said so basically:

git checkout master
git pull
git checkout feature-branch
git merge master # or git rebase master which I prefer but it requires some more familiarity with git, merge is fine
# resolve merge conflicts at best with a merge tool (https://gist.github.com/karenyyng/f19ff75c60f18b4b8149 I use meld which is brilliant)
git stash pop # Only if I understoof you correctly that you had your most recent changes in git stash
# continue working

However, it pains me to say this but I'd actually recommend starting over - keeping the code around to put in. Why would I say that? In the meantime the complete calculation of those "relative" statistics moved into an entirely new module: https://github.com/bencheeorg/benchee/blob/master/lib/benchee/relative_statistics.ex - so a lot of your code would have to move their either way. It might be faster to just start over putting it in that place instead of resolving merge conflicts where afterwards you'd have to change everything either way.

Hope that was helpful 🤞

edit: don't hesitate to ask any other questions :) For more real time help also feel free to hop onto the elixir slack and in the #benchee channel

@JamesGood626
Copy link
Author

Yeah your comments helped. And I'll hop in the slack if I run into anything while making these changes.

@JamesGood626
Copy link
Author

Ahh and thank you.

@PragTob
Copy link
Member

PragTob commented May 20, 2019

👋

Hey, not meaning to nag but what's the status here? If you don't get to it don't worry, then I'd close it and when I get to it use it as the basis to implement it. Or if you want to happy to help 😃

@JamesGood626
Copy link
Author

I made an attempt. I've been meaning to mess around with the output being logged from tests in one of my own projects, because I have been unable to view any IO.puts that take place within a test suite ever since I've pulled down this repo with the new changes.

I never brought this up due to this being a pretty silly thing to get stuck on -_-

@JamesGood626
Copy link
Author

I took notes when I was implementing the changes from this PR the first time around, and I feel pretty confident that I can implement it again, in addition to your requested changes, once I get past this small blocker.

@PragTob
Copy link
Member

PragTob commented May 21, 2019

@JamesGood626 nothing is silly :) Feel free to PR then I can have your changes and have a look at them to help figure out what's wrong! It might be as simple as that we're capturing IO output in all of the integration tests so might be easy to get rid off. Always feel free to ping me 😊

@JamesGood626
Copy link
Author

It sure does seem that way. And the code that I have currently is what is in of the master branch. I have yet to make any changes. Will look into the logging tomorrow/today it's 2am already.

@JamesGood626
Copy link
Author

JamesGood626 commented May 24, 2019

When you say capturing IO output in all of the integration tests, are you referring to this -> https://elixirschool.com/blog/til-capture-log-in-exunit-tests/

I don't see any signs of that configuration being set anywhere, but that's the behavior that I'm experiencing, where any IO.puts that I include inside of a function being tested aren't displayed in the console. However, an IO.puts written within the test file will print to the console.

@devonestes
Copy link
Collaborator

@JamesGood626 would you prefer to get through this yourself (with our help of course!), or to have me apply this to master and resolve the conflicts?

I‘m happy to apply this to the current master and resolve the conflicts, but I don’t want to tob you of a learning opportunity if you want it.

About the capturing, there is that global configuration, but there are also functions that can be used for individual tests. Take a look for „capture_io“ and „capture_log“, and imports of ExUnit.CaptureIo and ExUnit.CaptureLog, and check out the docs for those modules - they‘re pretty good!

@JamesGood626
Copy link
Author

That's the one thing I suppose I didn't make a note of in the steps we had taken when implementing this feature.... I forgot that you did remove the capture_io function. I forgot about that entirely, and didn't suspect that it was due to that function.

I would like to finish implementing this feature, but I know that my timeliness hasn't been the greatest. I can reach out sooner if something blocks my progress. Do you have a date by which you'd like for this feature to be complete?

@devonestes
Copy link
Collaborator

@JamesGood626 It’s open source, dude, so don’t worry about it. There aren’t any strict timelines.

If you get to it eventually, great! If someone else gets to it first, then that’s great, too! But don’t worry about taking a while for this. Life happens, and that takes priority.

@PragTob
Copy link
Member

PragTob commented May 29, 2019

Just to reinforce the date is "whenever you get to it or someone else picks this up" 👌

Don't worry this is all free time coding, no stress 🎉

edit: look at myself, seems like I didn't check my github bnotifications in a week 🤷‍♀️

@JamesGood626
Copy link
Author

Worked on this today. I've got the output for the reference job into this format:

Fastest - 200 ips 2x faster
Middle - 100 ips
Slowest - 50 ips 2x slower

Just need to write the tests to check that it's in the correct order.

After that is done. Then I'll need to handle the cases which warrant issuing a warning to the user.

@JamesGood626
Copy link
Author

Thanks for both your help and patience with this! Even though, as y'all have said, the timeline isn't being sweated.

@JamesGood626
Copy link
Author

All right, I've just started working on the warning bit. Using split_with instead of reduce, and as I was messing around w/ it in iex, I've noted something odd.

list = [1,2,3,4,5,7,8,9]
Enum.split_with(list, fn num -> rem(num, 3) === 0 end)
output: {[3, 9], [1, 2, 4, 5, 7, 8]}

But remove the three and leave the nine from the list and the first element in the tuple
comes out as \t.

list = [1,2,4,5,7,8,9]
Enum.split_with(list, fn num -> rem(num, 3) === 0 end)
{'\t', [1, 2, 4, 5, 7, 8]}

And after playing around with it some more...

Enum.split_with(list, fn num -> div(num, 7) < 1 end)
{[1, 2, 4, 5], '\a\b\t'}
Enum.split_with(list, fn num -> div(num, 7) === 1 end)
{'\a\b\t', [1, 2, 4, 5]}
Enum.split_with(list, fn num -> div(num, 7) > 1 end)  
{[], [1, 2, 4, 5, 7, 8, 9]}

'\a' represents 7, '\b' 8, and '\t' 9 (find it strange that it skips) from \b to \t.

?\a will output 7 in iex

But I haven't seen a way to convert the entire char string to a list again.

@JamesGood626
Copy link
Author

Aside from my random observation noted above... I've also messed up. I did indeed to a git pull origin master, however, it seems as though for whatever reason.... the new relative_statistics.ex change didn't make it into my local copy.

I thought I was finished implementing this, but as I was reading through older comments, I read @PragTob say that most of the changes should be made in the relative_statistics.ex file. This is the moment where I knew I messed up (keeping it pg here, because I'd love to curse right now).

I just did a git clone of the entire project to see everything that's changed, and now I'm just too frustrated to spend another few hours today looking through what needs to be modified.

Pissed that I keep making small mistakes...

@PragTob
Copy link
Member

PragTob commented Jun 10, 2019

Hi @JamesGood626 👋

The observation is that the 'erlangstrings' or well charlists are represented just as a list of numbers - so when elixir has a list of them it doesn't know if it should print a list of numbers or a charstring so it guess 🤷‍♀️

So sorry about all the troubles. If you're too frustrated and just wanna abandon this then I totally understand :) Thanks for all the time spent on it nonetheless. Sorry to have put you in such a spot. If you wanna continue, still happy to help :)

Tobi

@JamesGood626
Copy link
Author

I did mention that my frustration was just for the day. I don't want to wave the white flag just yet. I've just taken a look at the code again.

It doesn't seem that any of the changes required to implement this will actually go in relative_statistics.ex, and instead the changes will still be required in /formatters/console/run_time.ex.

Am I mistaken in my judgement? Because you did mention in an earlier comment that you believe all the changes would be required in relative_statistics.ex.

I still believe I can implement this, just want to make sure that I'm not making changes in an area that would be deemed undesirable. However, if I come back empty handed next time, then I suppose it'll be time to pair.

@PragTob
Copy link
Member

PragTob commented Jun 11, 2019

@JamesGood626 I still believe most of the changes have to/should go there.

Most specifically this code:

  # right now we take the first scenario as we sorted them and it is the fastest,
  # whenever we implement #179 though this becomesd more involved
  defp split_reference_scenario(scenarios) do
    [reference | others] = scenarios
    {reference, others}
  end

There the reference scenario is extracted and all other scenarios are compared against this one to determine their relative_more/relative_less and absolute_difference.

If a reference_job is set, we should take that scenario as the comparison basis. My hope is that everything else will just magically work after it is correctly selected at this point :)

Hope that makes sensed, if not let me know 👌

@JamesGood626
Copy link
Author

I think it'd be a good time to pair now. If that's time you don't mind spending. Otherwise I'd say just implement the feature, and thank you for attempting to help me implement this.

@PragTob
Copy link
Member

PragTob commented Jun 19, 2019

@JamesGood626 hey, are you in Berlin or remote? I'm heading off to Euruko but some time enxt week should work out. 👋

@JamesGood626
Copy link
Author

@PragTob I'm in New York. Just let me know what time will be most convenient for you.

@PragTob PragTob changed the base branch from master to main July 5, 2020 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants