Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misleading benchmark in README #123

Open
jalil-salame opened this issue Mar 4, 2024 · 3 comments
Open

Misleading benchmark in README #123

jalil-salame opened this issue Mar 4, 2024 · 3 comments

Comments

@jalil-salame
Copy link

This benchmark image is misleading:

README benchmark image

The patterns used are not the same:

  • fzf-native uses main.c$
  • fzy-native uses main.c
  • fzf-lua uses main.c

Notice the extra $ at the end in fzf-native. This causes fzf-native to only find 39 matches compared to the 44215. It is not clear that the fzf-native search is more specific and thus gives misleading information about the benchmark results.

Proposed solution

Add a note like:

Note

Here fzf-native is much faster because we can use a narrower match thanks to fzf-native's syntax:

README benchmark image

Other solutions that were considered

  • Re-run the benchmark using main.c for all the queries
  • Re-run the benchmark using main.c$ for all queries (I couldn't find out if fzy-native supports regex matches though)
  • Delete the benchmark from the repo (there is already one benchmark, no need for more c:)
@Conni2461
Copy link
Member

I couldn't find out if fzy-native supports regex matches though

there is no way to do this with fzy, thats kinda the point if you need to find things main.c doesnt get you anywhere with fzy but with fzf you can just append a $ and you end up with 39 results in ms. Thats why i wanna keep that benchmark! But i will redo it.

i will be fair and show main.c for all queries but i keep the main.c$ with an additional note. And i will also add the benchmark code, because i have to redo it from scratch, i no longer have the input file with the 180k lines so i redo it with a 600k input file. or something like that.

i always knew that this test was misleading but yeah, it is what it is

@jalil-salame
Copy link
Author

I couldn't find out if fzy-native supports regex matches though

there is no way to do this with fzy, thats kinda the point if you need to find things main.c doesnt get you anywhere with fzy but with fzf you can just append a $ and you end up with 39 results in ms. Thats why i wanna keep that benchmark! But i will redo it.

Then just add the note c: The $ is easy to miss in the wall of text, and it is hard to find out if fzy supports the syntax.

A better wording of the note would be:

Note

The following test uses fzf's syntax to get a narrower match, this makes us 11x faster than fzy 🎉 by not doing as much work (we return 39 matches instead of 44215).

This way you can skip redoing the benchmarks.

On the other hand, being able to reproduce the benchmarks would be nice c: And if you do redo them, then I suggest comparing:

  • fzf-native main.c$ (narrower match, only supported by fzf-native 🎉)
  • fzf-native main.c (broader match)
  • fzy-native main.c
  • fzy-lua main.c

This keeps the benchmarks clear c:

@Conni2461
Copy link
Member

okay i need to write this down better for documentation but this is the script:

local path = require "plenary.path"
local bench = require "plenary.benchmark"
local fzf = require "fzf_lib"
local fzy_lua = require "telescope.algos.fzy"
local fzy_native = require "fzy-lua-native" # romgrk/fzy-lua-native

local files = vim.split(path:new("files"):read(), "\n")

print("Benchmark Setup: 5 warmup and 25 runs each")
bench("fuzzy sorting", {
  warmup = 5,
  runs = 25,
  fun = {
    {
      "fzf-native pattern: 'f' results: '414451'",
      function()
        local out = {}

        local slab = fzf.allocate_slab()
        local p = fzf.parse_pattern("f", 0)

        for _, f in ipairs(files) do
          local score = fzf.get_score(f, p, slab)
          if score > 0 then
            table.insert(out, { line = f, score = score })
          end
        end

        fzf.free_pattern(p)
        fzf.free_slab(slab)
        return #out
      end,
    },
    {
      "fzy-native pattern: 'f' results: '414451'",
      function()
        local out = {}

        local needle = "f"
        for _, f in ipairs(files) do
          if fzy_native.has_match(needle, f) then
            local _, score = fzy_native.positions(needle, f)
            table.insert(out, { line = f, score = score })
          end
        end

        return #out
      end,
    },
    {
      "fzy-lua pattern: 'f' results: '414451'",
      function()
        local out = {}

        local needle = "f"
        for _, f in ipairs(files) do
          if fzy_lua.has_match(needle, f) then
            local _, score = fzy_lua.positions(needle, f)
            table.insert(out, { line = f, score = score })
          end
        end

        return #out
      end,
    },
  },
})

bench("fuzzy sorting", {
  warmup = 5,
  runs = 25,
  fun = {
    {
      "fzf-native pattern: 'main.c' results: '61539'",
      function()
        local out = {}

        local slab = fzf.allocate_slab()
        local p = fzf.parse_pattern("main.c", 0)

        for _, f in ipairs(files) do
          local score = fzf.get_score(f, p, slab)
          if score > 0 then
            table.insert(out, { line = f, score = score })
          end
        end

        fzf.free_pattern(p)
        fzf.free_slab(slab)
        return #out
      end,
    },
    {
      "fzf-native pattern: 'main.c$' results: '66'",
      function()
        local out = {}

        local slab = fzf.allocate_slab()
        local p = fzf.parse_pattern("main.c$", 0)

        for _, f in ipairs(files) do
          local score = fzf.get_score(f, p, slab)
          if score > 0 then
            table.insert(out, { line = f, score = score })
          end
        end

        fzf.free_pattern(p)
        fzf.free_slab(slab)
        return #out
      end,
    },
    {
      "fzy-native pattern: 'main.c' results: '61539'",
      function()
        local out = {}

        local needle = "main.c"
        for _, f in ipairs(files) do
          if fzy_native.has_match(needle, f) then
            local _, score = fzy_native.positions(needle, f)
            table.insert(out, { line = f, score = score })
          end
        end
        return #out
      end,
    },
    {
      "fzy-lua pattern: 'main.c' results: '61539'",
      function()
        local out = {}

        local needle = "main.c"
        for _, f in ipairs(files) do
          if fzy_lua.has_match(needle, f) then
            local _, score = fzy_lua.positions(needle, f)
            table.insert(out, { line = f, score = score })
          end
        end
        return #out
      end,
    },
  },
})

and these are the results

Benchmark Setup: 5 warmup and 25 runs each
Benchmark Group: 'fuzzy sorting' -----------------------
Benchmark #1: 'fzf-native pattern: 'f' results: '414451''
  Time(mean ± σ):    118.8 ms ±  34.7 ms
  Range(min … max):   88.5 ms … 204.4 ms  25 runs
Benchmark #2: 'fzy-native pattern: 'f' results: '414451''
  Time(mean ± σ):    363.0 ms ±  31.5 ms
  Range(min … max):  280.0 ms … 407.7 ms  25 runs
Benchmark #3: 'fzy-lua pattern: 'f' results: '414451''
  Time(mean ± σ):    4.7 s ±  54.7 ms
  Range(min … max):  4.7 s … 4.9 s  25 runs
Summary
  'fzf-native pattern: 'f' results: '414451'' ran
  3.1 ± 0.9 times faster than 'fzy-native pattern: 'f' results: '414451''
  39.9 ± 11.7 times faster than 'fzy-lua pattern: 'f' results: '414451''

Benchmark Group: 'fuzzy sorting' -----------------------
Benchmark #1: 'fzf-native pattern: 'main.c' results: '61539''
  Time(mean ± σ):    102.5 ms ±  17.6 ms
  Range(min … max):   91.6 ms … 163.1 ms  25 runs
Benchmark #2: 'fzf-native pattern: 'main.c$' results: '66''
  Time(mean ± σ):      7.7 ms ± 272.1 μs
  Range(min … max):    7.4 ms …   8.1 ms  25 runs
Benchmark #3: 'fzy-native pattern: 'main.c' results: '61539''
  Time(mean ± σ):    108.1 ms ±  27.8 ms
  Range(min … max):   90.7 ms … 162.3 ms  25 runs
Benchmark #4: 'fzy-lua pattern: 'main.c' results: '61539''
  Time(mean ± σ):    1.7 s ±  27.5 ms
  Range(min … max):  1.7 s … 1.8 s  25 runs
Summary
  'fzf-native pattern: 'main.c$' results: '66'' ran
  13.4 ± 2.3 times faster than 'fzf-native pattern: 'main.c' results: '61539''
  14.1 ± 3.7 times faster than 'fzy-native pattern: 'main.c' results: '61539''
  227.7 ± 8.8 times faster than 'fzy-lua pattern: 'main.c' results: '61539''

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants