Improve `channel_unordered` performance: Take items from input channel outside worker threads #123

tkf · 2020-01-05T04:49:56Z

No description provided.

codecov-io · 2020-01-05T04:55:55Z

Codecov Report

Merging #123 into master will decrease coverage by 5.69%.
The diff coverage is 100%.

@@            Coverage Diff            @@
##           master     #123     +/-   ##
=========================================
- Coverage   93.67%   87.97%   -5.7%     
=========================================
  Files          19       19             
  Lines        1264     1206     -58     
=========================================
- Hits         1184     1061    -123     
- Misses         80      145     +65

Impacted Files	Coverage Δ
src/unordered.jl	`84.78% <100%> (-11.38%)`	⬇️
src/interop/dataframes.jl	`0% <0%> (-100%)`	⬇️
src/basics.jl	`43.75% <0%> (-37.5%)`	⬇️
src/simd.jl	`78.12% <0%> (-18.85%)`	⬇️
src/comprehensions.jl	`64.28% <0%> (-18.07%)`	⬇️
src/core.jl	`75.48% <0%> (-16.5%)`	⬇️
src/interop/blockarrays.jl	`87.5% <0%> (-12.5%)`	⬇️
src/lister.jl	`80.48% <0%> (-7.32%)`	⬇️
src/progress.jl	`87.77% <0%> (-5.71%)`	⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30cdf2a...2fbc18a. Read the comment docs.

tkf · 2020-01-05T07:00:02Z

No improvement 5c399c2:

                                           ID time ratio memory ratio
  ––––––––––––––––––––––––––––––––––––––––––– –––––––––– ––––––––––––
     ["unordered", "unordered", "basesize=1"]  0.98 (5%)  1.02 (1%) ❌
  ["unordered", "unordered", "basesize=1024"]  0.99 (5%)  0.91 (1%) ✅
    ["unordered", "unordered", "basesize=32"]  1.02 (5%)  1.04 (1%) ❌

https://travis-ci.com/tkf/Transducers.jl/jobs/272454798#L388

github-actions · 2020-01-19T03:46:26Z

Multi-thread benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Time of benchmarks:
- Target: 19 Jan 2020 - 03:44
- Baseline: 19 Jan 2020 - 03:46
Package commits:
- Target: 4594da
- Baseline: 30cdf2
Julia commits:
- Target: 2d5741
- Baseline: 2d5741
Julia command flags:
- Target: None
- Baseline: None
Environment variables:
- Target: JULIA_NUM_THREADS => 2
- Baseline: JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio
`["parallel_histogram", "comm", "basesize=16384"]`	0.83 (5%) ✅	0.96 (1%) ✅
`["parallel_histogram", "comm", "basesize=4096"]`	0.48 (5%) ✅	1.03 (1%) ❌
`["parallel_histogram", "comm", "basesize=8192"]`	0.69 (5%) ✅	1.19 (1%) ❌
`["unordered", "unordered", "basesize=1"]`	1.07 (5%) ❌	1.02 (1%) ❌
`["unordered", "unordered", "basesize=1024"]`	0.82 (5%) ✅	0.85 (1%) ✅
`["words", "nthreads=1"]`	0.88 (5%) ✅	0.99 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["parallel_histogram", "assoc"]
["parallel_histogram", "comm"]
["parallel_histogram"]
["unordered"]
["unordered", "unordered"]
["words"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      18203 s          0 s       1226 s      16570 s          0 s
       #2  2294 MHz      19258 s          0 s       1294 s      16079 s          0 s
       
  Memory: 6.782737731933594 GB (3684.1953125 MB free)
  Uptime: 377.0 sec
  Load Avg:  1.67236328125  1.0966796875  0.517578125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      24865 s          0 s       1578 s      21558 s          0 s
       #2  2294 MHz      29238 s          0 s       1578 s      17902 s          0 s
       
  Memory: 6.782737731933594 GB (3603.3671875 MB free)
  Uptime: 499.0 sec
  Load Avg:  1.59912109375  1.2265625  0.64013671875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Time of benchmark: 19 Jan 2020 - 3:44
Package commit: 4594da
Julia commit: 2d5741
Julia command flags: None
Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID	time	GC time	memory	allocations
`["parallel_histogram", "assoc", "basesize=16384"]`	5.322 ms (5%)		732.25 KiB (1%)	110
`["parallel_histogram", "assoc", "basesize=4096"]`	6.347 ms (5%)		1.80 MiB (1%)	540
`["parallel_histogram", "assoc", "basesize=8192"]`	5.687 ms (5%)		1.43 MiB (1%)	261
`["parallel_histogram", "comm", "basesize=16384"]`	11.523 ms (5%)		1.17 MiB (1%)	183
`["parallel_histogram", "comm", "basesize=4096"]`	11.465 ms (5%)		1.10 MiB (1%)	259
`["parallel_histogram", "comm", "basesize=8192"]`	11.690 ms (5%)		1.47 MiB (1%)	208
`["parallel_histogram", "seq"]`	9.623 ms (5%)		364.63 KiB (1%)	25
`["unordered", "collect"]`	459.454 ms (5%)		513.00 KiB (1%)	23
`["unordered", "unordered", "basesize=1"]`	573.563 ms (5%)		30.75 MiB (1%)	507257
`["unordered", "unordered", "basesize=1024"]`	300.023 ms (5%)		851.44 KiB (1%)	5546
`["unordered", "unordered", "basesize=32"]`	274.863 ms (5%)		1.56 MiB (1%)	22416
`["words", "nthreads=1"]`	39.965 ms (5%)	6.947 ms	64.44 MiB (1%)	2085156
`["words", "nthreads=2"]`	24.115 ms (5%)		65.16 MiB (1%)	2085319
`["words", "nthreads=4"]`	24.295 ms (5%)		65.80 MiB (1%)	2085625

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["parallel_histogram", "assoc"]
["parallel_histogram", "comm"]
["parallel_histogram"]
["unordered"]
["unordered", "unordered"]
["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      18203 s          0 s       1226 s      16570 s          0 s
       #2  2294 MHz      19258 s          0 s       1294 s      16079 s          0 s
       
  Memory: 6.782737731933594 GB (3684.1953125 MB free)
  Uptime: 377.0 sec
  Load Avg:  1.67236328125  1.0966796875  0.517578125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Time of benchmark: 19 Jan 2020 - 3:46
Package commit: 30cdf2
Julia commit: 2d5741
Julia command flags: None
Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID	time	GC time	memory	allocations
`["parallel_histogram", "assoc", "basesize=16384"]`	5.284 ms (5%)		732.25 KiB (1%)	110
`["parallel_histogram", "assoc", "basesize=4096"]`	6.344 ms (5%)		1.80 MiB (1%)	539
`["parallel_histogram", "assoc", "basesize=8192"]`	5.906 ms (5%)		1.43 MiB (1%)	261
`["parallel_histogram", "comm", "basesize=16384"]`	13.917 ms (5%)		1.22 MiB (1%)	331
`["parallel_histogram", "comm", "basesize=4096"]`	23.771 ms (5%)		1.07 MiB (1%)	5131
`["parallel_histogram", "comm", "basesize=8192"]`	16.847 ms (5%)		1.23 MiB (1%)	887
`["parallel_histogram", "seq"]`	9.247 ms (5%)		364.63 KiB (1%)	25
`["unordered", "collect"]`	462.960 ms (5%)		513.00 KiB (1%)	23
`["unordered", "unordered", "basesize=1"]`	537.271 ms (5%)		30.26 MiB (1%)	475643
`["unordered", "unordered", "basesize=1024"]`	366.726 ms (5%)		998.72 KiB (1%)	17033
`["unordered", "unordered", "basesize=32"]`	273.819 ms (5%)		1.57 MiB (1%)	23090
`["words", "nthreads=1"]`	45.193 ms (5%)	7.502 ms	64.87 MiB (1%)	2099520
`["words", "nthreads=2"]`	23.404 ms (5%)		65.59 MiB (1%)	2099681
`["words", "nthreads=4"]`	24.044 ms (5%)		66.23 MiB (1%)	2099990

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["parallel_histogram", "assoc"]
["parallel_histogram", "comm"]
["parallel_histogram"]
["unordered"]
["unordered", "unordered"]
["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      24865 s          0 s       1578 s      21558 s          0 s
       #2  2294 MHz      29238 s          0 s       1578 s      17902 s          0 s
       
  Memory: 6.782737731933594 GB (3603.3671875 MB free)
  Uptime: 499.0 sec
  Load Avg:  1.59912109375  1.2265625  0.64013671875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

github-actions · 2020-01-19T03:51:24Z

Benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Time of benchmarks:
- Target: 19 Jan 2020 - 03:47
- Baseline: 19 Jan 2020 - 03:51
Package commits:
- Target: 4594da
- Baseline: 30cdf2
Julia commits:
- Target: 2d5741
- Baseline: 2d5741
Julia command flags:
- Target: None
- Baseline: None
Environment variables:
- Target: None
- Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio
`["gemm", "fusedmul", "blas", "16"]`	1.08 (5%) ❌	1.00 (1%)
`["gemm", "fusedmul", "blas", "2"]`	1.09 (5%) ❌	1.00 (1%)
`["gemm", "mul", "man", "false", "8"]`	1.37 (5%) ❌	1.00 (1%)
`["gemm", "mul", "man", "ivdep", "8"]`	1.31 (5%) ❌	1.00 (1%)
`["gemm", "mul", "xf", "false", "32"]`	0.92 (5%) ✅	1.00 (1%)
`["gemm", "mul", "xf", "false", "8"]`	1.06 (5%) ❌	1.00 (1%)
`["gemm", "mul", "xf", "ivdep", "8"]`	1.33 (5%) ❌	1.00 (1%)
`["missing_dot", "equiv"]`	0.90 (5%) ✅	1.00 (1%)
`["missing_dot", "rf_nota"]`	0.90 (5%) ✅	1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["cat"]
["collect"]
["dot"]
["filter_map_map!"]
["filter_map_reduce"]
["gemm", "fusedmul", "blas"]
["gemm", "fusedmul", "xf"]
["gemm", "mul", "linalg"]
["gemm", "mul", "man", "false"]
["gemm", "mul", "man", "ivdep"]
["gemm", "mul", "man", "true"]
["gemm", "mul", "xf", "false"]
["gemm", "mul", "xf", "ivdep"]
["gemm", "mul", "xf", "true"]
["missing_argmax"]
["missing_dot"]
["partition_by"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      19101 s          0 s       1986 s      36371 s          0 s
       #2  2294 MHz      32848 s          0 s       1352 s      24039 s          0 s
       
  Memory: 6.782737731933594 GB (3492.5625 MB free)
  Uptime: 594.0 sec
  Load Avg:  1.197265625  1.04052734375  0.58251953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      38678 s          0 s       2160 s      37267 s          0 s
       #2  2294 MHz      35937 s          0 s       1980 s      41016 s          0 s
       
  Memory: 6.782737731933594 GB (3562.1953125 MB free)
  Uptime: 801.0 sec
  Load Avg:  1.15673828125  1.11376953125  0.71630859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Time of benchmark: 19 Jan 2020 - 3:47
Package commit: 4594da
Julia commit: 2d5741
Julia command flags: None
Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID	time	memory	allocations
`["cat", "base"]`	205.600 μs (5%)
`["cat", "xf"]`	1.460 μs (5%)
`["collect", "filter-missing"]`	81.600 μs (5%)	33.05 KiB (1%)	20
`["collect", "identity-float"]`	63.200 μs (5%)	256.91 KiB (1%)	20
`["collect", "identity-union"]`	293.500 μs (5%)	285.28 KiB (1%)	6673
`["dot", "blas"]`	2.278 μs (5%)
`["dot", "man"]`	2.244 μs (5%)
`["dot", "rf"]`	2.656 μs (5%)
`["dot", "xf"]`	2.667 μs (5%)
`["filter_map_map!", "man"]`	66.300 μs (5%)
`["filter_map_map!", "xf"]`	69.400 μs (5%)	144 bytes (1%)	8
`["filter_map_reduce", "man"]`	194.900 μs (5%)
`["filter_map_reduce", "xf"]`	194.900 μs (5%)
`["gemm", "fusedmul", "blas", "16"]`	3.340 ms (5%)
`["gemm", "fusedmul", "blas", "2"]`	2.543 ms (5%)
`["gemm", "fusedmul", "blas", "32"]`	4.506 ms (5%)
`["gemm", "fusedmul", "blas", "8"]`	2.819 ms (5%)
`["gemm", "fusedmul", "xf", "16"]`	4.935 ms (5%)	160 bytes (1%)	6
`["gemm", "fusedmul", "xf", "2"]`	613.100 μs (5%)	160 bytes (1%)	6
`["gemm", "fusedmul", "xf", "32"]`	9.963 ms (5%)	160 bytes (1%)	6
`["gemm", "fusedmul", "xf", "8"]`	2.462 ms (5%)	160 bytes (1%)	6
`["gemm", "mul", "linalg", "256"]`	658.400 μs (5%)
`["gemm", "mul", "linalg", "32"]`	3.712 μs (5%)
`["gemm", "mul", "linalg", "8"]`	289.362 ns (5%)
`["gemm", "mul", "man", "false", "256"]`	4.389 ms (5%)
`["gemm", "mul", "man", "false", "32"]`	7.100 μs (5%)
`["gemm", "mul", "man", "false", "8"]`	411.000 ns (5%)
`["gemm", "mul", "man", "ivdep", "256"]`	4.348 ms (5%)
`["gemm", "mul", "man", "ivdep", "32"]`	6.240 μs (5%)
`["gemm", "mul", "man", "ivdep", "8"]`	392.574 ns (5%)
`["gemm", "mul", "man", "true", "256"]`	4.332 ms (5%)
`["gemm", "mul", "man", "true", "32"]`	7.375 μs (5%)
`["gemm", "mul", "man", "true", "8"]`	381.373 ns (5%)
`["gemm", "mul", "xf", "false", "256"]`	4.389 ms (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "false", "32"]`	6.900 μs (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "false", "8"]`	423.618 ns (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "ivdep", "256"]`	4.439 ms (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "ivdep", "32"]`	5.683 μs (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "ivdep", "8"]`	398.049 ns (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "true", "256"]`	4.308 ms (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "true", "32"]`	6.880 μs (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "true", "8"]`	401.005 ns (5%)	48 bytes (1%)	2
`["missing_argmax", "man"]`	889.362 ns (5%)	32 bytes (1%)	1
`["missing_argmax", "rf"]`	2.200 μs (5%)	32 bytes (1%)	1
`["missing_argmax", "xf"]`	2.211 μs (5%)	32 bytes (1%)	1
`["missing_dot", "equiv"]`	1.210 μs (5%)	16 bytes (1%)	1
`["missing_dot", "man"]`	1.040 μs (5%)	16 bytes (1%)	1
`["missing_dot", "naive"]`	4.043 μs (5%)	16 bytes (1%)	1
`["missing_dot", "rf"]`	850.000 ns (5%)	16 bytes (1%)	1
`["missing_dot", "rf_nota"]`	1.230 μs (5%)	16 bytes (1%)	1
`["missing_dot", "xf"]`	185.800 μs (5%)	74.11 KiB (1%)	3866
`["missing_dot", "xf_nota"]`	183.800 μs (5%)	73.94 KiB (1%)	3862
`["partition_by", "man"]`	1.626 ms (5%)	352 bytes (1%)	4
`["partition_by", "xf"]`	1.562 ms (5%)	576 bytes (1%)	7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["cat"]
["collect"]
["dot"]
["filter_map_map!"]
["filter_map_reduce"]
["gemm", "fusedmul", "blas"]
["gemm", "fusedmul", "xf"]
["gemm", "mul", "linalg"]
["gemm", "mul", "man", "false"]
["gemm", "mul", "man", "ivdep"]
["gemm", "mul", "man", "true"]
["gemm", "mul", "xf", "false"]
["gemm", "mul", "xf", "ivdep"]
["gemm", "mul", "xf", "true"]
["missing_argmax"]
["missing_dot"]
["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      19101 s          0 s       1986 s      36371 s          0 s
       #2  2294 MHz      32848 s          0 s       1352 s      24039 s          0 s
       
  Memory: 6.782737731933594 GB (3492.5625 MB free)
  Uptime: 594.0 sec
  Load Avg:  1.197265625  1.04052734375  0.58251953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Time of benchmark: 19 Jan 2020 - 3:51
Package commit: 30cdf2
Julia commit: 2d5741
Julia command flags: None
Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID	time	memory	allocations
`["cat", "base"]`	206.900 μs (5%)
`["cat", "xf"]`	1.460 μs (5%)
`["collect", "filter-missing"]`	80.000 μs (5%)	33.05 KiB (1%)	20
`["collect", "identity-float"]`	61.200 μs (5%)	256.91 KiB (1%)	20
`["collect", "identity-union"]`	291.301 μs (5%)	285.42 KiB (1%)	6678
`["dot", "blas"]`	2.267 μs (5%)
`["dot", "man"]`	2.256 μs (5%)
`["dot", "rf"]`	2.656 μs (5%)
`["dot", "xf"]`	2.667 μs (5%)
`["filter_map_map!", "man"]`	66.700 μs (5%)
`["filter_map_map!", "xf"]`	68.700 μs (5%)	144 bytes (1%)	8
`["filter_map_reduce", "man"]`	194.900 μs (5%)
`["filter_map_reduce", "xf"]`	194.900 μs (5%)
`["gemm", "fusedmul", "blas", "16"]`	3.094 ms (5%)
`["gemm", "fusedmul", "blas", "2"]`	2.331 ms (5%)
`["gemm", "fusedmul", "blas", "32"]`	4.480 ms (5%)
`["gemm", "fusedmul", "blas", "8"]`	2.812 ms (5%)
`["gemm", "fusedmul", "xf", "16"]`	4.838 ms (5%)	160 bytes (1%)	6
`["gemm", "fusedmul", "xf", "2"]`	602.100 μs (5%)	160 bytes (1%)	6
`["gemm", "fusedmul", "xf", "32"]`	9.749 ms (5%)	160 bytes (1%)	6
`["gemm", "fusedmul", "xf", "8"]`	2.417 ms (5%)	160 bytes (1%)	6
`["gemm", "mul", "linalg", "256"]`	658.601 μs (5%)
`["gemm", "mul", "linalg", "32"]`	3.800 μs (5%)
`["gemm", "mul", "linalg", "8"]`	300.000 ns (5%)
`["gemm", "mul", "man", "false", "256"]`	4.304 ms (5%)
`["gemm", "mul", "man", "false", "32"]`	7.000 μs (5%)
`["gemm", "mul", "man", "false", "8"]`	300.000 ns (5%)
`["gemm", "mul", "man", "ivdep", "256"]`	4.277 ms (5%)
`["gemm", "mul", "man", "ivdep", "32"]`	6.300 μs (5%)
`["gemm", "mul", "man", "ivdep", "8"]`	300.000 ns (5%)
`["gemm", "mul", "man", "true", "256"]`	4.310 ms (5%)
`["gemm", "mul", "man", "true", "32"]`	7.100 μs (5%)
`["gemm", "mul", "man", "true", "8"]`	400.000 ns (5%)
`["gemm", "mul", "xf", "false", "256"]`	4.298 ms (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "false", "32"]`	7.500 μs (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "false", "8"]`	400.000 ns (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "ivdep", "256"]`	4.263 ms (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "ivdep", "32"]`	5.800 μs (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "ivdep", "8"]`	300.000 ns (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "true", "256"]`	4.297 ms (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "true", "32"]`	6.700 μs (5%)	48 bytes (1%)	2
`["gemm", "mul", "xf", "true", "8"]`	400.000 ns (5%)	48 bytes (1%)	2
`["missing_argmax", "man"]`	900.000 ns (5%)	32 bytes (1%)	1
`["missing_argmax", "rf"]`	2.178 μs (5%)	32 bytes (1%)	1
`["missing_argmax", "xf"]`	2.178 μs (5%)	32 bytes (1%)	1
`["missing_dot", "equiv"]`	1.350 μs (5%)	16 bytes (1%)	1
`["missing_dot", "man"]`	1.030 μs (5%)	16 bytes (1%)	1
`["missing_dot", "naive"]`	4.043 μs (5%)	16 bytes (1%)	1
`["missing_dot", "rf"]`	857.692 ns (5%)	16 bytes (1%)	1
`["missing_dot", "rf_nota"]`	1.360 μs (5%)	16 bytes (1%)	1
`["missing_dot", "xf"]`	183.200 μs (5%)	74.08 KiB (1%)	3864
`["missing_dot", "xf_nota"]`	188.100 μs (5%)	73.92 KiB (1%)	3859
`["partition_by", "man"]`	1.629 ms (5%)	352 bytes (1%)	4
`["partition_by", "xf"]`	1.566 ms (5%)	576 bytes (1%)	7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["cat"]
["collect"]
["dot"]
["filter_map_map!"]
["filter_map_reduce"]
["gemm", "fusedmul", "blas"]
["gemm", "fusedmul", "xf"]
["gemm", "mul", "linalg"]
["gemm", "mul", "man", "false"]
["gemm", "mul", "man", "ivdep"]
["gemm", "mul", "man", "true"]
["gemm", "mul", "xf", "false"]
["gemm", "mul", "xf", "ivdep"]
["gemm", "mul", "xf", "true"]
["missing_argmax"]
["missing_dot"]
["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      38678 s          0 s       2160 s      37267 s          0 s
       #2  2294 MHz      35937 s          0 s       1980 s      41016 s          0 s
       
  Memory: 6.782737731933594 GB (3562.1953125 MB free)
  Uptime: 801.0 sec
  Load Avg:  1.15673828125  1.11376953125  0.71630859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

tkf · 2020-01-27T23:46:10Z

#123 (comment)

ID time ratio memory ratio

["parallel_histogram", "comm", "basesize=16384"] 0.83 (5%) ✅ 0.96 (1%) ✅

["parallel_histogram", "comm", "basesize=4096"] 0.48 (5%) ✅ 1.03 (1%) ❌

["parallel_histogram", "comm", "basesize=8192"] 0.69 (5%) ✅ 1.19 (1%) ❌

Using commit: Support Table 1.0 (#123) JuliaFolds/BangBang.jl@976e825

tkf mentioned this pull request Jan 5, 2020

Add unordered transduce for channels #112

Merged

tkf force-pushed the unordered-performance branch 2 times, most recently from 127f945 to 5c399c2 Compare January 5, 2020 05:52

tkf force-pushed the unordered-performance branch from 5c399c2 to dcdf3d4 Compare January 19, 2020 03:38

Take items from input channel outside worker threads

2fbc18a

tkf force-pushed the unordered-performance branch from dcdf3d4 to 2fbc18a Compare January 19, 2020 03:39

tkf mentioned this pull request Jan 27, 2020

add Threads.foreach for convenient multithreaded Channel consumption JuliaLang/julia#34543

Merged

tkf added a commit that referenced this pull request Feb 13, 2020

Update: BangBang

a8099d9

Using commit: Support Table 1.0 (#123) JuliaFolds/BangBang.jl@976e825

Improve channel_unordered performance: Take items from input channel outside worker threads #123

Are you sure you want to change the base?

Improve channel_unordered performance: Take items from input channel outside worker threads #123

Conversation

tkf commented Jan 5, 2020

codecov-io commented Jan 5, 2020 • edited

Codecov Report

tkf commented Jan 5, 2020

github-actions bot commented Jan 19, 2020

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Target

Baseline

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

github-actions bot commented Jan 19, 2020

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Target

Baseline

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

tkf commented Jan 27, 2020

Improve `channel_unordered` performance: Take items from input channel outside worker threads #123

Improve `channel_unordered` performance: Take items from input channel outside worker threads #123

codecov-io commented Jan 5, 2020 •

edited