Fivefold faster tempo detection and spectrograms too #5899

Paul-Licameli · 2024-01-28T21:07:19Z

Resolves: (direct link to the issue)

You read that right, fivefold.

At least ... on one computer, with the averages of a few before-and-after trials, with 6 cores and hyperthreading.

Results may vary with hardware but have great promise to scale as the processing power gets cheaper.

In all the recent performance improvements of tempo detection performance, the "obvious" thing wasn't done --
which is, to use multiple cores to exploit the "embarrassing" parallelism inherent in the algorithm.

This PR adds some template classes and functions in new lib-concurrency -- the hard part -- to make transformations of
such data-parallel computations comparatively easy to do.

Tempo detection is the first application -- unit tests pass, when runLocally = true in MirTestUtils.cpp, and timeMeasurement.txt then contains much smaller numbers.

The computation of spectrograms for display is the example of a second, even easier application, in which one simply
supplies an iterator range and a (copyable) lambda, and the speedup follows. Try using the largest spectrogram window
size (327678) and compare before-and-after zooming-in and -out, scrolling, and playing with the pinned head. (But sorry,
no parallelism yet if you use the Reassignment algorithm.)

What are other possible easy applications? I think there are some in Sequence.cpp such as converting sample format, and
finding min/max/rms. Others?

I signed CLA
The title of the pull request describes an issue it addresses
If changes are extensive, then there is a sequence of easily reviewable commits
Each commit's message describes its purpose and effects
There are no behavior changes unnecessary for the stated purpose of the PR

Recommended:

Each commit compiles and runs on my machine without known undesirable changes of behavior

... Thus std::advance and std::distance will be O(1)

... Removing the old OpenMP experiment

crsib

Judging by the code style you are pulling in some of your older work.

I have only looked on lib-concurrency. I think there few changes needed, but they are rather minor.

crsib · 2024-01-29T08:26:38Z

libraries/lib-concurrency/CMakeLists.txt

+]]#
+
+set( SOURCES
+   MessageBuffer.h


I don't expect this class to be used anywhere except for the current Effects implementation. It makes very strict assumptions about the Reader/Writer and gives a fair amount of problems due to its' double buffering, which made 3.2 such a torture.

It was first written and used in scrubbing to communicate to the thread that fetches audio in response to mouse movements.

That file is just moved without change and does not need review.

crsib · 2024-01-29T08:27:37Z

libraries/lib-concurrency/CMakeLists.txt

+
+set( SOURCES
+   MessageBuffer.h
+   spinlock.h


This class uses this_thread::yield every second loop and doesn't use pause. It makes sense, but not exactly correct.

That file too is just moved unchanged and not used in the new files.

crsib · 2024-01-29T08:28:44Z

libraries/CMakeLists.txt

@@ -6,6 +6,7 @@ set( LIBRARIES
   lib-string-utils
   lib-strings
   lib-utility
+   lib-concurrency


This library is surely long needed.

crsib · 2024-01-29T08:32:08Z

libraries/lib-concurrency/ParallelProgress.h

+    */
+   increment_type operator()(increment_type inc = {}) {
+      if (abandoned.load(std::memory_order_relaxed))
+         throw detail::AbortException{};


I see absolutely no justification for using exceptions here. They are meant to signal exceptional situations. In the worst case, they can be used to get back deep into the call stack, but that always indicates serious problems with program design.

The alternative is to complicate how tasks are written, requiring them to test and exit.

It is simpler instead to let them use a possibly non-returning function, which also, in the future C++20 version perhaps, would instead become a coroutine suspension point. Then there could be the possibility of non-resumption or migration of a task to another thread in a more sophisticated task scheduler.

The alternative is to complicate how tasks are written

I can have a task where I need to handle complications explicitly. I do have such use cases now, but not using this class, obviously.

I don't see how handling the result complicates anything.

crsib · 2024-01-29T08:56:19Z

libraries/lib-concurrency/ParallelProgress.h

+      if (abandoned.load(std::memory_order_relaxed))
+         throw detail::AbortException{};
+      const auto old_total = total.fetch_add(inc, std::memory_order_acq_rel);
+      return old_total + inc;


Reporting progress is very complex in MT environments. I'm not sure that this value will be meaningful by the time the progress is reported.

crsib · 2024-01-29T09:30:39Z