Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WithWaitGroup hangs on error. Help explain how to use WithWaitGroup. #130

Open
kalensk opened this issue Apr 14, 2023 · 3 comments
Open

Comments

@kalensk
Copy link

kalensk commented Apr 14, 2023

Can you help explain how to properly use WithWaitGroup() especially when an error occurs? I seem to be having some misunderstanding.

The general pattern without using a progress bar which works is:

  • create a wg sync.WaitGroup and for each loop call wg.Add(1)
  • for each loop create a go func with defer wg.Done() that calls a DoWork function which internally calls bar.Increment()
  • call wg.Wait() and collect any errors

To use the mpb library I used mpb.New(mpb.WithWaitGroup(&wg)) and called progressBar.Wait() instead of wg.Wait(). See below code. However, if the DoWork() errors such as before bar.Increment() is called the progressBar.Wait() never returns. Adding a bar.Abort() when an error occurs seems to work, but is a pattern that seems incorrect and one I'd like to avoid.

Note: I have also tried the below code without a WaitGroup and just progressBar := mpb.New() and it still hangs on progressBar.Wait() unless I add a bar.Abort() if an error is returned from DoWork, which does not seem correct.
I am not sure I understand the point of WithWaitGroup if the below code works the same without needing it. Can you explain?

Thank you for any clarification and help in understanding!

	
var wg sync.WaitGroup
progressBar := mpb.New(mpb.WithWaitGroup(&wg))
errChan := make(chan error, len(databases))

for _, database := range databases {
	wg.Add(1)

	bar := progressBar.New(int64(database.NumScriptsToDoWork),
		mpb.NopStyle(),
		mpb.PrependDecorators(decor.Name(database.Datname, decor.WCSyncSpaceR)),
		mpb.AppendDecorators(decor.NewPercentage()),
	)

	go func(database DatabaseInfo, bar *mpb.Bar) {
		defer wg.Done()

		err = DoWork(database, bar)  // calls bar.Increment()
		if err != nil {
			errChan <- err
			return
		}

	}(database, bar)
}

progressBar.Wait()
close(errChan)

// ...
@kalensk kalensk changed the title WithWaitGroup hangs on error. Can you further explain WithWaitGroup? WithWaitGroup hangs on error. Help explain how to use WithWaitGroup. Apr 14, 2023
kalensk added a commit to kalensk/gpupgrade that referenced this issue Apr 14, 2023
Show percentage progress for data migration script generation. And show
count of scripts applied for data migration script apply.

It's odd to have to call bar.Abort() on error. See
vbauerster/mpb#130
@vbauerster
Copy link
Owner

I am not sure I understand the point of WithWaitGroup if the below code works the same without needing it. Can you explain?

When you apply mpb.WithWaitGroup(&wg) it means wait for supplied wait group first and then wait for all bars to complete or abort. If you don't use this option which is totally ok then you'll end up with:

wg.Wait() // wait for range databases loop
progress.Wait() // wait for bars to complete or abort

the point of WithWaitGroup is to have single point of Wait call and wait/sync between different goroutines (range databases loop and bars rendering loop run in different goroutines). In other words if range databases loop has completed it doesn't mean that bars rendering loop completed as well and vise versa. There may be surprising side effects if forgetting to wait either, for example if you forget to wait for progress and you program ends right after wg.Wait() then bars may end up with incomplete state like showing 98%.

Adding a bar.Abort() when an error occurs seems to work, but is a pattern that seems incorrect and one I'd like to avoid.
I have also tried the below code without a WaitGroup and just progressBar := mpb.New() and it still hangs on progressBar.Wait() unless I add a bar.Abort() if an error is returned from DoWork, which does not seem correct.

If bar doesn't complete or abort then progress.Wait() will never release. You already answered how to fix: just call bar.Abort() in error case. Why it seems incorrect to you?

@vbauerster
Copy link
Owner

Following is not related to your question just some little error handling review. Do you really need to handle all possible errors? Usually it's enough to handle first error and ignore the rest:

var wg sync.WaitGroup
progressBar := mpb.New(mpb.WithWaitGroup(&wg))
errChan := make(chan error, 1) // we will handle first error only

for _, database := range databases {
	wg.Add(1)

	bar := progressBar.New(int64(database.NumScriptsToDoWork),
		mpb.NopStyle(),
		mpb.PrependDecorators(decor.Name(database.Datname, decor.WCSyncSpaceR)),
		mpb.AppendDecorators(decor.NewPercentage()),
	)

	go func(database DatabaseInfo, bar *mpb.Bar) {
		defer wg.Done()

		err = DoWork(database, bar)  // calls bar.Increment()
		if err != nil {
                        select {
			   case errChan <- err:
                           // we're the first goroutine to fail here
                           default:
                           // don't care as error already happened/sent
                        }
                        bar.Abort(...)
		}

	}(database, bar)
}

progressBar.Wait()
close(errChan)
// do something with error if any
if err := <-errChan; err != nil {
 ...
}

@kalensk
Copy link
Author

kalensk commented Apr 21, 2023

Thank you so much for the response! I really appreciate it as it helps clarify some concepts for me.

the point of WithWaitGroup is to have single point of Wait call and wait/sync between different goroutines (range databases loop and bars rendering loop run in different goroutines). I

Interesting. So, WithWaitGroup basically avoids the following:

wg.Wait() // wait for range databases loop
progress.Wait() // wait for bars to complete or abort

I tend to prefer code that is very explicit and clear. Thus, having a waitgroup (or whatever is needed) for "both" the database work being done, "and" for the progress bars rather than potentially hiding any logic behind something. To me having it for both helps make things clear and explicit to mitigate bugs and logic errors.

Why it seems incorrect to you?

Because, in my "normal" workflow as show in my original example there is no need for "aborting". That is,

  • create a wg sync.WaitGroup and for each loop call wg.Add(1)
  • for each loop create a go func with defer wg.Done() that calls a DoWork function
  • call wg.Wait() and collect any errors

It is using the general concept of returning errors from DoWork() via channels since we are using goroutines. There is no concept of "aborting" or needing to call an "abort" function. That is what I mean by seems incorrect or might indicate a "code smell".

Following is not related to your question just some little error handling review. Do you really need to handle all possible errors? Usually it's enough to handle first error and ignore the rest:

If I am following correctly, I think it mainly comes down to if one wants to return early at the first error, or collect and return all errors. That is, by returning on the first error one ignores any additional errors.

kalensk added a commit to kalensk/gpupgrade that referenced this issue Apr 26, 2023
Show percentage progress for data migration script generation. And show
count of scripts applied for data migration script apply.

It's odd to have to call bar.Abort() on error. See
vbauerster/mpb#130
kalensk added a commit to greenplum-db/gpupgrade that referenced this issue Apr 26, 2023
Show percentage progress for data migration script generation. And show
count of scripts applied for data migration script apply.

It's odd to have to call bar.Abort() on error. See
vbauerster/mpb#130
kalensk added a commit to kalensk/gpupgrade that referenced this issue Jun 10, 2023
Show percentage progress for data migration script generation. And show
count of scripts applied for data migration script apply.

It's odd to have to call bar.Abort() on error. See
vbauerster/mpb#130
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants