Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UART plugin does not send correct DMX data when granularity check happens during high load #1798

Open
markus983 opened this issue Nov 21, 2022 · 5 comments · May be fixed by #1800
Open

UART plugin does not send correct DMX data when granularity check happens during high load #1798

markus983 opened this issue Nov 21, 2022 · 5 comments · May be fixed by #1800

Comments

@markus983
Copy link

Hi,

I run OLA on a RPI 4 within a docker container and use UART to send out DMX data. Most of the time it works just fine, however, from time to time the DMX protocol appears to break completely. After some testing and investigating, I found that the granularity is set to BAD when the DMX protocol appears to break. Apparently this can happen quite easily, when the RPI has a lot of other stuff to do during the granularity check. My guess would be, that the ola server doesn't get the CPU in time. The code can be found in plugins/uartdmx/UartDmxThread.cpp.
As a result, the DMX break signal is skipped which causes the DMX recipient to ignore the DMX data following since it cannot interpret it correctly. The only way I've found to resolve this problem is to restart the entire OLA server.

Recreating this issue should be fairly easy, just force all your CPUs to 100% / stress test your hardware and restart your OLA server with the UART DMX plugin enabled. Maybe you'll have to restart the server a few times until it happens since this issue is not deterministic. It occurred with the versions 0.10.3 and 0.10.8, I haven't tested the others, but I don't expect different results.

I think it would be cleaner to fix this issue by fixing the granularity check, or by changing the impact the granularity has on the data sent. However, I'm not sure how to fix this since I have no experience in coding so close to the hardware. But for the time being, could you perhaps provide a config parameter that allows to force the granularity to GOOD?

I know that this is basically undermining it's entire purpose, but I also know that the hardware can handle the timings and it isn't an option for me to basically depend on luck that this plugin is sending out DMX data correctly.

Thanks for your help!

@peternewman
Copy link
Member

Hi @markus983 ,

I run OLA on a RPI 4 within a docker container

To try and address the simple issues, have you tried running OLAd natively or using nice to give it a higher priority?

and use UART to send out DMX data. Most of the time it works just fine, however, from time to time the DMX protocol appears to break completely. After some testing and investigating, I found that the granularity is set to BAD when the DMX protocol appears to break. Apparently this can happen quite easily, when the RPI has a lot of other stuff to do during the granularity check. My guess would be, that the ola server doesn't get the CPU in time. The code can be found in plugins/uartdmx/UartDmxThread.cpp. As a result, the DMX break signal is skipped which causes the DMX recipient to ignore the DMX data following since it cannot interpret it correctly.

From a quick look at the code, it does indeed appear that the break and MAB get skipped if we're not in good granularity, but the frame data does still get output, which seems a slightly odd choice (this is also true with the FTDI plugin):

if (!m_widget->SetBreak(true))
goto framesleep;
if (m_granularity == GOOD)
usleep(m_breakt);
if (!m_widget->SetBreak(false))
goto framesleep;
if (m_granularity == GOOD)
usleep(DMX_MAB);

The only way I've found to resolve this problem is to restart the entire OLA server.

FWIW I suspect reloading the plugins would probably resolve it too.

But for the time being, could you perhaps provide a config parameter that allows to force the granularity to GOOD?

That feels like a huge hack and will just generate a pile of new issues!

However there was a change made to the FTDI plugin some time ago, which allows granularity to be recovered if timing improves, would you like to try applying the same changes to the UART plugin and see if that fixes your issues?
01835aa

@markus983
Copy link
Author

Hi @peternewman ,

thanks for your help!

From a quick look at the code, it does indeed appear that the break and MAB get skipped if we're not in good granularity, but the frame data does still get output, which seems a slightly odd choice (this is also true with the FTDI plugin)

You're right about that, I was surprised as well since I expected the actual data to be the most important/sensitive part regarding the granularity.

However there was a change made to the FTDI plugin some time ago, which allows granularity to be recovered if timing improves, would you like to try applying the same changes to the UART plugin and see if that fixes your issues?
01835aa

Thanks about that info, I've changed the code of the plugin accordingly and this actually fixes the problems. Even high load scenarios are no issue with these changes. I opened a pull request (#1800) hoping others can benefit from these changes as well.

@peternewman
Copy link
Member

thanks for your help!

No worries, thanks for testing it out and opening a PR!

From a quick look at the code, it does indeed appear that the break and MAB get skipped if we're not in good granularity, but the frame data does still get output, which seems a slightly odd choice (this is also true with the FTDI plugin)

You're right about that, I was surprised as well since I expected the actual data to be the most important/sensitive part regarding the granularity.

Thinking about it a bit more, I guess the theory is as follows:
Break
MAB
First frame output (1,2,3...512)
Timing granularity goes
No Break
No MAB
Second frame output continues (1,2,3...512)

So what the fixture sees is:
Break
MAB
1,2,3...512,1,2,3...512

A properly behaved fixture should see that as one very big frame and just pick up the values towards the start of the frame that it needs. Without double-checking I'm not sure if that would count as a loss of DMX or not, but I guess hopefully it would stop it putting out garbage anyway. Or at least I assume that's the idea.

Thanks about that info, I've changed the code of the plugin accordingly and this actually fixes the problems. Even high load scenarios are no issue with these changes. I opened a pull request (#1800) hoping others can benefit from these changes as well.

Excellent thanks. So do you see the failed granularity message during high load and then it recovers afterwards?

@markus983
Copy link
Author

Excellent thanks. So do you see the failed granularity message during high load and then it recovers afterwards?

Yes, with these changes the plugin was always able to recover even during high load. Actually, the recovery happened almost immediately after the initial check.

@peternewman peternewman added this to the 0.10.9 milestone Dec 5, 2022
@peternewman
Copy link
Member

Yes, with these changes the plugin was always able to recover even during high load. Actually, the recovery happened almost immediately after the initial check.

That's great news. I guess it probably just drops one or two frames in reality and that's it!

@peternewman peternewman modified the milestones: 0.10.9, 0.10.10 Feb 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants