Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move away from variable-length array #44937

Open
srimanob opened this issue May 8, 2024 · 8 comments
Open

Move away from variable-length array #44937

srimanob opened this issue May 8, 2024 · 8 comments

Comments

@srimanob
Copy link
Contributor

srimanob commented May 8, 2024

This is a follow up of #44306 where we see the crash which cen be solved by moving away from variable-length array.

          > I've only checked the L1Trigger/TrackFinding* packages by recompiling them with the `-Werror=vla` flag, but there seem to be no more instances of this particular problem there.

Just for fun, here is a table of all variable-length arrays in L1Trigger in CMSSW_14_1_0_pre3. I leave it to the experts of other subpackages to fix them, but hopefully this is a useful starting point.

File name Line number Name of offending array
L1Trigger/DTTriggerPhase2/src/MuonPathAssociator.cc 1004 useFit
L1Trigger/DTTriggerPhase2/src/MuonPathAssociator.cc 122 useFitSL1
L1Trigger/DTTriggerPhase2/src/MuonPathAssociator.cc 125 useFitSL3
L1Trigger/L1TCaloLayer1/src/UCTRegion.cc 132 activeTower
L1Trigger/L1TGlobal/plugins/GenToInputProducer.cc 264 idxMu
L1Trigger/L1TGlobal/plugins/GenToInputProducer.cc 265 muPtSorted
L1Trigger/L1TGlobal/plugins/GenToInputProducer.cc 333 idxEg
L1Trigger/L1TGlobal/plugins/GenToInputProducer.cc 334 egPtSorted
L1Trigger/L1TGlobal/plugins/GenToInputProducer.cc 364 idxTau
L1Trigger/L1TGlobal/plugins/GenToInputProducer.cc 365 tauPtSorted
L1Trigger/L1TGlobal/src/CorrCondition.cc 369 InvDeltaRSqLUT
L1Trigger/L1TGlobal/src/CorrCondition.cc 370 temp_InvDeltaRSq
L1Trigger/L1THGCal/src/backend/HGCalClusteringImpl.cc 253 isSeed
L1Trigger/L1THGCal/src/backend/HGCalClusteringImpl.cc 372 toRemove
L1Trigger/L1THGCal/src/backend/HGCalClusteringImpl.cc 44 isSeed
L1Trigger/L1TTrackMatch/plugins/L1TrackJetEmulatorProducer.cc 150 epbins_default
L1Trigger/L1TTrackMatch/plugins/L1TrackJetEmulatorProducer.cc 196 epbins
L1Trigger/L1TTrackMatch/plugins/L1TrackJetProducer.cc 140 epbins_default
L1Trigger/L1TTrackMatch/plugins/L1TrackJetProducer.cc 179 epbins
L1Trigger/Phase2L1ParticleFlow/interface/common/bitonic_hybrid_sort_ref.h 290 work
L1Trigger/Phase2L1ParticleFlow/interface/common/bitonic_hybrid_sort_ref.h 304 halfsorted
L1Trigger/Phase2L1ParticleFlow/interface/common/bitonic_hybrid_sort_ref.h 304 work
L1Trigger/Phase2L1ParticleFlow/interface/common/bitonic_hybrid_sort_ref.h 333 tomerge
L1Trigger/Phase2L1ParticleFlow/interface/common/bitonic_sort_ref.h 113 OutTmp
L1Trigger/Phase2L1ParticleFlow/interface/common/bitonic_sort_ref.h 128 outTmp2
L1Trigger/Phase2L1ParticleFlow/interface/common/bitonic_sort_ref.h 67 out2
L1Trigger/Phase2L1ParticleFlow/interface/common/bitonic_sort_ref.h 70 out3
L1Trigger/Phase2L1ParticleFlow/src/L1TCorrelatorLayer1PatternFileWriter.cc 325 ret
L1Trigger/TrackFindingTracklet/src/PurgeDuplicate.cc 194 dupMap
L1Trigger/TrackFindingTracklet/src/PurgeDuplicate.cc 202 noMerge

Originally posted by @aehart in #44306 (comment)

@cmsbuild
Copy link
Contributor

cmsbuild commented May 8, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented May 8, 2024

A new Issue was created by @srimanob.

@sextonkennedy, @rappoccio, @makortel, @antoniovilela, @Dr15Jones, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor

makortel commented May 8, 2024

I'd suspect the reason for the failure reported in #44306 was the array in stack became large, rather than the array being variable-length itself.

As mentioned in #44306 (comment), VLA is a non-standard extension. From the past I recall e.g. tracking code uses

template <typename T>
class DynArray {

for performance reasons (but I can't tell on the top of my head how big the impact of moving to e.g. std::vector would be there).

Is the scope of this issue to remove VLA in L1T code, or everywhere in CMSSW?

@srimanob
Copy link
Contributor Author

srimanob commented May 9, 2024

Hi @makortel
Thanks for the comment. My initial thought is how risk we are on this VLA, as this is just a pop-up issues in L1T module. It seems we will never know if we are at boundary of failure until it fails. Do we have cons for migration?

@VinInn
Copy link
Contributor

VinInn commented May 9, 2024

Do we have cons for migration?
YES. performance.

Just avoid those very large arrays in first place.

To be safe one can use the DynArray and then at run time initialize either with a pointer to a VLA or with one to the heap depending to to size of the allocation required...

@makortel
Copy link
Contributor

makortel commented May 9, 2024

It seems we will never know if we are at boundary of failure until it fails.

This is correct for stack overflows in general. While the problem of large arrays in stack is strictly speaking orthogonal to the question of whether we should allow VLA to be used or not, the dynamic nature makes potentially large VLAs somewhat more difficult to catch in code review, PR tests, and IB tests compared to compile-time-defined arrays.

@dan131riley
Copy link

There's an additional issue that large VLAs that cause problems with multi-threaded processes may not show up in single-threaded PR tests. I haven't checked if we impose the same stack size limit in both cases.

@makortel
Copy link
Contributor

I haven't checked if we impose the same stack size limit in both cases.

All TBB threads should use the same stack size

// if needed, re-initialise TBB
if (threadsInfo.nThreads_ != edm::s_defaultNumberOfThreads or
threadsInfo.stackSize_ != edm::s_defaultSizeOfStackForThreadsInKB) {
threadsInfo.nThreads_ = edm::setNThreads(threadsInfo.nThreads_, threadsInfo.stackSize_, tsiPtr);
}

oPtr = std::make_unique<ThreadsController>(static_cast<int>(iNThreads), iStackSize);

return std::make_unique<oneapi::tbb::global_control>(oneapi::tbb::global_control::thread_stack_size, iStackSize);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants