Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle event meta data changes in streamer files #44892

Merged
merged 4 commits into from
May 29, 2024

Conversation

Dr15Jones
Copy link
Contributor

PR description:

Moved storage of meta data needed for events into the first EventMessage sent before the subsequent events which need that meta data.
This fixed a problem seen online where different HLT nodes were generating different event meta data for the same luminosity block but online the event data for one node is used.

PR validation:

All unit tests associated with streamer files pass.

@Dr15Jones
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 2, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented May 2, 2024

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44892/40158

@cmsbuild
Copy link
Contributor

cmsbuild commented May 2, 2024

A new Pull Request was created by @Dr15Jones for master.

It involves the following packages:

  • CalibCalorimetry/EcalLaserSorting (alca)
  • DQMServices/StreamerIO (dqm)
  • DataFormats/Streamer (core)
  • EventFilter/Utilities (daq)
  • IOPool/Streamer (core)

@saumyaphor4252, @perrotta, @nothingface0, @smuzaffar, @tjavaid, @syuvivida, @antoniovagnerini, @makortel, @smorovic, @consuegs, @rvenditti, @Dr15Jones, @emeschi can you please review it and eventually sign? Thanks.
@wddgit, @rovere, @ReyerBand, @argiro, @wang0jin, @yuanchao, @makortel, @Martin-Grunewald, @rsreds, @mmusich, @missirol, @tocheng, @rchatter, @barvic, @thomreis this is something you requested to watch as well.
@rappoccio, @antoniovilela, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented May 2, 2024

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-79e450/39218/summary.html
COMMIT: e3d9efe
CMSSW: CMSSW_14_1_X_2024-05-02-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44892/39218/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3331548
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3331525
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor

makortel commented May 3, 2024

Part of addressing #44643

The event meta-data (e.g. BranchLists) is now added as an
EventMsg before the following EventMsg for which it applies.
This added EventMsg contains no data products. When seen,
the added EventMsg is handled as an artificial file boundary in
order to cause the needed stream synchronization.
This change is needed to handle how HLT merges parts of streamer
files which might have different meta-data.
- event meta data is cached at begin job or when new file is opened
- refactored StreamerOutputModuleCommon to be two classes so buffer
  could be handled separately for the caching
- added many unit tests to show GlobalEvFOutputModule properly
  writes the expected events and the data products are correct
@smorovic
Copy link
Contributor

+1

@makortel
Copy link
Contributor

+core

@cmsbuild
Copy link
Contributor

REMINDER @rappoccio, @sextonkennedy, @antoniovilela: This PR was tested with #44991, please check if they should be merged together

@antoniovilela
Copy link
Contributor

ping @cms-sw/dqm-l2

@tjavaid
Copy link

tjavaid commented May 29, 2024

+1

@rappoccio
Copy link
Contributor

+1

@mmusich
Copy link
Contributor

mmusich commented May 29, 2024

hold

is this still meant to be on hold?

@antoniovilela
Copy link
Contributor

unhold

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@cmsbuild cmsbuild merged commit ff1f93a into cms-sw:master May 29, 2024
12 checks passed
@smuzaffar
Copy link
Contributor

smuzaffar commented May 30, 2024

@Dr15Jones , this is now available in today's 11h IB. There is a unit test failing with [a], could this PR be the cause of it? Note that IB is a patch build, may be we need a full build to avoid it?

[a] https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_amd64_gcc12/CMSSW_14_1_X_2024-05-30-1100/unitTestLogs/RecoPPS/Local#/

----- Begin Fatal Exception 30-May-2024 13:09:44 CEST-----------------------
An exception of category 'FatalRootError' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing input source of type NewEventStreamFileReader
   Additional Info:
      [a] Fatal Root Error: @SUB=TBufferFile::ReadClassBuffer
Could not find the StreamerInfo for version 11 of the class edm::SendJobHeader, object skipped at offset 33

----- End Fatal Exception -------------------------------------------------

@makortel
Copy link
Contributor

There is a unit test failing with [a], could this PR be the cause of it?

It is.

Note that IB is a patch build, may be we need a full build to avoid it?

Full build won't help. Based on the logs the test seems to be reading streamer files (brittle), so either the streamer files need to be redone (through streamer -> root -> streamer cycle), or the files could be converted to root format if the streamer aspect is not super important for that test (or the test would be disabled). I'd suggest to follow up with the test authors in a new issue.

@makortel
Copy link
Contributor

I'd suggest to follow up with the test authors in a new issue.

I opened an issue in #45101

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet