Ticket #7489 (closed: wontfix)

Opened 7 years ago

Last modified 5 years ago

Add mutex for adding events in LoadEventNexus

Reported by: Peter Peterson Owned by: Russell Taylor
Priority: major Milestone: Release 3.2
Component: Framework Keywords:
Cc: martyn.gigg@… Blocked By: #7524
Blocking: Tester: Peter Peterson

Description

To prevent re-introducing a periodic failure, introduce a vector of mutexes to check before accessing individual event vectors in LoadEventNexus. This should be done just inside the check for the time-of-flight being in the desired range (lines 286-335).

Change History

comment:1 Changed 7 years ago by Russell Taylor

  • Owner changed from Peter Peterson to Russell Taylor
  • Status changed from new to inprogress

comment:2 Changed 7 years ago by Russell Taylor

Re #7489. Add back mutex per Pete's instructions.

Hoping that this will fix sporadic failures in system (LoadLotsOfFiles & BASISAutoReduction) & unit (ModeratorTzeroLinear) tests that load BASIS data.

Changeset: aac45c4c7125c362a370d58f70cd9dbabd9e73b2

comment:3 Changed 7 years ago by Russell Taylor

  • Blocked By 7524 added

comment:4 Changed 7 years ago by Russell Taylor

Revert "Re #7489. Add back mutex per Pete's instructions."

Take the mutex out again pending fixing the real problem in #7524. This reverts commit aac45c4c7125c362a370d58f70cd9dbabd9e73b2.

Changeset: f58716404c835b698b10cbf00986c2aab021b43c

comment:5 Changed 7 years ago by Russell Taylor

  • Priority changed from blocker to major
  • type changed from enhancement to defect
  • Milestone changed from Release 2.6 to Backlog

I'm going to reduce the priority on this and push it back because, while I believe there is a theoretical risk, I think the actual risk of us encountering a problem is very small (might even be zero) and not having the global mutex gives a big boost in performance.

The problem would be if more that one detector ID pointed to the same event list/workspace index, if there were events for more than one of these detectors and if they were processed on different threads. In that case you could have an error if events destined for the same event were processed at the same time, and a crash if one caused a resize of the vector. SNS doesn't do hardware grouping so it shouldn't happen for us. I'm not sure if ISIS files even pass through the code in a way that they could hit this.

The correct solution would be to check the pixelID_to_wi_vector for duplicates and then lock the mutex if any are found (a mutex per event list seems like overkill given how rare this will be). However, everyone would have to pay the price for checking this when I'm not even sure it will ever trigger.

comment:6 Changed 7 years ago by Russell Taylor

  • Status changed from inprogress to new

comment:7 Changed 7 years ago by Russell Taylor

  • Status changed from new to verify
  • Resolution set to wontfix
  • Milestone changed from Backlog to Release 3.2

I'm going to close this one. I think the risk of it biting us is very low - and if it does having an old ticket lying around probably wouldn't be much help anyway.

comment:8 Changed 7 years ago by Peter Peterson

  • Status changed from verify to verifying
  • Tester set to Peter Peterson

comment:9 Changed 7 years ago by Peter Peterson

  • Status changed from verifying to closed

As the creator of the ticket I agree with Russell's assessment that it isn't worth doing.

comment:10 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 8334

Note: See TracTickets for help on using tickets.