Ticket #5533 (closed: duplicate)
IntegratePeaksMD seg faults sporadically
Reported by: | Dennis Mikkelson | Owned by: | Vickie Lynch |
---|---|---|---|
Priority: | major | Milestone: | Release 3.2 |
Component: | Framework | Keywords: | |
Cc: | petersonpf@… | Blocked By: | |
Blocking: | Tester: | Michael Reuter |
Description (last modified by Vickie Lynch) (diff)
When used repeatedly in a script, IntegratePeaksMD will seg fault, typically after processing something between 2 and 6 TOPAZ runs. This is related to OpenMP, since if the line: PRAGMA_OMP( parallel for schedule(dynamic, 10) ) is commented out, IntegratePeaksMD runs reliably.
Change History
comment:2 Changed 8 years ago by Russell Taylor
Dennis, you say the use of OpenMP provides a small benefit, yet the IntegratePeaksMD performances tests have shown a marked slowdown - 0.9 -> 2.7s for one, 1.1 -> 4.0s for the other. How many cores were you running on? (presumably the fewer cores, the less change is seen).
I'll have a look to see if I can spot where the race condition is.
comment:3 Changed 8 years ago by Dennis Mikkelson
Re 5533: Comment out OMP Pragma
If the OMP pragma is included in IntegratePeaksMD, the algorithm seg faults sporadically when processing multiple TOPAZ runs in a script, on Scientific Linux 6.2. Typically, it seg faults after 2 to 6 runs are processed, though occasionally it will process all 8 requested in the script without crashing. Since the lower level codes already use OpenMP, parallelizing at this level is only marginally useful, giving about a 5-10% speedup. Perhaps it should just be removed permanantly, but for now it is commented out to avoid the seg faults. Refs #5533
Changeset: a56c8967e4d65bd6c0502742ddc429c0c4a32b6e
comment:4 Changed 8 years ago by Nick Draper
- Milestone changed from Release 2.2 to Release 2.3
Moved at the end of release 2.2
comment:5 Changed 8 years ago by Nick Draper
- Milestone changed from Release 2.3 to Release 2.4
Moved to milestone 2.4
comment:6 Changed 8 years ago by Dennis Mikkelson
- Owner changed from Dennis Mikkelson to Anyone
- Status changed from new to assigned
- Milestone changed from Release 2.4 to Release 2.5
comment:10 Changed 7 years ago by Nick Draper
- Milestone changed from Release 2.6 to Backlog
Moved to backlog at the code freeze for R2.6
comment:11 Changed 7 years ago by Nick Draper
- Status changed from new to assigned
Bulk move to assigned at the introduction of the triage step
comment:12 Changed 7 years ago by Vickie Lynch
- Owner changed from Anyone to Vickie Lynch
- Description modified (diff)
comment:13 Changed 7 years ago by Vickie Lynch
comment:15 Changed 7 years ago by Vickie Lynch
comment:16 Changed 7 years ago by Vickie Lynch
comment:17 Changed 7 years ago by Russell Taylor
I took the last commit out of the develop branch because it was still causing the Mac build to get stuck. Moreover, it caused crashes in performance and system tests so it's clearly not threadsafe.
comment:18 Changed 6 years ago by Vickie Lynch
Refs #5533 limit number of threads
Changeset: 3d536d35a9bcab324aac9b32d9debda9da90725b
comment:19 Changed 6 years ago by Vickie Lynch
Revert "Refs #5533 limit number of threads"
This reverts commit 3d536d35a9bcab324aac9b32d9debda9da90725b.
Changeset: 908e0beee54161bffe1d6582cba7a1b6e8396e7e
comment:20 Changed 6 years ago by Vickie Lynch
Refs #5533 try threads with fitting moved inside function
Changeset: 787a9cba192bf1e8c386ebfdd73ade3c8102ff04
comment:21 Changed 6 years ago by Vickie Lynch
Refs #5533 fix merge conflict
Changeset: 1f776b496bf10088d41133e58a45ec14d9cf349c
comment:22 Changed 6 years ago by Vickie Lynch
Revert "Refs #5533 try threads with fitting moved inside function"
This reverts commit 787a9cba192bf1e8c386ebfdd73ade3c8102ff04.
Changeset: da7ae980f13a5454373c2e4b5edb088b7cf6f79e
comment:23 Changed 6 years ago by Vickie Lynch
- Status changed from inprogress to verify
- Resolution set to duplicate
There is nothing to test in this ticket. All the changes have been reverted primarily because they hang on the Mac build. In ticket 7651 a parallel SCD reduction workflow will be written which will replace the current parallel python scripts and in ticket 9228 IntegratePeaksMD2 will be refactored which may fix the problem with the Mac builds.
comment:25 Changed 6 years ago by Michael Reuter
- Status changed from verify to verifying
- Tester set to Michael Reuter
comment:26 Changed 6 years ago by Michael Reuter
- Status changed from verifying to closed
Verified changes have been backed out.
comment:27 Changed 5 years ago by Stuart Campbell
This ticket has been transferred to github issue 6379
Re 5533: Comment out OMP Pragma
If the OMP pragma is included in IntegratePeaksMD, the algorithm seg faults sporadically when processing multiple TOPAZ runs in a script, on Scientific Linux 6.2. Typically, it seg faults after 2 to 6 runs are processed, though occasionally it will process all 8 requested in the script without crashing. Since the lower level codes already use OpenMP, parallelizing at this level is only marginally useful, giving about a 5-10% speedup. Perhaps it should just be removed permanantly, but for now it is commented out to avoid the seg faults. Refs #5533