Ticket #3003 (closed: fixed)

Opened 9 years ago

Last modified 5 years ago

Fix curious error from ParComponentFactory when run in parallel.

Reported by: Janik Zikovsky Owned by: Janik Zikovsky
Priority: blocker Milestone: Iteration 29
Component: Mantid Keywords:
Cc: Blocked By:
Blocking: Tester: Martyn Gigg

Description

GetDetectorOffsetsTestPerformance as of revision 11377 breaks but only if run in parallel. Error occurs here:

Thread [6] (Suspended: Breakpoint hit.)	
	21 Mantid::Geometry::Component::Component() Component.cpp:20 0x00007ffff5449f50	
	20 Mantid::Geometry::CompAssembly::CompAssembly() CompAssembly.cpp:39 0x00007ffff5443184	
	19 Mantid::Geometry::Instrument::Instrument() Instrument.cpp:43 0x00007ffff546187c	
	18 Mantid::Geometry::ParComponentFactory::createInstrument() ParComponentFactory.cpp:138 0x00007ffff54810eb	
	17 Mantid::API::MatrixWorkspace::getInstrument() MatrixWorkspace.cpp:530 0x00007ffff5a010c3	
	16 Mantid::API::MatrixWorkspace::getDetector() MatrixWorkspace.cpp:458 0x00007ffff5a0054b	
	15 Mantid::API::IFunctionMW::setMatrixWorkspace() IFunctionMW.cpp:278 0x00007ffff59e99c8	
	14 Mantid::API::CompositeFunctionMW::setMatrixWorkspace() CompositeFunctionMW.cpp:147 0x00007ffff59b8486	
	13 Mantid::API::IFunctionMW::setWorkspace() IFunctionMW.cpp:174 0x00007ffff59e945f	
	12 Mantid::API::CompositeFunctionMW::setWorkspace() CompositeFunctionMW.cpp:153 0x00007ffff59b853e	
	11 Mantid::CurveFitting::GenericFit::exec() GenericFit.cpp:103 0x00007ffff7210ce9	
	10 Mantid::API::Algorithm::execute() Algorithm.cpp:280 0x00007ffff5966073	
	9 Mantid::CurveFitting::Fit::exec() Fit.cpp:128 0x00007ffff71dc504	
	8 Mantid::API::Algorithm::execute() Algorithm.cpp:280 0x00007ffff5966073	
	7 Mantid::API::Algorithm::executeAsSubAlg() Algorithm.cpp:375 0x00007ffff596700a	
	6 Mantid::Algorithms::GetDetectorOffsets::fitSpectra() GetDetectorOffsets.cpp:148 0x00007ffff789c712	
	5 Mantid::Algorithms::GetDetectorOffsets::exec() GetDetectorOffsets.cpp:94 0x00007ffff789d578	
	4 gomp_thread_start() team.c:115 0x00007fffef786fd2	
	3 start_thread() pthread_create.c:304 0x00007fffef353971	
	2 clone() clone.S:112 0x00007fffef0af92d	
	1 <symbol is not available> 0x0000000000000000	

because a ParameterMap is not passed as a parameter for some odd reason!

ParComponentFactory has NO TEST! Add a test and test for this issue.

Change History

comment:1 Changed 9 years ago by Janik Zikovsky

  • Status changed from new to accepted

comment:2 Changed 9 years ago by Janik Zikovsky

(In [11379]) Refs #3003: Added ParComponentFactory basic test. Added a few tests in the hope of tracking down my parallel bug; no luck.

comment:3 Changed 9 years ago by Janik Zikovsky

(In [11380]) Refs #3003: Segfault on ubuntu-10.04 and RHEL6. The rest pass!?

comment:4 Changed 9 years ago by Janik Zikovsky

(In [11381]) Refs #3003: Made the ParameterMap NOT mutable in MatrixWorkspace. Mutable is dangerous!!! Let's see if this fixes the segfault issue.

comment:5 Changed 9 years ago by Janik Zikovsky

  • Status changed from accepted to verify
  • Resolution set to fixed

Looks like that fixed GetDetectorOffsets() ! What a weird bug.

mutable is evil! Don't use a copy-on-write pointer when you don't have to!

comment:6 Changed 9 years ago by Janik Zikovsky

(In [11403]) Refs #3003: Put back a parameter of GetDetectorOffsets to help compatibility with old scripts.

comment:7 Changed 9 years ago by Martyn Gigg

  • Status changed from verify to verifying
  • Tester set to Martyn Gigg

comment:8 Changed 9 years ago by Martyn Gigg

  • Status changed from verifying to closed

I've run all of the algorithms tests, including the performance tests, several times using -j8 and observed no crash.

comment:9 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 3850

Note: See TracTickets for help on using tickets.