Ticket #2883 (closed: fixed)

Opened 9 years ago

Last modified 5 years ago

Seg fault for tight-looped quick algorithm when executing another

Reported by: Martyn Gigg Owned by: Martyn Gigg
Priority: major Milestone: Release 2.3
Component: Mantid Keywords:
Cc: Blocked By:
Blocking: Tester: Nick Draper

Description

Running

while True:
   CreateWorkspace("test",1,1,1)

from the script interpreter and then running a separate algorithm from the GUI causes an seg fault. However, this does not.

import time
while True:
   CreateWorkspace("test",1,1,1)
   time.sleep(0.1)

Change History

comment:1 Changed 9 years ago by Martyn Gigg

(In [11166]) Refs #2883. Help with possible causes of script crashes by ensuring the algorithm object itself executes asynchronously from Python rather than calling back to C++. Still need to wire up algorithm monitor.

comment:2 Changed 9 years ago by Nick Draper

  • Milestone changed from Iteration 29 to Iteration 30

"New" tickets moved at the code freeze of iteration 29

comment:3 Changed 9 years ago by Martyn Gigg

  • Owner set to Martyn Gigg
  • Status changed from new to accepted

comment:4 Changed 9 years ago by Martyn Gigg

  • Status changed from accepted to verify
  • Resolution set to fixed

comment:5 Changed 9 years ago by Janik Zikovsky

  • Status changed from verify to verifying
  • Tester set to Janik Zikovsky

comment:6 Changed 9 years ago by Janik Zikovsky

  • Status changed from verifying to reopened
  • Resolution fixed deleted

Failed with a segfault. On Ubuntu 11.04, debug build. Stacktrace:

Thread [13] (Suspended: Signal 'SIGSEGV' received. Description: Segmentation fault.)	
	19 memmove() memmove.c:64 0x00007fffed27acb7	
	18 std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<int>() stl_algobase.h:386 0x0000000000789eec	
	17 std::__copy_move_a<false, int const*, int*>() stl_algobase.h:404 0x00007ffff78b03b3	
	16 std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<int const*, std::vector<int> >, __gnu_cxx::__normal_iterator<int*, std::vector<int> > >() stl_algobase.h:442 0x00007ffff78af1d9	
	15 std::copy<__gnu_cxx::__normal_iterator<int const*, std::vector<int> >, __gnu_cxx::__normal_iterator<int*, std::vector<int> > >() stl_algobase.h:474 0x00007ffff78ad112	
	14 std::vector<int, std::allocator<int> >::operator=() vector.tcc:176 0x00007ffff78ab012	
	13 Mantid::DataHandling::LoadRawHelper::runLoadInstrument() LoadRawHelper.cpp:606 0x00007fffd50226ed	
	12 Mantid::DataHandling::LoadRaw3::exec() LoadRaw3.cpp:144 0x00007fffd50148f2	
	11 Mantid::API::Algorithm::execute() Algorithm.cpp:300 0x00007ffff6d15b4f	
	10 Mantid::DataHandling::Load::exec() Load.cpp:274 0x00007fffd4f07f16	
	9 Mantid::API::Algorithm::execute() Algorithm.cpp:300 0x00007ffff6d15b4f	
	8 Mantid::API::Algorithm::executeAsyncImpl() Algorithm.cpp:1214 0x00007ffff6d1b1e2	
	7 Mantid::API::AlgorithmProxy::executeAsyncImpl() AlgorithmProxy.cpp:215 0x00007ffff6d487bb	
	6 Poco::ActiveRunnable<bool, Poco::Void, Mantid::API::AlgorithmProxy>::run() ActiveRunnable.h:85 0x00007ffff6d4aa40	
	5 Poco::PooledThread::run() ThreadPool.cpp:203 0x00007fffef054185	
	4 Poco::ThreadImpl::runnableEntry() Thread_POSIX.cpp:345 0x00007fffef050cbd	
	3 start_thread() pthread_create.c:304 0x00007fffecfded8c	
	2 clone() clone.S:112 0x00007fffed2dc04d	
	1 <symbol is not available> 0x0000000000000000	

comment:7 Changed 9 years ago by Martyn Gigg

  • Milestone changed from Iteration 30 to Iteration 31

This is a sufficiently rare enough scenario that I don't think it's worth looking in to for this release. This under the hood stuff will change slightly with the new Python stuff anyway.

comment:8 Changed 9 years ago by Nick Draper

  • Milestone changed from Iteration 32 to Iteration 33

Moved to iteration 33 at iteration 32 code freeze

comment:9 Changed 8 years ago by Nick Draper

  • Milestone changed from Release 2.1 to Release 2.2

Moved at end of release 2.1

comment:10 Changed 8 years ago by Nick Draper

  • Milestone changed from Release 2.2 to Release 2.3

Moved at the end of release 2.2

comment:11 Changed 8 years ago by Martyn Gigg

  • Status changed from reopened to accepted

comment:12 Changed 8 years ago by Martyn Gigg

  • Status changed from accepted to verify
  • Resolution set to fixed

I think this has been fixed, most probably by all of the locking etc that has gone on with workspaces.

This loop didn't crash it, for example:

for i in range(10000):
   test = CreateWorkspace([1],[1],[1])
Last edited 8 years ago by Martyn Gigg (previous) (diff)

comment:13 Changed 8 years ago by Nick Draper

  • Status changed from verify to verifying
  • Tester changed from Janik Zikovsky to Nick Draper

comment:14 Changed 8 years ago by Nick Draper

  • Status changed from verifying to closed

Running two things simultaneously passed.

However I did manage to crash Mantid by running two tight loops and another algorithm. New ticket for this #6045

comment:15 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 3730

Note: See TracTickets for help on using tickets.