Ticket #3652 (closed: fixed)

Opened 9 years ago

Last modified 5 years ago

MergeMDEWTest is sometimes hanging on Mac builds

Reported by: Russell Taylor Owned by: Janik Zikovsky
Priority: major Milestone: Iteration 30
Component: Mantid Keywords:
Cc: Blocked By:
Blocking: Tester: Russell Taylor

Description

...and causing them to time out.

Oldest occurrence that I could see was August 29, though there were a couple of times on August 26 when DiskMRU caused similar timeouts.

Feel free to enlist me and my mac if you can't spot the problem.

Change History

comment:1 Changed 9 years ago by Russell Taylor

Some light reading:

(gdb) bt
#0  0x0000000101ee17b4 in boost::multi_index::const_mem_fun<Mantid::Kernel::FreeBlock, unsigned long long, &(Mantid::Kernel::FreeBlock::getSize() const)>::operator() (this=0x107910af0, x=@0x1067d43c0) at mem_fun.hpp:63
#1  0x0000000101ee6864 in boost::multi_index::detail::ordered_index_lower_bound<boost::multi_index::detail::ordered_index_node<boost::multi_index::detail::index_node_base<Mantid::Kernel::FreeBlock, std::allocator<Mantid::Kernel::FreeBlock> > >, boost::multi_index::const_mem_fun<Mantid::Kernel::FreeBlock, unsigned long long, &(Mantid::Kernel::FreeBlock::getSize() const)>, unsigned long long, std::less<unsigned long long> > (top=0x1067d43c0, y=0x1067d43c0, key=@0x107910af0, x=@0x7fff5fbfa1e8, comp=@0x107910af1) at ord_index_ops.hpp:86
#2  0x0000000101ee2d36 in boost::multi_index::detail::ordered_index<boost::multi_index::const_mem_fun<Mantid::Kernel::FreeBlock, unsigned long long, &(Mantid::Kernel::FreeBlock::getSize() const)>, std::less<unsigned long long>, boost::multi_index::detail::nth_layer<2, Mantid::Kernel::FreeBlock, boost::multi_index::indexed_by<boost::multi_index::ordered_non_unique<boost::multi_index::const_mem_fun<Mantid::Kernel::FreeBlock, unsigned long long, &(Mantid::Kernel::FreeBlock::getFilePosition() const)>, boost::mpl::na, boost::mpl::na>, boost::multi_index::ordered_non_unique<boost::multi_index::const_mem_fun<Mantid::Kernel::FreeBlock, unsigned long long, &(Mantid::Kernel::FreeBlock::getSize() const)>, boost::mpl::na, boost::mpl::na>, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na, boost::mpl::na>, std::allocator<Mantid::Kernel::FreeBlock> >, boost::mpl::vector0<boost::mpl::na>, boost::multi_index::detail::ordered_non_unique_tag>::lower_bound<unsigned long long> (this=0x107910af0, x=@0x7fff5fbfa1e8) at ordered_index.hpp:462
#3  0x0000000101ed61cd in Mantid::Kernel::DiskMRU::allocate (this=0x107910928, newSize=36) at /Users/tr9/Mantid/Code/Mantid/Framework/Kernel/src/DiskMRU.cpp:406
#4  0x0000000101ed636b in Mantid::Kernel::DiskMRU::relocate (this=0x107910928, oldPos=0, oldSize=12, newSize=36) at /Users/tr9/Mantid/Code/Mantid/Framework/Kernel/src/DiskMRU.cpp:452
#5  0x000000010083b2c3 in Mantid::MDEvents::MDBox<Mantid::MDEvents::MDLeanEvent<3ul>, 3ul>::save (this=0x1067f4010) at MDBox.cpp:271

#6  0x0000000100a1c995 in Mantid::MDEvents::SaveMDEW::doSave<Mantid::MDEvents::MDLeanEvent<3ul>, 3ul> (this=0x1079118f0, ws={px = 0x1067d2680, pn = {pi_ = 0x107936660}}) at /Users/tr9/Mantid/Code/Mantid/Framework/MDEvents/src/SaveMDEW.cpp:240
#7  0x00000001009ea75e in Mantid::MDEvents::SaveMDEW::exec (this=0x1079118f0) at /Users/tr9/Mantid/Code/Mantid/Framework/MDEvents/src/SaveMDEW.cpp:386
#8  0x0000000102c85051 in Mantid::API::Algorithm::execute (this=0x1079118f0) at /Users/tr9/Mantid/Code/Mantid/Framework/API/src/Algorithm.cpp:300
#9  0x0000000102c86bc3 in Mantid::API::Algorithm::executeAsSubAlg (this=0x1079118f0) at /Users/tr9/Mantid/Code/Mantid/Framework/API/src/Algorithm.cpp:395
#10 0x00000001009a9b07 in Mantid::MDEvents::MergeMDEW::finalizeOutput<Mantid::MDEvents::MDLeanEvent<3ul>, 3ul> (this=0x7fff5fbfd930, outWS={px = 0x1067d2680, pn = {pi_ = 0x107936660}}) at /Users/tr9/Mantid/Code/Mantid/Framework/MDEvents/src/MergeMDEW.cpp:646
#11 0x0000000100984961 in Mantid::MDEvents::MergeMDEW::doExecByCloning<Mantid::MDEvents::MDLeanEvent<3ul>, 3ul> (this=0x7fff5fbfd930, ws={px = 0x10676be60, pn = {pi_ = 0x1079bdc60}}) at /Users/tr9/Mantid/Code/Mantid/Framework/MDEvents/src/MergeMDEW.cpp:623
#12 0x0000000100978983 in Mantid::MDEvents::MergeMDEW::exec (this=0x7fff5fbfd930) at /Users/tr9/Mantid/Code/Mantid/Framework/MDEvents/src/MergeMDEW.cpp:673
#13 0x0000000102c85051 in Mantid::API::Algorithm::execute (this=0x7fff5fbfd930) at /Users/tr9/Mantid/Code/Mantid/Framework/API/src/Algorithm.cpp:300
#14 0x00000001000596c4 in MergeMDEWTest::do_test_exec (this=0x1002eca50, OutputFilename={_M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x107a31c38 "MergeMDEWTest_OutputWS.nxs"}, static npos = 18446744073709551615}) at MergeMDEWTest.h:64
#15 0x0000000100058680 in MergeMDEWTest::test_exec_fileBacked (this=0x1002eca50) at MergeMDEWTest.h:38

comment:2 Changed 9 years ago by Russell Taylor

source /opt/intel/bin/iccvars.sh intel64

Then add this to DYLD_LIBRARY_PATH: /Users/tr9/Mantid/Code/Third_Party/lib/mac64

comment:3 Changed 9 years ago by Janik Zikovsky

export DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:/Users/tr9/Mantid/Code/Third_Party/lib/mac64

comment:4 Changed 9 years ago by Janik Zikovsky

  • Status changed from new to accepted

comment:5 Changed 9 years ago by Janik Zikovsky

In [14369]:

Refs #3652: Test relating to boost::multi_index and hanging on mac

comment:6 Changed 9 years ago by Janik Zikovsky

In [14370]:

Refs #3652: Test relating to boost::multi_index and hanging on mac

comment:7 Changed 9 years ago by Janik Zikovsky

Ran that same test (MergeMDEWtest) on Ubuntu 11.04 machine 96000 times overnight and did not get any hanging.

Last edited 9 years ago by Janik Zikovsky (previous) (diff)

comment:8 Changed 9 years ago by Russell Taylor

comment:9 Changed 9 years ago by Janik Zikovsky

In [14386]:

Refs #3557, Refs #3652: This should fix the Mac hanging bug.

comment:23 Changed 9 years ago by Janik Zikovsky

In [14389]:

Refs #3652: Better fix to DiskMRU

comment:24 Changed 9 years ago by Janik Zikovsky

  • Status changed from accepted to verify
  • Resolution set to fixed

I ran the same test 400+ times on the Mac mini with no hang, so this issue seems to be resolved fully.

comment:25 Changed 9 years ago by Russell Taylor

  • Status changed from verify to verifying
  • Tester set to Russell Taylor

comment:26 Changed 9 years ago by Russell Taylor

  • Status changed from verifying to closed

No hangs have been seen on the buildserver since this ticket was marked as fixed.

comment:27 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 4499

Note: See TracTickets for help on using tickets.