Ticket #5102 (closed: fixed)

Opened 8 years ago

Last modified 5 years ago

MDBox: better tracking of which boxes to split - Dead code, merging mistakes

Reported by: Janik Zikovsky Owned by: Alex Buts
Priority: blocker Milestone: Release 2.6
Component: Framework Keywords: Maintenance
Cc: Blocked By:
Blocking: Tester: Owen Arnold

Description (last modified by Janik Zikovsky) (diff)

Currently, you need occasional calls to "splitAllIfNeeded()" to check which MDBoxes have grown too large.

This gets slow if called too often (for very large workspaces) since it has to traverse all the boxes. E.g. in a 1.4 Gevent MDWorkspace it took 1.5 seconds or so on TOPAZ (32 cores).

The BoxController could track the MDBoxes as they get too big. Then the call to split could only look at these instead of having to parse everything.

Refs #5021.

Another possibility is to have the MDBox itself do the splitting as soon as it becomes too big. I tried this but it was tricky to do in a multithreaded environment because the MDBox needs to basically delete itself.

Change History

comment:1 Changed 8 years ago by Janik Zikovsky

  • Status changed from new to accepted

comment:2 Changed 8 years ago by Janik Zikovsky

Refs #5102: WIP of tracking too-big MDBoxes

Thinking of reorganizing IMDBox into a NON-templated class, and adding a MDBoxBase

Changeset: 47a8d240384f171ad1f3e312934c5c1e9280913d

comment:3 Changed 8 years ago by Janik Zikovsky

comment:4 Changed 8 years ago by Janik Zikovsky

Refs #5102: renamed IMDBox MDBoxBase. Sorry for the big commit...

Changeset: 7ff6dc04c6dbd2ad1d3fcf362d60994a5ebb14d9

comment:5 Changed 8 years ago by Janik Zikovsky

Refs #5102: WIP of tracking too-big MDBoxes

Thinking of reorganizing IMDBox into a NON-templated class, and adding a MDBoxBase

Changeset: 47a8d240384f171ad1f3e312934c5c1e9280913d

comment:6 Changed 8 years ago by Janik Zikovsky

comment:7 Changed 8 years ago by Janik Zikovsky

Refs #5102: renamed IMDBox MDBoxBase. Sorry for the big commit...

Changeset: 7ff6dc04c6dbd2ad1d3fcf362d60994a5ebb14d9

comment:8 Changed 8 years ago by Janik Zikovsky

Refs #5102 commenting out idea that will be left for later

Changeset: 1211627886fca758d26b18e7fd0da276ea359a88

comment:9 Changed 8 years ago by Janik Zikovsky

  • Milestone changed from Release 2.1 to Release 2.2

Will work on this for next release (maybe!)

comment:10 Changed 8 years ago by Janik Zikovsky

  • Status changed from accepted to assigned

comment:11 Changed 8 years ago by Janik Zikovsky

Refs #5102 commenting out idea that will be left for later

Changeset: 1211627886fca758d26b18e7fd0da276ea359a88

comment:12 Changed 8 years ago by Janik Zikovsky

  • Owner changed from Janik Zikovsky to Owen Arnold
  • Description modified (diff)

comment:13 Changed 8 years ago by Owen Arnold

  • Status changed from assigned to accepted

comment:14 Changed 8 years ago by Owen Arnold

refs #5102. Essential benchmark performance test.

Changeset: 41533d1ff61e36cd2d19b27c7bb0705bd4a17fe7

comment:15 Changed 8 years ago by Owen Arnold

refs #5102. Fix to avoid type narrowing.

Changeset: e95f57ff9f1d86989a68ea3caf9d7873bc8be67b

comment:16 Changed 8 years ago by Owen Arnold

I am a little concerned about the impact of the changes to addEvent on MDBox. I have therefore created some temporary Performance tests, which add 1E7 events to an MDBox. One version queries to box controller to determine if the box should be marked for spltting, one does not.

Direct comparison shows that the version that queries the box controller is very close to the speed of the original (single threaded). Infact, there is only a loss of 3% in the speed by including the query and then adding 1E7 events. This is encouraging, because in the new way of doing things, adding the events becomes more expensive, while splitting the boxes should become significantly faster.

comment:17 Changed 8 years ago by Owen Arnold

<?xml version="1.0" encoding="UTF-8" ?>

  • <testsuite name="MDEventsTest" tests="2" errors="0" failures="0" package="MDEventsTest" time="0.889">
  • <testcase classname="MDEventsTest.MDBoxTestPerformance" name="test_adding_events_without_splits_performance" line="1090" time="0.437" totalTime="0.437" CPUFraction="1">
  • <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
  • <testcase classname="MDEventsTest.MDBoxTestPerformance" name="test_adding_events_with_splits_performance" line="1101" time="0.452" totalTime="0.452" CPUFraction="1">
  • <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase> </testsuite>

Only 3% loss in the speed!

comment:18 Changed 8 years ago by Owen Arnold

Performance tests for new spliting method vs old splitting methodology show that it's running it's more than 2* faster, even when the events have been widely distributed.

comment:19 Changed 8 years ago by Owen Arnold

Just to back-up previous claim, here are the test results:

<?xml version="1.0" encoding="UTF-8" ?>

  • <testsuite name="MDEventsTest" tests="2" errors="0" failures="0" package="MDEventsTest" time="27.611">
  • <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_performance_single_threaded" line="641" time="5.545" totalTime="11.546" CPUFraction="1">
  • <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
  • <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_tracked_boxes_performance_single_threaded" line="646" time="2.545" totalTime="16.065" CPUFraction="1">
  • <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase> </testsuite>

comment:20 Changed 8 years ago by Owen Arnold

As illustrated below, spitting is achieved much much quicker via the newer tracking method than was possible before.

<?xml version="1.0" encoding="UTF-8" ?>

  • <testsuite name="MDEventsTest" tests="4" errors="0" failures="0" package="MDEventsTest" time="20.22">
  • <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_performance_single_threaded_on_wide_distribution" line="727" time="5.31" totalTime="5.7" CPUFraction="1">
  • <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
  • <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_tracked_boxes_performance_single_threaded_on_wide_distribution" line="732" time="0.3" totalTime="4.36" CPUFraction="1">
  • <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
  • <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_performance_multi_threaded_on_wide_distribution" line="737" time="5.15" totalTime="5.721" CPUFraction="1">
  • <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
  • <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_tracked_boxes_performance_multi_threaded_on_wide_distribution" line="745" time="0.306" totalTime="4.439" CPUFraction="1">
  • <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase> </testsuite>

comment:21 Changed 8 years ago by Owen Arnold

refs #5102. Working splitting with tests inc perf tests

Changeset: f0a09731a41944a49ce3434d2f61f4c72bdc75d1

comment:22 Changed 8 years ago by Owen Arnold

refs #5102. Wired up and proven with perf tests.

Changeset: d8ef8862e76d3740cbd2b8a159f617b68da599ad

comment:23 Changed 8 years ago by Owen Arnold

refs #5102. More Performance Tests.

Changeset: a1cc439a7d23b3c03a604ac3c46eda00c47f0ed8

comment:24 Changed 8 years ago by Owen Arnold

refs #5102, consistent results s and m threaded

Changeset: 91373b75e94e891a6263677f0b511bba2f7e9c3d

comment:25 Changed 8 years ago by Owen Arnold

refs #5102, work on copy construction with tracking.

Changeset: b3cf4edb652ac79a6f975fc78747704eae13d8a4

comment:26 Changed 8 years ago by Owen Arnold

refs #5102. Made more robust and added performance tests

Changeset: a3f929dabcb8b20177b1a5554fc75bb7c2846a1f

comment:27 Changed 8 years ago by Owen Arnold

refs #5102. Essential benchmark performance test.

Changeset: 41533d1ff61e36cd2d19b27c7bb0705bd4a17fe7

comment:28 Changed 8 years ago by Owen Arnold

refs #5102. Fix to avoid type narrowing.

Changeset: e95f57ff9f1d86989a68ea3caf9d7873bc8be67b

comment:29 Changed 8 years ago by Owen Arnold

refs #5102. Essential benchmark performance test.

Changeset: 41533d1ff61e36cd2d19b27c7bb0705bd4a17fe7

comment:30 Changed 8 years ago by Owen Arnold

refs #5102. Fix to avoid type narrowing.

Changeset: e95f57ff9f1d86989a68ea3caf9d7873bc8be67b

comment:31 Changed 8 years ago by Owen Arnold

refs #5102. splitAllIfNeeded now uses splitTrackedBoxes

Changeset: 26500144872ac8dc00947e84a8accceec0f29199

comment:32 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error.

Changeset: f17321b934cf3d02746c676bf64f7cff96d5002c

comment:33 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error.

Changeset: 2cf5b09c3e5bc47b3bcce7755800058fa9f7db6f

comment:34 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error.

Changeset: c45e532cde0858c94abf912421e03d969337713e

comment:35 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error.

Changeset: 843a8b1dd293c5dd34ad57c2061b16ed93f5e0d0

comment:36 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error.

Changeset: faf3c348e2ec2a12e9daf844863322ab7cc96e73

comment:37 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error.

Changeset: 0354f8919a0d2ab0debcfc391718c3e930c99cc6

comment:38 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error.

Changeset: 677254ad81d1fb6bce561093e549b56d22261ce4

comment:39 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error.

Changeset: 9c5b69d16b7fee7a49ab35971cb5c7be611d3bd2

comment:40 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error.

Changeset: 4b7bd0c957f5e92cd554ce5d639621d71cf589c6

comment:41 Changed 8 years ago by Owen Arnold

refs #5102. keep splitting methods separate while tests are fixed..

Changeset: 47a585a17c5abd6fe9f568e1c812b4d091a3588d

comment:42 Changed 8 years ago by Owen Arnold

refs #5102. Null check required for testing purposes

Changeset: 8d273ff7137a6fe4c07166d9eb2415a350fa897d

comment:43 Changed 8 years ago by Owen Arnold

refs #5102. Fix gcc error

Changeset: 39fc039246dd53e986ee24b09403ea414b11f06c

comment:44 Changed 8 years ago by Owen Arnold

refs #5102 swap in splitAllIfNeeded

Changeset: 35113e2f38285b802f481ea1cad49ed138f6f370

comment:45 Changed 8 years ago by Owen Arnold

refs #5102. Check boxes exist before removing

Changeset: b5770d08cef05b1aff5a6c7aaf128198c8bb2ab4

comment:46 Changed 8 years ago by Owen Arnold

refs #5102. Fixes thread blocking issue on gcc.

Changeset: 994b255c3393bac4bbb0d277aa80663332089ae7

comment:47 Changed 8 years ago by Owen Arnold

refs #5102. Fix GCC warnings.

Changeset: f3300251366d27965c0a1f6249bbc74cfaaf1ec0

comment:48 Changed 8 years ago by Nick Draper

  • Milestone changed from Release 2.2 to Release 2.3

Moved at the end of release 2.2

comment:49 Changed 8 years ago by Nick Draper

  • Milestone changed from Release 2.3 to Release 2.4

moved to Release 2.4

comment:50 Changed 8 years ago by Nick Draper

  • Milestone changed from Release 2.4 to Release 2.5

Moved at the code freeze for release 2.4

comment:51 Changed 7 years ago by Nick Draper

  • Milestone changed from Release 2.5 to Release 2.6

Moved to r2.6 at the end of r2.5

comment:52 Changed 7 years ago by Owen Arnold

  • Status changed from accepted to verify
  • Resolution set to wontfix

These changes have ended up in master anyway as someone ended up merging these commits into their own branch and eventually master without my knowledge! If you want proof that this has been done, then run git branch --contains {commit_id} where commit_id is one of the SHA1 commits from above. There's no point continuing with this ticket given that it's purpose has been corrupted like this.

comment:53 Changed 7 years ago by Nick Draper

  • Status changed from verify to verifying
  • Tester set to Nick Draper

comment:54 Changed 7 years ago by Nick Draper

  • Status changed from verifying to reopened
  • Resolution wontfix deleted

Alex, It appears some of the changes here have been cherry picked from this branch and committed to master with other code. However they appear to not be used and are therefore dead code in our codebase. Dead code is a bad thing it adds a risk of side effects and bugs on other changes and makes the code harder to understand. Equally if the code is brought into use it could easily slip in without sufficient testing.

We need to talk about what to do to fix this.

The class BoxCtrlChangesList has been derived from BoxController. BoxController was the old system, whereby the controller was queried using ( BoxController::shouldSplitBoxes() ) to determine whether the boxes were ready to split. The changes that Janik and I worked on seem to have been "refactored" into BoxCtrlChangesList.

As far as I can tell, BoxCtrlChangesList is never instantiated. MDEventWorkspace still creates a  new BoxController, see line 41 of MDEventWorkspace.cpp for example. 

The author of this code has removed the use of the BoxCtrlChangesList by commenting it out of the code base. Every time you build mantid, you still build this code, however, the author has commented out the header includes in various places, such as MDEventWorkspace.h line 8.

comment:55 Changed 7 years ago by Nick Draper

  • Status changed from reopened to assigned
  • Priority changed from major to blocker
  • Keywords Maintenance added
  • Owner changed from Owen Arnold to Alex Buts

comment:56 Changed 7 years ago by Nick Draper

  • Summary changed from MDBox: better tracking of which boxes to split to MDBox: better tracking of which boxes to split - Dead code, merging mistakes

comment:57 Changed 7 years ago by Alex Buts

  • Status changed from assigned to accepted

comment:58 Changed 7 years ago by Alex Buts

refs #5102 This should remove the box-list related code

Changeset: dd9ccd2c5eda3479369e41a1a76cc59b37a87749

comment:59 Changed 7 years ago by Alex Buts

refs #5102 This should remove the box-list related code

Changeset: dd9ccd2c5eda3479369e41a1a76cc59b37a87749

comment:60 Changed 7 years ago by Alex Buts

  • Status changed from accepted to verify
  • Resolution set to fixed

I have removed two classes responsible for the box tracking namely BoxCntrlChangesList and MDBoxToChange and all method responsible for adding and tracing boxes namely addAndTrace and addAndBuild on MDBoxes and its interfaces. The code was not used so no changes to the unit tests or Mantid application was introduced.

The previous code was left in the branch related to ticket #7100.

The tester should think if anything else is related to box tracking or any traces of the classes/methods responsible for tracking is present in the codebase. I do not see any.

Then it should be easy to merge with master.

comment:61 Changed 7 years ago by Owen Arnold

  • Status changed from verify to verifying
  • Tester changed from Nick Draper to Owen Arnold

comment:62 Changed 7 years ago by Alex Buts

refs #5102 This should remove the box-list related code

Changeset: dd9ccd2c5eda3479369e41a1a76cc59b37a87749

comment:63 Changed 7 years ago by Owen Arnold

  • Status changed from verifying to closed

comment:64 Changed 7 years ago by Nick Draper

  • Component changed from Mantid to Framework

comment:65 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 5948

Note: See TracTickets for help on using tickets.