Ticket #5102 (closed: fixed)
MDBox: better tracking of which boxes to split - Dead code, merging mistakes
Reported by: | Janik Zikovsky | Owned by: | Alex Buts |
---|---|---|---|
Priority: | blocker | Milestone: | Release 2.6 |
Component: | Framework | Keywords: | Maintenance |
Cc: | Blocked By: | ||
Blocking: | Tester: | Owen Arnold |
Description (last modified by Janik Zikovsky) (diff)
Currently, you need occasional calls to "splitAllIfNeeded()" to check which MDBoxes have grown too large.
This gets slow if called too often (for very large workspaces) since it has to traverse all the boxes. E.g. in a 1.4 Gevent MDWorkspace it took 1.5 seconds or so on TOPAZ (32 cores).
The BoxController could track the MDBoxes as they get too big. Then the call to split could only look at these instead of having to parse everything.
Refs #5021.
Another possibility is to have the MDBox itself do the splitting as soon as it becomes too big. I tried this but it was tricky to do in a multithreaded environment because the MDBox needs to basically delete itself.
Change History
comment:2 Changed 8 years ago by Janik Zikovsky
Refs #5102: WIP of tracking too-big MDBoxes
Thinking of reorganizing IMDBox into a NON-templated class, and adding a MDBoxBase
Changeset: 47a8d240384f171ad1f3e312934c5c1e9280913d
comment:3 Changed 8 years ago by Janik Zikovsky
Refs #5102: fix test
Changeset: bcf100f6a17ab8d56d5649ada918b80d832ad489
comment:4 Changed 8 years ago by Janik Zikovsky
Refs #5102: renamed IMDBox MDBoxBase. Sorry for the big commit...
Changeset: 7ff6dc04c6dbd2ad1d3fcf362d60994a5ebb14d9
comment:5 Changed 8 years ago by Janik Zikovsky
Refs #5102: WIP of tracking too-big MDBoxes
Thinking of reorganizing IMDBox into a NON-templated class, and adding a MDBoxBase
Changeset: 47a8d240384f171ad1f3e312934c5c1e9280913d
comment:6 Changed 8 years ago by Janik Zikovsky
Refs #5102: fix test
Changeset: bcf100f6a17ab8d56d5649ada918b80d832ad489
comment:7 Changed 8 years ago by Janik Zikovsky
Refs #5102: renamed IMDBox MDBoxBase. Sorry for the big commit...
Changeset: 7ff6dc04c6dbd2ad1d3fcf362d60994a5ebb14d9
comment:8 Changed 8 years ago by Janik Zikovsky
Refs #5102 commenting out idea that will be left for later
Changeset: 1211627886fca758d26b18e7fd0da276ea359a88
comment:9 Changed 8 years ago by Janik Zikovsky
- Milestone changed from Release 2.1 to Release 2.2
Will work on this for next release (maybe!)
comment:11 Changed 8 years ago by Janik Zikovsky
Refs #5102 commenting out idea that will be left for later
Changeset: 1211627886fca758d26b18e7fd0da276ea359a88
comment:12 Changed 8 years ago by Janik Zikovsky
- Owner changed from Janik Zikovsky to Owen Arnold
- Description modified (diff)
comment:14 Changed 8 years ago by Owen Arnold
refs #5102. Essential benchmark performance test.
Changeset: 41533d1ff61e36cd2d19b27c7bb0705bd4a17fe7
comment:15 Changed 8 years ago by Owen Arnold
refs #5102. Fix to avoid type narrowing.
Changeset: e95f57ff9f1d86989a68ea3caf9d7873bc8be67b
comment:16 Changed 8 years ago by Owen Arnold
I am a little concerned about the impact of the changes to addEvent on MDBox. I have therefore created some temporary Performance tests, which add 1E7 events to an MDBox. One version queries to box controller to determine if the box should be marked for spltting, one does not.
Direct comparison shows that the version that queries the box controller is very close to the speed of the original (single threaded). Infact, there is only a loss of 3% in the speed by including the query and then adding 1E7 events. This is encouraging, because in the new way of doing things, adding the events becomes more expensive, while splitting the boxes should become significantly faster.
comment:17 Changed 8 years ago by Owen Arnold
<?xml version="1.0" encoding="UTF-8" ?>
- <testsuite name="MDEventsTest" tests="2" errors="0" failures="0" package="MDEventsTest" time="0.889">
- <testcase classname="MDEventsTest.MDBoxTestPerformance" name="test_adding_events_without_splits_performance" line="1090" time="0.437" totalTime="0.437" CPUFraction="1">
- <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
- <testcase classname="MDEventsTest.MDBoxTestPerformance" name="test_adding_events_with_splits_performance" line="1101" time="0.452" totalTime="0.452" CPUFraction="1">
- <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase> </testsuite>
Only 3% loss in the speed!
comment:18 Changed 8 years ago by Owen Arnold
Performance tests for new spliting method vs old splitting methodology show that it's running it's more than 2* faster, even when the events have been widely distributed.
comment:19 Changed 8 years ago by Owen Arnold
Just to back-up previous claim, here are the test results:
<?xml version="1.0" encoding="UTF-8" ?>
- <testsuite name="MDEventsTest" tests="2" errors="0" failures="0" package="MDEventsTest" time="27.611">
- <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_performance_single_threaded" line="641" time="5.545" totalTime="11.546" CPUFraction="1">
- <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
- <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_tracked_boxes_performance_single_threaded" line="646" time="2.545" totalTime="16.065" CPUFraction="1">
- <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase> </testsuite>
comment:20 Changed 8 years ago by Owen Arnold
As illustrated below, spitting is achieved much much quicker via the newer tracking method than was possible before.
<?xml version="1.0" encoding="UTF-8" ?>
- <testsuite name="MDEventsTest" tests="4" errors="0" failures="0" package="MDEventsTest" time="20.22">
- <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_performance_single_threaded_on_wide_distribution" line="727" time="5.31" totalTime="5.7" CPUFraction="1">
- <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
- <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_tracked_boxes_performance_single_threaded_on_wide_distribution" line="732" time="0.3" totalTime="4.36" CPUFraction="1">
- <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
- <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_performance_multi_threaded_on_wide_distribution" line="737" time="5.15" totalTime="5.721" CPUFraction="1">
- <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase>
- <testcase classname="MDEventsTest.MDEventWorkspaceTestPerformance" name="test_splitting_tracked_boxes_performance_multi_threaded_on_wide_distribution" line="745" time="0.306" totalTime="4.439" CPUFraction="1">
- <measurement> <name>CPUFraction</name> <value>1</value> </measurement> </testcase> </testsuite>
comment:21 Changed 8 years ago by Owen Arnold
refs #5102. Working splitting with tests inc perf tests
Changeset: f0a09731a41944a49ce3434d2f61f4c72bdc75d1
comment:22 Changed 8 years ago by Owen Arnold
refs #5102. Wired up and proven with perf tests.
Changeset: d8ef8862e76d3740cbd2b8a159f617b68da599ad
comment:23 Changed 8 years ago by Owen Arnold
refs #5102. More Performance Tests.
Changeset: a1cc439a7d23b3c03a604ac3c46eda00c47f0ed8
comment:24 Changed 8 years ago by Owen Arnold
refs #5102, consistent results s and m threaded
Changeset: 91373b75e94e891a6263677f0b511bba2f7e9c3d
comment:25 Changed 8 years ago by Owen Arnold
refs #5102, work on copy construction with tracking.
Changeset: b3cf4edb652ac79a6f975fc78747704eae13d8a4
comment:26 Changed 8 years ago by Owen Arnold
refs #5102. Made more robust and added performance tests
Changeset: a3f929dabcb8b20177b1a5554fc75bb7c2846a1f
comment:27 Changed 8 years ago by Owen Arnold
refs #5102. Essential benchmark performance test.
Changeset: 41533d1ff61e36cd2d19b27c7bb0705bd4a17fe7
comment:28 Changed 8 years ago by Owen Arnold
refs #5102. Fix to avoid type narrowing.
Changeset: e95f57ff9f1d86989a68ea3caf9d7873bc8be67b
comment:29 Changed 8 years ago by Owen Arnold
refs #5102. Essential benchmark performance test.
Changeset: 41533d1ff61e36cd2d19b27c7bb0705bd4a17fe7
comment:30 Changed 8 years ago by Owen Arnold
refs #5102. Fix to avoid type narrowing.
Changeset: e95f57ff9f1d86989a68ea3caf9d7873bc8be67b
comment:31 Changed 8 years ago by Owen Arnold
refs #5102. splitAllIfNeeded now uses splitTrackedBoxes
Changeset: 26500144872ac8dc00947e84a8accceec0f29199
comment:32 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error.
Changeset: f17321b934cf3d02746c676bf64f7cff96d5002c
comment:33 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error.
Changeset: 2cf5b09c3e5bc47b3bcce7755800058fa9f7db6f
comment:34 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error.
Changeset: c45e532cde0858c94abf912421e03d969337713e
comment:35 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error.
Changeset: 843a8b1dd293c5dd34ad57c2061b16ed93f5e0d0
comment:36 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error.
Changeset: faf3c348e2ec2a12e9daf844863322ab7cc96e73
comment:37 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error.
Changeset: 0354f8919a0d2ab0debcfc391718c3e930c99cc6
comment:38 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error.
Changeset: 677254ad81d1fb6bce561093e549b56d22261ce4
comment:39 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error.
Changeset: 9c5b69d16b7fee7a49ab35971cb5c7be611d3bd2
comment:40 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error.
Changeset: 4b7bd0c957f5e92cd554ce5d639621d71cf589c6
comment:41 Changed 8 years ago by Owen Arnold
refs #5102. keep splitting methods separate while tests are fixed..
Changeset: 47a585a17c5abd6fe9f568e1c812b4d091a3588d
comment:42 Changed 8 years ago by Owen Arnold
refs #5102. Null check required for testing purposes
Changeset: 8d273ff7137a6fe4c07166d9eb2415a350fa897d
comment:43 Changed 8 years ago by Owen Arnold
refs #5102. Fix gcc error
Changeset: 39fc039246dd53e986ee24b09403ea414b11f06c
comment:44 Changed 8 years ago by Owen Arnold
refs #5102 swap in splitAllIfNeeded
Changeset: 35113e2f38285b802f481ea1cad49ed138f6f370
comment:45 Changed 8 years ago by Owen Arnold
refs #5102. Check boxes exist before removing
Changeset: b5770d08cef05b1aff5a6c7aaf128198c8bb2ab4
comment:46 Changed 8 years ago by Owen Arnold
refs #5102. Fixes thread blocking issue on gcc.
Changeset: 994b255c3393bac4bbb0d277aa80663332089ae7
comment:47 Changed 8 years ago by Owen Arnold
refs #5102. Fix GCC warnings.
Changeset: f3300251366d27965c0a1f6249bbc74cfaaf1ec0
comment:48 Changed 8 years ago by Nick Draper
- Milestone changed from Release 2.2 to Release 2.3
Moved at the end of release 2.2
comment:49 Changed 8 years ago by Nick Draper
- Milestone changed from Release 2.3 to Release 2.4
moved to Release 2.4
comment:50 Changed 8 years ago by Nick Draper
- Milestone changed from Release 2.4 to Release 2.5
Moved at the code freeze for release 2.4
comment:51 Changed 7 years ago by Nick Draper
- Milestone changed from Release 2.5 to Release 2.6
comment:52 Changed 7 years ago by Owen Arnold
- Status changed from accepted to verify
- Resolution set to wontfix
These changes have ended up in master anyway as someone ended up merging these commits into their own branch and eventually master without my knowledge! If you want proof that this has been done, then run git branch --contains {commit_id} where commit_id is one of the SHA1 commits from above. There's no point continuing with this ticket given that it's purpose has been corrupted like this.
comment:53 Changed 7 years ago by Nick Draper
- Status changed from verify to verifying
- Tester set to Nick Draper
comment:54 Changed 7 years ago by Nick Draper
- Status changed from verifying to reopened
- Resolution wontfix deleted
Alex, It appears some of the changes here have been cherry picked from this branch and committed to master with other code. However they appear to not be used and are therefore dead code in our codebase. Dead code is a bad thing it adds a risk of side effects and bugs on other changes and makes the code harder to understand. Equally if the code is brought into use it could easily slip in without sufficient testing.
We need to talk about what to do to fix this.
The class BoxCtrlChangesList has been derived from BoxController. BoxController was the old system, whereby the controller was queried using ( BoxController::shouldSplitBoxes() ) to determine whether the boxes were ready to split. The changes that Janik and I worked on seem to have been "refactored" into BoxCtrlChangesList. As far as I can tell, BoxCtrlChangesList is never instantiated. MDEventWorkspace still creates a new BoxController, see line 41 of MDEventWorkspace.cpp for example. The author of this code has removed the use of the BoxCtrlChangesList by commenting it out of the code base. Every time you build mantid, you still build this code, however, the author has commented out the header includes in various places, such as MDEventWorkspace.h line 8.
comment:55 Changed 7 years ago by Nick Draper
- Status changed from reopened to assigned
- Priority changed from major to blocker
- Keywords Maintenance added
- Owner changed from Owen Arnold to Alex Buts
comment:56 Changed 7 years ago by Nick Draper
- Summary changed from MDBox: better tracking of which boxes to split to MDBox: better tracking of which boxes to split - Dead code, merging mistakes
comment:58 Changed 7 years ago by Alex Buts
refs #5102 This should remove the box-list related code
Changeset: dd9ccd2c5eda3479369e41a1a76cc59b37a87749
comment:59 Changed 7 years ago by Alex Buts
refs #5102 This should remove the box-list related code
Changeset: dd9ccd2c5eda3479369e41a1a76cc59b37a87749
comment:60 Changed 7 years ago by Alex Buts
- Status changed from accepted to verify
- Resolution set to fixed
I have removed two classes responsible for the box tracking namely BoxCntrlChangesList and MDBoxToChange and all method responsible for adding and tracing boxes namely addAndTrace and addAndBuild on MDBoxes and its interfaces. The code was not used so no changes to the unit tests or Mantid application was introduced.
The previous code was left in the branch related to ticket #7100.
The tester should think if anything else is related to box tracking or any traces of the classes/methods responsible for tracking is present in the codebase. I do not see any.
Then it should be easy to merge with master.
comment:61 Changed 7 years ago by Owen Arnold
- Status changed from verify to verifying
- Tester changed from Nick Draper to Owen Arnold
comment:62 Changed 7 years ago by Alex Buts
refs #5102 This should remove the box-list related code
Changeset: dd9ccd2c5eda3479369e41a1a76cc59b37a87749
comment:65 Changed 5 years ago by Stuart Campbell
This ticket has been transferred to github issue 5948