Ticket #10385 (closed: fixed)
Profile LoadNexus Slow for group workspaces
Reported by: | Owen Arnold | Owned by: | Owen Arnold |
---|---|---|---|
Priority: | major | Milestone: | Release 3.3 |
Component: | Reflectometry | Keywords: | |
Cc: | Blocked By: | #9848 | |
Blocking: | #10500 | Tester: | Pete Peterson |
Description
Saving of large group workspaces is much faster now. Load times are still poor. Profiling should reveal why.
Attachments
Change History
comment:2 Changed 6 years ago by Owen Arnold
Might be interesting to use the OSX instrument tools given that I know the load times are really bad on that platform. INTER00027729 is an offending run.
- Load INTER00027729.raw (fast)
- Save it via SaveNexus (relatively fast)
- Load it LoadNexus (really slow)
comment:3 Changed 6 years ago by Owen Arnold
- Status changed from assigned to inprogress
- Summary changed from Profile LoadISISNexus2 Slow for group workspaces to Profile LoadNexus Slow for group workspaces
comment:4 Changed 6 years ago by Owen Arnold
Attached profiler results for Load of INTER00027731.raw, saved via SaveNexus and then loaded.
comment:5 Changed 6 years ago by Owen Arnold
As a course-grain timing comparator, LoadNexus on INTER 27731 currently takes 5.94 to load all 51 periods on a release build on my macbook pro
comment:6 Changed 6 years ago by Owen Arnold
refs #10385. Extract spectrum info to object.
This is the first step towards reusing the same spectrum information rather than fetching it renewed each time. One caveat with this is that you will HAVE to be sure that all workspaces in the group are equivalent in terms of instrument detector mappings etc. What i intend to do next is:
1) For workspaces that look to relate to multiperiod group workspaces, default to reuse the spectrum-detectector mappings, once obtained on a lazy basis. 2) Put a boolean property with value in place (which defaults to true) called "PredictiveInstrumentLoading". If false, we would go back to old behaviour.
Changeset: 1978e1484b60c48538484ca46450103e829bbb7c
comment:7 Changed 6 years ago by Owen Arnold
refs #10385. Naming issue root problem.
There is a naming issue here, which is actually the root cause of the problem! The query that was being performed retrieved the number of workspace entries, NOT the number of periods. Going forward. If the number of periods exactly matches the number of workspace entries, we have a multiperiod group workspace, and can happily apply the optimisations we want.
Next step will be to actually extract the nperiods. To do that we need to try to fetch it out of the logs for The first entry.
Changeset: fdf6281af77bfb069315c3fae6937bad26e09bab
comment:8 Changed 6 years ago by Owen Arnold
refs #10385. Query default property.
Changeset: d0ba5d5f93dfb7b1c20474eece8952f88099963b
comment:9 Changed 6 years ago by Owen Arnold
refs #10385. Multiperiod query.
Changeset: 8a0c59958fdc0b0f05ecfb616f273b34c6764aa9
comment:10 Changed 6 years ago by Owen Arnold
refs #10385. Reuse the SpecInfo on multiperiod groups.
Changeset: 61959e5705be8423f54eb8a77034d8e739880dfb
comment:11 Changed 6 years ago by Owen Arnold
As a course-grain timing comparator, LoadNexus on INTER 27731 currently NOW takes 5.10 to load all 51 periods on a release build on my macbook pro. I'll need to tackle the other big areas highlighed by the profiler in order to achieve better results, I suspect I've moved the bottleneck now. I'm going to re-profile to see what effect this sort of strategy has had.
Changed 6 years ago by Owen Arnold
- Attachment Screen Shot 2014-10-23 at 16.54.01.png added
updated profiler results
comment:12 Changed 6 years ago by Owen Arnold
Updated profiler results, it now looks like the major problem has shifted to the number of calls to NXDataSet::open (see cycle estimation results). We now make +1 calls to loadEntry, so in my profiler example, we have 468 calls to the offending function, that's 9 calls per workspace entry. My next step will be to tackle this and reduce the number or redundant calls. Noting the success above, the multiperiod-only caching technique seems to be a good route to do this.
comment:13 Changed 6 years ago by Owen Arnold
My next step is going to be to try the following:
- For multiperiod runs, copy process the first load as per normal, then, clone the workspace each time, set dataY, dataE and load the logs only.
- Check the above as a proof of concept
- Refactor the above
- Add a property which would default to the old behaviour. Set this property to be false by default.
comment:14 Changed 6 years ago by Owen Arnold
refs #10385. Try to use clone and write technique.
Changeset: 9a677316ce4294e96ec8a2c793c9d9090dd13be8
comment:15 Changed 6 years ago by Owen Arnold
refs #10385. Test file is twice as fast now.
There are lots of TODOs still to resolve. We also need to add tests for this sort of multiperiod loading.
Changeset: ccc1c235df1cb2d613326730fd9c0421f0751754
comment:16 Changed 6 years ago by Owen Arnold
refs #10385. Add failing test.
We are not currently handling log extraction properly, which is why the test is failing.
File size is just over the threshold, but shouldn't be a problem. I've tried to trim it down as much as possible.
Changeset: 757b0046c62a550e10f005d04fa2284fa4f18286
comment:17 Changed 6 years ago by Owen Arnold
refs #10385. Remove log values
This fixes the regression test. We clear the logs off the temporary workspace, that way we don't have to do a similar thing in the loop, and we avoid copying them over each time. This is not a fully formed solution, but it's a start.
Changeset: ab1da07fcd610910084b5c5e6ee641efddba9322
comment:18 Changed 6 years ago by Owen Arnold
refs #10385. Reporting added and extra property.
Also better robustness around workspace type.
Changeset: 8e03062228667837665c57b4ef45c5f95f82ef35
comment:19 Changed 6 years ago by Owen Arnold
refs #10385. Extract spectrum info to object.
This is the first step towards reusing the same spectrum information rather than fetching it renewed each time. One caveat with this is that you will HAVE to be sure that all workspaces in the group are equivalent in terms of instrument detector mappings etc. What i intend to do next is:
1) For workspaces that look to relate to multiperiod group workspaces, default to reuse the spectrum-detectector mappings, once obtained on a lazy basis. 2) Put a boolean property with value in place (which defaults to true) called "PredictiveInstrumentLoading". If false, we would go back to old behaviour.
Changeset: 1978e1484b60c48538484ca46450103e829bbb7c
comment:20 Changed 6 years ago by Owen Arnold
refs #10385. Naming issue root problem.
There is a naming issue here, which is actually the root cause of the problem! The query that was being performed retrieved the number of workspace entries, NOT the number of periods. Going forward. If the number of periods exactly matches the number of workspace entries, we have a multiperiod group workspace, and can happily apply the optimisations we want.
Next step will be to actually extract the nperiods. To do that we need to try to fetch it out of the logs for The first entry.
Changeset: fdf6281af77bfb069315c3fae6937bad26e09bab
comment:21 Changed 6 years ago by Owen Arnold
refs #10385. Query default property.
Changeset: d0ba5d5f93dfb7b1c20474eece8952f88099963b
comment:22 Changed 6 years ago by Owen Arnold
refs #10385. Multiperiod query.
Changeset: 8a0c59958fdc0b0f05ecfb616f273b34c6764aa9
comment:23 Changed 6 years ago by Owen Arnold
refs #10385. Reuse the SpecInfo on multiperiod groups.
Changeset: 61959e5705be8423f54eb8a77034d8e739880dfb
comment:24 Changed 6 years ago by Owen Arnold
refs #10385. Try to use clone and write technique.
Changeset: 9a677316ce4294e96ec8a2c793c9d9090dd13be8
comment:25 Changed 6 years ago by Owen Arnold
refs #10385. Test file is twice as fast now.
There are lots of TODOs still to resolve. We also need to add tests for this sort of multiperiod loading.
Changeset: ccc1c235df1cb2d613326730fd9c0421f0751754
comment:26 Changed 6 years ago by Owen Arnold
refs #10385. Add failing test.
We are not currently handling log extraction properly, which is why the test is failing.
File size is just over the threshold, but shouldn't be a problem. I've tried to trim it down as much as possible.
Changeset: 757b0046c62a550e10f005d04fa2284fa4f18286
comment:27 Changed 6 years ago by Owen Arnold
refs #10385. Remove log values
This fixes the regression test. We clear the logs off the temporary workspace, that way we don't have to do a similar thing in the loop, and we avoid copying them over each time. This is not a fully formed solution, but it's a start.
Changeset: ab1da07fcd610910084b5c5e6ee641efddba9322
comment:28 Changed 6 years ago by Owen Arnold
refs #10385. Reporting added and extra property.
Also better robustness around workspace type.
Changeset: 8e03062228667837665c57b4ef45c5f95f82ef35
comment:29 Changed 6 years ago by Owen Arnold
refs #10385 Refactor new functionality to method.
Should read better now, although the entire algorithm is still a mess.
Changeset: aba603d5e0bef0c6a3c3babfef7edc0f46e955d8
comment:30 Changed 6 years ago by Owen Arnold
refs #10385. Remove instrument info caching
This does result in a speed up, but the caching is being bypassed at the moment since our strategy is to clone and overwrite. I've removed the caching code as part of the clean up, but I've kept the SpecInfo type around as it makes the usage clearer, and does not result in a performance penalty.
Changeset: d88bbe4abf5cdb342d58b79897e75486c14d9ce7
comment:31 Changed 6 years ago by Owen Arnold
refs #10385. Minor changes.
Changeset: 5b18d28967654bbbfa5f76fd32b1a2245977b7b1
comment:32 Changed 6 years ago by Owen Arnold
refs #10385. Put error identifer code in place.
Changeset: fb70f730ed01cb46d4a4e064fb989b95ce7e6aed
comment:33 Changed 6 years ago by Owen Arnold
refs #10385 Extend tests.
Extend tests. Do not process spectrum lists.
Changeset: ce10e438130d7a0d0aa3684f6ed5c90b9d9b5d0f
comment:34 Changed 6 years ago by Owen Arnold
- Status changed from inprogress to verify
- Resolution set to fixed
This is being verified as pull request #56.
comment:35 Changed 6 years ago by Owen Arnold
refs #10385. Fix warnings.
Changeset: e084e3b8da4a930d627b8561edfde75d0a401df6
comment:36 Changed 6 years ago by Owen Arnold
Tester:
1) The most important thing to verify is that the load/save unit tests are passing and that the system tests loadlotsoffiles have passed with these changes.
2) Secondly, I removed a nxs data open which will affect workspaces with fractional area. So try, running http://docs.mantidproject.org/nightly/algorithms/SofQW3-v1.html, saving the results using SaveNexus, and reloading them using LoadProcessedNexus. The loaded workspaces should be have the fractional area information.
3) The attached python script would run slowly prior to these changes. It should now report faster load times.
comment:37 Changed 6 years ago by Owen Arnold
refs #10385. Fix cpp check issue.
Changeset: 679c839829669c9252c7d0ea73caef984f968287
comment:39 Changed 6 years ago by Pete Peterson
- Status changed from verify to verifying
- Tester set to Pete Peterson
comment:40 Changed 6 years ago by Pete Peterson
- Status changed from verifying to closed
Merge remote-tracking branch 'origin/feature/10385_loadnexus_speed'
Full changeset: a8fe5a33c2cfb8321d62a8c15b09e2170cae1d52
comment:41 Changed 5 years ago by Stuart Campbell
This ticket has been transferred to github issue 11227