Ticket #8412 (closed: fixed)
Workspaces references held by AlgorithmProxy if algorithm fails
Reported by: | Martyn Gigg | Owned by: | Martyn Gigg |
---|---|---|---|
Priority: | major | Milestone: | Release 3.1 |
Component: | Framework | Keywords: | Maintenance |
Cc: | Blocked By: | ||
Blocking: | Tester: | Arturs Bekasovs |
Description (last modified by Martyn Gigg) (diff)
When an algorithm fails the AlgorithmProxy still remains in the AlgorithmManager. The proxy has got all of the properties set & so is still holding on to workspace shared_ptrs. When you delete all of the workspaces in Mantid (and do a release free memory on linux) you don't see the memory drop. If you execute FrameworkManager.clearAlgorithms() then the memory drops back to what you would expected.
To reproduce:
- Use a big file such as MAP17589.raw. Load the file into Mantid
- Take a look at the memory usage of the MantidPlot process, it should be around 2Gb
- Run CropWorkspace, replacing the input workspace and set XMin=25000. The algorithm should fail.
- Delete the MAP17589 workspace
- Click "File->Release Free Memory"
- Check the memory usage of the MantidPlot process and it will still be around 2Gb
- Open the script window & type and execute FrameworkManager.clearAlgorithms(), the memory should now fall back to (near) what it started with.
Change History
comment:2 Changed 7 years ago by Martyn Gigg
Note: This can also be produced by running steps 1->2 of the above instructions, start the CropWorkspace dialog and then click cancel, then continue with steps 4->7
comment:3 Changed 7 years ago by Martyn Gigg
- Status changed from new to inprogress
Drop workspace refs in AlgorithmProxy when algorithm finishes
The shared_ptrs held in WorkspaceProperty were set and still held by the AlgorithmProxy object until it got knocked off the AlgorithmManager stack. This gave the impression of a memory leak if an algorithm failed when in fact the references were just being held too long. Refs #8412
Changeset: 9341f190f3c60835a531ed05d5db42deba5a5284
comment:4 Changed 7 years ago by Martyn Gigg
Remove an algorithm from the AlgorithmManager if cancel is clicked.
This drops the workspaces references that the proxy would have acquired. Refs #8412
Changeset: e7b7fdb9012f978f9e77696be04998f54bf7846e
comment:5 Changed 7 years ago by Martyn Gigg
Only drop workspace references for child algorithms.
If dropped from child algorithms they will be removed completely as nothing else has a reference to them. Refs #8412
Changeset: cc2a27af9ea5ebd7f4f719d487dca6f39aad0027
comment:6 Changed 7 years ago by Martyn Gigg
- Status changed from inprogress to verify
- Resolution set to fixed
Branch: bugfix/8412_wksp_refs_alg_failure
Tester: This was a tricky problem to understand. My advice would be to follow the steps in the ticket description before merging the branch to make sure you understand what the error is. Once understood, merge the branch and check that both sets of instructions, the description & comment 1, do not produce seemingly leaked memory.
comment:7 Changed 7 years ago by Arturs Bekasovs
- Status changed from verify to verifying
- Tester set to Arturs Bekasovs
comment:8 Changed 7 years ago by Arturs Bekasovs
Can reproduce it before merging the branch. Behavior I am getting is exactly what I'd expect given the problem described. The only additional thing I had to do is to release all the free memory once again after step 7.
comment:9 Changed 7 years ago by Arturs Bekasovs
With a branch merged, the problem seems to be solved. Both cancel button and failure cases were tested in different variations and with multiple workspaces. Memory seems to be freed correctly.
comment:10 Changed 7 years ago by Arturs Bekasovs
- Status changed from verifying to closed
Merge remote-tracking branch 'origin/bugfix/8412_wksp_refs_alg_failure'
Full changeset: e2041ae9716c9018cbb1100e1bbb945dbd9c8e61
comment:11 Changed 7 years ago by Martyn Gigg
Merge remote-tracking branch 'origin/bugfix/8412_wksp_refs_alg_failure'
Full changeset: ce2db3789d7a134780c6eb9c01b66cbfb9f25809
comment:12 Changed 5 years ago by Stuart Campbell
This ticket has been transferred to github issue 9257