Ticket #8412 (closed: fixed)

Opened 7 years ago

Last modified 5 years ago

Workspaces references held by AlgorithmProxy if algorithm fails

Reported by: Martyn Gigg Owned by: Martyn Gigg
Priority: major Milestone: Release 3.1
Component: Framework Keywords: Maintenance
Cc: Blocked By:
Blocking: Tester: Arturs Bekasovs

Description (last modified by Martyn Gigg) (diff)

When an algorithm fails the AlgorithmProxy still remains in the AlgorithmManager. The proxy has got all of the properties set & so is still holding on to workspace shared_ptrs. When you delete all of the workspaces in Mantid (and do a release free memory on linux) you don't see the memory drop. If you execute FrameworkManager.clearAlgorithms() then the memory drops back to what you would expected.

To reproduce:

  1. Use a big file such as MAP17589.raw. Load the file into Mantid
  2. Take a look at the memory usage of the MantidPlot process, it should be around 2Gb
  3. Run CropWorkspace, replacing the input workspace and set XMin=25000. The algorithm should fail.
  4. Delete the MAP17589 workspace
  5. Click "File->Release Free Memory"
  6. Check the memory usage of the MantidPlot process and it will still be around 2Gb
  7. Open the script window & type and execute FrameworkManager.clearAlgorithms(), the memory should now fall back to (near) what it started with.

Change History

comment:1 Changed 7 years ago by Martyn Gigg

  • Description modified (diff)

comment:2 Changed 7 years ago by Martyn Gigg

Note: This can also be produced by running steps 1->2 of the above instructions, start the CropWorkspace dialog and then click cancel, then continue with steps 4->7

Last edited 7 years ago by Martyn Gigg (previous) (diff)

comment:3 Changed 7 years ago by Martyn Gigg

  • Status changed from new to inprogress

Drop workspace refs in AlgorithmProxy when algorithm finishes

The shared_ptrs held in WorkspaceProperty were set and still held by the AlgorithmProxy object until it got knocked off the AlgorithmManager stack. This gave the impression of a memory leak if an algorithm failed when in fact the references were just being held too long. Refs #8412

Changeset: 9341f190f3c60835a531ed05d5db42deba5a5284

comment:4 Changed 7 years ago by Martyn Gigg

Remove an algorithm from the AlgorithmManager if cancel is clicked.

This drops the workspaces references that the proxy would have acquired. Refs #8412

Changeset: e7b7fdb9012f978f9e77696be04998f54bf7846e

comment:5 Changed 7 years ago by Martyn Gigg

Only drop workspace references for child algorithms.

If dropped from child algorithms they will be removed completely as nothing else has a reference to them. Refs #8412

Changeset: cc2a27af9ea5ebd7f4f719d487dca6f39aad0027

comment:6 Changed 7 years ago by Martyn Gigg

  • Status changed from inprogress to verify
  • Resolution set to fixed

Branch: bugfix/8412_wksp_refs_alg_failure

Tester: This was a tricky problem to understand. My advice would be to follow the steps in the ticket description before merging the branch to make sure you understand what the error is. Once understood, merge the branch and check that both sets of instructions, the description & comment 1, do not produce seemingly leaked memory.

comment:7 Changed 7 years ago by Arturs Bekasovs

  • Status changed from verify to verifying
  • Tester set to Arturs Bekasovs

comment:8 Changed 7 years ago by Arturs Bekasovs

Can reproduce it before merging the branch. Behavior I am getting is exactly what I'd expect given the problem described. The only additional thing I had to do is to release all the free memory once again after step 7.

comment:9 Changed 7 years ago by Arturs Bekasovs

With a branch merged, the problem seems to be solved. Both cancel button and failure cases were tested in different variations and with multiple workspaces. Memory seems to be freed correctly.

Last edited 7 years ago by Arturs Bekasovs (previous) (diff)

comment:10 Changed 7 years ago by Arturs Bekasovs

  • Status changed from verifying to closed

Merge remote-tracking branch 'origin/bugfix/8412_wksp_refs_alg_failure'

Full changeset: e2041ae9716c9018cbb1100e1bbb945dbd9c8e61

comment:11 Changed 7 years ago by Martyn Gigg

Merge remote-tracking branch 'origin/bugfix/8412_wksp_refs_alg_failure'

Full changeset: ce2db3789d7a134780c6eb9c01b66cbfb9f25809

comment:12 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 9257

Note: See TracTickets for help on using tickets.