Ticket #7798 (closed: wontfix)
Linux: TCMalloc does not release free memory when requested
Reported by: | Martyn Gigg | Owned by: | Martyn Gigg |
---|---|---|---|
Priority: | critical | Milestone: | Release 3.3 |
Component: | Framework | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Tester: | Anders Markvardsen |
Description
There are calls to TCMalloc's ReleaseFreeMemory both before and after algorithm execution and also when calling FrameworkManager::clear.
It would appear that these calls do not always do what is expected. For example, the system test performance reports seem to suggest alot of memory loss in many tests:
https://builds.sns.gov/view/All/job/ornl_test_rhel6_develop/System_tests_performance/?
whereas the Windows servers do not show this:
https://builds.sns.gov/view/All/job/ornl_test_windows7_develop/System_tests_performance/?
Investigate what is happening here. The script below (data in systemtests/Data) should leave memory around when the clear should have freed it up:
import mantid from mantid.kernel import MemoryStats from mantid.simpleapi import * mantid.api.FrameworkManager.clear() ## TEST ## memory_before = MemoryStats().residentMem()/1024 wish_ws = Load(Filename='WISH00016748.raw',OutputWorkspace='wish_ws') mantid.api.FrameworkManager.clear() heldMemory = MemoryStats().residentMem()/1024 - memory_before print "Memory held:",heldMemory
Change History
comment:6 Changed 6 years ago by Martyn Gigg
- Status changed from assigned to inprogress
Fix link order of libaries so that tcmalloc is first.
This allows tcmalloc to replace malloc and report the correct memory usage when running through Python, with the exception of systems running gcc 4.4. On these systems, if stdc++ is not linked/loaded first then a segfault occurs when throwing an exception across a dll boundary. Refs #7798
Changeset: 3746a914ff5ab333718ada501c5c0b7fb740b05e
comment:7 Changed 6 years ago by Martyn Gigg
Simplify to process of loading the Python plugins on Linux.
Rearranging the library link order seems to have made obselete the process of of having to force certain libraries to load first. Refs #7798
Changeset: 3b72565c6b22dd965f9e68466efe88cef8573321
comment:8 Changed 6 years ago by Martyn Gigg
Fix segfault on RHEL6.
Each Python module was separately linked to each core library, which is unnecessary. They only each need to link to the boost python libraries. Most systems handled this multiple linking without a problem but RHEL6 would segfault when accessing the NeXus C api. Refs #7798
Changeset: 362641f19a26aa2082c69df6a6caf67d8d1ebb77
comment:9 Changed 6 years ago by Martyn Gigg
Don't be as aggressive trying to force memory release.
It can be a time consuming operation and now that managed workspaces have gone we're not so reliant on being able to programatically judge the amount of memory available. Refs #7798
Changeset: 97802eedeb8836ad3c71a42d72c2a8a1127164e3
comment:10 Changed 6 years ago by Martyn Gigg
Put back MPI libraries in the Python layers.
Refs #7798
Changeset: 8b5796b916554920d88e6aed1f5680d8a6412f0c
comment:11 Changed 6 years ago by Martyn Gigg
comment:12 Changed 6 years ago by Martyn Gigg
Link Python libraries to the _kernel module and nothing else.
Refs #7798
Changeset: f45fd5995802b83b150eb8efb0b0d5e24cdb8ad8
comment:13 Changed 6 years ago by Martyn Gigg
Fix gcc test in Kernel link line.
It needs to catch all of 4.4 so restricts to > 4.5 Refs #7798
Changeset: 691e8a938f1cd973c55c803ba03b2217646a53ac
comment:14 Changed 6 years ago by Martyn Gigg
Also place NeXus first in linker list in Kernel for gcc 4.4
If it is not after stdc++ then we get a crash in Python when accessing the C api. Refs #7798
Changeset: 20230faf14bfc3c4cdabe2c16ad733492caa20f4
comment:15 Changed 6 years ago by Martyn Gigg
Also place NeXus first in linker list in Kernel for gcc 4.4
If it is not after stdc++ then we get a crash in Python when accessing the C api. Refs #7798
Changeset: 20230faf14bfc3c4cdabe2c16ad733492caa20f4
comment:16 Changed 6 years ago by Martyn Gigg
- Status changed from inprogress to verify
- Resolution set to wontfix
After discussions with the TSC it has been decided to go down a different path with tcmalloc, as described in #10271. As a result I am abandoning the changes here to avoid confusion with what will become the actual work.
The branch has been deleted so there is nothing to merge.
comment:17 Changed 6 years ago by Anders Markvardsen
- Status changed from verify to verifying
- Tester set to Anders Markvardsen
comment:18 Changed 6 years ago by Anders Markvardsen
- Status changed from verifying to closed
Superceeded by #10271
comment:19 Changed 5 years ago by Stuart Campbell
This ticket has been transferred to github issue 8643