Ticket #3557 (closed: fixed)

Opened 9 years ago

Last modified 5 years ago

MergeMDEW: Algorithm to merge many MDEventWorkspaces

Reported by: Janik Zikovsky Owned by: Janik Zikovsky
Priority: major Milestone: Iteration 30
Component: Mantid Keywords:
Cc: Blocked By: #3571, #3605, #3661
Blocking: Tester: Owen Arnold

Description

Need a method for merging a large number of MDEventWorkspaces together WITHOUT EXPLODING MEMORY! This mostly for inelastic people wanting to add together N runs at N small rotations.

I think the way to go is:

  • Generate a MDEventWorkspaces NXS file for each run with a fixed box structure.
    • This would be a MaxDepth=1 structure but with finer boxes, maybe 50x50x50.
    • This can be done immediately after acquiring each run so that less processing has to be done at once
  • Supply a list of all the filenames to merge
    • What's the GUI possibilities for this? Would be nice to select a bunch of files.
  • Go through each BOX to sum up a small part of the events from EACH file, and dump these into a large (file-backed) MDEventWorkspace that CAN split however is best.

Attachments

MergeMD.py (1.0 KB) - added by Owen Arnold 9 years ago.

Change History

comment:1 Changed 9 years ago by Janik Zikovsky

  • Status changed from new to accepted

comment:2 Changed 9 years ago by Janik Zikovsky

  • Blocked By 3571 added

comment:3 Changed 9 years ago by Janik Zikovsky

In [14010]:

Refs #3571: MultipleFileProperty for selecting multiple files. Refs #3557: MergeMDEW stub.

comment:3 Changed 9 years ago by Janik Zikovsky

In [14012]:

Refs #3557: Fix test build.

comment:4 Changed 9 years ago by Janik Zikovsky

In [14015]:

Refs #3557: MergeMDEW works in memory (for now)

comment:5 Changed 9 years ago by Janik Zikovsky

In [14018]:

Refs #3557: MergeMDEW to a file-backed MDEW.

comment:6 Changed 9 years ago by Janik Zikovsky

In [14071]:

Refs #3557: splitAllIfNeeded() method for MDBoxes will save data to disk as it goes along. Seems too slow in practice. Used in MergeMDEW.

comment:7 Changed 9 years ago by Janik Zikovsky

In [14076]:

Refs #3557: MergeMDEW refactored to use tasks and separated into methods for clarity. MacOS build fix.

comment:8 Changed 9 years ago by Janik Zikovsky

In [14102]:

Refs #3557: MergeMDEW update.

comment:9 Changed 9 years ago by Janik Zikovsky

In [14147]:

Refs #3557: MergeMDEW benchmarked at 12 minutes for 500 million events.

comment:10 Changed 9 years ago by Janik Zikovsky

  • Blocked By 3605 added

comment:11 Changed 9 years ago by Janik Zikovsky

In [14207]:

Refs #3557: CloneMDEW has parameter to specify the file.

comment:12 Changed 9 years ago by Janik Zikovsky

In [14219]:

Refs #3557: This version of the algorithm ran through a billion events in 25 minutes but was heavy in memory for a few boxes with too many events.

comment:13 Changed 9 years ago by Janik Zikovsky

In [14220]:

Refs #3557: win build fix.

comment:14 Changed 9 years ago by Janik Zikovsky

In [14226]:

Refs #3557: With a more finely split input, MergeMDEW takes 25 minutes for 1 billion events but does not go blow up memory, remaining < 4 GB. This is in debug mode.

comment:15 Changed 9 years ago by Janik Zikovsky

In [14227]:

Refs #3620: Progress reporting can give fractions of a percent; also can show estimated time to algorithm completion. Refs #3557: Better progress reporting in MergeMDEW.

comment:16 Changed 9 years ago by Stuart Campbell

In [14273]:

Cleanup temp files after test has run. refs #3557

comment:17 Changed 9 years ago by Janik Zikovsky

In [14276]:

Refs #3557: Close file handles to hopefully fix windows test

comment:18 Changed 9 years ago by Janik Zikovsky

In [14277]:

Refs #3557: Comment out file removal since it still fails

comment:19 Changed 9 years ago by Stuart Campbell

Last edited 9 years ago by Stuart Campbell (previous) (diff)

comment:20 Changed 9 years ago by Janik Zikovsky

  • Blocked By 3661 added

comment:21 Changed 9 years ago by Janik Zikovsky

In [14384]:

Refs #3557: MergeMDEW slight rework

comment:22 Changed 9 years ago by Janik Zikovsky

In [14386]:

Refs #3557, Refs #3652: This should fix the Mac hanging bug.

comment:23 Changed 9 years ago by Janik Zikovsky

In [14401]:

Refs #3557: Clean up files after test

comment:24 Changed 9 years ago by Janik Zikovsky

In [14402]:

Refs #3557: Clean up files after test

comment:25 Changed 9 years ago by Janik Zikovsky

In [14403]:

Refs #3557: reverting

comment:26 Changed 9 years ago by Janik Zikovsky

  • Status changed from accepted to verify
  • Resolution set to fixed

comment:27 Changed 9 years ago by Owen Arnold

  • Status changed from verify to verifying
  • Tester set to Owen Arnold

comment:28 Changed 9 years ago by Owen Arnold

  • Status changed from verifying to closed

Fully working as far as I can tell. Generated 20 rotated workspaces (in one axis). All have common box structure. After merging, number of events in merged workspace is exactly the same as the sum across all input workspaces. Visualisation of output workspace shows Bragg peaks tracing arc in one dimension, consistent with rotation.

comment:29 Changed 9 years ago by Owen Arnold

Test generator ... import time LoadEventNexus(Filename='C:/mantid/Test/AutoTestData/CNCS_7860_event.nxs', OutputWorkspace='CNCS_7860_event_NXS',CompressTolerance='0.10000000000000001')

start = time.time(); for omega in xrange(0, 20):

print "Starting omega %03d degrees" % omega CreateMDWorkspace(Dimensions='3',Extents='-5,5,-5,5,-5,5',Names='Q_sample_x,Q_sample_y,Qsample_z',Units='A,A,A'

,SplitInto='5',SplitThreshold='2000',MaxRecursionDepth='3',

MinRecursionDepth='3', OutputWorkspace='CNCS_7860_event_MD')

# Convert events to MD events AddSampleLog("CNCS_7860_event_NXS", "omega", "%s" % omega, "Number Series") AddSampleLog("CNCS_7860_event_NXS", "chi", "%s" % 0, "Number Series") AddSampleLog("CNCS_7860_event_NXS", "phi", "%s" % 0, "Number Series") ConvertToDiffractionMDWorkspace(InputWorkspace='CNCS_7860_event_NXS'

,OutputWorkspace='CNCS_7860_event_MD',OutputDimensions='Q(sampleframe)' ,LorentzCorrection='1')

SaveMD("CNCS_7860_event_MD", "C:/Users/spu92482/Desktop/CNCS/CNCS_7860_event_rotated_%03d.nxs" % omega) print time.time() - start, " seconds since start."

Changed 9 years ago by Owen Arnold

comment:30 Changed 9 years ago by Owen Arnold

Better way of distributing script than previous.

comment:31 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 4404

Note: See TracTickets for help on using tickets.