Ticket #2452 (closed: fixed)

Opened 10 years ago

Last modified 5 years ago

LoadEventNexus: Optimize processor usage w/ thread pool

Reported by: Janik Zikovsky Owned by: Janik Zikovsky
Priority: major Milestone: Iteration 28
Component: Mantid Keywords:
Cc: Blocked By: #2366
Blocking: Tester: Vickie Lynch

Description

... since banks are often of different sizes, CPU may be mis-allocated. In one test (SNAP_4112) only 4 cores were really fully used (out of 8) on my system.

Separating disk IO and processing might help?

Change History

comment:1 Changed 10 years ago by Janik Zikovsky

  • Status changed from new to accepted

comment:2 Changed 10 years ago by Janik Zikovsky

# Speed test on a bunch of files
import time
names = ["SNAP_4104", "TOPAZ_1715", "SNAP_4112", "PG3_1370", "TOPAZ_1825", "SEQ_4533"]
	
for name in names:
	start = time.time()
	LoadEventNexus(Filename="/home/8oz/data/%s_event.nxs"%name,OutputWorkspace=name,SingleBankPixelsOnly="0",Precount="1")
	print time.time()-start, "secs for", name
	DeleteWorkspace(name)

Results:

 19.4002780914 secs for SNAP_4104
 17.7047979832 secs for TOPAZ_1715
 129.67632103 secs for SNAP_4112
 15.9923739433 secs for PG3_1370
 205.588195086 secs for TOPAZ_1825
 21.1869170666 secs for SEQ_4533

comment:3 Changed 10 years ago by Janik Zikovsky

First try, simply replacing OpenMP loop with a thread pool:

20.0163600445 secs for SNAP_4104
17.9157001972 secs for TOPAZ_1715
13.6152219772 secs for PG3_1370
16.4156320095 secs for SEQ_4533
206.595791817 secs for TOPAZ_1825
106.535076857 secs for SNAP_4112

15-20 % faster already (except for TOPAZ)

comment:4 Changed 10 years ago by Janik Zikovsky

Mutex-aware thread pool:

19.7484800816 secs for SNAP_4104
17.8455410004 secs for TOPAZ_1715
13.8989241123 secs for PG3_1370
18.4017541409 secs for SEQ_4533
205.25503397 secs for TOPAZ_1825
103.715818882 secs for SNAP_4112

comment:5 Changed 10 years ago by Janik Zikovsky

Improved the scheduling by using cost estimates.

19.6926910877 secs for SNAP_4104
17.6305940151 secs for TOPAZ_1715
13.9349930286 secs for PG3_1370
17.384773016 secs for SEQ_4533
201.705959082 secs for TOPAZ_1825
103.561197996 secs for SNAP_4112

comment:6 Changed 10 years ago by Janik Zikovsky

  • Status changed from accepted to verify
  • Resolution set to fixed

(In [9730]) Fixes #2452: ThreadPool implementation of LoadEventNexus; speed up can be 20% or so especially if there are empty banks, or no speed-up for other files.

comment:7 Changed 9 years ago by Vickie Lynch

  • Status changed from verify to verifying
  • Tester set to Vickie Lynch

comment:8 Changed 9 years ago by Vickie Lynch

  • Status changed from verifying to closed

Could not find event file for SEQ_4533, but other timings were better than 2 months ago using above script: 4.09357094765 secs for SNAP_4104 4.24875497818 secs for TOPAZ_1715 92.8791759014 secs for SNAP_4112 12.7033641338 secs for PG3_1370 187.958839893 secs for TOPAZ_1825

comment:9 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 3299

Note: See TracTickets for help on using tickets.