Ticket #7263 (closed: fixed)

Opened 7 years ago

Last modified 5 years ago

Load Algorithms, improve selection speed, particularly for Nexus files

Reported by: Nick Draper Owned by: Martyn Gigg
Priority: critical Milestone: Release 2.6
Component: Framework Keywords:
Cc: Blocked By: #7352
Blocking: #7065 Tester: Owen Arnold

Description (last modified by Peter Peterson) (diff)

If we pass around an open file pointer to the FileCheck methods nexus loaders (and others) can use the open pointer to perform their tests, rather than each open the file themselves. This should be much faster.

Source email from James Lord:

If I have a script/algorithm to process a batch of runs and I write it in a generic 
way using Load(), it takes about twice as long as using the correct specific load 
routine such as LoadMuonNexus(). I suspect Load() is first opening the file to 
inspect the format then closing it and passing the filename onto the format-specific 
version, which opens it again and ignores any cached data.

I've checked with files from the local disk and those from a network drive 
(//hifi/data) and while the local files load faster there is still the same 
improvement to be had by switching to LoadMuonNexus().

In this case I am (intentionally) loading all spectra, and the logs, from muon 
Nexus files of about 750kbytes each (on disk).

Is there any improvement to be made such as caching the file or being able to 
"hint" to Load() that the next file in a sequence is almost certainly going to be 
the same file format as the last one it just loaded?

James Lord.
(Mantid 2.5.3 at present, Windows 7)

Attachments

load.py (223 bytes) - added by Martyn Gigg 7 years ago.

Change History

comment:1 Changed 7 years ago by Peter Peterson

  • Description modified (diff)

Just formatting the quote so it is a bit easier to read.

comment:2 Changed 7 years ago by Peter Peterson

An additional bit of customization for the nexus readers:

  1. We have several "nexus" readers
  2. Nexus files have a hierarchical structure
  3. For nexus based loaders we could pass around a bit of the hierarchy (as some sort of object?) for each one to do the deeper check against rather than the filehandle for each of them to build part of the hierarchy against
  4. Most nexus readers should be looking at the value of /entry/definition (and its attributes) as their first check

comment:3 Changed 7 years ago by Martyn Gigg

  • Blocked By 7352 added

comment:10 Changed 7 years ago by Martyn Gigg

Avoid duplicate Algorithm registration in API tests. Refs #7263

Changeset: 1ef3b9eac35deff16ec10b9663dd727eda2d2028

comment:11 Changed 7 years ago by Martyn Gigg

FileLoaderRegistry now subscribes algorithms to factory

and also defines two types of loader, NonHDF and HDF. Refs #7263

Changeset: 1571071e92df153fffebbf6e385cee40332ba18a

comment:12 Changed 7 years ago by Martyn Gigg

Add HDFDescriptor class that will describe HDF files.

The checks for the headers at the start of HDF4 & HDF5 files have been put in a static isHDF member to encapsulate the details about checking for a HDF file and make it reusable. It will not retain an open file handle to the file so does not inherit from the standard FileDescriptor. Refs #7263

Changeset: 8220d8c039e66db09b6eb912fb067638eaab4eb0

comment:13 Changed 7 years ago by Martyn Gigg

Add pathExists & classTypeExists checks to HDFDDescriptor. Refs #7263

Changeset: f28e98634808556c3a9f8349b8c8af86a36d00d8

comment:14 Changed 7 years ago by Martyn Gigg

Implement loader searching in FileLoaderRegistry.

The FileLoaderRegistry can now be asked to search its list of loaders for the best that matches a given file. It passes along a *Descriptor object to minimize the number of times the file is opened.

Refs #7263

Changeset: b3a0dd67688dd1cfbc77f3f48e338195e012a689

comment:15 Changed 7 years ago by Martyn Gigg

Add ability to restrict path search in HDF file by type.

Refs #7263

Changeset: 50a5d7fb76c0ef5de4032c8c131319a2e47db53c

comment:16 Changed 7 years ago by Martyn Gigg

Add root attribute query methods to HDFDescriptor. Refs #7263

Changeset: 49fcfb3df704c1bff0f50f7fe064bb7644cf65b9

comment:17 Changed 7 years ago by Martyn Gigg

First lot of updates for new loader check style. Refs #7263

Changeset: f6b6a6ce3bf2184e13c6da4dc9590fc697345e50

comment:18 Changed 7 years ago by Martyn Gigg

Refs #7263 Formatting changes only

Changeset: e97ed18738e429bb81d79607c8356146380b7543

comment:19 Changed 7 years ago by Martyn Gigg

LoadILL changes. Refs #7263

Changeset: 79e76cb75788bcad9601485df20500795fdef3ea

comment:20 Changed 7 years ago by Martyn Gigg

Convert more loaders to new structure. Refs #7263

Changeset: ff828f6f5452c5b66e338597c9483779d9fb2676

comment:21 Changed 7 years ago by Martyn Gigg

More NeXus loaders converted. Refs #7263

Changeset: e61a0109313f6d000870dab7d139d4b44134791e

comment:22 Changed 7 years ago by Martyn Gigg

Add basic code for FileLoaderRegistry. Refs #7263

Currently based on string names.

Changeset: 886d0b00f831d546c4a148ced4039a0d711ccf5c

comment:23 Changed 7 years ago by Martyn Gigg

Add DECLARE_FILELOADER_ALGORITHM macro. Refs #7263

Changeset: b846d131744cfed0a7c35ea29b8eeb82323247ba

comment:24 Changed 7 years ago by Martyn Gigg

Add a simple FileDescriptor class that wraps a file stream

The stream is open for as long as the object is alive. Refs #7263

FileDescriptor uses streams rather than c file handles. Refs #7263

Changeset: 8681dffc29bc6a0c75b9fb931466d0857d76c23b

comment:25 Changed 7 years ago by Martyn Gigg

Allow move_class.py to move untracked files. Refs #7263

Changeset: 12dfaa006ada5f6def6b4fe1b73f658741bf5185

comment:26 Changed 7 years ago by Martyn Gigg

Fix AlgorithmFactory::unsubscribe method to updates version map.

It was previously not updating the version map when the algorithm was removed and then subscribing the same thing again would throw an error. Refs #7263

Changeset: f6992fc0057c1d0c27020a953126df5282ef6e7b

comment:27 Changed 7 years ago by Martyn Gigg

Have AlgorithmFactory::subscribe return the class name subscribed.

Refs #7263

Changeset: 7cff7af939060c3d81fd2b47b1db92c29e5a69fc

comment:28 Changed 7 years ago by Martyn Gigg

Avoid duplicate Algorithm registration in API tests. Refs #7263

Changeset: 2ba04b3ffffbd3ac9d4aef91165f4a325f314a4e

comment:29 Changed 7 years ago by Martyn Gigg

FileLoaderRegistry now subscribes algorithms to factory

and also defines two types of loader, NonHDF and HDF. Refs #7263

Changeset: 79e6051d0c199175aca57ff8790994d1e49ceee0

comment:30 Changed 7 years ago by Martyn Gigg

Add HDFDescriptor class that will describe HDF files.

The checks for the headers at the start of HDF4 & HDF5 files have been put in a static isHDF member to encapsulate the details about checking for a HDF file and make it reusable. It will not retain an open file handle to the file so does not inherit from the standard FileDescriptor. Refs #7263

Changeset: 6568320654211aa04f89504190f6450efa5d4db1

comment:31 Changed 7 years ago by Martyn Gigg

Add pathExists & classTypeExists checks to HDFDDescriptor. Refs #7263

Changeset: 429bce6457a2d585ed1968271810e6e8f1a1469c

comment:32 Changed 7 years ago by Martyn Gigg

Implement loader searching in FileLoaderRegistry.

The FileLoaderRegistry can now be asked to search its list of loaders for the best that matches a given file. It passes along a *Descriptor object to minimize the number of times the file is opened.

Refs #7263

Changeset: 9b885511a5870d50b22e1eea10176edd20fe40f2

comment:33 Changed 7 years ago by Martyn Gigg

Add ability to restrict path search in HDF file by type.

Refs #7263

Changeset: eb79a4e433be3e3873635e2ed1e479068cb6b86a

comment:34 Changed 7 years ago by Martyn Gigg

Add root attribute query methods to HDFDescriptor. Refs #7263

Changeset: 4293cc33b80d553a09f04a4c5e5d0bf077cf8fef

comment:35 Changed 7 years ago by Martyn Gigg

First lot of updates for new loader check style. Refs #7263

Changeset: b4ba3236f57a206cd1b31c515a098ffdd555e47d

comment:36 Changed 7 years ago by Martyn Gigg

Refs #7263 Formatting changes only

Changeset: d6aabf8a2dc2a120d639feadce6d8089910cf8a6

comment:37 Changed 7 years ago by Martyn Gigg

LoadILL changes. Refs #7263

Changeset: bc0d6b3e8b8d052e48c6ed3acdd7b0a5604ffd83

comment:38 Changed 7 years ago by Martyn Gigg

Convert more loaders to new structure. Refs #7263

Changeset: 4722d5bf1a76540bbc18180ee621d535cf427bd7

comment:39 Changed 7 years ago by Martyn Gigg

More NeXus loaders converted. Refs #7263

Changeset: 29d535d4c94d86696fd7770a89115dd2e1289b90

comment:40 Changed 7 years ago by Martyn Gigg

Cache first entry name & type in HDFDescriptor. Refs #7263

Changeset: e8d298baff3ff52270e8a02e3b75fc8f6af5be47

comment:41 Changed 7 years ago by Martyn Gigg

Final NeXus loader converted. Refs #7263

Changeset: 04e325763073e37ed282d0501d29e298157d6621

comment:42 Changed 7 years ago by Martyn Gigg

Add static isAscii checks to FileDescriptor class. Refs #7263

Changeset: c0b04b5be6d9c40d395f8b7d396dab6f3ffa6bd7

comment:43 Changed 7 years ago by Martyn Gigg

Start on general file loader conversion. Refs #7263

Changeset: 3f102e2d868d4f17f2f409f275bfe578539a0945

comment:44 Changed 7 years ago by Martyn Gigg

More loaders converted. Refs #7263

Changeset: 71706740104269b15e2ac7021086b1e961fda55f

comment:45 Changed 7 years ago by Martyn Gigg

More loaders migrated. Refs #7263

Changeset: 7a1bd4e3e72dd409345dcd8c1dcbe863573ba59e

comment:46 Changed 7 years ago by Karl Palmen

  • Blocking 7065 added

comment:47 Changed 7 years ago by Martyn Gigg

Add basic code for FileLoaderRegistry. Refs #7263

Currently based on string names.

Changeset: c5ab6b955dfc544c1a43f00d526ebbb7829d1211

comment:48 Changed 7 years ago by Martyn Gigg

Add DECLARE_FILELOADER_ALGORITHM macro. Refs #7263

Changeset: 9dc140841fdd8d29ca1d6ba2bdd949c7c41ee1af

comment:49 Changed 7 years ago by Martyn Gigg

Add a simple FileDescriptor class that wraps a file stream

The stream is open for as long as the object is alive. Refs #7263

FileDescriptor uses streams rather than c file handles. Refs #7263

Changeset: 02509f87a8b33fd7973951c54abb5e7555685ec9

comment:50 Changed 7 years ago by Martyn Gigg

Allow move_class.py to move untracked files. Refs #7263

Changeset: df6f2db124b0e3f6712adc42ea3f517ee6d02c2a

comment:51 Changed 7 years ago by Martyn Gigg

Fix AlgorithmFactory::unsubscribe method to updates version map.

It was previously not updating the version map when the algorithm was removed and then subscribing the same thing again would throw an error. Refs #7263

Changeset: a23d639411aedc0dd1e726d369fcacbba28b73c4

comment:52 Changed 7 years ago by Martyn Gigg

Have AlgorithmFactory::subscribe return the class name subscribed.

Refs #7263

Changeset: 5fb5a0ba8fe71656ee4a4468bdea23acd9e85348

comment:53 Changed 7 years ago by Martyn Gigg

Avoid duplicate Algorithm registration in API tests. Refs #7263

Changeset: 688d09f690685015f61289beb0a14f4bfc766480

comment:54 Changed 7 years ago by Martyn Gigg

FileLoaderRegistry now subscribes algorithms to factory

and also defines two types of loader, NonHDF and HDF. Refs #7263

Changeset: e77a0a5738a1914befe30741d366495a4953f4c3

comment:55 Changed 7 years ago by Martyn Gigg

Add HDFDescriptor class that will describe HDF files.

The checks for the headers at the start of HDF4 & HDF5 files have been put in a static isHDF member to encapsulate the details about checking for a HDF file and make it reusable. It will not retain an open file handle to the file so does not inherit from the standard FileDescriptor. Refs #7263

Changeset: b33429b36beee34fcaf3d191ceb735522ce3e5ec

comment:56 Changed 7 years ago by Martyn Gigg

Add pathExists & classTypeExists checks to HDFDDescriptor. Refs #7263

Changeset: 90537dd6136b16a531bd2e454e85bf32d086b5d8

comment:57 Changed 7 years ago by Martyn Gigg

Implement loader searching in FileLoaderRegistry.

The FileLoaderRegistry can now be asked to search its list of loaders for the best that matches a given file. It passes along a *Descriptor object to minimize the number of times the file is opened.

Refs #7263

Changeset: ee1d2ba84a8748c70ae32afb7f980b9d849a29c9

comment:58 Changed 7 years ago by Martyn Gigg

Add ability to restrict path search in HDF file by type.

Refs #7263

Changeset: 0908502cb75249ec3a7431db2e11e4f3e2a44ff5

comment:59 Changed 7 years ago by Martyn Gigg

Add root attribute query methods to HDFDescriptor. Refs #7263

Changeset: 4743494db4debe9cf778e4a1610e67c89d76b89f

comment:60 Changed 7 years ago by Martyn Gigg

Cache first entry name & type in HDFDescriptor. Refs #7263

Changeset: 3b88b845022c5653d8cfa6ce34cf4fa229d43d3c

comment:61 Changed 7 years ago by Martyn Gigg

Refs #7263 Formatting changes only

Changeset: 574a70fe79a4720f96e73a0ef406e4679a57041b

comment:62 Changed 7 years ago by Martyn Gigg

First lot of updates for new loader check style. Refs #7263

Changeset: 11988b9a591149fde1dd15e28d0e4b0968a5f8a8

comment:63 Changed 7 years ago by Martyn Gigg

Add static isAscii checks to FileDescriptor class. Refs #7263

Changeset: d94c7d27c1d50d3549081a8c39608445dfa15beb

comment:64 Changed 7 years ago by Martyn Gigg

General file loader conversion. Refs #7263

Changeset: 5fa890fbdf81568763df6c5d16d16341ba9a6ed7

comment:65 Changed 7 years ago by Martyn Gigg

Cache isAscii flag in FileDescriptor. Refs #7263

Changeset: 9a2601fcdf1a94350f05be91554ff82d9faa11c3

comment:66 Changed 7 years ago by Martyn Gigg

Port final DataHandling loaders to new structure. Refs #7263

Changeset: b3236b5c18c398bf322c23de241ba9e5384c5cdf

comment:67 Changed 7 years ago by Martyn Gigg

Move FileRegistry to singleton Refs #7263

It appears that we can't start the FrameworkManager during static initialization or the destruction of the FrameworkManager instance gets instance gets confused.

Changeset: db8d95c8cdaf648b2439c416724bb59ac2617d65

comment:68 Changed 7 years ago by Martyn Gigg

Move the remaining loaders to the new scheme. Refs #7263

Changeset: 2a330f57d781248e4b8ec0502726d1821c5004c5

comment:69 Changed 7 years ago by Martyn Gigg

Remove file access from some NeXus checks. Refs #7263

Changeset: 146d72e7c593ab85fed6009cb6493a76633a0a40

comment:70 Changed 7 years ago by Martyn Gigg

Fix bug with running correct version of loader. Refs #7263

Changeset: e1dcfe7d4145ea6c9a1d7d51435afffd3cc27c3d

comment:71 Changed 7 years ago by Martyn Gigg

Flip path/type map in HDFDescriptor. Refs #7263

Looking up paths is far more common so make that faster.

Changeset: 4c4155e8e58c12d3dcac4430bc95d233fd442025

comment:72 Changed 7 years ago by Martyn Gigg

FileDescriptor::resetStreamToStart needs to reset ios flags too.

Refs #7263

Changeset: c9f87457614765bdf53272037f567d0eae486c13

comment:73 Changed 7 years ago by Martyn Gigg

Add base-class checks to FILELOADER* check macros. Refs #7263

Also added runtime checks to the subscribe method. This allows it to static cast at runtime when checking rather than a more expensive dynamic cast.

Changeset: 34785e5194cf9bc29e5435b4973f541ba76ed4df

comment:74 Changed 7 years ago by Martyn Gigg

Fix LoadMD::confidence path checks. Refs #7263

Changeset: 961b02983a7752038c2532068499f794a8cbc60c

comment:75 Changed 7 years ago by Martyn Gigg

Fix EventPreNexus & RKH file checks. Refs #7263

Changeset: 7b1bfbe989a08c9dadc96056e2e8e9cfdd48fcee

comment:76 Changed 7 years ago by Martyn Gigg

Add access to open NeXus file in HDFDescriptor. Refs #7263

Changeset: 19119d59871acd545067afbca480bce35ee22941

comment:77 Changed 7 years ago by Martyn Gigg

Allow stored NeXus handle to be use in ::confidence checks. Refs #7263

Changeset: d60fc3449766ac8f8a29363ca8099e65199ab7c2

comment:78 Changed 7 years ago by Martyn Gigg

Ignore new LoaderVersion property in Python Load. Refs #7263

Changeset: 62314ef77d2c164f7d87892da7a333e40628a12d

comment:79 Changed 7 years ago by Martyn Gigg

Add basic code for FileLoaderRegistry. Refs #7263

Currently based on string names.

Changeset: c5ab6b955dfc544c1a43f00d526ebbb7829d1211

comment:80 Changed 7 years ago by Martyn Gigg

Add DECLARE_FILELOADER_ALGORITHM macro. Refs #7263

Changeset: 9dc140841fdd8d29ca1d6ba2bdd949c7c41ee1af

comment:81 Changed 7 years ago by Martyn Gigg

Add a simple FileDescriptor class that wraps a file stream

The stream is open for as long as the object is alive. Refs #7263

FileDescriptor uses streams rather than c file handles. Refs #7263

Changeset: 02509f87a8b33fd7973951c54abb5e7555685ec9

comment:82 Changed 7 years ago by Martyn Gigg

Allow move_class.py to move untracked files. Refs #7263

Changeset: df6f2db124b0e3f6712adc42ea3f517ee6d02c2a

comment:83 Changed 7 years ago by Martyn Gigg

Fix AlgorithmFactory::unsubscribe method to updates version map.

It was previously not updating the version map when the algorithm was removed and then subscribing the same thing again would throw an error. Refs #7263

Changeset: a23d639411aedc0dd1e726d369fcacbba28b73c4

comment:84 Changed 7 years ago by Martyn Gigg

Have AlgorithmFactory::subscribe return the class name subscribed.

Refs #7263

Changeset: 5fb5a0ba8fe71656ee4a4468bdea23acd9e85348

comment:85 Changed 7 years ago by Martyn Gigg

Avoid duplicate Algorithm registration in API tests. Refs #7263

Changeset: 688d09f690685015f61289beb0a14f4bfc766480

comment:86 Changed 7 years ago by Martyn Gigg

FileLoaderRegistry now subscribes algorithms to factory

and also defines two types of loader, NonHDF and HDF. Refs #7263

Changeset: e77a0a5738a1914befe30741d366495a4953f4c3

comment:87 Changed 7 years ago by Martyn Gigg

Add HDFDescriptor class that will describe HDF files.

The checks for the headers at the start of HDF4 & HDF5 files have been put in a static isHDF member to encapsulate the details about checking for a HDF file and make it reusable. It will not retain an open file handle to the file so does not inherit from the standard FileDescriptor. Refs #7263

Changeset: b33429b36beee34fcaf3d191ceb735522ce3e5ec

comment:88 Changed 7 years ago by Martyn Gigg

Add pathExists & classTypeExists checks to HDFDDescriptor. Refs #7263

Changeset: 90537dd6136b16a531bd2e454e85bf32d086b5d8

comment:89 Changed 7 years ago by Martyn Gigg

Implement loader searching in FileLoaderRegistry.

The FileLoaderRegistry can now be asked to search its list of loaders for the best that matches a given file. It passes along a *Descriptor object to minimize the number of times the file is opened.

Refs #7263

Changeset: ee1d2ba84a8748c70ae32afb7f980b9d849a29c9

comment:90 Changed 7 years ago by Martyn Gigg

Add ability to restrict path search in HDF file by type.

Refs #7263

Changeset: 0908502cb75249ec3a7431db2e11e4f3e2a44ff5

comment:91 Changed 7 years ago by Martyn Gigg

Add root attribute query methods to HDFDescriptor. Refs #7263

Changeset: 4743494db4debe9cf778e4a1610e67c89d76b89f

comment:92 Changed 7 years ago by Martyn Gigg

Cache first entry name & type in HDFDescriptor. Refs #7263

Changeset: 3b88b845022c5653d8cfa6ce34cf4fa229d43d3c

comment:93 Changed 7 years ago by Martyn Gigg

Refs #7263 Formatting changes only

Changeset: 574a70fe79a4720f96e73a0ef406e4679a57041b

comment:94 Changed 7 years ago by Martyn Gigg

First lot of updates for new loader check style. Refs #7263

Changeset: 11988b9a591149fde1dd15e28d0e4b0968a5f8a8

comment:95 Changed 7 years ago by Martyn Gigg

Add static isAscii checks to FileDescriptor class. Refs #7263

Changeset: d94c7d27c1d50d3549081a8c39608445dfa15beb

comment:96 Changed 7 years ago by Martyn Gigg

General file loader conversion. Refs #7263

Changeset: 5fa890fbdf81568763df6c5d16d16341ba9a6ed7

comment:97 Changed 7 years ago by Martyn Gigg

Cache isAscii flag in FileDescriptor. Refs #7263

Changeset: 9a2601fcdf1a94350f05be91554ff82d9faa11c3

comment:98 Changed 7 years ago by Martyn Gigg

Port final DataHandling loaders to new structure. Refs #7263

Changeset: b3236b5c18c398bf322c23de241ba9e5384c5cdf

comment:99 Changed 7 years ago by Martyn Gigg

Move FileRegistry to singleton Refs #7263

It appears that we can't start the FrameworkManager during static initialization or the destruction of the FrameworkManager instance gets instance gets confused.

Changeset: db8d95c8cdaf648b2439c416724bb59ac2617d65

comment:100 Changed 7 years ago by Martyn Gigg

Move the remaining loaders to the new scheme. Refs #7263

Changeset: 2a330f57d781248e4b8ec0502726d1821c5004c5

comment:101 Changed 7 years ago by Martyn Gigg

Remove file access from some NeXus checks. Refs #7263

Changeset: 146d72e7c593ab85fed6009cb6493a76633a0a40

comment:102 Changed 7 years ago by Martyn Gigg

Fix bug with running correct version of loader. Refs #7263

Changeset: e1dcfe7d4145ea6c9a1d7d51435afffd3cc27c3d

comment:103 Changed 7 years ago by Martyn Gigg

Flip path/type map in HDFDescriptor. Refs #7263

Looking up paths is far more common so make that faster.

Changeset: 4c4155e8e58c12d3dcac4430bc95d233fd442025

comment:104 Changed 7 years ago by Martyn Gigg

FileDescriptor::resetStreamToStart needs to reset ios flags too.

Refs #7263

Changeset: c9f87457614765bdf53272037f567d0eae486c13

comment:105 Changed 7 years ago by Martyn Gigg

Add base-class checks to FILELOADER* check macros. Refs #7263

Also added runtime checks to the subscribe method. This allows it to static cast at runtime when checking rather than a more expensive dynamic cast.

Changeset: 34785e5194cf9bc29e5435b4973f541ba76ed4df

comment:106 Changed 7 years ago by Martyn Gigg

Fix LoadMD::confidence path checks. Refs #7263

Changeset: 961b02983a7752038c2532068499f794a8cbc60c

comment:107 Changed 7 years ago by Martyn Gigg

Fix EventPreNexus & RKH file checks. Refs #7263

Changeset: 7b1bfbe989a08c9dadc96056e2e8e9cfdd48fcee

comment:108 Changed 7 years ago by Martyn Gigg

Add access to open NeXus file in HDFDescriptor. Refs #7263

Changeset: 19119d59871acd545067afbca480bce35ee22941

comment:109 Changed 7 years ago by Martyn Gigg

Allow stored NeXus handle to be use in ::confidence checks. Refs #7263

Changeset: 7aad82fbecd50e5a2b784f68e954849571f06468

comment:110 Changed 7 years ago by Martyn Gigg

Ignore new LoaderVersion property in Python Load. Refs #7263

Changeset: 775e5249f4dead9814565de7a26edd4afff5b6d2

comment:111 Changed 7 years ago by Martyn Gigg

Merge branch 'feature/7263_improve_load_selection_performance'

into develop Refs #7263

Conflicts:

Code/Mantid/Framework/DataHandling/inc/MantidDataHandling/LoadILL.h Code/Mantid/Framework/DataHandling/src/LoadEventNexus.cpp Code/Mantid/Framework/DataHandling/src/LoadILL.cpp Code/Mantid/Framework/DataHandling/src/LoadSassena.cpp

Changeset: 49a10d7e33c34eaa40e0cde4ceaebd0b23909473

comment:112 Changed 7 years ago by Martyn Gigg

Remove valgrind header left accidentally. Refs #7263

Changeset: 82d931398d792e4d9a3e58ebdead81393687b85e

comment:113 Changed 7 years ago by Martyn Gigg

Fix ifstream initialization in FileDescriptor. Refs #7263

Changeset: 5ff698b5c973e64809b9a1b886215abbacc92a18

comment:114 Changed 7 years ago by Martyn Gigg

Remove compile-time checks on FileLoader classes.

They don't seem to work everywhere. My guess is that the types isn't quite complete by the time we use it so some compiles don't like it. Refs #7263

Changeset: 4d64a81a01043122ca1f1ada303ec6a1570d5b4c

comment:115 Changed 7 years ago by Martyn Gigg

Avoid double-algorithm registration in tests. Refs #7263

Changeset: 6e678e5ca625f1f0b25695d4f1d334ce0fa1f920

comment:116 Changed 7 years ago by Martyn Gigg

Add basic code for FileLoaderRegistry. Refs #7263

Currently based on string names.

Changeset: c5ab6b955dfc544c1a43f00d526ebbb7829d1211

comment:117 Changed 7 years ago by Martyn Gigg

Add DECLARE_FILELOADER_ALGORITHM macro. Refs #7263

Changeset: 9dc140841fdd8d29ca1d6ba2bdd949c7c41ee1af

comment:118 Changed 7 years ago by Martyn Gigg

Add a simple FileDescriptor class that wraps a file stream

The stream is open for as long as the object is alive. Refs #7263

FileDescriptor uses streams rather than c file handles. Refs #7263

Changeset: 02509f87a8b33fd7973951c54abb5e7555685ec9

comment:119 Changed 7 years ago by Martyn Gigg

Allow move_class.py to move untracked files. Refs #7263

Changeset: df6f2db124b0e3f6712adc42ea3f517ee6d02c2a

comment:120 Changed 7 years ago by Martyn Gigg

Fix AlgorithmFactory::unsubscribe method to updates version map.

It was previously not updating the version map when the algorithm was removed and then subscribing the same thing again would throw an error. Refs #7263

Changeset: a23d639411aedc0dd1e726d369fcacbba28b73c4

comment:121 Changed 7 years ago by Martyn Gigg

Have AlgorithmFactory::subscribe return the class name subscribed.

Refs #7263

Changeset: 483acce7607ca2aeb6bae584523fd487a80754d5

comment:122 Changed 7 years ago by Martyn Gigg

Avoid duplicate Algorithm registration in API tests. Refs #7263

Changeset: 885088de599e1d0f00bcd7cca06ef888a84fa802

comment:123 Changed 7 years ago by Martyn Gigg

FileLoaderRegistry now subscribes algorithms to factory

and also defines two types of loader, NonHDF and HDF. Refs #7263

Changeset: 52b5f1a5a5a1884401f76143e09290e785c49142

comment:124 Changed 7 years ago by Martyn Gigg

Add HDFDescriptor class that will describe HDF files.

The checks for the headers at the start of HDF4 & HDF5 files have been put in a static isHDF member to encapsulate the details about checking for a HDF file and make it reusable. It will not retain an open file handle to the file so does not inherit from the standard FileDescriptor. Refs #7263

Changeset: da30646f22af90d25e18a0be3b2bb56cef973896

comment:125 Changed 7 years ago by Martyn Gigg

Add pathExists & classTypeExists checks to HDFDDescriptor. Refs #7263

Changeset: 367b6232759944ca13e334dc859acf813fd687bc

comment:126 Changed 7 years ago by Martyn Gigg

Implement loader searching in FileLoaderRegistry.

The FileLoaderRegistry can now be asked to search its list of loaders for the best that matches a given file. It passes along a *Descriptor object to minimize the number of times the file is opened.

Refs #7263

Changeset: 132675199cf21badd7d4952a78fe6b42c48c3bfb

comment:127 Changed 7 years ago by Martyn Gigg

Add ability to restrict path search in HDF file by type.

Refs #7263

Changeset: 9a591118e449763fba5aefd5c9a84edac6d53e58

comment:128 Changed 7 years ago by Martyn Gigg

Add root attribute query methods to HDFDescriptor. Refs #7263

Changeset: ec3e85aaf4496543aca35c186d190c2eba46e111

comment:129 Changed 7 years ago by Martyn Gigg

Cache first entry name & type in HDFDescriptor. Refs #7263

Changeset: 5d723229c6be4088067f51a6c250eeadad0721da

comment:130 Changed 7 years ago by Martyn Gigg

Refs #7263 Formatting changes only

Changeset: ff1f7ad4c9d894da2e96b50180809cca41273fb6

comment:131 Changed 7 years ago by Martyn Gigg

First lot of updates for new loader check style. Refs #7263

Changeset: 5eec9296ab81d729b10ed22c87be140367bc9941

comment:132 Changed 7 years ago by Martyn Gigg

Add static isAscii checks to FileDescriptor class. Refs #7263

Changeset: bbc2514e5a1fe7b238dd26fe5fb8baf70183a877

comment:133 Changed 7 years ago by Martyn Gigg

General file loader conversion. Refs #7263

Changeset: 94047da498146c0db71a9a671ace0dacdb106b28

comment:134 Changed 7 years ago by Martyn Gigg

Cache isAscii flag in FileDescriptor. Refs #7263

Changeset: 248c7047a90f0972a4abc02d308e04a8d353d85f

comment:135 Changed 7 years ago by Martyn Gigg

Port final DataHandling loaders to new structure. Refs #7263

Changeset: 076350e97bd4ff2c49f7dd2ab31b3b10620b2371

comment:136 Changed 7 years ago by Martyn Gigg

Move FileRegistry to singleton Refs #7263

It appears that we can't start the FrameworkManager during static initialization or the destruction of the FrameworkManager instance gets instance gets confused.

Changeset: f4dbdb03111d909f57b1094dbbdffdc2f4fd564a

comment:137 Changed 7 years ago by Martyn Gigg

Move the remaining loaders to the new scheme. Refs #7263

Changeset: b008a1348f9862f22722d7d42bb3edb6c61c60be

comment:138 Changed 7 years ago by Martyn Gigg

Remove file access from some NeXus checks. Refs #7263

Changeset: 9d03dfb0ad782579c16eca779ca9bf2ecae8e9bf

comment:139 Changed 7 years ago by Martyn Gigg

Fix bug with running correct version of loader. Refs #7263

Changeset: 5fd18f632d4bef0eda7dbd97e8732389f38fcabf

comment:140 Changed 7 years ago by Martyn Gigg

Flip path/type map in HDFDescriptor. Refs #7263

Looking up paths is far more common so make that faster.

Changeset: 14a7ab1ce7ae4b05b4f56e7596bd289591d6e679

comment:141 Changed 7 years ago by Martyn Gigg

FileDescriptor::resetStreamToStart needs to reset ios flags too.

Also fix initialization that is not a pointer. Refs #7263

Changeset: 4ff024a68acbb8083a6a2a69da44fce94e9ea4d8

comment:142 Changed 7 years ago by Martyn Gigg

Add base-class checks to FILELOADER* check macros. Refs #7263

Also added runtime checks to the subscribe method. This allows it to static cast at runtime when checking rather than a more expensive dynamic cast.

Changeset: e3e18db9a3750df686e41badd27fb7ec0dfcc789

comment:143 Changed 7 years ago by Martyn Gigg

Fix LoadMD::confidence path checks. Refs #7263

Changeset: 7d124b9dee7bed2b6a75677b542e56d9566fe9e7

comment:144 Changed 7 years ago by Martyn Gigg

Fix EventPreNexus & RKH file checks. Refs #7263

Changeset: 0e0bd35179723d2b608093b619d3eb55ea9d4a04

comment:145 Changed 7 years ago by Martyn Gigg

Add access to open NeXus file in HDFDescriptor. Refs #7263

Changeset: 5a1a53c2c09e4b07f288cf35a32850492dd1e785

comment:146 Changed 7 years ago by Martyn Gigg

Allow stored NeXus handle to be use in ::confidence checks. Refs #7263

Changeset: 96f42591d31ec8fddf76896e657631104b1e6622

comment:147 Changed 7 years ago by Martyn Gigg

Ignore new LoaderVersion property in Python Load. Refs #7263

Changeset: aa26bb649bb8173d4b88af5c03ffa2356afc73ae

comment:148 Changed 7 years ago by Martyn Gigg

Remove compile-time checks on FileLoader classes.

They don't seem to work everywhere. My guess is that the types isn't quite complete by the time we use it so some compiles don't like it. Refs #7263

Changeset: 84aa5968b2caacb2ac575ce7f3ff24cea2939b4b

comment:149 Changed 7 years ago by Martyn Gigg

Fix LoadSNSspec test.

A \r left in the file after opening in binary mode confused the check. Refs #7263

Changeset: 6292160c0d040a146cd65649579a37b0f05c585a

comment:150 Changed 7 years ago by Martyn Gigg

Add canLoad method to FileLoaderRegistry.

This better encapsulates the file checking so that other places don't have to worry about the desciptor stuff. Refs #7263

Changeset: a5c9c3096f25cdbbeedd23f65c320ea6f5083b76

comment:151 Changed 7 years ago by Martyn Gigg

Remove IDataFileChecker & LoadAlgorithmFactory. Refs #7263

Changeset: 39bfe42ca9f987cdcd0036799c070346c6a6a26e

comment:146 Changed 7 years ago by Martyn Gigg

Merge branch 'feature/7263_improve_load_selection_performance' into develop

Refs #7263 Conflicts:

Code/Mantid/Framework/DataHandling/inc/MantidDataHandling/LoadILL.h Code/Mantid/Framework/DataHandling/src/LoadEventNexus.cpp Code/Mantid/Framework/DataHandling/src/LoadILL.cpp Code/Mantid/Framework/DataHandling/src/LoadSassena.cpp

Changeset: 134bd768072544227c66568dd50764973035c92b

comment:147 Changed 7 years ago by Martyn Gigg

Temp move LoadILLSANS to new loader hierachy on develop. Refs #7263

Changeset: 014f1fec3f379936607f4e72578e29b08e9c7c36

comment:148 Changed 7 years ago by Martyn Gigg

Fixes for OSX 10.8 & Intel compiler.

It doesn't generate a symbol for the destructor of the *FileLoader interfaces unless explicitly defined. All other platforms, including OS X 10.6, don't show this problem. Refs #7263

Changeset: 04e169774094e6307a6f24e20cc31f0c979242f8

comment:149 Changed 7 years ago by Martyn Gigg

Merge branch 'feature/7263_improve_load_selection_performance' into develop

Refs #7263 Conflicts:

Code/Mantid/Framework/API/inc/MantidAPI/IFileLoader.h Code/Mantid/Framework/API/inc/MantidAPI/IHDFFileLoader.h Code/Mantid/Framework/DataHandling/inc/MantidDataHandling/LoadILL.h Code/Mantid/Framework/DataHandling/src/LoadILL.cpp Code/Mantid/Framework/DataHandling/src/LoadSassena.cpp

Changeset: 10264aba28915ee6805abeb17bfa3f6aef967161

comment:150 Changed 7 years ago by Martyn Gigg

Fix istream construction in FileDescriptorTest.

The standard is to only allow C strings but the other platforms seem to accept std::strings as well Refs #7263

Changeset: 367fe425b49a021bf1d6f71be23bde4429b8db48

comment:151 Changed 7 years ago by Martyn Gigg

  • Status changed from new to accepted

comment:152 Changed 7 years ago by Martyn Gigg

  • Status changed from accepted to verify
  • Resolution set to fixed

Branch: feature/7263_improve_load_selection_performance

Tester: There are a lot of changes here. Things got rebased as well so the comments are little bit of a mess. Best to see here for the changes.

The whole idea of file checking has now changed and is described here:

http://www.mantidproject.org/Hooking_In_With_Load_Algorithm

so that it is available for developers. The file being checked now only gets opened once and a wrapper object is passed to each of the loaders that gives access to meta-data & the open file stream.

There is also a special object for HDF files that caches other aspects of the file to avoid each loader having to open & parse the file.

The tests, both unit & system) should be passing, see here https://builds.sns.gov/job/ornl_test_rhel6_develop/, for most up to date tests results.

comment:153 Changed 7 years ago by Owen Arnold

  • Status changed from verify to verifying
  • Tester set to Owen Arnold

Changed 7 years ago by Martyn Gigg

comment:154 Changed 7 years ago by Martyn Gigg

Attached script has a loop for loading a file. Flip it between Load & LoadMuonNexus to see the difference.

comment:155 Changed 7 years ago by Owen Arnold

Before changes. Script loads in 3.93 seconds. After changes, script loads in 2.33 seconds. Both running on a Debug build on the same Mac. By way of reference, I've also run the script using the LoadMuonNexus algorithm directly, and find that it takes 2.02 seconds. So the cost of finding the right loader is now quite small in comparison to what it was previously.

comment:156 Changed 7 years ago by Owen Arnold

  • Status changed from verifying to closed

comment:157 Changed 7 years ago by Anders Markvardsen

Merge remote-tracking branch 'origin/feature/7263_improve_load_selection_performance'

Refs #7263 Conflicts:

Code/Mantid/Framework/DataHandling/src/LoadEventNexus.cpp Code/Mantid/Framework/DataHandling/src/LoadSassena.cpp

Changeset: 28afe6d1afe3b8329978e5d8b0413f3938e08629

comment:158 Changed 7 years ago by Nick Draper

  • Component changed from Mantid to Framework

comment:159 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 8109

Note: See TracTickets for help on using tickets.