Ticket #1778 (closed: wontfix)
PythonAPI: create workspace method to get multi-dim Numpy array with all spectra
Reported by: | Mathieu Doucet | Owned by: | Nick Draper |
---|---|---|---|
Priority: | major | Milestone: | Iteration 25 |
Component: | Python | Keywords: | |
Cc: | doucetm@… | Blocked By: | |
Blocking: | Tester: | Mathieu Doucet |
Description
Although all data manipulation for analysis should be done by algorithms, there is a need for UI widgets to be able to access the data. Right now, it is only possible to use the readX/Y/E methods to get a single spectrum. In cases like SANS, where the each pixel of the 2D detector is a spectra, one would have to loop over N*N pixels to get the whole detector array. This can be very slow for a widget coded in python. We should add three methods along the lines of read_all_spectraX/Y/E, which would return a Numpy array N_x * N_y, where N_x is the number of X bins (TOF) and N_y is the number of spectra.
Change History
comment:3 Changed 10 years ago by Nick Draper
- Owner set to Nick Draper
- Status changed from new to verify
- Resolution set to wontfix
We could do this, and add such a method, but it would be for ease of use rather than performance.
NumPy arrays require contiguous memory, and the memory for each spectra array is seperate (on purpose as we absolutely did not want to need contiguous memory). As such we could only create a multidimenional NumPy array by copying the data into a new numpy array. This would overall I think be slower and of course increase running memory requirements.
This would therefore be a method we could expose, but in most circumstances we would prefer people no to use, in which case I'm not usre about creating the method in the first place.
If you disagree feel free to reopen the ticket.
comment:5 Changed 10 years ago by Mathieu Doucet
- Status changed from verify to verifying
- Tester set to Mathieu Doucet
That's a good argument, although the end result ends up being the same if we have to fill a new numpy array. For instance, if we need to pass an array to matplotlib for plotting, our only solution currently is to create a new numpy array, loop over all spectra, call readX/Y/E and fill each array bin. In python, that loop is very slow. So although it really is just a convenience method, it would speed up the UI response by quite a bit and have the same memory impact (because we have to create the array no matter what). We'd just have to make sure that it's only used properly. I'll think about it and may re-open the ticket.
comment:7 Changed 10 years ago by Russell Taylor
The risks of injudicious use of such a method were apparent when we first discussed this, which is why I didn't like the idea of putting it directly on MatrixWorkspace but was more comfortable with having it in the PythonAPI.
In fact, what Mathieu needs to do here is take a workspace with only 1 bin and get it into an array. This is already possible on the C++ side using the Transpose algorithm (rather heavy-weight way of doing it), but a method to retrieve a single 'vertical' bin of a workspace would also work and be less open to abuse.