Ticket #9022 (closed: fixed)

Opened 7 years ago

Last modified 5 years ago

Run CheckWorkspaceMatch serially when investigating comparison logs

Reported by: Alex Buts Owned by: Alex Buts
Priority: major Milestone: Release 3.2
Component: Framework Keywords:
Cc: Blocked By:
Blocking: Tester: Martyn Gigg

Description

Changes to checkWorkspaceMatch allowing to run it in parallel certainly accelerate things and fine when things are correct but are absolutely unhelpful if one wants to look why and were things went wrong. logs are produced in parallel and it is impossible to understand what they are reporting.

This is why it is worth to run comparison serially (or build serial logs which is much more difficult task) if one is interested in looking at comparison logs (on debug level)

Change History

comment:1 Changed 7 years ago by Alex Buts

  • Status changed from new to inprogress

refs #9022 Should solve the problem

by prohibiting parallel execution for CheckWorkspacesMatch in one selects "debug" and "trace" log levels

Changeset: ac40ab632f7a3f12da8161f265b9bf5c78e234f6

comment:2 Changed 7 years ago by Alex Buts

refs #9022 Incorrect comparison fixed

Changeset: 21e067b8216ba6ec172894c7ace2441e1413e765

comment:3 Changed 7 years ago by Alex Buts

refs #9022 Histogram data comparison also serialized when requested

Changeset: 6e07846b15f4166beb6f495e94f6587b5bab6dc1

comment:4 Changed 7 years ago by Alex Buts

refs #9022 Unit test for log levels

Changeset: 451a7a29e119f8a3797845b21deab690683b8193

comment:5 Changed 7 years ago by Alex Buts

refs #9022 Should solve the problem

by prohibiting parallel execution for CheckWorkspacesMatch in one selects "debug" and "trace" log levels

Changeset: 2964bea150fccae89214bc9dbdf999b68898aca0

comment:6 Changed 7 years ago by Alex Buts

refs #9022 Incorrect comparison fixed

Changeset: 1e3a3f6b00f7cccaced054178dbb692ce3414155

comment:7 Changed 7 years ago by Alex Buts

refs #9022 Histogram data comparison also serialized when requested

Changeset: 7996832f28bd258a27c6ceabbeb0bb8fa81a8ac8

comment:8 Changed 7 years ago by Alex Buts

refs #9022 Unit test for log levels

Changeset: f68a043808334cb3ae33ef415a32e78393276ca0

comment:9 Changed 7 years ago by Alex Buts

refs #9022 fixing unix build

by removing references to POCO methods which do not exist on Unix

Changeset: 4746ee7705f867548a0effe339359810c697b0c4

comment:10 Changed 7 years ago by Alex Buts

To test:

find 2 histogram workspaces which are different on data level* and run CheckWorkspaceMatch on multiprocessor machine before and after applying this patch.

Choose Debug level on results window to see the reason for the data mismatch.

Before the patch CheckWorkspaceMatch generates some gibberish composed by output from different threads. After the patch you will be able to see single (first) mismatched spectra with clear view for what the differences are.

*(easy way to do get such workspaces is to take ISISDirectInelasti system test e.g. MARIReductionFromWorkspace -- it is very fast, set self.sample_mass = 11 (row 176)) and run the test -- it will generate MARIReductionFromFile-mismatch.nxs which is slightly different from the sample MARIReduction.nxs)

Last edited 7 years ago by Alex Buts (previous) (diff)

comment:11 Changed 7 years ago by Alex Buts

  • Status changed from inprogress to verify
  • Resolution set to fixed

this ticket has been created to help with another ticket where it has been initially resided and then rebased to master. As the result track is in mess. Look in Git for code changes -- they are small.

Should be easy to understand and test the changes. Testing guidance -- in the message above.

comment:12 Changed 7 years ago by Alex Buts

refs #9022 Reverting CheckWorkspacesMatch to division under fabs

Somebody did a good job making this algorithm look nice but useless, so I spent long time looking in amazement why workspaces which look identical do not match. A coordinate (and sometimes signal and error! can be negative!) This fixes this problem.

Changeset: 7012b8a76eb0dc350247bf6b0749d290cbd7f49f

comment:13 Changed 7 years ago by Alex Buts

refs #9022 No, this actually fixes negative values problem

Changeset: cbb4cd6f4f768ab22bc0b51b626474f6b2d13cea

comment:14 Changed 7 years ago by Alex Buts

refs #9022 Proper error estimation

Evaluating problem I am having comparing two similar inelastic workspaces I've modified comparison algorithm to work with relative and absolute error. This is actually what you want comparing two workspaces and some kind of this approach was already in place for values close to 0

Changeset: 02176858c3c201cfe94e99e15e6f3150022acef3

comment:15 Changed 7 years ago by Alex Buts

refs #9022 after some thinking, I've decided that standard definition

of Relerr with more careful treatment of values near 0 is preferable implementation.

Changeset: 0d47d5b2525525e208d68e17f316f8464342d610

comment:16 Changed 7 years ago by Alex Buts

refs #9022 A simple comment

Changeset: 0fdfa950b9874ba1350becebcdc190e9b69d367a

comment:17 Changed 7 years ago by Alex Buts

Trying to compare two complex workspaces I tried different ways of comparing the data and calculating the error. Final result which suits the purpose is the standard relative error with careful treatment of small values. As the change is minor and related to CheckWorkspaceMatch algorithm it well fits within this ticket.

comment:18 Changed 7 years ago by Nick Draper

  • Status changed from verify to verifying
  • Tester set to Nick Draper

The branch

comment:19 Changed 7 years ago by Nick Draper

This branch has not Fully been merged to develop, please run a checkbuild and send for testing again.

This wasted 15 minutes of my life, as I foolishly started testing the current state before running git test start.

comment:20 Changed 7 years ago by Nick Draper

  • Status changed from verifying to reopened
  • Resolution fixed deleted

comment:21 Changed 7 years ago by Alex Buts

  • Status changed from reopened to verify
  • Resolution set to fixed

comment:22 Changed 7 years ago by Martyn Gigg

  • Status changed from verify to verifying
  • Tester changed from Nick Draper to Martyn Gigg

comment:23 Changed 7 years ago by Martyn Gigg

  • Status changed from verifying to reopened
  • Resolution fixed deleted

The changes generally look fine but the new function, TOL_ERR, is named incorrectly and is not in the right place. Can it be moved to the first thing inside Algorithms namespace and inside an anonymous namespace. Also please remove the inline keyword and use the proper camelCase standard for the names. I think the function name could be improved slightly while we're here too:

  • TOL_ERR -> relErrInLimit
  • ERROR_VAL -> errVal
namespace Algorithms
{
  namespace
  {
    /**
     * Function which calculates relative error between two values and analyses if this error is within the limits
     * requested
     * @param x1 -- first value to check difference
     * @param x2 -- second value to check difference
     * @param errVal -- the value of the error, to check against. Should  be positive != 0
     * @returns true if error or false if the value is within the limits requested
     */
     bool relErrInLimit(const double &x1, const double &x2, const double &errVal)
     {
 
       double num= std::fabs(x1-x2);
       // how to treat x1<0 and x2 > 0 ?  probably this way
       double den=0.5*(std::fabs(x1)+std::fabs(x2));
       if (den<errVal) 
         return (num>errVal);
 
       return (num/den>errVal);
     }
   }

comment:24 Changed 7 years ago by Alex Buts

  • Status changed from reopened to inprogress

refs #9022 Trivial changes to the function arguments

Changeset: 35ae58734a5b92458347e872361c49d4cc89f8d4

comment:25 Changed 7 years ago by Alex Buts

The TOL_ERR capital came from "notional" templating of this function into the code where it was initially cut-out. I do not have problem with naming it low case, but making it global does not make sense. It is not good enough to provide homogeneous accuracy along any data range and its meaning is well restricted by the algorithm it is in. It can be properly understood only within this algorithm scope.

I tried to make it generic, but doing this appears to need much more efforts then I was able to assign within the scope of another ticket this one is extracted from.

If we wand generic and proper tollErr function, it should be separate ticket for it. Meanwhile this one has been renamed to relErr.

comment:26 Changed 7 years ago by Alex Buts

refs #9022 conflicts with master after long inactivity period

Merge branch 'master' into feature/9022_serialWorkspaceComparison

Conflicts:

Code/Mantid/Framework/Algorithms/src/CheckWorkspacesMatch.cpp

Changeset: 891c24d9a1c975329309419621c5bd0ad4710788

comment:27 Changed 7 years ago by Alex Buts

refs #9022 Missing pre-commit

why it get missing in the first place?

Changeset: 096696236d85f3ab1f0435b87db68c964a3038e5

comment:28 Changed 7 years ago by Alex Buts

  • Status changed from inprogress to verify
  • Resolution set to fixed

comment:29 Changed 7 years ago by Martyn Gigg

  • Status changed from verify to verifying

comment:30 Changed 7 years ago by Martyn Gigg

  • Status changed from verifying to reopened
  • Resolution fixed deleted

I didn't mean to make the function global. What I meant was just move it to near the top of the CheckWorkspacesMatch.cpp file and put it in an anonymous namespace (this will actually make it far from global). The only reason I think it should be moved is that at the moment it's place in the file makes it look like it is a member function.

My idea was for the top of the file to look like this:

namespace Mantid
{
namespace Algorithms
{
  namespace
  { //anonymous
    /**
     * Function which calculates relative error between two values and analyses if this error is within the limits
    * requested
    * @param x1       -- first value to check difference
    * @param x2       -- second value to check difference
    * @param errorVal -- the value of the error, to check against. Should  be positive != 0

    * @returns true if error or false if the value is within the limits requested
    */
   bool relErr(const double &x1,const double &x2,const double &errorVal)
   {
     double num= std::fabs(x1-x2);
     // how to treat x1<0 and x2 > 0 ?  probably this way
     double den=0.5*(std::fabs(x1)+std::fabs(x2));
     if (den<errorVal) 
        return (num>errorVal);
     return (num/den>errorVal);
   }
  }
... rest of the file

comment:31 Changed 7 years ago by Alex Buts

  • Status changed from reopened to inprogress

refs #9022 moved error calculating function into anonymous namespace.

nof if I am familiar with this style or finding it clearer then usual file scope function.

Changeset: a4a2820677b34cde570d1aa28e6c428275345749

comment:32 Changed 7 years ago by Alex Buts

refs #9022 better way of expressing the same concept

Changeset: 757b060971d0981e0934c4867f0f5648b8213274

comment:33 Changed 7 years ago by Alex Buts

  • Status changed from inprogress to verify
  • Resolution set to fixed

comment:34 Changed 7 years ago by Martyn Gigg

  • Status changed from verify to verifying

comment:35 Changed 7 years ago by Martyn Gigg

  • Status changed from verifying to closed

Merge remote-tracking branch 'origin/feature/9022_serialWorkspaceComparison'

Full changeset: b0d8ba3930908612f42ddbb8a4ee7a433c4e7036

comment:36 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 9865

Note: See TracTickets for help on using tickets.