Ticket #2183 (closed: wontfix)

Opened 10 years ago

Last modified 5 years ago

EventWorkspace: PadPixels(): consider using placement new with an arena

Reported by: Janik Zikovsky Owned by: Janik Zikovsky
Priority: major Milestone: Iteration 27
Component: Mantid Keywords:
Cc: Blocked By:
Blocking: Tester: Nick Draper

Description

This means pre-allocating the necessary memory, making the creation of millions of EventLists faster.

Change History

comment:1 Changed 10 years ago by Janik Zikovsky

  • Status changed from new to accepted

comment:2 Changed 10 years ago by Janik Zikovsky

  • Status changed from accepted to assigned
  • Summary changed from EventWorkspace: PadPixels(): consider using placement new with an arena to EventWorkspace: use a vector of EventList instead of EventList *

(changed description):

I believe this will make allocating large number of EventLists faster because std::vector already creates its own arena-style allocators. The only thing that becomes impossible (or at least, unwise) is returning a pointer to EventList (getEventListPtr) but that is only used once or twice and does not need to.

Since this requires changing lots of files, this will wait till after the CMAKE move.

comment:3 Changed 10 years ago by Janik Zikovsky

Further note: This means that going from the data_map to the plain data[] vector will require copying all events in each list. This affects things calling getEventListAtPixelID() only = basically just LoadEventPreNexus.

Something to speed it up when PadPixels is true can be used and will make things faster. Then only the case of PadPixels=false and using LoadEventPreNexus will remain, where event copying will be necessary, unless some other trick is used.

comment:4 Changed 10 years ago by Janik Zikovsky

  • Status changed from assigned to accepted
  • Summary changed from EventWorkspace: use a vector of EventList instead of EventList * to EventWorkspace: PadPixels(): consider using placement new with an arena

Further speed testing, with one million spectra:

Creating a std::vector<EventList> then calling reserve( 1M ) and push_back( EventList() ) 1M times: 1.57 seconds. The copy constructor is called each time on push_back!

Creating a std::vector<EventList> then calling resize( 1M ): 0.98 seconds

Allocating 1M new EventList and push_back to std::vector<EventList *>: 0.54 seconds

(this allocation was normal, aka no arena). Clearly the pointer approach wins. Let's see if using an arena can help.

comment:5 Changed 10 years ago by Janik Zikovsky

  • Status changed from accepted to verify
  • Resolution set to wontfix

More speed testing using this code :

    int numpix = 1000000;
    Timer time;
    std::vector<EventList> l_nop;
    l_nop.reserve(numpix);
    for (int i=0; i<numpix; i++)
    {
      l_nop.push_back( EventList() );
      //l_nop[i].addDetectorID(5);
    }

    std::cout << "std::vector<EventList>; reserve() then push_back : " << time.elapsed() << " sec.\n";

    std::vector<EventList> l_nop2;
    l_nop2.resize(numpix);
    for (int i=0; i<numpix; i++)
    {
      //l_nop[i].addDetectorID(5);
    }

    std::cout << "std::vector<EventList>; resize() : " << time.elapsed() << " sec.\n";

    std::vector<EventList *> l_p;
    l_p.reserve(numpix);
    for (int i=0; i<numpix; i++)
    {
      EventList * ptr = new EventList();
      l_p.push_back( ptr );
      //ptr->addDetectorID(5);
      // l_p[i] = new EventList();
    }

    std::cout << "Pointer allocation in a std::vector<EventList *>: " << time.elapsed() << " sec.\n";

    EventList * l_arr;
    l_arr = new EventList[numpix];
    for (int i=0; i<numpix; i++)
    {
      l_arr[i];
    }

    std::cout << "Dynamic array allocation of an EventList[] : " << time.elapsed() << " sec.\n";

    char * arr;
    std::vector<EventList *> l_p2;
    l_p2.resize(numpix);
    arr = new char[numpix*sizeof(EventList)];
    char * placement = arr;
    for (int i=0; i<numpix; i++)
    {
      l_p2[i] = new (placement) EventList();
      arr += sizeof(EventList);
    }

    std::cout << "Single char* array with placement new: " << time.elapsed() << " sec.\n";

Results:

std::vector<EventList>; reserve() then push_back : 1.51327 sec.
std::vector<EventList>; resize() : 0.929531 sec.
Pointer allocation in a std::vector<EventList *>: 0.515269 sec.
Dynamic array allocation of an EventList[] : 0.397139 sec.
Single char* array with placement new: 0.340699 sec.

Therefore it looks like at least 70% of the time is spent in the constructor, not in trying to allocate the objects. The placement new operator is indeed the fastest option, but I don't believe that a 0.15 second speed-up is worth the extra complexity.

comment:6 Changed 10 years ago by Nick Draper

  • Status changed from verify to verifying
  • Tester set to Nick Draper

comment:7 Changed 10 years ago by Nick Draper

  • Status changed from verifying to closed

comment:8 Changed 5 years ago by Stuart Campbell

This ticket has been transferred to github issue 3030

Note: See TracTickets for help on using tickets.