2008/01/02 CSEP meeting

Attendees: John, Jeremy, Masha


  1. CSEP Software Development:
    • There were two csep-cert crushes over the holidays - both were associated with Python random number generator:
      1. Using one seed for all random numbers of the evaluation test resulted in allocating 6.5 Gb of random numbers per test. The system crushed when two instances of Dispatcher tried to allocate 6.5 Gb of random numbers each.
        • John rebooted the server
        • Masha changed the code to use seed value per each simulation within the test
      2. File I/O used by new random number generator doubled up the processing time for each forecast group. That resulted in 4 instances of Dispatcher running on Friday, December 27. Masha's guess: it ran out of memory and crushed. We never tried to run 4 Dispatchers at the same time before.
        • Masha rebooted the server and modified the code to write all random seed files associated with a particular test to the sub-directory to avoid ls problems in the future viewing the directory content

        • Masha scheduled the processing of forecast groups (except for one-day models) sequentially. The processing of these groups overlap with processing of one-day models.
  2. SCEC/CSEP operational:
    • Install and build CSEP Version 8.1 on csep-op after processing of one-day models completes on Thursday, January 3, 2008
      1. John will rsync existing operational data to new storage, will send email when it's done
      2. Masha will install, build, and run acceptance tests for CSEP Version 8.1
      3. Masha will remove result data generated with CSEP Version 1.0 since all test dates will be re-processed with new version - there will be 2 copies of data available until we reprocess all of the test dates
  3. Jeremy asked about scheduled tasks that don't include new models for the next release of CSEP V8.4:
    • Masha mentioned that there is a number of known problems that would need to be addressed:
      1. Trac ticket #60 Lowering minimum magnitude for one-day models input catalog causes Matlab recursion limit error in STEP

      2. Trac ticket #61 Importing of over 1 million of events into Matlab code never finished after 20 hours of running

      3. Trac ticket #62 File IO used by random numbers doubles up processing time

      4. Trac ticket #36 Re-implement evaluation tests without random numbers

        • Jeremy suggested not to wait for the write up paper from David Rhoades, and start working on at least N-test - Jeremy will help
    • As Danijel suggested in one of the previous meetings:
      1. We should work on web interface for the CSEP results.
      2. Consider implementing storage API - "database"-like interface to query existing operational data
For more information about the W. M. Keck Foundation CSEP Testing Center at SCEC, please contact info@cseptesting.org