2012.06.25 Range Maps

Range Modeling Discussion

June, 25, 2012

Participants

John Donoghue, Edwin Skidmore, Sangeeta Kuchimanchi, Jim Regetz, Martha Narro, Nirav Merchant

Agenda
Notes
  • JD: >= 5 points go into max ent models (all 3 of them)
    • Mistake in the document: Not UTM, instead is Lambert equal area (LEA).
    • John will correct that.

Output Products

  • JD: Background raster layer is LEA, so that is what the initial rasters are in.
    • The WGS format isn’t output for species with 3-4 points since it takes about 5 min. to re-project from LEA to WGS.
    • Could re-project the points in WGS instead of re-projecting the entire raster.
    • Would be faster.
    • JR: Probably gives more accurate re-projection too.
    • JD: na, 0, 1 are the values in the raster

Discussing whether or not to create WGS product for species with 3-4 points.

  • JD: Have 12,400 species with 3-4 occurrence points.
    • ES: That’s 4 days on a single processor.
    • ES: For 16 cores, can divide by 16 (roughly).
    • MN: Do you or don’t you need the WMS product? It’s there for species with 1, 2 and 5, occurrence points but not for species with 3-4. What’s the WMS used for?
    • JD: It’s for the format for public consumption.
    • MN: Seems like you should compute the WGS product for the species with 3-4 species.
      • Otherwise that product is there for some species but not for others.
      • Don’t worry about the compute time.
  • The 5-point species runs did not vary a lot as number of points increased.

R optimization issues

  • ES: Launching R took 5-10 sec
    • Launches for each species
    • It will be good to structure scripts to optimize the number of times R launches.
    • Also, we’ll need to update R and recompile to use the Intel math library.
      • Will improve performance by one factor.
      • Matrix operations faster.
      • Up to 50x faster.

File structure

  • JD: Species file structure: John didn’t include it in the document he sent us.
  • JD: Tmp files
    • This time we want to retain all the tmp directories
    • John can’t test creating the tmp file.

Overview of steps John will need to do.

  • ES: To use parametric launcher (python launcher)
    • John will need to write paramlist script
    • Run it. When it hits the time limit for jobs execution at TACC,
    • Check the last species that completed,
    • Edit the paramlist,
    • Submit the job again.
  • ES: Have you used sge engine and qsub?
    • JD: No
    • ES: OK, you’ll have to learn some things.
  • ES: Also, be aware that HPC has login nodes and compute nodes.
    • Don’t compute on the login nodes or you’ll get nasty messages from managers at TACC.
  • ES: Do you have a TACC account.
    • JD: I think so, but haven’t used it.
  • JD: Where does output go?
    • ES: Depends on your scripts, where do they put it?
    • But there are so may output files, last time he wrote a script to tar them (at TACC?).
    • Then move the files back to Data Store.
  • ES: Maxent requires some graphics packages so need to run on Longhorn (has graphics). Ranger doesn’t.
  • ES: Suggests John get everything ready to run and do trial runs on 16 species.
    • Meet again this week, Thursday after 1:30, to do a side-by-side run with Edwin.
Next Steps
  • Edwin - document steps to be done, including example scripts.
  • John – work on scripts for next run.
  • John – Update number of species in the documentation.