2014 09 18

Agenda:
- open TODOs
  - TODO SJ: doc is missing 10.000 feet overview (SJ?)
  - TODO SJ: contact Helmut about SuperMuc allocation
  - DONE AM: reduce testing frequency further after release
  - WIP AM: follow up with Anjani on EC2
  - WIP AM: RP slide decks
  - TODO AM: test supermuc, check out gsi contexts
  - DONE AM: suspend testing on FG
  - TODO MS: verify state model wrt. data staging
  - TODO AM: better name for STATE_X
- open agenda items
  - WIP AM: AGENDA: cleanup
  - WIP AM: AGENDA: configuration files
  - TODO AM: AGENDA: test suite granularity
  - TODO AM: AGENDA: performance PTY / SHELL / SAGA
  - TODO AM: AGENDA: discuss how to ensure test coverage
  - TODO AM: AGENDA: discuss #307, async call semantics
  - TODO AM: AGENDA: student project: plotting
- MS.8
  - What are goals for the next couple of weeks?
  - (check on open tickets)
- eval tutorial with Indiana: online, Scott and Abhinav
  - TODO IU: provide application code
- configuration files
  - AM: we could re-use what we had in OWMS (code exists in utils):
    - RP resource configs remain as is
    - user configs can be used to overwrite those default settings, like:
      # $HOME/radical/pilot.cfg { "resources" : { # add a custom host "boskop" : { "defaults" : "localhost" "pilot_agent" : "radical-pilot-agent-multicore_testing.py", "lrms" : "TORQUE", "task_launch_method" : "SSH", "mpi_launch_method" : "MPIRUN", "global_virtenv" : "$HOME/ve/" }, # change some user specific variable in existing RP config entries "*.futuregrid.org" : { "username" : "merzky" }, "sierra.futuregrid.org" : { "default_queue" : "batch" }, } }
- cleanup modes
  - 1: cleanup database entries: session.close (cleanup=TRUE)
  - 2: terminate pilots: session.close (terminate=TRUE)
  - 3: clean pilot sandbox: pilot_description.cleanup = TRUE
  - 4: clean unit sandbox: unit_description.cleanup = TRUE
  - 1, 2 are enacted by RP/Application on clean application shutdown
  - 3, 4 are enacted by agent on clean pilot shutdown
  - 1, 2 can be performed after application finishes, via radicalpilot-cleanup
  - 3, 4 cannot be performed after application finishes (yet)
- STATE_X:
  - AM: should be SCHEDULING: the CU has reached the scheduler but has not yet been assigned to a pilot (e.g., if none is free to run the CU).
```
# scheduler
for task in wait_q :
    task.state = SCHEDULING

    while True :
        pilot = find_free_pilot (task)
        if  pilot :
            task.pilot = pilot
            break
        
task.state = PENDING_EXECUTION
submit_task_to_pilot (task)
```
  - but: SCHEDULING already used within the agent:
```
# agent
for task in mongodb.find (pid  : my_pid, 
                          state: PENDING_EXECUTION)
    task.state = SCHEDULING

    while True :
        cores = find_free_cores (task)
        if  cores :
            task.cores = cores
            break
        
task.state = EXECUTING
submit_task_to_cores (task)
```
Notes:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2014 09 18

Clone this wiki locally