-
Notifications
You must be signed in to change notification settings - Fork 23
2014 06 05
Andre Merzky edited this page Jun 11, 2014
·
3 revisions
-
Who: Mark, Matteo, Shantenu, Ole, AndreM, Antons
-
Agenda:
- open TODOs
- DONE AM: re-check 4096 ceiling
- TODO AM: micro benchmarks for RP
- DONE MS: MPI agent...
- TODO AT: start on Cray agent, based on ATs scripts
- TODO OW: get quantitative EnMDTK requirements
- DONE AM: re-check with Matteo on exact problem
- INV. AM: support MS on MPI agent?
- TODO AM: address port forwarding ... (mom-nodes, head-nodes, ...)
- TODO OW: check with Sergeij on Cray/BlueGene issue
- DONE AT: put archer scripts into repo
- TODO DINESH: check out port forwarding solution from Mark
-
MS-7 checkpoints:
- May 8:
- OW: OK simple MPI support for Stampede complete (prototype)
- AT, OW: ?? draft architecture for Cray agent
- May 15:
- MS: ?? implementation proposal for MPI support beyond stampede
- AM: ?? MPI integration tests set up
- OW: ?? first prototype of non-MPI agent for cray
- ALL: ?? agree on implementation plan for Cray agent
- May 22:
- AM: ?? MPI support integrated in Troy
- May 29:
- MS: ?? MPI support complete
- OW/AT: ?? prototype for Cray agent -- reconfirm implementation plan
- ALL: ?? documentation of performance bottleneck
- June 5:
- AM, OW: ?? integration tests for MPI stable for multiple use cases / machines
- ALL: ?? agreement on optimization targets
- May 8:
- status reports
- ticket prioritization / milestones
- (?) what role plays scheduling on agent level?
- open TODOs
-
Notes:
-
AM: tickets, tests, integration, 4096 checks
-
TODO AM: triage, this is probably not #104
-
TODO OW,DINESH: HT-BAC as scalability testing for MPI
-
Mark:
- order of module load matters :( preexec...
- TODO MS: document in resource configs
- multicore and mpi support working
- switched by resource config file
- should be moved to CU level (#156)
- removed multiple exec workers (no partitions)
- multi threads for data remain...
- merge with Cray track: ignore for now...
- TODO MS: preexec does not work across nodes...
-
Ole:
- resource configuration revamped
- TODO OW: check
-
Antons:
- Archer down...
-
TODO AM:
- workdir
-
release:
- TODO MS: sort tickets into tutorial milestones
-
- will data be merged?
- will 156 done?
- put data on agenda again