Performance Tests

Architecture and implementation needs to be performance test driven so that the final product satisfies performance requirements. Performance is measured in terms of scalability along two axes: number of tasks and number of agents.

We define three test categories:

Large number of cores owned by one agent, represented by one task queue (single HPC scenarios)
Large number of cores owned by small number of agents (multiple HPC scenarios)

through one aggregate task queue (saga-pilot only scenario)
through one task queue per agent (saga-pilot + TROY)

Small number of cores owned by large number of agents (OSG / Cloud scenarios)

through one aggregate task queue (saga-pilot only scenario)
through one task queue per agent (saga-pilot + TROY)

Tests shall be defined as soon as the user-facing REST API has been defined and periodically run during all stages of the implementation period to ensure performance QoS and get an early handle on overall scalability and performance numbers and limits.

Scenario 1

Scale up the number of cores owned by one agent:

The largest HPC cluster we have access to is STAMPEDE:
- normal queue: 256 nodes / 4K cores
- large queue: 1024 nodes / 10k cores (by request)

Performance metrics are:

Overhead: time the agent spends doing things other than job execution

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Tests

Scenario 1

Scenario 2

Scenario 3

Clone this wiki locally