Skip to content

Actions: microsoft/DeepSpeed

nv-torch-latest-v100

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
5,059 workflow runs
5,059 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

nv-torch-latest-v100
nv-torch-latest-v100 #12790: Merge group checks requested
January 4, 2025 05:58 1h 26m 24s
January 4, 2025 05:58 1h 26m 24s
Fix: forbid repeated deepspeed.initialize on training objects
nv-torch-latest-v100 #12789: Pull request #6874 synchronize by loadams
January 4, 2025 04:39 Action required traincheck-team:fix-6848-forbid-repeated-init
January 4, 2025 04:39 Action required
nv-torch-latest-v100
nv-torch-latest-v100 #12788: Scheduled
January 4, 2025 00:20 1h 36m 14s master
January 4, 2025 00:20 1h 36m 14s
Use ds-specific module id to avoid conflicts
nv-torch-latest-v100 #12787: Pull request #6847 synchronize by loadams
January 3, 2025 22:04 25m 30s olruwase/pr_6772
January 3, 2025 22:04 25m 30s
Add the missing view operations from sequence parallel(async).
nv-torch-latest-v100 #12786: Pull request #6750 synchronize by loadams
January 3, 2025 19:32 2h 50m 36s inkcherry:ds_overlap_fix
January 3, 2025 19:32 2h 50m 36s
Cleanup ops/transformer/inference tests
nv-torch-latest-v100 #12785: Pull request #6925 opened by loadams
January 3, 2025 17:10 1h 25m 32s loadams/cleanup-transformer-inference-ops-tests
January 3, 2025 17:10 1h 25m 32s
Add fp8_gemm fallback for non-triton systems
nv-torch-latest-v100 #12784: Pull request #6916 synchronize by loadams
January 3, 2025 16:54 1h 23m 21s oelayan7:fp8_gemm_no_triton
January 3, 2025 16:54 1h 23m 21s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-torch-latest-v100 #12783: Pull request #6909 synchronize by loadams
January 3, 2025 16:28 1h 23m 14s hj-wei:dev_hjwei
January 3, 2025 16:28 1h 23m 14s
Fix checkpointable_layers Logic
nv-torch-latest-v100 #12782: Pull request #6881 synchronize by loadams
January 3, 2025 16:28 1h 23m 58s Quentin-Anthony:qanthony/fix-act-recomp
January 3, 2025 16:28 1h 23m 58s
nv-torch-latest-v100
nv-torch-latest-v100 #12781: Merge group checks requested
January 3, 2025 15:38 1h 33m 26s
January 3, 2025 15:38 1h 33m 26s
Support pure meta model lm_head tp
nv-torch-latest-v100 #12780: Pull request #6812 synchronize by delock
January 3, 2025 02:56 Action required Yejing-Lai:lyj/lm_head_replace
January 3, 2025 02:56 Action required
nv-torch-latest-v100
nv-torch-latest-v100 #12779: Scheduled
January 3, 2025 00:20 6h 0m 21s master
January 3, 2025 00:20 6h 0m 21s
Cleanup ops/transformer/inference tests
nv-torch-latest-v100 #12777: Pull request #6830 synchronize by loadams
January 2, 2025 18:47 1h 15m 50s loadams/transformers-inference
January 2, 2025 18:47 1h 15m 50s
Autotp training
nv-torch-latest-v100 #12775: Pull request #6922 synchronize by inkcherry
January 2, 2025 03:54 6h 0m 25s inkcherry:autotp_training
January 2, 2025 03:54 6h 0m 25s
nv-torch-latest-v100
nv-torch-latest-v100 #12774: Scheduled
January 2, 2025 00:20 1h 35m 10s master
January 2, 2025 00:20 1h 35m 10s
nv-torch-latest-v100
nv-torch-latest-v100 #12773: Scheduled
January 1, 2025 00:23 1h 39m 25s master
January 1, 2025 00:23 1h 39m 25s
Add fp8_gemm fallback for non-triton systems
nv-torch-latest-v100 #12772: Pull request #6916 synchronize by oelayan7
December 31, 2024 12:01 1h 20m 42s oelayan7:fp8_gemm_no_triton
December 31, 2024 12:01 1h 20m 42s
[inf] Add config var to enable keeping module on host
nv-torch-latest-v100 #12771: Pull request #6846 synchronize by oelayan7
December 31, 2024 07:32 1h 38m 58s oelayan7:keep_module_on_host
December 31, 2024 07:32 1h 38m 58s
nv-torch-latest-v100
nv-torch-latest-v100 #12770: Scheduled
December 31, 2024 00:20 2h 20m 13s master
December 31, 2024 00:20 2h 20m 13s
Use ds-specific module id to avoid conflicts
nv-torch-latest-v100 #12769: Pull request #6847 synchronize by loadams
December 30, 2024 21:04 1h 18m 5s olruwase/pr_6772
December 30, 2024 21:04 1h 18m 5s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-torch-latest-v100 #12768: Pull request #6909 synchronize by loadams
December 30, 2024 21:02 4h 25m 44s hj-wei:dev_hjwei
December 30, 2024 21:02 4h 25m 44s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12767: Pull request #6773 synchronize by loadams
December 30, 2024 18:54 2h 24m 36s deepcharm:stage3-use-new-grad-acc-api
December 30, 2024 18:54 2h 24m 36s
Fix checkpointable_layers Logic
nv-torch-latest-v100 #12766: Pull request #6881 synchronize by loadams
December 30, 2024 18:53 1h 43m 38s Quentin-Anthony:qanthony/fix-act-recomp
December 30, 2024 18:53 1h 43m 38s