Skip to content

Actions: microsoft/DeepSpeed

nv-torch-latest-v100

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
5,022 workflow runs
5,022 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Inference ops unit test failures/fixes
nv-torch-latest-v100 #12688: Pull request #6879 synchronize by loadams
December 18, 2024 16:53 3m 40s loadams/inference-ops-test-repro
December 18, 2024 16:53 3m 40s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12687: Pull request #6773 synchronize by loadams
December 18, 2024 16:51 1h 4m 48s deepcharm:stage3-use-new-grad-acc-api
December 18, 2024 16:51 1h 4m 48s
Zero2: avoid graph breaks in torch.compile by using param_idx
nv-torch-latest-v100 #12686: Pull request #6803 synchronize by loadams
December 18, 2024 16:51 1h 4m 59s nelyahu:zero2_param_idx
December 18, 2024 16:51 1h 4m 59s
Update code owners
nv-torch-latest-v100 #12685: Pull request #6890 synchronize by loadams
December 18, 2024 16:30 1h 32m 36s olruwase/code_owners
December 18, 2024 16:30 1h 32m 36s
Use ds-specific module id to avoid conflicts
nv-torch-latest-v100 #12683: Pull request #6847 synchronize by tjruwase
December 18, 2024 13:59 1h 19m 19s olruwase/pr_6772
December 18, 2024 13:59 1h 19m 19s
Update code owners
nv-torch-latest-v100 #12682: Pull request #6890 opened by tjruwase
December 18, 2024 12:04 1h 37m 34s olruwase/code_owners
December 18, 2024 12:04 1h 37m 34s
Fix error caused by all_reduce call in domino
nv-torch-latest-v100 #12681: Pull request #6880 synchronize by tjruwase
December 18, 2024 11:51 1h 35m 21s hongwei/fix_domino_allreduce
December 18, 2024 11:51 1h 35m 21s
Stage3: Use new torch grad accumulation hooks API
nv-torch-latest-v100 #12680: Pull request #6773 synchronize by deepcharm
December 18, 2024 09:44 1h 32m 4s deepcharm:stage3-use-new-grad-acc-api
December 18, 2024 09:44 1h 32m 4s
Add arctic model support by adding w2 to all_reduce
nv-torch-latest-v100 #12678: Pull request #6856 synchronize by loadams
December 18, 2024 01:31 4h 1m 23s pi314ever:arctic-enabling-upstream
December 18, 2024 01:31 4h 1m 23s
nv-torch-latest-v100
nv-torch-latest-v100 #12676: Scheduled
December 18, 2024 00:21 7h 29m 13s master
December 18, 2024 00:21 7h 29m 13s
Fix no-torch workflow and update real_accelerator
nv-torch-latest-v100 #12675: Pull request #6885 opened by loadams
December 17, 2024 22:25 6h 7m 2s loadams/fix-real-accelerator-no-torch
December 17, 2024 22:25 6h 7m 2s
Adds ignore_index to sequence parallel cross entropy
nv-torch-latest-v100 #12674: Pull request #6882 synchronize by tjruwase
December 17, 2024 22:00 6h 0m 43s ronald-d-rogers:add-ignore-index-sp-loss
December 17, 2024 22:00 6h 0m 43s
Zero2: avoid graph breaks in torch.compile by using param_idx
nv-torch-latest-v100 #12673: Pull request #6803 synchronize by loadams
December 17, 2024 20:22 6h 11m 49s nelyahu:zero2_param_idx
December 17, 2024 20:22 6h 11m 49s
Add arctic model support by adding w2 to all_reduce
nv-torch-latest-v100 #12672: Pull request #6856 synchronize by loadams
December 17, 2024 19:58 5h 10m 31s pi314ever:arctic-enabling-upstream
December 17, 2024 19:58 5h 10m 31s
Cleanup ops/transformer/inference tests
nv-torch-latest-v100 #12671: Pull request #6830 synchronize by loadams
December 17, 2024 19:55 6h 5m 22s loadams/transformers-inference
December 17, 2024 19:55 6h 5m 22s
Inference ops unit test failures/fixes
nv-torch-latest-v100 #12670: Pull request #6879 synchronize by loadams
December 17, 2024 19:54 29m 1s loadams/inference-ops-test-repro
December 17, 2024 19:54 29m 1s
Update transformers ops unit tests to use requried_torch_version
nv-torch-latest-v100 #12669: Pull request #6884 synchronize by loadams
December 17, 2024 18:22 1h 31m 13s loadams/fix-transformers-inference
December 17, 2024 18:22 1h 31m 13s
Inference ops unit test failures/fixes
nv-torch-latest-v100 #12666: Pull request #6879 synchronize by loadams
December 17, 2024 18:00 28m 49s loadams/inference-ops-test-repro
December 17, 2024 18:00 28m 49s
[inf] Add config var to enable keeping module on host
nv-torch-latest-v100 #12664: Pull request #6846 synchronize by oelayan7
December 17, 2024 07:46 6h 0m 26s oelayan7:keep_module_on_host
December 17, 2024 07:46 6h 0m 26s
[inf] Add config var to enable keeping module on host
nv-torch-latest-v100 #12663: Pull request #6846 synchronize by oelayan7
December 17, 2024 07:39 Action required oelayan7:keep_module_on_host
December 17, 2024 07:39 Action required
Fix error caused by all_reduce call in domino
nv-torch-latest-v100 #12662: Pull request #6880 synchronize by hwchen2017
December 17, 2024 01:46 2h 8m 10s hongwei/fix_domino_allreduce
December 17, 2024 01:46 2h 8m 10s
Add arctic model support by adding w2 to all_reduce
nv-torch-latest-v100 #12661: Pull request #6856 synchronize by tjruwase
December 17, 2024 01:35 1h 52m 27s pi314ever:arctic-enabling-upstream
December 17, 2024 01:35 1h 52m 27s