Skip to content

Actions: microsoft/DeepSpeed

nv-accelerate-v100

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
4,942 workflow runs
4,942 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

nv-accelerate-v100
nv-accelerate-v100 #12646: Merge group checks requested
January 4, 2025 05:58 7m 34s
January 4, 2025 05:58 7m 34s
Fix: forbid repeated deepspeed.initialize on training objects
nv-accelerate-v100 #12645: Pull request #6874 synchronize by loadams
January 4, 2025 04:39 Action required traincheck-team:fix-6848-forbid-repeated-init
January 4, 2025 04:39 Action required
nv-accelerate-v100
nv-accelerate-v100 #12644: Scheduled
January 4, 2025 00:06 7m 32s master
January 4, 2025 00:06 7m 32s
Use ds-specific module id to avoid conflicts
nv-accelerate-v100 #12643: Pull request #6847 synchronize by loadams
January 3, 2025 22:04 11m 37s olruwase/pr_6772
January 3, 2025 22:04 11m 37s
Add the missing view operations from sequence parallel(async).
nv-accelerate-v100 #12642: Pull request #6750 synchronize by loadams
January 3, 2025 19:32 12m 44s inkcherry:ds_overlap_fix
January 3, 2025 19:32 12m 44s
Add fp8_gemm fallback for non-triton systems
nv-accelerate-v100 #12640: Pull request #6916 synchronize by loadams
January 3, 2025 16:54 1h 41m 40s oelayan7:fp8_gemm_no_triton
January 3, 2025 16:54 1h 41m 40s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-accelerate-v100 #12639: Pull request #6909 synchronize by loadams
January 3, 2025 16:28 54m 53s hj-wei:dev_hjwei
January 3, 2025 16:28 54m 53s
Fix checkpointable_layers Logic
nv-accelerate-v100 #12638: Pull request #6881 synchronize by loadams
January 3, 2025 16:28 11m 18s Quentin-Anthony:qanthony/fix-act-recomp
January 3, 2025 16:28 11m 18s
nv-accelerate-v100
nv-accelerate-v100 #12637: Merge group checks requested
January 3, 2025 15:38 8m 8s
January 3, 2025 15:38 8m 8s
Support pure meta model lm_head tp
nv-accelerate-v100 #12636: Pull request #6812 synchronize by delock
January 3, 2025 02:56 Action required Yejing-Lai:lyj/lm_head_replace
January 3, 2025 02:56 Action required
nv-accelerate-v100
nv-accelerate-v100 #12635: Scheduled
January 3, 2025 00:07 11m 1s master
January 3, 2025 00:07 11m 1s
Cleanup ops/transformer/inference tests
nv-accelerate-v100 #12633: Pull request #6830 synchronize by loadams
January 2, 2025 18:47 7m 19s loadams/transformers-inference
January 2, 2025 18:47 7m 19s
Autotp training
nv-accelerate-v100 #12631: Pull request #6922 synchronize by inkcherry
January 2, 2025 03:54 3m 56s inkcherry:autotp_training
January 2, 2025 03:54 3m 56s
nv-accelerate-v100
nv-accelerate-v100 #12630: Scheduled
January 2, 2025 00:07 3m 51s master
January 2, 2025 00:07 3m 51s
nv-accelerate-v100
nv-accelerate-v100 #12629: Scheduled
January 1, 2025 00:08 3m 54s master
January 1, 2025 00:08 3m 54s
Add fp8_gemm fallback for non-triton systems
nv-accelerate-v100 #12628: Pull request #6916 synchronize by oelayan7
December 31, 2024 12:01 11m 23s oelayan7:fp8_gemm_no_triton
December 31, 2024 12:01 11m 23s
[inf] Add config var to enable keeping module on host
nv-accelerate-v100 #12627: Pull request #6846 synchronize by oelayan7
December 31, 2024 07:32 3m 55s oelayan7:keep_module_on_host
December 31, 2024 07:32 3m 55s
nv-accelerate-v100
nv-accelerate-v100 #12626: Scheduled
December 31, 2024 00:07 12m 31s master
December 31, 2024 00:07 12m 31s
Use ds-specific module id to avoid conflicts
nv-accelerate-v100 #12625: Pull request #6847 synchronize by loadams
December 30, 2024 21:04 11m 9s olruwase/pr_6772
December 30, 2024 21:04 11m 9s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-accelerate-v100 #12624: Pull request #6909 synchronize by loadams
December 30, 2024 21:02 12m 48s hj-wei:dev_hjwei
December 30, 2024 21:02 12m 48s
Stage3: Use new torch grad accumulation hooks API
nv-accelerate-v100 #12623: Pull request #6773 synchronize by loadams
December 30, 2024 18:54 21m 1s deepcharm:stage3-use-new-grad-acc-api
December 30, 2024 18:54 21m 1s
Fix checkpointable_layers Logic
nv-accelerate-v100 #12622: Pull request #6881 synchronize by loadams
December 30, 2024 18:53 33m 40s Quentin-Anthony:qanthony/fix-act-recomp
December 30, 2024 18:53 33m 40s