Add SDPA support for LayoutLMv3 model #35469

stancld · 2024-12-31T12:28:18Z

What does this PR do?

Part of #35467.

Performance benchmark

Speed & memory req consumption on a token classification ntraining of LayoutLMv3-like model with multilingual support, various auxiliary tasks, masked language modelling.

GPU: 1x A100 80 GB
Batch size: 16, Accumulated gradient batches: 8

Impl.	Speed	Peak memory
Eager	~2.0 it/s	66.7 Gi
SDPA	~3.0 it/s	47.2 Gi

Overall, ~50% speed-up and memory reqs reduction is observed.

Before submitting

Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

cc: @ArthurZucker

stancld changed the title ~~[WIP] Add SDPA support for LayoutLMv3 model~~ Add SDPA support for LayoutLMv3 model Dec 31, 2024

stancld force-pushed the ds/feat/layoutlmv3-flash-attn branch 4 times, most recently from 1ef5b7e to c5de661 Compare December 31, 2024 13:01

models.layoutlmv3: Add SDPA support for LayoutLMv3 model

923cdea

stancld force-pushed the ds/feat/layoutlmv3-flash-attn branch from c5de661 to 923cdea Compare January 2, 2025 10:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SDPA support for LayoutLMv3 model #35469

Add SDPA support for LayoutLMv3 model #35469

stancld commented Dec 31, 2024 •

edited

Loading

Add SDPA support for LayoutLMv3 model #35469

Are you sure you want to change the base?

Add SDPA support for LayoutLMv3 model #35469

Conversation

stancld commented Dec 31, 2024 • edited Loading

What does this PR do?

Performance benchmark

Before submitting

Who can review?

stancld commented Dec 31, 2024 •

edited

Loading