forked from iree-org/iree
-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[iree-import-onnx] improve handling of large models (iree-org#19217)
This pr adds a few options: 1. `--large-model` allows disabling the onnx model checker if a user knows ahead of time that the model is too large. It will also not load the external weights in memory unless saving the parameters. 2. `--num-initializers-threshold` allows storing initializers to the irpa file in batches with a specified number of entries. This can reduce the memory overhead of first gathering all of the initializers, then saving them to the irpa at once. 3. `--externalize-inputs-threshold` allows converting inputs to externalized weights. This is useful for the following workflow: exporting a HF pytorch model with safetensors, saving a `.irpa` from the safetensor weights directly, and exporting to onnx with `export_params=False` and `do_constant_folding=False` (which converts weights to inputs and avoids folding weights with things like transposes). When importing to mlir, you can set `externalize-inputs-threshold=<num_original_inputs>` and it will convert the inputs from and beyond that threshold to `util.global` ops. 4. `--save-params`/`--no-save-params` factors saving parameters out of `import_initializer`, and one can avoid saving parameters with `--no-save-params`. Useful for debugging compilation failures. ## TODO: Figure out what to do about loading the onnx model and updating opset version. It's possible to do opset version updating without weights in a somewhat hacky way, since models > 2GB fail on opset version updating. Add documentation --------- Signed-off-by: zjgarvey <zjgarvey@gmail.com>
- Loading branch information
Showing
3 changed files
with
363 additions
and
82 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.