Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add resource utilization calculator #1315

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from
Draft

add resource utilization calculator #1315

wants to merge 17 commits into from

Conversation

irenaby
Copy link
Collaborator

@irenaby irenaby commented Jan 6, 2025

Pull Request Description:

Add resource utilization calculator.
Update resource utilization data facade to use the resource utilization calculator.
Update mixed precision search manager to use the resource utilization calculator. Remove ru functions mapping.
Update core runner to use the resource utilization calculator to compute the final utilization.

Checklist before requesting a review:

  • I set the appropriate labels on the pull request.
  • I have added/updated the release note draft (if necessary).
  • I have updated the documentation to reflect my changes (if necessary).
  • All function and files are well documented.
  • All function and classes have type hints.
  • There is a licenses in all file.
  • The function and variable names are informative.
  • I have checked for code duplications.
  • I have added new unittest (if necessary).

@irenaby irenaby requested review from ofirgo and elad-c January 7, 2025 17:01
elif target_criterion == TargetInclusionCriterion.AnyQuantized:
nodes = [n for n in self.graph if n.has_any_weight_attr_to_quantize()]
elif target_criterion == TargetInclusionCriterion.QNonConfigurable:
# TODO this is wrong. Need to look at specific weights and not the whole node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is wrong? (add details to the comment if you are not planning on taking care of it in this PR, so we understand what is the problem when we come to fix it)

continue
for n in cut_target_nodes:
qc = act_qcs.get(n) if act_qcs else None
util_per_cut_per_node[cut][n] = self.compute_node_activation_tensor_utilization(n, target_criterion,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get into this, but @elad-c you should take a careful look here to understand that the "target_criterion" here enforces the behavior that we agreed on for what memory elements to include in a cut's "memory estimation" computation.
Maybe we should even raise an exception if the target_criterion for activation (i.e., max cut) is not aligned with what we agreed on for the current behavior (TargetInclusionCriterion.AnyQuantized if I understand the flow of the code correctly)

Copy link
Collaborator

@elad-c elad-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't comment in all places, but in general: fix or explain the combination of an argument which is a must, but type hint says Optional

""" Checks whether the specific weight has a configurable quantization. """
return self.is_weights_quantization_enabled(attr_name) and not self.is_all_weights_candidates_equal(attr_name)

def has_configurable_activation(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add return type hint
it has no parameters, so maybe switch to property

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added type hint.
I don't think the lack of arguments automatically qualifies as property. Property "pretends" to be a field, this is more logic than a field.

ru.activation_memory <= self.activation_memory and \
ru.total_memory <= self.total_memory and \
ru.bops <= self.bops
def __repr__(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't the dataclass autogenerate the repr method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it looks like a code (like it should be). Here it is used to print the summary, so I kept it identical.

def compute_weights_utilization(self,
target_criterion: TargetInclusionCriterion,
bitwidth_mode: BitwidthMode,
w_qcs: Optional[Dict[BaseNode, NodeWeightsQuantizationConfig]] = None) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if must be provided, then why is it optional? Also, better add validation for it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is optional. It is only used for the custom bitwidth mode. In custom mode, it must be provided for all configurable weights. For non-configurable weights it can be retrieved from the node, if not passed. I'll try to make it clearer.

mixed_precision_enable=False,
running_gptq=False)

ru_calculator = ResourceUtilizationCalculator(transformed_graph, fw_impl, fw_info)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a single RUCalculator, so we don't have to calculate everything twice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In another PR. I'll add TODO

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@irenaby please add this (and other RU-related open tasks we mentioned during the work on this one that we postponed to next PR) to the clickup task

tests_pytest/core/__init__.py Outdated Show resolved Hide resolved
tests_pytest/core/common/__init__.py Outdated Show resolved Hide resolved
tests_pytest/core/common/mixed_precision/__init__.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@ofirgo ofirgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completed reviewing the code itself (still need to go over the tests but don't wait for me).
Looks very good! most of the comments are regarding documentation and usability, nothing major.

NodeActivationQuantizationConfig


# TODO take into account Virtual nodes. Are candidates defined with respect to virtual or original nodes?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding the first question - the candidates of a virtual node are a combination of the candidates of the nodes it is composed of, from the original graph.

regarding the second question - I think we can separate it like this. Anyway, we can dive into this when re-implementing the BOPs metric, so keep the TODO and we'll address it in the future

model_compression_toolkit/core/runner.py Outdated Show resolved Hide resolved
model_compression_toolkit/core/runner.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants