You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
Thank you for your work!
In our project, we trained an AttentionXML model on 4 GPUs but are now trying to load it in an environment where only one GPU is available.
After modifying the code according to this issue #34, we get the following error:
RuntimeError: Error(s) in loading state_dict for ModuleDict:
Missing key(s) in state_dict: "Network.attention.attention.weight".
Unexpected key(s) in state_dict: "AttentionWeights.emb.0.weight", "AttentionWeights.emb.1.weight", "AttentionWeights.emb.2.weight".
This error occurs only when prediction on Level-1 is performed. No error occurs in the 4 GPU environment.
We have already tried to concatenate "AttentionWeights.emb.0.weight", "AttentionWeights.emb.1.weight", "AttentionWeights.emb.2.weight" but they seem to have a different dimension than required.
Do you have any idea how we can get this to work?
Best wishes, Katja
The text was updated successfully, but these errors were encountered:
We have already tried to concatenate "AttentionWeights.emb.0.weight", "AttentionWeights.emb.1.weight", "AttentionWeights.emb.2.weight" but they seem to have a different dimension than required.
Could you please tell me the dimension of these weights, I think it should be ok to concatenate them.
Thank you for your response!
The weights have the following dimensions: (67619, 1024) (67618, 1024) (67618, 1024)
When I try to concatenate them, I get the following error:
RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ModuleDict:
size mismatch for Network.attention.attention.weight: copying a param with shape torch.Size([202855, 1024]) from checkpoint, the shape in current model is torch.Size([202856, 1024])
As far as I understand, the required dimension is 202856 but when I concatenate the tensor with the dimensions mentioned above I get 202855.
How many labels do you have? 202856 or 202855?
I check the sum by using assert sum(self.group) == labels_num in the modules.py so I'm confused about it.
Hello,
Thank you for your work!
In our project, we trained an AttentionXML model on 4 GPUs but are now trying to load it in an environment where only one GPU is available.
After modifying the code according to this issue #34, we get the following error:
This error occurs only when prediction on Level-1 is performed. No error occurs in the 4 GPU environment.
We have already tried to concatenate
"AttentionWeights.emb.0.weight", "AttentionWeights.emb.1.weight", "AttentionWeights.emb.2.weight"
but they seem to have a different dimension than required.Do you have any idea how we can get this to work?
Best wishes, Katja
The text was updated successfully, but these errors were encountered: