Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the acceptable sequence length of borzoi #72

Open
HelloWorldLTY opened this issue Oct 21, 2024 · 8 comments
Open

Questions about the acceptable sequence length of borzoi #72

HelloWorldLTY opened this issue Oct 21, 2024 · 8 comments

Comments

@HelloWorldLTY
Copy link

Hi, I notice that the borzoi model of gReLU relies on 512 as input sequence length, which is different from the default setting of borzoi (https://www.biorxiv.org/content/10.1101/2023.08.30.555582v1.full.pdf) should be 524 kb.

Thanks a lot.

@avantikalal
Copy link
Collaborator

Hi @HelloWorldLTY , that doesn't sound right, our borzoi model should also take 524 kb input. Please see tutorial 1 where we use the model to make predictions on a 524 bp long sequence. Could you clarify where you found the number 512?

@HelloWorldLTY
Copy link
Author

Hi, thanks for your quick reply.

  1. I can run the bozori model with a 512 bp input:
image

which seems strange to me. But I use this approach:

model_params = {
    'model_type':'BorzoiPretrainedModel', # Type of model
    'n_tasks': 1, # Number of cell types to predict
    'crop_len':0, # No cropping of the model output
#     'n_transformers': 11, # Number of transformer layers; the published Enformer model has 11
}

train_params = {
    'task':'regression', # binary classification
    'lr':1e-4, # learning rate
    'logger': 'csv', # Logs will be written to a CSV file
    'batch_size': 4,
    'num_workers': 8,
    'devices': 0, # GPU index
    'save_dir': experiment,
    'optimizer': 'adam',
    'max_epochs': 50,
    'checkpoint': True, # Save checkpoints
    'loss': 'MSE'
}

import grelu.lightning
model = grelu.lightning.LightningModel(model_params=model_params, train_params=train_params)
  1. If I set the input sequence lenfth as 524288, there will be an error (which I also raised couple of weeks ago in Genentech internal slack)
File ~/.conda/envs/evo/lib/python3.11/site-packages/grelu/model/blocks.py:725, in UnetBlock.forward(self, x, y)
    723 x = self.conv(x)
    724 x = self.upsample(x)
--> 725 x = torch.add(x, self.channel_transform(y))
    726 x = self.sconv(x)
    727 return x

RuntimeError: The size of tensor a (5018) must match the size of tensor b (5019) at non-singleton dimension 2

@avantikalal
Copy link
Collaborator

This is because 'BorzoiPretrainedModel' is not the same thing as the actual Borzoi model. To load the actual Borzoi model with the architecture and weights trained by Linder et al., please follow the instructions in tutorial 1.

What you are doing here, with BorzoiPretrainedModel is creating a new model which has the same convolutional and transformer layers as Borzoi, but you can define the final (head) layer yourself. As such, you are allowed to set whatever parameters you want, and as you have set crop_len=0 (no cropping of the model output), it will work with shorter inputs.

@AndreaMariani-AM
Copy link

Correct me if i'm wrong, but specifying BorzoiPretrainedModel should instantiate a model with the trained weights, from documentation:
This class creates a model identical to the published Enformer model and initialized with the trained weights, but where you can change the number of transformer layers and the output head.

If then i follow up with tune_on_dataset, i should be able to learn a linear layer while keeping the rest of the weights fixed and make it work with shorter inputs.

Please correct me if my understanding isn't right.

Best,
Andrea

@HelloWorldLTY
Copy link
Author

I am also a bit confsed here.

"""
Borzoi model with published weights (ported from Keras).
"""

It seems that BorzoiPretrainedModel will give us an identical model. I think what Avantikalal want to say is that this function changed the output header and thus the output will be different.

I think you do not need to fix the other weights unless you really want. It will be better if you finetune all weights together.

@AndreaMariani-AM
Copy link

@HelloWorldLTY I agree with you. It's just that i'm finetuning with a small sample size, therefore i fear that i might make it overfit if i update all the weights. I'm testing out both by the way to see if that's the case

@avantikalal
Copy link
Collaborator

@HelloWorldLTY @AndreaMariani-AM, the Borzoi model consists of convolutional / transformer layers followed by a linear 'head' layer that gives you the correct number of output tracks.

If you load the Borzoi model from the model zoo using load_model as shown in tutorial 1, you will get the complete Borzoi model with all layers initialized with their pre-trained weights. You can fine-tune this model on a dataset of your choice using tune_on_dataset.

If you use the BorzoiPretrainedModel class, you will get a model containing the convolutional and transformer layers initialized with their pre-trained weights, but the final linear head will not be initialized with its pretrained weights. It will be an entirely new head based on your input parameters and initialized randomly. This model will not give you the same results as the published Borzoi model. You can fine-tune this model using train_on_dataset as shown in tutorial 2.

I hope this clarifies. We're aware that this is an unnecessary and confusing system with two ways to do the same thing, and will soon change it (#58). Therefore, I recommend using load_model and tune_on_dataset as suggested by @AndreaMariani-AM , since the PretrainedModel classes will be removed in the next version.

@AndreaMariani-AM
Copy link

Perfect! Thanks for the explanation. It makes absolute sense for me!

Best,
Andrea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants