You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am working on long embedding vector sequence problem where embedding size is 500k and I want to predict multistep ahead like 100 time steps. I followed gpt2 type architecture but I got memory problem since my embedding vector (inputs) is 500k. The memory problem comes from attention but I can replace successfully using your attention but still I got memory problem coming from nn.Linear(emb_size, emb_size) in gpt2 architecture. do you have example or suggestion to use gpt2 architecture with your attention.
Thank you
The text was updated successfully, but these errors were encountered:
Hi,
I am working on long embedding vector sequence problem where embedding size is 500k and I want to predict multistep ahead like 100 time steps. I followed gpt2 type architecture but I got memory problem since my embedding vector (inputs) is 500k. The memory problem comes from attention but I can replace successfully using your attention but still I got memory problem coming from nn.Linear(emb_size, emb_size) in gpt2 architecture. do you have example or suggestion to use gpt2 architecture with your attention.
Thank you
The text was updated successfully, but these errors were encountered: