Error in splitvec.from.bfile(bfile) : length(pvec) == length(bfile) is not TRUE #43

rainajia · 2022-10-11T13:47:31Z

Hi, I ran the pipeline by chromosomes using the same ref.file and test.file for each chromosome, then merged the output variables together using "merge" in a loop.
However, when I used "validate", it threw the error:
Error in splitvec.from.bfile(bfile) :length(pvec) == length(bfile) is not TRUE
Could you explain to me what might caused the error?

tshmak · 2022-10-12T01:49:26Z

Can you give me your entire script?

rainajia · 2022-10-12T07:48:16Z

Can you give me your entire script?
Hi, my original code is attached below. I have realised that to merge the lassosum.pipeline output, it didn't work when I merge them in a loop, but it worked when I do merge(out1, out2,out3...out22). However, it is taking very long to run validate with the merged "out", I have a large sample size of ~400k for my phenotype, which validation method would be the most efficient for large sample sizes?

for(i in 1:22){
print(paste0("now processing chromosome ",i))
bfile <- paste0("./Chr",i")
rfile <- paste0("../Chr",i,"_Random25k")

tmp <-
  lassosum.pipeline(
  cor = cor,
  chr = ss$CHR,
  pos = ss$POS,
  A1 = ss$A1,
  A2 = ss$A2,
  ref.bfile = rfile,
  test.bfile = bfile,
  max.ref.bfile.n=25000,  
  LDblocks = LDblocks, 
  cluster=cl)

  if(i==1){
       out <- tmp
       }else{
       out <- merge(out,tmp)
       }
}
target.res <- lassosum::validate(out, pheno = as.data.frame(pheno), covar=as.data.frame(cov))

tshmak · 2022-10-12T08:41:39Z

So are you still getting this error splitvec.from.bfile(bfile) : length(pvec) == length(bfile) is not TRUE. And if so, at which stage?

rainajia · 2022-10-12T08:50:34Z

So are you still getting this error splitvec.from.bfile(bfile) : length(pvec) == length(bfile) is not TRUE. And if so, at which stage?

I don't get this error anymore when I do validate(out) where "out <- merge(out1, out2, out3...out22)". The error occured previously when I do validate(out) where out is merged by each of the lassosum.poipeline output from chromosomes in a for loop as shown in the code above. Sorry about the confusion, my current question is which validatation method to use for large samples. I have 400k samples for matched genotype and phenotype, and the previous run with validate(out, pheno,covar) has ran over 9 hours with 40 cores. I was wondering if this is normal behaviour, or is there a better way to parallelise it?

tshmak · 2022-10-12T08:56:40Z

Yes, calculating PGS can take a long time with a large sample size. One way to speed up the calculation is to use multiprocessing (see here). Another way is to try to ensure the covar and the pheno is in the exact order as test.bfile. (Maybe you need to ensure there are no missing values also, but I can't remember if that's the case.) If everything matches exactly, you will not see the message Calculating PGS..., and it should be very fast.

rainajia · 2022-10-12T09:01:50Z

Yes, calculating PGS can take a long time with a large sample size. One way to speed up the calculation is to use multiprocessing (see here). Another way is to try to ensure the covar and the pheno is in the exact order as test.bfile. (Maybe you need to ensure there are no missing values also, but I can't remember if that's the case.) If everything matches exactly, you will not see the message Calculating PGS..., and it should be very fast.

Thanks very much, Calculating PGS... was exactly what I have been seeing. I will double check on these points.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in splitvec.from.bfile(bfile) : length(pvec) == length(bfile) is not TRUE #43

Error in splitvec.from.bfile(bfile) : length(pvec) == length(bfile) is not TRUE #43

rainajia commented Oct 11, 2022 •

edited

Loading

tshmak commented Oct 12, 2022

rainajia commented Oct 12, 2022 •

edited

Loading

tshmak commented Oct 12, 2022

rainajia commented Oct 12, 2022 •

edited

Loading

tshmak commented Oct 12, 2022 •

edited

Loading

rainajia commented Oct 12, 2022

Error in splitvec.from.bfile(bfile) : length(pvec) == length(bfile) is not TRUE #43

Error in splitvec.from.bfile(bfile) : length(pvec) == length(bfile) is not TRUE #43

Comments

rainajia commented Oct 11, 2022 • edited Loading

tshmak commented Oct 12, 2022

rainajia commented Oct 12, 2022 • edited Loading

tshmak commented Oct 12, 2022

rainajia commented Oct 12, 2022 • edited Loading

tshmak commented Oct 12, 2022 • edited Loading

rainajia commented Oct 12, 2022

rainajia commented Oct 11, 2022 •

edited

Loading

rainajia commented Oct 12, 2022 •

edited

Loading

rainajia commented Oct 12, 2022 •

edited

Loading

tshmak commented Oct 12, 2022 •

edited

Loading