Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a custom database #17

Closed
MarcNIOZ opened this issue Jun 12, 2017 · 18 comments
Closed

Creating a custom database #17

MarcNIOZ opened this issue Jun 12, 2017 · 18 comments

Comments

@MarcNIOZ
Copy link

Hello all,

I tried to generate a lobstah database but I encountered a couple of errors.

When I try create a database with the default .csv file I got the following:

generateLOBdbase(polarity=c("positive","negative"), gen.csv = FALSE, component.defs = "/export/data/mbesseling/Documents/Orbitrap/LOBSTAHS_componentCompTable.csv", AIH.defs=NULL, acyl.ranges=NULL, oxy.ranges=NULL)
Error in read.table(componentTableLoc, sep = ",", header = TRUE, row.names = 1) :
duplicate 'row.names' are not allowed

When I tried to create a database with some additional compounds I got the following:

generateLOBdbase(polarity=c("positive","negative"), gen.csv = FALSE, component.defs = "/export/data/mbesseling/Documents/Orbitrap/LOBSTAHS_componentCompTable.csv", AIH.defs=NULL, acyl.ranges=NULL, oxy.ranges=NULL)
Error in calcComponentMasses(componentTable.loc, use.default.componentTable) :
Different number of chemical building blocks in the component composition matrix and in the onboard list of exact masses. Check your composition matrix carefully. Aborting...

Perhaps somebody can help me with this.

Thanks in advance,

Marc Besseling
PhD student at the Netherlands Institute for Sea Research (NIOZ)

@jamesrco jamesrco self-assigned this Jun 12, 2017
@jamesrco
Copy link
Member

jamesrco commented Jun 12, 2017

Hi @MarcNIOZ,

Let's start with the first (less complex) error you are receiving when attempting to recreate the default database. When running your code using a clean version of the component definitions file (e.g., which I just downloaded from https://github.com/vanmooylipidomics/LOBSTAHS/blob/master/inst/doc/csv/LOBSTAHS_componentCompTable.csv), I did not receive the same error.

However, I just took a look at the default version of the LOBSTAHS_componentCompTable.csv file you are using (which you kindly sent along by e-mail), and the problem appears to be ~ 130 empty lines of csv data at the end of your file. Here is a sample of what it looks like when I open the file in a text reader:

UQ10:10,59,90,0,0,4,0,0,0,0,0,0,0,0,0,0,ubiquinone,ubiquinone,DB_unique_species
UQ11:11,64,98,0,0,4,0,0,0,0,0,0,0,0,0,0,ubiquinone,ubiquinone,DB_unique_species
UQ12:12,69,106,0,0,4,0,0,0,0,0,0,0,0,0,0,ubiquinone,ubiquinone,DB_unique_species
UQ13:13,74,114,0,0,4,0,0,0,0,0,0,0,0,0,0,ubiquinone,ubiquinone,DB_unique_species
PDMS6,12,36,0,0,6,0,0,0,0,0,0,0,0,0,6,PDMS,PDMS,DB_unique_species
PDMS7,14,42,0,0,7,0,0,0,0,0,0,0,0,0,7,PDMS,PDMS,DB_unique_species
PDMS8,16,48,0,0,8,0,0,0,0,0,0,0,0,0,8,PDMS,PDMS,DB_unique_species
PDMS9,18,54,0,0,9,0,0,0,0,0,0,0,0,0,9,PDMS,PDMS,DB_unique_species
PDMS10,20,60,0,0,10,0,0,0,0,0,0,0,0,0,10,PDMS,PDMS,DB_unique_species
PDMS11,22,66,0,0,11,0,0,0,0,0,0,0,0,0,11,PDMS,PDMS,DB_unique_species
PDMS12,24,72,0,0,12,0,0,0,0,0,0,0,0,0,12,PDMS,PDMS,DB_unique_species
PDMS13,26,78,0,0,13,0,0,0,0,0,0,0,0,0,13,PDMS,PDMS,DB_unique_species
PDMS14,28,84,0,0,14,0,0,0,0,0,0,0,0,0,14,PDMS,PDMS,DB_unique_species
PDMS15,30,90,0,0,15,0,0,0,0,0,0,0,0,0,15,PDMS,PDMS,DB_unique_species
PDMS16,32,96,0,0,16,0,0,0,0,0,0,0,0,0,16,PDMS,PDMS,DB_unique_species
PDMS17,34,102,0,0,17,0,0,0,0,0,0,0,0,0,17,PDMS,PDMS,DB_unique_species
PDMS18,36,108,0,0,18,0,0,0,0,0,0,0,0,0,18,PDMS,PDMS,DB_unique_species
PDMS19,38,114,0,0,19,0,0,0,0,0,0,0,0,0,19,PDMS,PDMS,DB_unique_species
PDMS20,40,120,0,0,20,0,0,0,0,0,0,0,0,0,20,PDMS,PDMS,DB_unique_species
PDMS21,42,126,0,0,21,0,0,0,0,0,0,0,0,0,21,PDMS,PDMS,DB_unique_species
PDMS22,44,132,0,0,22,0,0,0,0,0,0,0,0,0,22,PDMS,PDMS,DB_unique_species
PDMS23,46,138,0,0,23,0,0,0,0,0,0,0,0,0,23,PDMS,PDMS,DB_unique_species
PDMS24,48,144,0,0,24,0,0,0,0,0,0,0,0,0,24,PDMS,PDMS,DB_unique_species
PDMS25,50,150,0,0,25,0,0,0,0,0,0,0,0,0,25,PDMS,PDMS,DB_unique_species
PDMS26,52,156,0,0,26,0,0,0,0,0,0,0,0,0,26,PDMS,PDMS,DB_unique_species
PDMS27,54,162,0,0,27,0,0,0,0,0,0,0,0,0,27,PDMS,PDMS,DB_unique_species
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,

Those "empty" lines continue for a bit. Perhaps these were somehow retained after you added new entries to the file and then deleted them? This can happen if you don't actually delete the "empty" lines from the file. What were you using to edit your .csv files? Microsoft Excel is terrible about this. Take a look and see if you can fix the problem. Will take a look at the second issue here soon.

@jamesrco
Copy link
Member

jamesrco commented Jun 12, 2017

I am also opening an issue #18 for a new feature request with the @vanmooylipidomics/lobstahs-devel-team to see if we can't write something to make LOBSTAHS smart enough to correct/ignore that sort of problem in the future.

@MarcNIOZ
Copy link
Author

Hello Jamie,

Thanks for all the trouble. I used excel for this and it is possible that I tried it with the added compounds and then removed them for the default option. This probably kept some of the lines “open”. I will try to see if this also the case with the added compounds error.

@MarcNIOZ
Copy link
Author

Oke, I downloaded the original database file to try again. However now I got the same error as with the added compounds file.

generateLOBdbase(polarity=c("positive","negative"), gen.csv = FALSE, component.defs = "LOBSTAHS_componentCompTable.csv", AIH.defs=NULL, acyl.ranges=NULL, oxy.ranges=NULL)
Error in calcComponentMasses(componentTable.loc, use.default.componentTable) :
Different number of chemical building blocks in the component composition matrix and in the onboard list of exact masses. Check your composition matrix carefully. Aborting...

@jamesrco
Copy link
Member

Hi Marc:

I am unable to reproduce the calcComponentMasses building blocks error when I regenerate the default databases in this way, and I am able to generate the databases without issue. Tell me what you get when you try the below:

ncol(read.table("LOBSTAHS_componentCompTable.csv", sep = ",", header = TRUE, row.names = 1))

Result should be 18.

@MarcNIOZ
Copy link
Author

Hmm, strange. I checked it and it also gives 18.

Is there a way to download the original .csv file? Or can you send it to me? Perhaps it has something to do with excel and the convertion to .csv.

@jamesrco
Copy link
Member

@MarcNIOZ
Copy link
Author

MarcNIOZ commented Jun 12, 2017

I downloaded the excel file including the instructions.

From the following page: https://github.com/vanmooylipidomics/LOBSTAHS/blob/master/inst/doc/xlsx/LOBSTAHS_componentCompTable.xlsx

@jamesrco
Copy link
Member

What if you try using the .csv from the link above, do you still receive the error?

I also just tried again, this time by saving the second tab of the LOBSTAHS_componentCompTable.xlsx spreadsheet as a .csv file; it worked fine for me.

@MarcNIOZ
Copy link
Author

I tried it (by right clicking on raw and then save as). However I couldn't download it as it says "insufficient rights" (translated from Dutch).

@jamesrco
Copy link
Member

What is "it" and "raw," and where did you attempt to download the file from? Please be specific -- hard to recreate a workflow with nondescriptive terms.

@jamesrco
Copy link
Member

Sorry, you meant you clicked the "raw" tab on this page: https://github.com/vanmooylipidomics/LOBSTAHS/blob/master/inst/doc/csv/LOBSTAHS_componentCompTable.csv and then attempted to download? I am not sure what that error is about; sounds like problem specific to your browser or OS. What if you just copy the text, paste into a text document, and then save it?

@MarcNIOZ
Copy link
Author

Ah yes, sorry, it is getting late here. I was about to make a screen shot.

You were right, chrome didn't let me download it. I was able to do it with firefox but still got the same error.

@jamesrco
Copy link
Member

After much discussion and trial and error (mostly via a long email chain between @jamesrco and @MarcNIOZ), determined source of the second error was a very critical commit that did not get passed from the LOBSTAHS GitHub repository to Bioconductor in time to meet the Bioconductor 3.5 release date. I've pushed this commit and a few others to both the devel and release-3.5 Bioconductor branches. Any user installing (or re-installing) the package after roughly noon tomorrow (June 28, 2017) using

source("https://bioconductor.org/biocLite.R")
biocLite("LOBSTAHS")

will get the correct code, which contains the necessary fixes. The revised code requires that the custom LOBSTAHS_componentCompTable (if used) contain 18 columns, including a column for silicon atoms ("Si"). Thanks to @MarcNIOZ for helping to figure this one out.

Leaving this issue open for a day or so to make sure the changes propagate, then will close it.

@lee-t
Copy link
Contributor

lee-t commented Jun 27, 2017

On a potentially unrelated note, this also cleared up my earlier issue with the vignette. I was using the bioconductor version to run it before. I cloned the latest master and it seems to be working properly.

@jamesrco
Copy link
Member

Also, added a check in the generateLOBdbase() code (commits eee8703 and c58f0af) which will catch future users who are supplying a table without an "Si" column and provide them with some specific feedback to correct the error.

@jamesrco
Copy link
Member

@lee-t Remind me to walk you though how we get files from this Git repo to the correct place on the Bioconductor Subversion server. It is far from obvious, and very easy to screw up. I've messed it up a bunch of times, and in one of those instances it took me a day at the command line to fix everything. Supposedly they're migrating to pure Git soon, but the project seems to be on hold.

@jamesrco
Copy link
Member

jamesrco commented Jul 3, 2017

Closing this one out. The Bioconductor release installation now loads LOBSTAHS v1.2.1, which contains the necessary fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants