The assay which proceeds to measure changes in microbe abundance according to a single genotype, does not take into account the interactions of bacteria and genetics.
Therefore, we have created a tool to discover new host genetics and microbe interaction networks using microbiome PheWAS.
Python 2.7
version from https://www.python.org/download/releases/2.7
matplotlib
: pip install matplotlib
pandas 0.23.4
: pip install pandas
numpy 1.15.4
: pip install numpy
seaborn 0.9.0
: pip install seaborn
networkx 2.2
: pip install networkx
metagenomeSeq
: for CSS normalization
R Script
source("http://bioconductor.org/biocLite.R")
biocLite("metagenomeSeq")
sklearn
, math
Plink 1.9
version for genotype analysis from https://www.cog-genomics.org/plink2
If you downloaded plink, you would enter the following code:
sudo ln -s /absolute/Path/of/plink /usr/local/bin/plink
R
version 3.4 or higher
db19_20k.gz
for Gene mode from https://drive.google.com/open?id=1hEUdViceUQIO-_-zSShxUqW6W4qashXu
--DIR
: path of your Plink file format data
--Input_prefix
: Plink files(.bed,.bim,.fam) ID
--OTU_ID
: File ID of OTU file format
--Bacterial_class
: Choose bacterial taxonomic level such as Species(S),Genus(G),Family(F),Order(O),Class(C),Pylumn(P)
--Analysis
: Choose Analysis Mode such as Linear, NMF(non-negative matrix factorization), (Logistic is not yet available)
see http://zzz.bwh.harvard.edu/plink/anal.shtml
--P_cut
: Cut off of Single SNP P-value base on linear Quantitative Trait Loci Wald Test.
see http://zzz.bwh.harvard.edu/plink/anal.shtml .qassoc
--P_count
: Set the number of bacteria that exceed significance P. This is to find SNPs that control several bacteria.
--PHEWAS_image_mode
: PheWAS results make image like fig.1 Choose Y or YES make image default None
--NMF_K
: If the --Analysis
option is NMF
, set the NMF component K
--Gene_mode
: Y or YES : SNP in the gene region 20kb are generated by gene name. default None
require db19_20k.gz from https://drive.google.com/open?id=1hEUdViceUQIO-_-zSShxUqW6W4qashXu
--Cov
: Covariate File name . For the covariate file format, only plink covariate format is available.(require --Cov_names)
--Cov_names
: "," sparate Covariate names such as age,sex,bmi, .etc (require --Cov)
--Norm
: Operation taxonomic units(OTU) Table normalization. you can choose TSS and CSS .(default TSS)
--Corr
: Analysis method of SNP-SNP Beta correlation (defalt pearson, bray-cutis, .etc ) *not yet available option
--Corr_cut
: Set correlation coefficient cutoff *not yet available option
/downloaded/hGMNet/Path/hGMNet.sh
--OTU_ID your_OTU.txt
--Bacterial_class F
--OTU_DIR /your/OTU/path
--Input_prefix plink_file_id
--DIR /your/plink/.bed.bim.fam/path
--Analysis Linear
--P_cut 5e-6
--P_count 10
--PHEWAS_image_mode Y
/downloaded/hGMNet/Path/hGMNet.sh
--OTU_ID your_OTU.txt
--Bacterial_class S
--OTU_DIR /your/OTU/path
--Input_prefix plink_file_id
--DIR /your/plink/.bed.bim.fam/path
--Analysis Linear
--P_cut 5e-6
--P_count 4
--Cov /covariate/path/covariate.txt
--Cov_names age,sex,bmi,.etc
/downloaded/hGMNet/Path/hGMNet.sh
--OTU_ID your_OTU.txt
--Bacterial_class F
--OTU_DIR /your/OTU/path
--Input_prefix plink_file_id
--DIR /your/plink/.bed.bim.fam/path
--Analysis NMF
--NMF_K 8
--P_cut 5e-6
--P_count 1
/downloaded/hGMNet/Path/hGMNet.sh
--OTU_ID your_OTU.txt
--Bacterial_class G
--OTU_DIR /your/OTU/path
--Input_prefix plink_file_id
--DIR /your/plink/.bed.bim.fam/path
--Analysis NMF
--Norm CSS
--NMF_K 8
--P_cut 5e-6
--P_count 1
--Cov /covariate/path/covariate.txt
--Cov_names age,sex,bmi,.etc
git clone https://github.com/Sung-Bong-Kang/hGMNet.git
bash ./Setup.sh
./SetUP_example/Example_run.sh
[fig.1 microbiome PheWAS image mode result]
[fig.2 Bacteria and host Genotype interaction network
[1]
[2] Cronin, Robert M.; Field, Julie R.; Bradford, Yuki; Shaffer, Christian M.; Carroll, Robert J.; Mosley, Jonathan D.; Bastarache, Lisa; Edwards, Todd L.; Hebbring, Scott J. (2014). "Phenome-wide association studies demonstrating pleiotropy of genetic variants within FTO with and without adjustment for body mass index". Frontiers in Genetics. 5: 250. doi:10.3389/fgene.2014.00250. ISSN 1664-8021. PMC 4134007. PMID 25177340.
[3] Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ & Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81.
[4] Fevotte, C., & Idier, J. (2011). Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Computation, 23(9).