- Experimental data were generated by Armaoutov et al. (2020)
- Arnaoutov A, Lee H, Plevock Haase K, Aksenova V et al. IRBIT Directs Differentiation of Intestinal Stem Cell Progeny to Maintain Tissue Homeostasis. iScience 2020 Mar 27;23(3):100954. PMID: 32179478
- GEO data set: GSE109862
- Processing:
- Sequencing reads were downloaded from SRA, at PRJNA432208
fastq
files were check for adapter content, using FASTQC (no adapter sequence was found)- Reads were aligned on Drosophila melanogaster genome from Ensembl (Dmel_BDGP6.28) + 92 ERCC sequences, using STAR 2.7.1a, and then quantified by RSEM for abundance levels at gene and transcript levels.
Install the package, import the library and load the data set
devtools::install_github('ttdtrang/data-rnaseq-DmelIrbit')
library(data.rnaseq.DmelIrbit)
data(dmelirbit.rnaseq.gene)
dim(dmelirbit.rnaseq.gene@assayData$exprs)
The package includes 2 data sets, one for transcript-level counts/TPM and another for gene-level counts/TPM. Counts are non-integer estimate of expected_count
by RSEM.
cd data-raw
- Download all necessary raw data files which include
1.2M data-raw/feature_attrs.rsem.transcripts.tsv
6.2M data-raw/matrix.gene.expected_count.RDS
6.6M data-raw/matrix.gene.tpm.RDS
13M data-raw/matrix.transcripts.expected_count.RDS
11M data-raw/matrix.transcripts.tpm.RDS
64K data-raw/PRJNA432208_metadata_cleaned.tsv
48K data-raw/starLog.final.tsv
- Set the environment variable
DBDIR
to point to the path containing said files - Run the R notebook
make-data-package.Rmd
to assemble parts intoExpressionSet
objects.
You may need to change some code chunk setting from eval=FALSE
to eval=TRUE
to make sure all chunks would be run. These chunks are disabled by default to avoid overwriting existing data files in the folder.