Question

Working Monocle 3 with series_matrix files?

0

Entering edit mode

4.9 years ago

Pratik ★ 1.1k

Hi,

I am running Ubuntu 20.04 LTS. Currently on a slower Macbook Air, but recently ordered this server: HP Proliant DL360p G8 8 Bays 2.5 Server - 2X Intel Xeon E5-2680 2.7GHz 8 Core - 16GB DDR3 REG Memory - HP P420i 512MB Raid Controller - 2.4TB (4X 600GB 10K SAS SED New HDD) - 2X 750w PSU (Renewed) to work faster.

so I'm just starting out using Monocle 3. I want to eventually be able to use all the tools that are available efficiently, however I am starting with Monocle 3 because of the option to do pseudotime trajectory analysis.

I want to recreate the finding on the original Monocle3 paper: "The single-cell transcriptional landscape of mammalian organogenesis." Specifically the "Resolving cellular trajectories in myogenesis" figure.

I was going to be beginning with fastq files that through some real struggle I figured out how to download in bulk through the 'awk' command.

However, I was told by a mentor that working with expression matrix files would make my life easier.

So my questions are on NIH NCBI GEO Accession page is the "series_matrix.txt.gz" file also known as the expression matrix file?

In the 'loading the data step' in Getting started in Moncole3 on the Monocle3 page

# Load the data
expression_matrix <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_expression.rds"))
cell_metadata <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_colData.rds"))
gene_annotation <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_rowData.rds"))

Would I first download these series matrix file:

https://ftp.ncbi.nlm.nih.gov/geo/series/GSE119nnn/GSE119945/matrix/GSE119945_series_matrix.txt.gz

then would cell_annotation be cell_metadata?:

https://ftp.ncbi.nlm.nih.gov/geo/series/GSE119nnn/GSE119945/suppl/GSE119945%5Fcell%5Fannotate%2Ecsv%2Egz

and lastly (this one is more obvious, I think) gene_annotation would be gene_annotate:

https://ftp.ncbi.nlm.nih.gov/geo/series/GSE119nnn/GSE119945/suppl/GSE119945%5Fgene%5Fannotate%2Ecsv%2Egz

so I would download these files through wget, then extract them through

gzip -d filename

and then feed their directories into?

expression_matrix <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_expression.rds"))
cell_metadata <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_colData.rds"))
gene_annotation <- readRDS(url("http://staff.washington.edu/hpliner/data/cao_l2_rowData.rds"))

and then I would continue the steps of getting started on the Monocle 3 page.

Could someone share what you do, when you're getting started with analyzing data with Monocle 3 without 10x genomic data, please?

Very Respectfully, Pratik

RNA-Seq rna-seq • 984 views

ADD COMMENT • link 4.9 years ago by Pratik ★ 1.1k