Downloading Raw Reads
1
0
Entering edit mode
8 weeks ago
Jamie • 0

Hello!

I am trying to do a computational biology project for my school’s science fair and I want to download raw sequencing reads off the SRA database. However these reads are a lot larger than I thought and I’m worried what will happen if my computer runs out of space. How many samples should I have in general for each independent variable? The study has 99 different samples. Should I just buy an external hardrive or something and download all of the files, or could I only use, perhaps, 50 of the 99 samples?

For context I have a MacBook Pro with 1 TB of storage and this is the data that I want to use for my project: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA879084

I’m new to computational biology, so any suggestions would be greatly appreciated!

reads RNA • 301 views
ADD COMMENT
0
Entering edit mode

Another problem - even if you buy a big external SSD (not a hard disk) is that your computer likely does not have enough RAM to align the sequences to the genome. From memory an aligner like STAR can use over 32 GB of RAM, HiSat2 is likely more efficient. Another more resource efficient route would be to look at Kallisto or Salmon (both on github). But getting the count table as ATpoint says is likely the best option.

ADD REPLY
3
Entering edit mode
8 weeks ago
ATpoint 86k

I assume you eventually want counts? Make your life easy and use the counts provided at https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE213092 rather than processing data yourself from scratch, that imo goes beyond a school project.

ADD COMMENT

Login before adding your answer.

Traffic: 2192 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6