MiSeq Analysis Beginner: First steps after getting MiSeq paired end data
0
0
Entering edit mode
9.4 years ago
Sara S. • 0

Hi,

So I'm very new to all of this bioinformatics and have a couple of questions. I've gotten back my fastq.gz files from a MiSeq provider and have run them through FASTQC. I did MiSeq 2x250 paired end reads. I'm not sure of the next step. Do I need to demultiplex them and then trim them? Or do I join R1 and R2 before I trim them? And then align them to my reference genomes? Or am I missing steps somewhere in there? I'm looking to align them to 2 different reference genomes to see which genes are from which reference genome (looking at recombination). Any suggestions about the order of steps would be greatly appreciated! Thanks!

reference-genome demultiplexing Assembly MiSeq • 5.6k views
ADD COMMENT
0
Entering edit mode

If you're using a common, up-to-date aligner like bwa or bowtie, you won't need to join your FASTQ data. I'll just go ahead and say "don't join them".

Did your FASTQC report(s) show any pronounced quality dropoff in the later cycles? Were there adapters in the overrepresented sequences? If the answer is yes to either of those questions, you should perform a quality trim or an adapter trim accordingly. Once you've done that, your data should be ready for alignment.

ADD REPLY
0
Entering edit mode

Take a look at BBMap suite. It has all the tools you are likely to need for your analysis bbduk.sh=trimming, bbmap.sh=alignment, bbsplit.sh=splitting data). e.g. You can use BBSplit to assign reads to your two genomes in a single step. There are threads for these tools over at http://seqanswers.com that will give you usage details.

ADD REPLY
0
Entering edit mode

Thank you both very much! Some of my samples show a quality dropoff in later cycles and almost all of them show wobbling in GC content in the first few bases where adapters would be, so I think trimming is definitely in order.

I'll definitely take a look at the BBMap suite-it sounds like it should have most of what I need.

Does this mean I don't need to do anything with demultiplexing?

Thanks again for your input, it's very helpful.

ADD REPLY
0
Entering edit mode

If you have individual files (a pair for each sample, R1/R2) then your data was already demultiplexed by your sequence provider.

ADD REPLY

Login before adding your answer.

Traffic: 1471 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6