Hi biostars community,
First off I am quite new to NGS data analysis, and I am trying to find out answers for a few questions.
I have gotten ChIP-seq data from a time-course and disease experiment, that I have trimmed and aligned with bowtie2, giving me .sam files, which I am converting to .bam with SAMtools. I want to check the read coverage for QC (with MultiBamSummary) prior to peak-calling with MACS2 (and would like to visualise with IGV), but I have run into a few options and am not sure which is the right way to go:
Should I use SAMtools to sort and index the bam files prior to peak calling? (and can visualise the sorted.bam & index files in IGV) - is it necessary to sort .bam files prior to peak calling?
Or is it better to use deepTools' Bam coverage to generate bigwig files from the .bam files for visualisation with IGV, and then peak call with the non-sorted/indexed .bam files - saving some processing time?
Thanks for the help and suggestions, Andrew
Edit: I think I answered my own question here - deepTools multibamsummary requires indexed files to run....so will sort and index the files!
Coordinate-sorting your alignment is always a good choice as it is necessary for indexing and most downstream tools. You can avoid the sam file by simply piping the alignment directly to SAMtools sort:
Thanks for the suggestion! I hadn't thought of that (would have saved me a heap of time...) - I am going to try a few different alignment and peak calling tools, so will keep this in mind for the future! Cheers!
You're welcome. There are plenty of shortcuts and little workarounds that you'll learn about once you get more experienced ;-) I strongly encourage ypu to play around with tools and commands if you have a bit of time here and there.
how to index the sort.bam files
Please use the search function and google for this. The first hit would have redirected you to this post, instructing you to do
samtools index in.bam
;-)