Hello dear one,
for my new project I would need sequence alignment tool to analyse RNA-seq data. I found that bowtie2 is a tool used in most of the previous studies and also bwa-mem is a tool does the same!
So, I was supposed to compare the both tools!
when I align (2GB +2GB fastq files) paired en reads, bowtie2 took 5131.879537 seconds . In other hand, bwa-mem took only 1041.879358 seconds. (bwa mem is so faster) I compared both of the output sam files, Its position, CIGAR, MRNM, MPOS are same only! flags are +/- 2 variations!
Anyone help me choose, which one? and what is the difference between them?
Thanks in advance!
Hey eternal question :)
How many reads do you have in your fastq ? at the end you can try to elucidate your 2 variations , but i m not sure you will able to it ^^ The think is in most of time the choice for one or an other tool is speed running question (like in your case you should choice bwa) or familiarity usage , i never seen alignment results that gives you different conclusions with one and the other.
Best
RNA-seq data of a eukaryotic or prokaryotic organism? Please be as complete as possible when asking questions.
Thanks for your reply! RNA -Seq data of prokaryote (Escherichia coli)
Then both should be fine and the fastest is a reasonable choice. Take into account that for eukaryotic RNA-seq (mRNA splicing) you would need a different aligner such as HISAT2 or STAR.
Hi On what parameters can we say which aligner is the best is it based up on 1) Alignment percentage (no. of PE reads mapped against reference) 2) Computation time In my case BWA-MEM and Bowtie2 taken 15:00 h and 15:30 h computational time. Both have similar algorithms. Can any one any give me a strong answer.
Currently I'm analysing human exome dataset, I performed the alignment and variant calling with different tools BWA-MEM, Bowtie2, Cushaw3, Novoalign. Variant calling by Freebayes and GATK-HC. How can I conclude which aligner is the best, on what parameters ?
You can not necessarily directly compare e.g. % aligned reads, since lower does not always mean worse - it might just be that the default parameters for Tool B are more strict than Tool A, meaning it’s actually discarding the poorer quality reads. Or vice versa, it might throw away real variants!
You’d need to try and run the tools with comparable parameters for gaps and mismatches etc. for a comparison to be fully meaningful.
Thanks for suggestion
How to perform these sort of analysis and how to check on my bam files. can you send me any material explaining how to perform these sort of analysis that would be help me.
If you want to compare variant calling workflows, Brad Chapman's work with bcbio and comparing options might be useful to you. See: https://github.com/bcbio/bcbio_validation_workflows
In essence what he did was to use the Genome in a Bottle NA12878 sample (http://jimb.stanford.edu/giab) and compare the mapping output against a truth set. There are truth sets for hg38 and hg19, here is the hg38 one for example: https://s3.amazonaws.com/cloudbiolinux/cache/platinum-genome/platinum-genome-NA12878-hg38-v2_0_1.vcf.gz
Unless you are talking about bacterial RNAseq, neither BWA nor Bowtie2 are the right tools for the job. Use either STAR (probably still the best, but memory hungry) or HISAT2.
It's prokaryotic: C: BWA MEM vs BOWTIE2 , which is best?
How about RSEM? (for Homo sapience)