Entering edit mode
8.8 years ago
zizigolu
★
4.3k
Hi,
I used bowtie2 to map GSE69414 data sets but rate of alignment is 0 or max 0.02
How is that possible? Am I doing something wrong?
[izadi@lbox161 bowtie2-2.2.5]$ bowtie2 -x Saccharomyces_cerevisiae -U SRR2046311-ribo_trimmed.fastq -S SRR2046311-ribo_trimmed.sam
46386847 reads; of these:
46386847 (100.00%) were unpaired; of these:
46375320 (99.98%) aligned 0 times
10406 (0.02%) aligned exactly 1 time
1121 (0.00%) aligned >1 times
0.02% overall alignment rate
[izadi@lbox161 bowtie2-2.2.5]$
[izadi@lbox161 bowtie2-2.2.5]$ bowtie2 -x Saccharomyces_cerevisiae -U SRR2046322-mRNA_trimmed.fastq -S SRR2046322-mRNA_trimmed.fastq
0 reads
0.00% overall alignment rate
[izadi@lbox161 bowtie2-2.2.5]$
What is the reason please?
Thank you
hi,
It seems you are aligning paired-end RNA-seq using bowtie. Maybe use splice-aware aligner like TopHat or STAR. mRNA seq reads would have gapped alignment (due to intervening introns) which bowtie would fail to recognize in most cases. Still you should have some % of reads aligning. Not sure what exactly is reason for 0%. Try splice-aware aligners and see
thank you, but I think data sets is not paired-end
Yea, my bad. The bowtie report already says so.
How many reads in your fastq file? (
cat SRR2046322-mRNA_trimmed.fastq | wc -l
)thank you,
Is this Illumina sequencing? 4 lines per read is 6694/4 = 1673.5 (counting for an out by one error in wc). 1.7K Reads in raw fastq seems suspicious.
This. Either sequencing coverage was extremely low (too many libraries per effort or something) or something is screwed up in preprocessing. It is hard to diagnose anything else with this few reads.
EDIT: Actually, something else is wrong. Bowtie reports 46386847 reads above. Why the discrepancy? A previous comment was correct in that you should be using a splice-aware aligner.
actually I am mapping the reads on yeast coding sequence. I mean first I indexed cds fasta then mapped the reads on but the rate of alignment was zero. might this is the reason???
my supervisor believed that because yeast is Eukaryota, we can use bowtie but on cds instead of whole genome fasta this this is like this we are using tophat. I don't if he is right or not but the alignment rate is 0 although when I tried another data sets everything was normal. I mean for some data sets alignment rate is 0 and for another is the same as I mapped on whole genome fasta
You want to map to the Saccharomyces genome using TopHat (which uses Bowtie under the hood anyway). Depending on how your 'CDS reference' is set up (exons? transcripts? UTRs?), you are almost certainly losing a lot of reads due to mismatches or a failure of global alignment. If you wanted to mess around with this, you can try mapping using --local or --end-to-end to see if reads are being discarded because of mismatches, but you still ought to process your RNAseq data following a more standard series of steps.
yes it is illumina