Entering edit mode
9.1 years ago
AT
•
0
I have paired end 76 bp reads for RNA seq data from FFPE samples and I am using tophat2 (version 2.1.0) using mostly default options:
tophat-2.1.0.Linux_x86_64/tophat -p 8 --library-type fr-secondstrand -o . hg19 test_R1.fastq.gz test_R2.fastq.gz
and I get very low concordant rate, as given by the align_summary.txt
file:
Left reads:
Input : 41740697
Mapped : 20342583 (48.7% of input)
of these: 6651856 (32.7%) have multiple alignments (51919 have >20)
Right reads:
Input : 41740697
Mapped : 18876112 (45.2% of input)
of these: 6285887 (33.3%) have multiple alignments (51879 have >20)
Unpaired reads:
Input : 51310
Mapped : 13565 (26.4% of input)
of these: 2966 (21.9%) have multiple alignments (0 have >20)
47.0% overall read mapping rate.
Aligned pairs: 10618677
of these: 4367774 (41.1%) have multiple alignments
10544908 (99.3%) are discordant alignments
0.2% concordant pair alignment rate.
These reads have not been trimmed at the first pass, though FASTQC did show a few reads with length as low as 0. I am running on another set of trimmed fastq file in which I have removed all reads less than a length of 20, and am waiting to see the results. Is there some other parameter that I should be setting to get better concordance..this seems very low. Any help/suggestions will be appreciated.
Have a look at your command line, your second file is named
test_R.fastq.gz
. Shouldn't that betest_R2.fastq.gz
?That was a typo when putting in the question..
Yeah, would have been too easy...
I have been experiencing similar issues with my paired data. Did you ever figure out what the issue was for your data?