BWA paired reads have different names error
1
0
Entering edit mode
10.3 years ago
crysis405 ▴ 30
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] (25, 50, 75) percentile: (144, 213, 328)
[infer_isize] low and high boundaries: 100 and 696 for estimating avg and std
[infer_isize] inferred external isize from 111344 pairs: 247.404 +/- 131.705
[infer_isize] skewness: 1.110; kurtosis: 0.633; ap_prior: 2.92e-05
[infer_isize] inferred maximum insert size: 1103 (6.50 sigma)
[bwa_sai2sam_pe_core] time elapses: 10.79 sec
[bwa_sai2sam_pe_core] changing coordinates of 6435 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_paired_sw] 29497 out of 36462 Q17 singletons are mated.
[bwa_paired_sw] 1384 out of 8191 Q17 discordant pairs are fixed.
[bwa_sai2sam_pe_core] time elapses: 4.24 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 1.88 sec
[bwa_sai2sam_pe_core] print alignments... [bwa_sai2sam_pe_core] paired reads have different names: "I@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286", "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"

Using Version: 0.7.9a-r786

Can anyone shed some light on what might be causing this error? Stampy had not problem with the exact same files.

EDIT:

grep -n "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  forward.fastq
9:@I@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286/1

grep -n "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  reverse.fastq
9:@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286/2

grep -B 2 "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  forward.fastq
+
CCCFFFFFHHHHHJJJJJJJJJJJJJIIJJJJJJIIJJJJJJJJJIJJJJJJJJJBDHHIJJJJJJHHHHHHFFFFFFEEEEEEDDDDDDDDDDCDEECC
@I@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286/1

grep -B 2 "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286"  reverse.fastq
+
BBCFFFF;FHHHHJHIJGHIJJJJJJGGJJIJ?F?BFGGGHGJJJJJIIIHGHDFF@DDD9@ABBDDBC@ACDDD>AB9@D?BCCDADEEEDDDCCDCC@
@ILLUMINA:381:D1HHHACXX:1:2312:14474:27286/2
alignment • 8.8k views
ADD COMMENT
0
Entering edit mode

Run these commands on your pair of fastq files and paste the output.

grep -n "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286" forward.fastq
grep -n "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286" reverse.fastq
grep -B 2 "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286" forward.fastq
grep -B 2 "ILLUMINA:381:D1HHHACXX:1:2312:14474:27286" reverse.fastq
ADD REPLY
0
Entering edit mode

Added the output

ADD REPLY
0
Entering edit mode

There is no issue with the ordering of the read pairs in two files. The issue is related to the name of the read id. I am sure you have figured it out by now. Correct the read id or read name and run the aligner again. I would just make sure that that extra I@ doesn't belong to the quality score string of the previous read.

ADD REPLY
0
Entering edit mode
10.3 years ago

You have an I@ symbol in the read name for read1

ILLUMINA:381:D1HHHACXX:1:2312:14474:27286

That's pretty strange. Investigate why that happened.

ADD COMMENT
0
Entering edit mode

Yeah, don't know why only 1 out of 103 files would suddenly have I@ incorporated. I was thinking just deleting it and seeing if that worked.

ADD REPLY
0
Entering edit mode

Looks like the I@ was produced by Picardtools SamToFastq when using INTERLEAVE=TRUE

ADD REPLY

Login before adding your answer.

Traffic: 1606 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6