ATAC-seq many reads first base is "N"
0
1
Entering edit mode
4.2 years ago
MatthewP ★ 1.4k

I have fastq file of ATAC-seq but many reads start with one "N" base, I wonder what could be the reason.
Reads example:

@A00838:273:HCV7KDSXY:4:1101:1506:1000 1:N:0:AGGCAGAA+TATCCTCT
NATCCAGAAAAAAAAAAAATCATGACCAAGCTTACCGTCCCCACTTAAAT
+
#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFF
@A00838:273:HCV7KDSXY:4:1101:1687:1000 1:N:0:AGGCAGAA+TATCCTCT
NCATAGATCACATTAAGTACAAATATAAACAGTATTATTTCTTTACAATTGGATGTGTTGGAGACTTACTGATGT
+
#FFFFFF,FFFFFFFFF:FFFFFFFFFF:FFFFFFFFF:FFFF:FFF:FFFFF:FFFF,F,FFFFFFFFFFFFF:

Stats of such kind of reads:

$ zcat ATAC_R1.fq.gz | grep -e "^N" | wc -l
15150

$ zcat ATAC_R2.fq.gz | grep -e "^N" | wc -l
28
atac-seq • 719 views
ADD COMMENT
3
Entering edit mode

15150 and 28 reads of several millions is many? I would not even care about that tiny percentage to be honest. Just proceed with analysis. The aligner will clip off that one base.

ADD REPLY

Login before adding your answer.

Traffic: 2117 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6