Insert Size Metrics of ATAC-Seq data
2
1
Entering edit mode
5.5 years ago
Bioinfo_2006 ▴ 160

I have paired-end sequencing data from ATAC-Seq and the fastq files had nextera adaptors. So I trimmed them the following way:

trim_galore --paired --nextera ${names[${SLURM_ARRAY_TASK_ID}]}_R1.fastq ${names[${SLURM_ARRAY_TASK_ID}]}_R2.fastq

Then did Bowtie

bowtie2 -p 4 -q --very-sensitive  -x genome -1 ${names[${SLURM_ARRAY_TASK_ID}]}_R1_val_1.fq -2 ${names[${SLURM_ARRAY_TASK_ID}]}_R2_val_2.fq -S "${names[${SLURM_ARRAY_TASK_ID}]}"_aln_unsorted.sam

While doing ATAC-Seq insert size plotting, I see blue region in the histogram corresponding to flipped reads. Could someone point out where I went wrong? Please see below histogram, it looks the same for all samples before filtering for Mt reads, and I also tried filtering out MT reads and PCR duplicates, still looks the same.

Before removing mitochondrial reads

After filtering mito and PCR duplicates

ATAC-Seq Picard • 5.1k views
ADD COMMENT
0
Entering edit mode

Please see How to add images to a Biostars post to add your images properly. You need the direct link to the image, not the link to the webpage that has the image embedded (which is what you have used here)

ADD REPLY
2
Entering edit mode
5.5 years ago
igor 13k

Are your reads 75bp? That seems to be where the switch happens. If your fragments are shorter than your reads, Bowtie would consider those discordant. From the manual:

Bowtie 2’s default behavior is to consider overlapping and containing as being consistent with concordant alignment. By default, dovetailing is considered inconsistent with concordant alignment.

As a test, you can try trimming your initial reads to 35bp (much shorter but still mostly mappable) and see how that changes the result.

ADD COMMENT
0
Entering edit mode

Thanks so much for pointing this out, trimming to 35bp solved the problem.

ADD REPLY
0
Entering edit mode

My size distribution plot shows a pattern with a valley around 50bp (I am using PE50), so I am wondering whether I can learn what is going wrong with my library or analyses please? Could you explain what you mean by "flipped reads"? I guess you can somehow tell the sequencing direction by "FR" and "RF"? I did not have that kind of information, but I am wondering why I have that "gap" around 50bp... This is after removal of mito and dup. Thanks! enter image description here

ADD REPLY
0
Entering edit mode

Did you ever figure this out? I am seeing the exact same thing with my PE50 data.

ADD REPLY
1
Entering edit mode
5.5 years ago
ATpoint 86k

Did you remove mitochondrial reads? If not, do so. I bet money this blue part will disappear.

samtools idxstats your.bam \
  | cut -f1 \
  | grep -v 'chrM' \
  | xargs samtools view -o your_filtered.bam your.bam

Depending on your chromosome names, you might need to set 'chrM' to something else.

ADD COMMENT
0
Entering edit mode

Edited my post. Forgot to include that I tried that as well. I feel it has something to do with trimming adapters, and the alignment afterwards, that is why posted the code to check if am missing parameters.

ADD REPLY
0
Entering edit mode

Can you show samtools idxstats PBL_927-post_NR_aln_sorted_noM.bam?

ADD REPLY
0
Entering edit mode

enter image description here

ADD REPLY
0
Entering edit mode

Removing MT now and checking again.

ADD REPLY
0
Entering edit mode

grep -v 'MT' is what you need

ADD REPLY
0
Entering edit mode

It is strange. I tried

samtools idxstats PBL_927-post_NR_aln_sorted.bam | cut -f1 | grep -v 'MT' | xargs samtools view -b PBL_927-post_NR_aln_sorted.bam -o PBL_927-post_NR_aln_sorted_noM.bam

as well as

samtools idxstats PBL_927-post_NR_aln_sorted.bam | cut -f1 | grep -v 'MT' | xargs samtools view -b PBL_927-post_NR_aln_sorted.bam -o PBL_927-post_NR_aln_sorted_noM.bam

but unable to remove the MT sequences.

ADD REPLY
0
Entering edit mode

I am using this command in my work routinely and it works so it must be something on your side, maybe an old samtools version, I unfortunately cannot tell from here. Try updating samtools.

ADD REPLY
0
Entering edit mode

I know its the same thing I have been using for other data, so it is really frustrating. Am sure it is something small that am missing out. I will check this and update the post.

ADD REPLY
0
Entering edit mode

Insertsize metric plot still looks the same

enter image description here

ADD REPLY

Login before adding your answer.

Traffic: 2008 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6