How to generate single sam file from paired-end fastq
1
0
Entering edit mode
4.7 years ago
KT • 0

Hello,

I am analyzing ChIP Seq data from paired-end sequencing. I have two reads files R1/R2 for my samples. As I understand I can call peaks with MACS using the -f BAMPE. Here is code I am using:

(
for file in ${fileList}
do
prefix=$(echo ${file} | cut -d "." -f 1 | rev | cut -c 7- | rev)
bwa mem -M -t 16 ${BWA} ${INPATH}${prefix}R1_001_trim.fastq ${INPATH}${prefix}R2_001_trim.fastq > ${OUTPATH}${prefix}.sam
)

The issue where I am stuck right now is that I get individual .sam file for each of my reads. Can one please suggest a fix to generate a single .sam file.

Thank you

ChIP-Seq assembly alignment • 1.7k views
ADD COMMENT
0
Entering edit mode

. Can one please suggest a fix to generate a single .sam file.

you should use a workflow manager like snakemake or nextflow.

see also: BWA mem on multiple samples

see also Parallel for bwa mem - problem with -R argument for ID and SM

see also using GNU parallel for bwa mem and samtools

ADD REPLY
0
Entering edit mode

What do you mean by "individual" sam file? The command you are using is producing a single sam file that contains the paired-end alignment. Please be more precise. From the sam file that your command produces you should transform it to bam, e.g. samtools view -o out.bam in.sam and then mark duplicates, e.g. with MarkDuplicates from Picard. This file is then ready for peak calling.

ADD REPLY
0
Entering edit mode

What I mean is that it generates two .sam files *R1_00.sam and *R2_00.sam and in turn two .bam files. I query is that how can I output a single .sam files which will take two .fastq files as input. As I understand for later MACS steps I need this and then specific -f BAMPE that will automatically treat and know that its is PE sequencing.

ADD REPLY
0
Entering edit mode
4.7 years ago
ATpoint 85k

Paired-end bwa mem alignment: bwa mem index.fa read1.fq read2.fq > out.sam

There is no way this produces two files. If so then your loop is improperly invoking this. Be sure to loop through unique basenames of the file and not once for basename(File1) and once for basename(file2). Check it manually. if it doesn't work than omit all this bash scripting and wrote the commands by hand manually. The gold standard would indeed be a workflow manager.

ADD COMMENT

Login before adding your answer.

Traffic: 2521 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6