Hello,
I am analyzing ChIP Seq data from paired-end sequencing. I have two reads files R1/R2 for my samples. As I understand I can call peaks with MACS using the -f BAMPE. Here is code I am using:
(
for file in ${fileList}
do
prefix=$(echo ${file} | cut -d "." -f 1 | rev | cut -c 7- | rev)
bwa mem -M -t 16 ${BWA} ${INPATH}${prefix}R1_001_trim.fastq ${INPATH}${prefix}R2_001_trim.fastq > ${OUTPATH}${prefix}.sam
)
The issue where I am stuck right now is that I get individual .sam file for each of my reads. Can one please suggest a fix to generate a single .sam file.
Thank you
you should use a workflow manager like snakemake or nextflow.
see also: BWA mem on multiple samples
see also Parallel for bwa mem - problem with -R argument for ID and SM
see also using GNU parallel for bwa mem and samtools
What do you mean by "individual" sam file? The command you are using is producing a single sam file that contains the paired-end alignment. Please be more precise. From the sam file that your command produces you should transform it to bam, e.g.
samtools view -o out.bam in.sam
and then mark duplicates, e.g. withMarkDuplicates
from Picard. This file is then ready for peak calling.What I mean is that it generates two .sam files *R1_00.sam and *R2_00.sam and in turn two .bam files. I query is that how can I output a single .sam files which will take two .fastq files as input. As I understand for later MACS steps I need this and then specific -f BAMPE that will automatically treat and know that its is PE sequencing.