Hi, I would like to only unique mapped paired-end reads. Reads which are not anymore in pair I would like to remove. Are the below steps correctly to do it?
- aligned pair-end data with bowtie2.
- Removed duplicates with MarkDuplicates
- kept aligned reads only with mapping score above 40 with samtools
- fixed read information with
- filtered only proper paired with samtools .
usage: sh unique_reads_dedup_pbs.sh /path/to/FQdata
for r1 in $(find $1 -name "*.bam"); do
#cat <<eof qsub="" <<eof<="" p="">
!/bin/bash -l
PBS -N $r1
PBS -l walltime=43:00:00
PBS -j oe
PBS -l mem=190G
PBS -l ncpus=2
PBS -m bea
cd \$PBS_O_WORKDIR
conda activate picard picard -Xmx490g MarkDuplicates ASSUME_SORT_ORDER=coordinate REMOVE_DUPLICATES=true INPUT=$r1 OUTPUT=${r1}.dedup.bam METRICS_FILE=${r1}.dedup.metrics.txt conda deactivate
conda activate hisat2 samtools index ${r1}.dedup.bam conda deactivate
conda activate hisat2 samtools view -Sb -q 40 ${r1}.dedup.bam > ${r1}.dedup.q40.bam samtools index ${r1}.dedup.q40.bam conda deactivate
conda activate picard TMP_DIR=
pwd
/tmp mkdir \$TMP_DIR echo \$TMP_DIR picard -Xmx190g FixMateInformation \ I=${r1} \ O=${r1}.dedup.q40.FixMate.bam \ ASSUME_SORTED=false \ SORT_ORDER=coordinate conda deactivateconda activate hisat2 samtools view -@ 12 -hb -f 0x2 -F 2316 ${r1}.dedup.q40.FixMate.bam > ${r1}.dedup.q40.FixMate.clean.bam samtools index ${r1}.dedup.q40.FixMate.clean.bam conda deactivate
EOF
done
Thank you in advance
In the first or second
samtools view
command?Think second is better, you are already doing
-F 2316
. You could change this to2317
or2319
Thank you,
2317
looks good and now I can also remove-f 0x2