Entering edit mode
8.8 years ago
umn_bist
▴
390
The samples are tumor/match_normal PE RNA-seq samples from TCGA. I need to call/annotate somatic mutation
1. preprocessing QC: trim adapter using cutadapt (may use BBMap), remove low quality (<10) reads
2. align: TopHat2, add RG with bowtie2 (is this common practice?)
- I am interested in using STAR instead of TopHat2 but my advisor warned that the STAR/Tophat does not share similar format for removing multimapped/unmapped reads (Tophat is part of the workflow right now)
for f in *_nodupe.bam do fname=`basename $f .bam` #Remove Create Unique and Multi/Unmapped Bams samtools view -h -F 1024 $f | awk 'substr($0,1,1)=="@"||match($0, /NH:i:1\t/) {print $0}' | samtools view -hSb - > ${fname}_unique.bam samtools view -F 1024 -h $f | awk 'substr($0,1,1)=="@"||!/NH:i:1\t/ {print $0}' | samtools view -hSb - > ${fname}_multimapped.bam #Unique Reads Aligning to hg19 Repeatitive Sequence samtools view ${fname}_unique.bam | wc -l > ${fname}.count; bamToBed -i ${fname}_unique.bam -bed12 | intersectBed -a packages/tracks/hg19/hg19_rmsk_FLL1.bed -b stdin -split -sorted -c > ${fname}_MappedRE.bed; awk -F '\t' '{print $4"\t"$7}' ${fname}_MappedRE.bed > ${fname}.names; done
3. postprocessing QC: dedupe (mark duplicates)
4. variant calling/annotation: muTect2, snpSift (select against low MQ, AF, QUAL variants)
5. annotate: wAnnovar (may use snpEff)