My aim is to find out copy number variants. I knew that CNVnator is enough to find out CNV. Recently I read a post from GitHub. Where they merge CNVnator and lumpy pipeline. Post link:
WIN=100 | |
SAMPLE="NA12878" | |
SAMPLE_BAM="NA12878_S1.bam" | |
cnvnator -root $SAMPLE.$WIN.root -genome GRCh37 -tree $SAMPLE_BAM | |
cnvnator -genome GRCh37 -root $SAMPLE.$WIN.root -his $WIN -d /shared/genomes/b37/full/chroms | |
cnvnator -root $SAMPLE.$WIN.root -stat $WIN | |
cnvnator -root $SAMPLE.$WIN.root -partition $WIN | |
cnvnator -root $SAMPLE.$WIN.root -call $WIN > $SAMPLE.$WIN.cnvcalls.txt | |
~/src/lumpy-sv/scripts/cnvanator_to_bedpes.py \ | |
-c NA12878.$WIN.cnvcalls.txt \ | |
-b 600 \ | |
--del_o del.$WIN.bedpe \ | |
--dup_o dup.$WIN.bedpe | |
#### DEPENDING ON IF YOU BAMS ARE chr1 or 1 YOU MAY NOT NEED THIS STEP | |
cat del.$WIN.bedpe | sed -e "s/chr//g" > del.$WIN.nochr.bedpe | |
cat dup.$WIN.bedpe | sed -e "s/chr//g" > dup.$WIN.nochr.bedpe | |
~/src/lumpy-sv/scripts/bedpe_sort.py \ | |
-b del.$WIN.nochr.bedpe \ | |
-g ~/scratch/cnvnator/genome.txt\ | |
> del.$WIN.nochr.posSorted.bedpe | |
~/src/lumpy-sv/scripts/bedpe_sort.py \ | |
-b dup.$WIN.nochr.bedpe \ | |
-g ~/scratch/cnvnator/genome.txt\ | |
> dup.$WIN.nochr.posSorted.bedpe | |
~/src/lumpy-sv/bin/lumpy \ | |
-mw 4 \ | |
-tt 1.0 \ | |
-pe bam_file:$PEBAM,histo_file:$HISTO,mean:$MEAN,stdev:$STD,read_length:100,min_non_overlap:100,discordant_z:$Z,back_distance:20,weight:1,id:PE,min_mapping_threshold:10 \ | |
-sr bam_file:$SRBAM,back_distance:20,min_mapping_threshold:10,weight:1,id:SR,min_clip:20 \ | |
-bedpe bedpe_file:dup.$WIN.posSorted.bedpe,weight:3,id:DUP \ | |
-bedpe bedpe_file:del.$WIN.posSorted.bedpe,weight:3,id:DEL |
Do you know what format the genome file size should be for bedpe_sort.py? I tried using the reference file in fasta format and it returned a blank file.