bedtools -u not giving unique files
1
0
Entering edit mode
2.7 years ago

The following are the steps I'm following:

First step to extract sample using bed file is this (here the bedfile is input bedfile converted to Hg38):

tabix -h -R Hg19_to_Hg38_sorted.bed.gz gnomad.genomes.v{g_version}.hgdp_tgp.chr{chr}.vcf.bgz | perl {vcftools} -c {sample_name} > {sample_name}_out.vcf'

output({sample_name}_out.vcf')
chr2    113982416   rs56177103  TATAAAATAAAATAAA    T   .   PASS    .   GT:AAD:DAD:DAF:ADF  0/1:25519,4077:25519,4077:0.13776:0.13776   
chr2    113982416   rs56177103  TATAAAATAAAATAAA    T   .   PASS    .   GT:AAD:DAD:DAF:ADF  0/1:25519,4077:25519,4077:0.13776:0.13776   
chr2    113982416   rs56177103  TATAAAATAAAATAAA    T   .   PASS    .   GT:AAD:DAD:DAF:ADF  0/1:25519,4077:25519,4077:0.13776:0.13776   

As my output file had repeated regions, to extract the unique regions I'm using the same input bed file with intersect bed, but I'm unable to get the unique reads. It gives the same repeated results. why is that so? The following is the cmd that I had used:

bedtools/intersectBed -u -a  {sample_name}_out.vcf' -b bed_filename > output.vcf 
bedtools intersectbed vcftools vcf tabix • 923 views
ADD COMMENT
0
Entering edit mode

Was also wondering if doing sort|uniq gives the same result?

ADD REPLY
0
Entering edit mode
2.7 years ago

Another option is to pipe BED data to sort-bed:

$ ... | sort-bed --unique - > answer.bed

Ref.: https://bedops.readthedocs.io/en/latest/content/reference/file-management/sorting/sort-bed.html

ADD COMMENT
0
Entering edit mode

But one of my data is a vcf file

ADD REPLY

Login before adding your answer.

Traffic: 2686 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6