Dear Biostars,
I have a particular type of bed file
created using ngs-bits
in hopes of using the ClinCNV
tool. Unfortunately, I am unable to use their in-house annotation method because my hpc will not give me permission to use mysql
as root -- long story but I just need to brainstorm other ways of annotating my bed file
.
The particular bed file
looks as so (chromosomes, start, end, GC). Sequences have been binned by 2000
nucleotides. Also the start and end numbers restart for each chromosome.
chr1 0 2001 n/a
chr1 2001 4002 n/a
chr1 4002 6003 n/a
chr1 6003 8004 n/a
chr1 8004 10005 0.4000
chr1 10005 12006 0.5947
chr1 12006 14007 0.5882
chr2 0 1999 n/a
chr2 1999 3998 n/a
chr2 3998 5997 n/a
chr2 5997 7996 n/a
chr2 7996 9995 n/a
chr2 9995 11994 0.6720
chr2 11994 13993 0.3722
chr2 13993 15992 0.4132
Does anyone have any ideas on how to add transcript names
which correspond to the genomic ranges? Further, is there a way to highlight regions which fall into a cpg site
? I'd be interested in removing these.
Any advice would be appreciated.
Hi @K.patel5 you don't need mysql to run ngs-bits =) mysql is an optional thing. you can even use ngs-bits in a container via bioconda.
cpg-sites are from 2bp long (CG is already a CpG site) and your regions are 1KB so it is not clear what do you want to remove from where.
Thanks, I did end up using bioconda for this. I had intended to remove an areas with a CG ratio above 0.8. I have read in a few CNV diagnostic publications that removal of such regions can improve false reporting of CNVs.