Hi all,
Confused by an output I got from bedtools genomecov
. I was trying to identify what % of my target genome was covered by transcription factors so I submitted this command:
bedtools genomecov -i CS_unsplit_annotations/iwgsc_refseqv1.0_TransposableElements_2017Mar13.gff3 \
-g CS_chromosomes_no_mtcp.genome
-i = GFF file of transposable elements -g = genome file listing chromosome names and their coordinates
However, the output doesn't make much sense to me. The excerpt below describes what portions of the genome is covered to various depths by the TEs. However, these only sum to 0.70. What happened to the other 30% of the genome?
genome 0 2006041373 14547261565 0.137898
genome 1 4130680743 14547261565 0.283949
genome 2 821174927 14547261565 0.0564488
genome 3 2445960418 14547261565 0.168139
genome 4 690610368 14547261565 0.0474736
genome 5 132938036 14547261565 0.00913835
genome 6 20984529 14547261565 0.00144251
genome 7 2672582 14547261565 0.000183717
genome 8 1041252 14547261565 7.15772e-05
genome 9 19609 14547261565 1.34795e-06
genome 10 128569 14547261565 8.83802e-06
genome 12 41863 14547261565 2.87772e-06
Hoping someone can shed some light on this odd behaviour. Thank you!
EDIT
Getting even more confusing results with another feature file (coding sequences this time). Output states that about 90% of the length of individual chromosomes are not covered by any genes - this is roughly what I expect from my organism. However, for the genome as a whole, bedtools reports that only about 3% has no coverage... 21 chromosomes so not going to show all of the zero coverage lines, but here are some:
chr2D 0 592882880 651852609 0.909535
chr3D 0 562541593 615552423 0.913881
chr6A 0 568492811 618079260 0.919773
And then the whole genome zero coverage line:
genome 0 401132100 14547261565 0.0275744
Note that field 3, which the manual describes as the "number of bases on chromosome (or genome) with depth equal to column 2", is smaller for the whole genome than it is for any of the individual chromosomes. What is going on?
Some other whole genome lines for context:
genome 1 580647461 14547261565 0.0399146
genome 2 279006216 14547261565 0.0191793
genome 3 148353592 14547261565 0.010198
genome 4 87959970 14547261565 0.0060465
genome 5 53422266 14547261565 0.00367232