hi good morning biostars I got in trouble again accepting a work that I have never performed. seems It's not that difficult so I will propose here my approximation and then you can tell me if what I'm doing is right or we can have better approximation right. I need to get coverage of whole genome from an alignment file. I really don't know how to do that possibly using bedtools. please let me know if there's an alternative to have coverage for the whole genome. thank you very much have a nice day
Hi Agata good morning you are so helpful. In fact I just need to get coverage for each nucleotide I don't need the coverage for each gene for now. From the command you gave me seems I need a GFF file. Do you think we can skip that annotation file?
It is not an annotation file but defined regions. I think that a coverage file for full genome would be something like 2 Gb of data?
Hi thanks for your quick reply. I'm working with small bacteria is about 5 MB so things are going to be tiny. If I remember well the gff file has been produced using bedtools right. Anyway I think I should read in more detail the manual of bedtools.
I think you can write your own bed file including coordinates start and stop of your reference. What bacteria do you have?
We assembled the genome in house so it's not published. Do you know what I need to generate this GFF files for bedtools? How do people generate this file?
I think you can calculate how much nucleotides you have and then just write a bed file included:
1(start) /t XXX (end)
I am not sure it is going to work I have never done something like that but it is worth to try :) I would try that first.
Where are you taking this gff files from? You are so lucky you have them, maybe because you work with a model organism.
You need to write bed/gff file on your own :)
Here is an example how should bed file looks like:
How Can I Make A Bed File?
Have a look a this one, comming from QUAST GenMark tool. May be enough? Thanks https://drive.google.com/file/d/0B8-ZAuZe8jldTDd2RTd2REZxclU/view?usp=sharing
If that file does not work cut columns 1,4,5 and that would convert it to a bed format.
This file will not give you coverage for every nucleotide for the entire genome (just for the intervals defined in your file).
If you need the entire genome I wonder if you can get away with saying
which will give you coverage for every nucleotide in the scaffold.
Hi genomax2. Thanks for having a look at my file. I have 31 contigs in my genome. So doing that will do the trick for all the contigs right thanks so much
Hi Agata, do you think we are talking about gff3 or basic gff. If I remember well there are some differences.