Dear all,
I know that my question may seem so basic, but I need your advice.
I have about 400 genes and I need to know their GC%.
I couldn't find any online program that accepts a list of genes and returns their corresponding GC percent.
Can anybody suggest a way to do this?
I will appreciate any help in advance
Nazanin
Does the %GC of a gene only include the exons, or also the UTRs and/or introns?
I have only the gene symbols of those 400 genes.
Does the desired %GC of a gene only include the exonic sequences, or also the sequences from UTRs and/or introns?
If you have any nucleotide sequence, one could just use geneboy, and not worry about if it's exonic, UTR, promoter or other. https://www.dnalc.org/resources/geneboy.html
Hi , You need to describe your data , do you have sequences of your genes ?
Best
no. I only have their names (gene symbol)
Ok , i think you need sequences. You can get it with gene symbol using https://genome.ucsc.edu/cgi-bin/hgTables?hgsid=585799715_RKJMr531TPC0GMrB2Q4uMigVdh3K . You need make a custom track to get bed format. Then you can use samtools with the reference genome to get GC content :
bedtools nuc -fi hg19.fa -bed SNP_Regions.bed
That command says bedtools, not samtools :p
Thanks to correct me :)
Unfortunately the gene symbols belong to a bacterium not human.
Do you think I can use Ensembl bacteria?
Which bacterium is this? You may be able to get the records you need from the protein sequences from the genome and then it could simply be a matter of extracting genes you need and using a tool like GeeCee from EMBOSS to calculate the GC%.
I looked this BioMart for Bacteria Ensembl and if it's right there is no biomart to request what you want ... May be in NCBI ? How did you get your list ? is it for a specific species ?