Computing 1Gc 2GC 3GC on any CDS
2
0
Entering edit mode
7.4 years ago

Hello,

I was wondering how to compute 1GC, 2GC and 3GC (GC content for each codon at position 1, 2, 3).

I want to compare GC content of any predicted CDS. At the moment, I simply compute with this formula:

GC_content = (G+C)/(A+T+G+C)

For 1GC 2GC 3Gc I tried:

1GC = (G1+C1)/(A1+T1+G1+C1) 
2GC = (G2+C2)/(A2+T2+G2+C2) 
3GC = (G3+C3)/(A3+T3+G3+C3)

But I'm not confident about this way of computation. I know the formula takes mutation rate into account. I haven't found any software (like a little python script) for now and It's not really difficult to make my own python script if I have the right formula. I would prefer an existing script since it will involved lot of mathematics and probabilities.

In addition, I'm searching a deep review into GC content in prokaryota to really understand how make conclusions from GC content. I'm looking into several articles but I have not found a synthetic review on this subject.

Thanks in advance for any help.

sequence gene • 2.3k views
ADD COMMENT
1
Entering edit mode
7.3 years ago
Chirag Parsania ★ 2.0k

Here is the R function to calculate GC, GC1 ,GC2 or GC3. it takes fasta file as input and choice of function (GC/GC1/GC2/GC3). It will return a vector with GC of each sequence

library("seqinr")
library("Biostrings")
    getGC <- function(fastaFile , choice = "GC"){
            forGC <- readDNAStringSet(fastaFile)
            seqASchar <- lapply(forGC, function(elem){
                    return(s2c(as.character(elem)))        
            })
            gc_cont2 <- sapply(seqASchar, choice)
            return(gc_cont2)
    }
ADD COMMENT
0
Entering edit mode

R may be a little bit slow for large-scale analysis. So I suggest not using it in case you have much data to analyze.

ADD REPLY
0
Entering edit mode
7.4 years ago
Cacau ▴ 520

condonW should be able to help you.

ADD COMMENT

Login before adding your answer.

Traffic: 2778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6