Getting conversant with R
2
1
Entering edit mode
2.7 years ago
makoye ▴ 20
## Use a real coding sequence:
rcds <- read.fasta("virus.fasta")
uco( rcds, index = "freq")
uco( rcds, index = "eff")
uco( rcds, index = "rscu")
uco( rcds, as.data.frame = TRUE)

Guys, anyone to help me. I used those functions as directed in uco(seqinr) package for relative synonymous codon usage (rscu) computaion the results were as shown below. The fasta file has multiple fasta sequences. What is wrong?

aaa aac aag aat aca acc acg act aga agc agg agt 
  0   0   0   0   0   0   0   0   0   0   0   0 
ata atc atg att caa cac cag cat cca ccc ccg cct 
  0   0   0   0   0   0   0   0   0   0   0   0 
cga cgc cgg cgt cta ctc ctg ctt gaa gac gag gat 
  0   0   0   0   0   0   0   0   0   0   0   0 
gca gcc gcg gct gga ggc ggg ggt gta gtc gtg gtt 
  0   0   0   0   0   0   0   0   0   0   0   0 
taa tac tag tat tca tcc tcg tct tga tgc tgg tgt 
  0   0   0   0   0   0   0   0   0   0   0   0 
tta ttc ttg ttt 
  0   0   0   0 
> uco( rcds, index = "eff")

aaa aac aag aat aca acc acg act aga agc agg agt 
  0   0   0   0   0   0   0   0   0   0   0   0 
ata atc atg att caa cac cag cat cca ccc ccg cct 
  0   0   0   0   0   0   0   0   0   0   0   0 
cga cgc cgg cgt cta ctc ctg ctt gaa gac gag gat 
  0   0   0   0   0   0   0   0   0   0   0   0 
gca gcc gcg gct gga ggc ggg ggt gta gtc gtg gtt 
  0   0   0   0   0   0   0   0   0   0   0   0 
taa tac tag tat tca tcc tcg tct tga tgc tgg tgt 
  0   0   0   0   0   0   0   0   0   0   0   0 
tta ttc ttg ttt 
  0   0   0   0 
> uco( rcds, index = "rscu")
aaa aac aag aat aca acc acg act aga agc agg agt 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
ata atc atg att caa cac cag cat cca ccc ccg cct 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
cga cgc cgg cgt cta ctc ctg ctt gaa gac gag gat 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
gca gcc gcg gct gga ggc ggg ggt gta gtc gtg gtt 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
taa tac tag tat tca tcc tcg tct tga tgc tgg tgt 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
tta ttc ttg ttt 
 NA  NA  NA  NA 
> uco( rcds, as.data.frame = TRUE)
     AA codon eff freq RSCU
aaa Lys   aaa   0    0   NA
aac Asn   aac   0    0   NA
aag Lys   aag   0    0   NA
aat Asn   aat   0    0   NA
aca Thr   aca   0    0   NA
acc Thr   acc   0    0   NA
acg Thr   acg   0    0   NA
act Thr   act   0    0   NA
aga Arg   aga   0    0   NA
agc Ser   agc   0    0   NA
agg Arg   agg   0    0   NA
agt Ser   agt   0    0   NA
ata Ile   ata   0    0   NA
atc Ile   atc   0    0   NA
atg Met   atg   0    0   NA
att Ile   att   0    0   NA
caa Gln   caa   0    0   NA
cac His   cac   0    0   NA
cag Gln   cag   0    0   NA
cat His   cat   0    0   NA
cca Pro   cca   0    0   NA
ccc Pro   ccc   0    0   NA
ccg Pro   ccg   0    0   NA
cct Pro   cct   0    0   NA
cga Arg   cga   0    0   NA
cgc Arg   cgc   0    0   NA
cgg Arg   cgg   0    0   NA
cgt Arg   cgt   0    0   NA
cta Leu   cta   0    0   NA
ctc Leu   ctc   0    0   NA
ctg Leu   ctg   0    0   NA
ctt Leu   ctt   0    0   NA
gaa Glu   gaa   0    0   NA
gac Asp   gac   0    0   NA
gag Glu   gag   0    0   NA
gat Asp   gat   0    0   NA
gca Ala   gca   0    0   NA
gcc Ala   gcc   0    0   NA
gcg Ala   gcg   0    0   NA
gct Ala   gct   0    0   NA
gga Gly   gga   0    0   NA
ggc Gly   ggc   0    0   NA
ggg Gly   ggg   0    0   NA
ggt Gly   ggt   0    0   NA
gta Val   gta   0    0   NA
gtc Val   gtc   0    0   NA
gtg Val   gtg   0    0   NA
gtt Val   gtt   0    0   NA
taa Stp   taa   0    0   NA
tac Tyr   tac   0    0   NA
tag Stp   tag   0    0   NA
tat Tyr   tat   0    0   NA
tca Ser   tca   0    0   NA
tcc Ser   tcc   0    0   NA
tcg Ser   tcg   0    0   NA
tct Ser   tct   0    0   NA
tga Stp   tga   0    0   NA
tgc Cys   tgc   0    0   NA
tgg Trp   tgg   0    0   NA
tgt Cys   tgt   0    0   NA
tta Leu   tta   0    0   NA
ttc Phe   ttc   0    0   NA
ttg Leu   ttg   0    0   NA
ttt Phe   ttt   0    0   NA
> 
conversant rsoftware • 1.1k views
ADD COMMENT
0
Entering edit mode

Tip: use the code formatting when posting examples, it will make it your question easier to read. For this time I've reformatted it for you.

ADD REPLY
1
Entering edit mode
2.7 years ago
makoye ▴ 20
## Finally have resolved that problem using arrangement of the following functions:

## Use a real coding sequence:
rcds <- read.fasta("virus.fasta")[[1]]
uco( rcds, index = "freq")
uco( rcds, index = "eff")
uco( rcds, index = "rscu")
uco( rcds, as.data.frame = TRUE, NA.rscu = NA)

The output dataframe is as indicated below:

> uco( rcds, as.data.frame = TRUE, NA.rscu = NA)
     AA codon eff        freq      RSCU
aaa Lys   aaa  16 0.068669528 1.8823529
aac Asn   aac   6 0.025751073 0.9230769
aag Lys   aag   1 0.004291845 0.1176471
aat Asn   aat   7 0.030042918 1.0769231
aca Thr   aca   7 0.030042918 1.8666667
acc Thr   acc   3 0.012875536 0.8000000
acg Thr   acg   2 0.008583691 0.5333333
act Thr   act   3 0.012875536 0.8000000
aga Arg   aga   8 0.034334764 3.6923077
agc Ser   agc   0 0.000000000 0.0000000
agg Arg   agg   2 0.008583691 0.9230769
agt Ser   agt   6 0.025751073 2.5714286
ata Ile   ata   6 0.025751073 1.0588235
atc Ile   atc   3 0.012875536 0.5294118
atg Met   atg   4 0.017167382 1.0000000
att Ile   att   8 0.034334764 1.4117647
caa Gln   caa   4 0.017167382 1.3333333
cac His   cac   2 0.008583691 1.3333333
cag Gln   cag   2 0.008583691 0.6666667
cat His   cat   1 0.004291845 0.6666667
cca Pro   cca   4 0.017167382 2.0000000
ccc Pro   ccc   1 0.004291845 0.5000000
ccg Pro   ccg   1 0.004291845 0.5000000
cct Pro   cct   2 0.008583691 1.0000000
cga Arg   cga   2 0.008583691 0.9230769
cgc Arg   cgc   0 0.000000000 0.0000000
cgg Arg   cgg   0 0.000000000 0.0000000
cgt Arg   cgt   1 0.004291845 0.4615385
cta Leu   cta   1 0.004291845 0.2857143
ctc Leu   ctc   3 0.012875536 0.8571429
ctg Leu   ctg   3 0.012875536 0.8571429
ctt Leu   ctt   2 0.008583691 0.5714286
gaa Glu   gaa   2 0.008583691 0.5714286
gac Asp   gac   1 0.004291845 0.5000000
gag Glu   gag   5 0.021459227 1.4285714
gat Asp   gat   3 0.012875536 1.5000000
gca Ala   gca   5 0.021459227 2.5000000
gcc Ala   gcc   1 0.004291845 0.5000000
gcg Ala   gcg   0 0.000000000 0.0000000
gct Ala   gct   2 0.008583691 1.0000000
gga Gly   gga   4 0.017167382 2.0000000
ggc Gly   ggc   0 0.000000000 0.0000000
ggg Gly   ggg   1 0.004291845 0.5000000
ggt Gly   ggt   3 0.012875536 1.5000000
gta Val   gta   2 0.008583691 1.1428571
gtc Val   gtc   0 0.000000000 0.0000000
gtg Val   gtg   2 0.008583691 1.1428571
gtt Val   gtt   3 0.012875536 1.7142857
taa Stp   taa  11 0.047210300 1.2222222
tac Tyr   tac   4 0.017167382 0.5333333
tag Stp   tag   8 0.034334764 0.8888889
tat Tyr   tat  11 0.047210300 1.4666667
tca Ser   tca   2 0.008583691 0.8571429
tcc Ser   tcc   4 0.017167382 1.7142857
tcg Ser   tcg   1 0.004291845 0.4285714
tct Ser   tct   1 0.004291845 0.4285714
tga Stp   tga   8 0.034334764 0.8888889
tgc Cys   tgc   4 0.017167382 1.1428571
tgg Trp   tgg   6 0.025751073 1.0000000
tgt Cys   tgt   3 0.012875536 0.8571429
tta Leu   tta   4 0.017167382 1.1428571
ttc Phe   ttc   4 0.017167382 0.6153846
ttg Leu   ttg   8 0.034334764 2.2857143
ttt Phe   ttt   9 0.038626609 1.3846154
> 

Thank you again in advance!

ADD COMMENT
0
Entering edit mode
2.7 years ago

It seems the uco function only works on a sequence at a time. When you apply it to an object containing multiple sequences, it doesn't know which one to use so it runs on a empty string.

You can use names(rcds) to print the list of sequences in your fasta file.

Then, you can apply the uco function to each sequence separately, accessing them via rcds$<name of the sequence> or rcds[[1]]. A loop or apply will do the trick to calculate it for all the sequences.

ADD COMMENT
0
Entering edit mode
Thank you for your comments, really appreciate for the support. However, have tried to work on both options you provided and here is one of the worked options, the answers in both cases remain the same. Does the problem perhaps inherent within the  sequence themselves? You may have a look on the sequence below.

> rcds[[1]]
  [1] "c" "t" "c" "c" "a" "t" "g" "c" "c" "a" "c"
 [12] "c" "a" "c" "a" "a" "a" "c" "c" "a" "c" "a"
 [23] "a" "t" "a" "t" "t" "t" "c" "a" "a" "a" "a"
 [34] "t" "a" "a" "a" "g" "t" "a" "g" "t" "g" "t"
 [45] "t" "c" "t" "t" "t" "a" "g" "a" "t" "a" "t"
 [56] "g" "t" "g" "c" "t" "g" "t" "g" "t" "g" "g"
 [67] "c" "c" "a" "g" "t" "a" "t" "t" "t" "t" "t"
 [78] "t" "t" "a" "g" "c" "a" "a" "g" "a" "g" "c"
 [89] "c" "t" "g" "c" "a" "g" "a" "g" "a" "a" "a"
[100] "t" "t" "g" "g" "a" "g" "t" "a" "g" "a" "c"
[111] "a" "t" "a" "t" "t" "t" "t" "t" "t" "t" "t"
[122] "t" "g" "c" "a" "a" "a" "a" "t" "g" "g" "t"
[133] "t" "t" "a" "a" "g" "t" "t" "t" "t" "t" "c"
[144] "a" "a" "g" "a" "a" "t" "a" "c" "a" "g" "a"
[155] "t" "t" "g" "g" "a" "t" "a" "a" "a" "t" "t"
[166] "a" "g" "g" "t" "t" "g" "t" "t" "g" "a" "c"
[177] "t" "t" "a" "g" "t" "t" "a" "c" "a" "g" "g"
[188] "a" "g" "g" "t" "a" "t" "t" "a" "a" "a" "t"
[199] "a" "t" "t" "a" "t" "g" "t" "a" "g" "a" "c"
[210] "a" "t" "a" "a" "a" "a" "a" "t" "g" "a" "g"
[221] "a" "t" "c" "c" "t" "c" "c" "a" "a" "a" "a"
[232] "a" "a" "a" "t" "a" "a" "a" "c" "a" "a" "c"
[243] "a" "a" "a" "a" "a" "a" "a" "a" "t" "a" "a"
[254] "a" "c" "a" "a" "c" "a" "a" "a" "a" "a" "a"
[265] "a" "a" "a" "t" "a" "t" "g" "t" "t" "t" "a"
[276] "a" "t" "a" "t" "t" "a" "a" "a" "a" "t" "g"
[287] "a" "c" "a" "a" "t" "t" "t" "c" "t" "a" "c"
[298] "a" "t" "t" "g" "c" "t" "t" "a" "t" "t" "g"
[309] "c" "t" "c" "t" "t" "a" "t" "t" "a" "t" "a"
[320] "c" "t" "a" "c" "t" "t" "a" "t" "t" "a" "t"
[331] "t" "a" "t" "t" "t" "t" "a" "g" "t" "a" "g"
[342] "t" "g" "t" "t" "t" "t" "t" "a" "t" "a" "c"
[353] "t" "a" "t" "a" "a" "g" "a" "a" "a" "c" "a"
[364] "a" "c" "a" "a" "c" "c" "a" "c" "c" "g" "a"
[375] "a" "a" "a" "a" "g" "g" "t" "c" "t" "g" "t"
[386] "a" "a" "a" "g" "t" "a" "g" "a" "t" "a" "a"
[397] "a" "g" "a" "t" "t" "g" "t" "g" "g" "t" "a"
[408] "g" "t" "g" "g" "a" "g" "a" "g" "c" "a" "t"
[419] "t" "g" "t" "g" "t" "t" "c" "g" "t" "g" "g"
[430] "a" "t" "c" "a" "t" "g" "t" "a" "g" "c" "t"
[441] "c" "a" "t" "t" "g" "a" "g" "c" "t" "g" "c"
[452] "t" "t" "a" "g" "a" "t" "g" "c" "c" "g" "t"
[463] "a" "a" "a" "a" "a" "t" "g" "g" "a" "c" "a"
[474] "a" "a" "c" "g" "a" "a" "a" "t" "a" "t" "t"
[485] "a" "a" "g" "a" "t" "a" "g" "a" "t" "t" "c"
[496] "t" "a" "a" "g" "a" "t" "t" "t" "c" "c" "t"
[507] "c" "a" "t" "g" "c" "g" "a" "a" "t" "t" "c"
[518] "a" "c" "t" "c" "c" "c" "a" "a" "t" "t" "t"
[529] "t" "t" "a" "c" "c" "g" "t" "t" "t" "t" "a"
[540] "c" "g" "g" "a" "t" "a" "c" "t" "g" "c" "t"
[551] "g" "c" "t" "g" "a" "t" "g" "a" "g" "c" "a"
[562] "g" "c" "a" "a" "g" "a" "a" "t" "t" "t" "g"
[573] "g" "a" "a" "a" "a" "a" "c" "a" "c" "g" "g"
[584] "c" "a" "t" "c" "c" "t" "a" "t" "a" "a" "a"
[595] "a" "a" "t" "a" "a" "c" "t" "c" "c" "a" "t"
[606] "c" "t" "c" "c" "a" "a" "g" "t" "g" "a" "a"
[617] "t" "c" "c" "c" "a" "t" "a" "g" "c" "c" "c"
[628] "c" "c" "a" "a" "g" "a" "g" "g" "t" "g" "t"
[639] "g" "t" "g" "a" "a" "a" "a" "a" "t" "a" "t"
[650] "t" "g" "t" "t" "c" "a" "t" "g" "g" "g" "g"
[661] "a" "a" "c" "c" "g" "a" "t" "g" "a" "c" "t"
[672] "g" "t" "a" "c" "a" "g" "g" "t" "t" "g" "g"
[683] "g" "a" "a" "t" "a" "t" "g" "t" "t" "g" "g"
[694] "t" "g" "a" "t" "g" "a" "a"
attr(,"name")
[1] "MW856068.1"
attr(,"Annot")
[1] ">MW856068.1 African swine fever virus strain MAL/19/Karonga, complete genome"
attr(,"class")
[1] "SeqFastadna"
> uco( rcds, index = "freq")

aaa aac aag aat aca acc acg act aga agc agg agt 
  0   0   0   0   0   0   0   0   0   0   0   0 
ata atc atg att caa cac cag cat cca ccc ccg cct 
  0   0   0   0   0   0   0   0   0   0   0   0 
cga cgc cgg cgt cta ctc ctg ctt gaa gac gag gat 
  0   0   0   0   0   0   0   0   0   0   0   0 
gca gcc gcg gct gga ggc ggg ggt gta gtc gtg gtt 
  0   0   0   0   0   0   0   0   0   0   0   0 
taa tac tag tat tca tcc tcg tct tga tgc tgg tgt 
  0   0   0   0   0   0   0   0   0   0   0   0 
tta ttc ttg ttt 
  0   0   0   0 
> uco( rcds, index = "eff")

aaa aac aag aat aca acc acg act aga agc agg agt 
  0   0   0   0   0   0   0   0   0   0   0   0 
ata atc atg att caa cac cag cat cca ccc ccg cct 
  0   0   0   0   0   0   0   0   0   0   0   0 
cga cgc cgg cgt cta ctc ctg ctt gaa gac gag gat 
  0   0   0   0   0   0   0   0   0   0   0   0 
gca gcc gcg gct gga ggc ggg ggt gta gtc gtg gtt 
  0   0   0   0   0   0   0   0   0   0   0   0 
taa tac tag tat tca tcc tcg tct tga tgc tgg tgt 
  0   0   0   0   0   0   0   0   0   0   0   0 
tta ttc ttg ttt 
  0   0   0   0 
> uco( rcds, index = "rscu")
aaa aac aag aat aca acc acg act aga agc agg agt 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
ata atc atg att caa cac cag cat cca ccc ccg cct 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
cga cgc cgg cgt cta ctc ctg ctt gaa gac gag gat 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
gca gcc gcg gct gga ggc ggg ggt gta gtc gtg gtt 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
taa tac tag tat tca tcc tcg tct tga tgc tgg tgt 
 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 
tta ttc ttg ttt 
 NA  NA  NA  NA 
> uco( rcds, as.data.frame = TRUE)
     AA codon eff freq RSCU
aaa Lys   aaa   0    0   NA
aac Asn   aac   0    0   NA
aag Lys   aag   0    0   NA
aat Asn   aat   0    0   NA
aca Thr   aca   0    0   NA
acc Thr   acc   0    0   NA
acg Thr   acg   0    0   NA
act Thr   act   0    0   NA
aga Arg   aga   0    0   NA
agc Ser   agc   0    0   NA
agg Arg   agg   0    0   NA
agt Ser   agt   0    0   NA
ata Ile   ata   0    0   NA
atc Ile   atc   0    0   NA
atg Met   atg   0    0   NA
att Ile   att   0    0   NA
caa Gln   caa   0    0   NA
cac His   cac   0    0   NA
cag Gln   cag   0    0   NA
cat His   cat   0    0   NA
cca Pro   cca   0    0   NA
ccc Pro   ccc   0    0   NA
ccg Pro   ccg   0    0   NA
cct Pro   cct   0    0   NA
cga Arg   cga   0    0   NA
cgc Arg   cgc   0    0   NA
cgg Arg   cgg   0    0   NA
cgt Arg   cgt   0    0   NA
cta Leu   cta   0    0   NA
ctc Leu   ctc   0    0   NA
ctg Leu   ctg   0    0   NA
ctt Leu   ctt   0    0   NA
gaa Glu   gaa   0    0   NA
gac Asp   gac   0    0   NA
gag Glu   gag   0    0   NA
gat Asp   gat   0    0   NA
gca Ala   gca   0    0   NA
gcc Ala   gcc   0    0   NA
gcg Ala   gcg   0    0   NA
gct Ala   gct   0    0   NA
gga Gly   gga   0    0   NA
ggc Gly   ggc   0    0   NA
ggg Gly   ggg   0    0   NA
ggt Gly   ggt   0    0   NA
gta Val   gta   0    0   NA
gtc Val   gtc   0    0   NA
gtg Val   gtg   0    0   NA
gtt Val   gtt   0    0   NA
taa Stp   taa   0    0   NA
tac Tyr   tac   0    0   NA
tag Stp   tag   0    0   NA
tat Tyr   tat   0    0   NA
tca Ser   tca   0    0   NA
tcc Ser   tcc   0    0   NA
tcg Ser   tcg   0    0   NA
tct Ser   tct   0    0   NA
tga Stp   tga   0    0   NA
tgc Cys   tgc   0    0   NA
tgg Trp   tgg   0    0   NA
tgt Cys   tgt   0    0   NA
tta Leu   tta   0    0   NA
ttc Phe   ttc   0    0   NA
ttg Leu   ttg   0    0   NA
ttt Phe   ttt   0    0   NA
> 
ADD REPLY

Login before adding your answer.

Traffic: 1920 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6