Computing tri/di/mononucleotide context per region of the human genome
1
0
Entering edit mode
4.2 years ago

I am looking for help to compute the nucleotide context of the human genome hg38, per region along the chromosomes. Preferably something like this.

GRanges object with 309581 ranges and 98 metadata columns:

       seqnames            ranges strand |       AAA       AAC       AAG       AAT       ACA       ACC       ACG       ACT
          <Rle>         <IRanges>  <Rle> | <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
   [1]        1           1-10000      + |         0         0         0         0         0         0         0         0
   [2]        1       10001-20000      + |    0.0104     0.015    0.0152    0.0056    0.0138    0.0224    0.0044    0.0106
   [3]        1       20001-30000      + |    0.0235    0.0127    0.0186    0.0154    0.0193    0.0155    0.0054    0.0147
   [4]        1       30001-40000      + |    0.0309    0.0104    0.0224    0.0174    0.0188    0.0129    0.0027    0.0127
   [5]        1       40001-50000      + |    0.0613    0.0217    0.0242    0.0347      0.03    0.0101    0.0025    0.0181

But for all tri/di/mononucleotide context. Any ideas? A

genome nucleotide context • 830 views
ADD COMMENT
0
Entering edit mode

how did you get the output above ? what's wrong with this package ?

ADD REPLY
0
Entering edit mode

This is from the Fishhook package. Nothing wrong with it - it just does not show you how to generate nucleotide context for hg38.

http://mskilab.com/fishHook/tutorial.html

I've also asked the authors.

ADD REPLY
0
Entering edit mode
4.2 years ago
JC 13k

Just search for kmer analysis tools, there are multiple packages/programs, like compseq in Emboss or time ago I wrote one in Perl,

ADD COMMENT

Login before adding your answer.

Traffic: 1638 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6