How to transfer GRanges object into fasta
1
0
Entering edit mode
4.7 years ago
anran04100 • 0

I've got a GRanges object , but I don't know how to transfer it into fasta with tx_name. I tried to extract sequence using writeXStringSet(), but the result didn't have the tx_name. Can anybody tell me how to solve this?

seq <- getSeq(Hsapiens,gr)
writeXStringSet(seq,file="result")

but the result didn't have the tx_name, like follows > ATCGATCG

I want to have the tx_name as follows

ENST00000252835.5 ATCGATCG

R • 3.3k views
ADD COMMENT
0
Entering edit mode

Can you show some content of that GR?

ADD REPLY
0
Entering edit mode
GRanges object with 10 ranges and 3 metadata columns:
       seqnames            ranges strand |     tx_id            tx_name
          <Rle>         <IRanges>  <Rle> | <integer>        <character>
   [1]    chr22 15528159-15529139      + |    214568  ENST00000252835.5
   [2]    chr22 15528192-15529139      + |    214569  ENST00000643195.1
   [3]    chr22 15690078-15721631      + |    214573  ENST00000621704.4
   [4]    chr22 15690026-15721631      + |    214572 ENST00000343518.11
   [5]    chr22 15690246-15721522      + |    214574  ENST00000452800.1
   [6]    chr22 15690078-15721631      + |    214573  ENST00000621704.4
   [7]    chr22 15690026-15721631      + |    214572 ENST00000343518.11
   [8]    chr22 15690078-15721631      + |    214573  ENST00000621704.4
   [9]    chr22 15690246-15721522      + |    214574  ENST00000452800.1
  [10]    chr22 15690026-15721631      + |    214572 ENST00000343518.11

I wanna to get a fasta file as follows

>ENST00000252835.5
ATGTGTCCCTTGACCTTGCAGGTCACTGGCCTAATGAATGTCTCTGAGCCAAATTCCAGCTTTGCTTTTGTAAATGAATTTATACTCCAAGGTTTCTCTTGTGAGTGGACAATTCAGATCTTCCTCTTCTCACTCTTTACTACAACATATGCACTGACTATAACAGGGAATGGAGCCATTGCTTTTGTCCTGTGGTGTGACCGGCGACTTCACACTCCCATGTACATGTTCCTGGGAAATTTCTCCTTTTTAGAGATATGGTATGTCTCTTCTACAGTTCCCAAGATGTTGGTCAACTTCCTTTCAGAGAAAAAAAACATCTCCTTTGCTGGATGTTTTCTCCAGTTTTATTTCTTCTTCTCTTTGGGTACATCAGAATGCTTGCTTTTGACTGTGATGGCCTTTGATCAGTACCTTGCTATCTGCCGTCCCTTGCTCTATCCTAATATCATGACTGGGCATCTCTATGCCAAACTGGTCATACTGTGCTGGGTTTGTGGATTTCTGTGGTTCCTGATCCCCATTGTTCTCATCTCTCAGATGCCCTTCTGTGGCCCAAACATTATTGACCATGTTGTGTGTGACCCAGGGCCACGATTTGCATTGGATTGTGTTTCTGCCCCAAGAATCCAACTGTTTTGCTACACTCTAAGCTCATTAGTTATTTTTGGTAACTTCCTCTTTATTATTGGATCCTATACTCTTGTCCTGAAAGCTATGTTGGGTATGCCTTCAAGCACTGGGAGACATAAGGCCTTCTCTACCTGTGGGTCTCATTTGGCTGTGGTATCACTGTGCTATAGCTCTCTTATGGTCATGTATGTGAGCCCAGGACTCGGACATTCTACAGGGATGCAGAAAATTGAAACTTTGTTCTATGCTATGGTGACCCCACTCTTCAATCCCCTTATCTATAGCCTCCAGAATAAGGAGATAAAGGCAGCCCTGAGGAAAGTTCTGGGGAGTTCCAACATAATCTAA
>ENST00000643195.1
ATGAATGTCTCTGAGCCAAATTCCAGCTTTGCTTTTGTAAATGAATTTATACTCCAAGGTTTCTCTTGTGAGTGGACAATTCAGATCTTCCTCTTCTCACTCTTTACTACAACATATGCACTGACTATAACAGGGAATGGAGCCATTGCTTTTGTCCTGTGGTGTGACCGGCGACTTCACACTCCCATGTACATGTTCCTGGGAAATTTCTCCTTTTTAGAGATATGGTATGTCTCTTCTACAGTTCCCAAGATGTTGGTCAACTTCCTTTCAGAGAAAAAAAACATCTCCTTTGCTGGATGTTTTCTCCAGTTTTATTTCTTCTTCTCTTTGGGTACATCAGAATGCTTGCTTTTGACTGTGATGGCCTTTGATCAGTACCTTGCTATCTGCCGTCCCTTGCTCTATCCTAATATCATGACTGGGCATCTCTATGCCAAACTGGTCATACTGTGCTGGGTTTGTGGATTTCTGTGGTTCCTGATCCCCATTGTTCTCATCTCTCAGATGCCCTTCTGTGGCCCAAACATTATTGACCATGTTGTGTGTGACCCAGGGCCACGATTTGCATTGGATTGTGTTTCTGCCCCAAGAATCCAACTGTTTTGCTACACTCTAAGCTCATTAGTTATTTTTGGTAACTTCCTCTTTATTATTGGATCCTATACTCTTGTCCTGAAAGCTATGTTGGGTATGCCTTCAAGCACTGGGAGACATAAGGCCTTCTCTACCTGTGGGTCTCATTTGGCTGTGGTATCACTGTGCTATAGCTCTCTTATGGTCATGTATGTGAGCCCAGGACTCGGACATTCTACAGGGATGCAGAAAATTGAAACTTTGTTCTATGCTATGGTGACCCCACTCTTCAATCCCCTTATCTATAGCCTCCAGAATAAGGAGATAAAGGCAGCCCTGAGGAAAGTTCTGGGGAGTTCCAACATAATCTAA
ADD REPLY
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Thank you!

ADD REPLY
0
Entering edit mode
4.7 years ago

You just need to make sure the names are append to the seq object. This can be done easily using: names(seq) <- gr$tx_name

library(GenomicFeatures)
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
library(BSgenome.Hsapiens.UCSC.hg38)

Hsapiens <- BSgenome.Hsapiens.UCSC.hg38

## random tx subset
tx <- transcripts(TxDb.Hsapiens.UCSC.hg38.knownGene)
gr <- tx[sample(seq(length(tx)),5)]

## extract sequence
seq <- getSeq(Hsapiens,gr)

## add names
names(seq) <- gr$tx_name

writeXStringSet(seq,file="result")
ADD COMMENT
0
Entering edit mode

Thanks for your generous help!

ADD REPLY

Login before adding your answer.

Traffic: 1770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6