Question

how to find sequence of the new gene that find by using RNA-seq and bioinformatic tools cufflinks

0

Entering edit mode

6.8 years ago

mra8187 ▴ 20

Dear all i have done RNA-seq project and have some question about cufflinks and other related program that links to cufflinks like cuffmerge and ....

after using Cufflinks package we get this document : cds.diff gene expression.diff and ... that contain this column :

test_id,    gene_id,    gene,   locus,  sample_1,   sample_2,   status, value_1,    value_2,    log2(fold_change),  test_stat,  p_value,    q_value,    significant,

XLOC_000302,    XLOC_000302,    -,  1:9748739-9749918,  D,  Q,  ,OK,    1.35346,    25.6511,    4.2443  ,4.96161,   5.00E-05,   0.000162672,    yes,

my question is : how i can find sequence of this differential expression gene ?

sequence of this genes is really important to me

thanks all

mohamadreza

rna-seq • 1.4k views

ADD COMMENT • link updated 6.7 years ago by Biostar 20 • written 6.8 years ago by mra8187 ▴ 20

1

Entering edit mode

See the section on "Extracting transcript sequences" here.

ADD REPLY • link 6.8 years ago by GenoMax 148k

1

Entering edit mode

in this scrip : gffread -w transcripts.fa -g /path/to/genome.fa transcripts.gtf

transcripts.fa : my raw RNA-seq data ?

transcripts.gtf : gtf file that i download from internet or file that i get from cuflinks ?

and how can i get exit file ?

thanks for your answers

ADD REPLY • link 6.8 years ago by mra8187 ▴ 20

1

Entering edit mode

-w filename is output file with spliced exons for each transcript. transcripts.gtf is the file that has the XLOC id you are interested in. If you only want one XLOC id you could make a subset file.

ADD REPLY • link 6.8 years ago by GenoMax 148k

1

Entering edit mode

transcripts.fa- output sequences in fasta format.
transcripts.gtf - transcripts of interest from analysis
reference_sequence.fa - reference sequence in fasta format. Index the genome sequence before you proceed. Example code:

$ samtools faidx reference_sequence.fa

try in linux:

$ gffread -w transcripts.fa -g reference_sequence.fa transcripts.gtf

ADD REPLY • link 6.8 years ago by cpad0112 21k