Merge three assembled files (iterations) into one.
2
0
Entering edit mode
9.8 years ago
arronar ▴ 290

Hi.

In an experiment with three iterations , a gene from a plant was silenced. There is only that treatment and the Wild type (wt).

Mutant_plant_1
Mutant_plant_2
Mutant_plant_3

WT_1
WT_2
WT_3

I took RNA-seq data and at first aligned them using tophat 2 and the reference genome.

tophat2 -o tophat_output --microexon-search -G genes.gtf Sequence/Bowtie2Index/genome pls1_1.fastq pls1_2.fastq

Then i run the stringtie for assembly.

stringtie -G genes.gtf -o stringtie_output tophat_output/accepted_hits.bam

I did that for all three iterations , both for the mutant and WT plant. No as you understand i have 3 assembly files for the mutant plant and three for the WT.

The next step on the pipeline is to merge those assemblies into one file using cuffmerge application.

But before that i think that i have to merge the three files of each treatment (mutant plant & WT) into one, so to use them later on the cuffmerge. Here is where i want your opinion.

Let me give you an example.

Let's say that

in Mutan_plant_1 , gene #1 and gene #4 found
in Mutan_plant_2 , gene #1 found
in Mutan_plant_3 , gene #3 found

So i want to merge those three files into one , having for mutant plant in general :

gene #1 , gene #2 , gene #3 , gene #4

Then i will do the same with the WT plants and finally use those files with cuffmerge.

Thank you.

RNA-Seq genome cuffmerge • 2.7k views
ADD COMMENT
1
Entering edit mode
9.8 years ago
merodev ▴ 150

I haven't used stringtie. What you might want to do is merge all transcripts (from both wild and mutant) into a single file with cuffmerge. This gives you a full list of transcripts. You can then use cuffdiff to get the differentially expressed genes following the pipeline.

ADD COMMENT
0
Entering edit mode

It has nothing to do with stringtie. Before merge wild type with the mutant, I thought that it might be good to merge all mutant plants files into one, all wild type files into one and then run cuffmerge with two these files.Are you sure that cuffmerge gets as input 6 files?

ADD REPLY
0
Entering edit mode
7.8 years ago
Fluorine ▴ 100

This is an old question, but it might be useful for someone else with the same question. For StringTie merge, you need to merge all the samples, regardless if you have treated/untreated or healthy/cancer samples. What it does is, it makes a sort of reference GTF file with all the expressed transcripts in your dataset, which you use in the next step to call differential expression on, for each individual sample.

ADD COMMENT

Login before adding your answer.

Traffic: 1890 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6