changing Mito to Mt and vise versa is too confusing

0

Entering edit mode

9.4 years ago

zizigolu ★ 4.3k

Hi,

I am using Scharomyces Bowtie2Index and genome.fa, gtf from ensembl for tophat2, cufflinks and cuffmerge...but always changing between two errors, once could not find Mito in fasta once could not find Mt...and an infinite cycle while I am changing Mito to Mt or vise versa and realigning for many hours...may you please tell me a solution to get rid of this confusing problem?

Thank you

RNA-Seq • 1.5k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 9.4 years ago by zizigolu ★ 4.3k

1

Entering edit mode

9.4 years ago

Sean Davis 27k

You could simply change everything to either Mt or Mito before starting or, perhaps better, get all your information from one source (eg., get genome and gtf from ensembl). I have found, though, that it is definitely best to do this before starting with any analysis.

ADD COMMENT • link 9.4 years ago by Sean Davis 27k

0

Entering edit mode

thank you, i am using genome.fa, gtf and all from ensembl sourse but as i read in biostar its not enough to change because im using pre-build bowtie2-index from ensembl and might the genome.fa with Mito already indexed while i am changing that to Mt to use in cuffmerge...in other hand i dont know how to build genome index when using tophat2..really confusing

ADD REPLY • link 9.4 years ago by zizigolu ★ 4.3k

0

Entering edit mode

sorry I changed Mt to Mito but in gtf and genome.fa but again this error

	[izadi@lbox161 bowtie2-2.2.5]$ $TOP/tophat -p 8 -G genes.gtf -o C1_R1_thout genome SRR1944914.fastq

	[2015-12-22 22:23:46] Beginning TopHat run (v2.1.0)
	-----------------------------------------------
	[2015-12-22 22:23:46] Checking for Bowtie
	Bowtie version: 2.2.5.0
	[2015-12-22 22:23:46] Checking for Bowtie index files (genome)..
	[2015-12-22 22:23:46] Checking for reference FASTA file
	Warning: Could not find FASTA file genome.fa
	[2015-12-22 22:23:46] Reconstituting reference FASTA file from Bowtie index
	Executing: ./bowtie2-inspect genome > C1_R1_thout/tmp/genome.fa
	[2015-12-22 22:23:47] Generating SAM header for genome
	[2015-12-22 22:23:47] Reading known junctions from GTF file
	[2015-12-22 22:23:47] Preparing reads
	left reads: min. length=25, max. length=47, 17505435 kept reads (2 discarded)
	[2015-12-22 22:25:33] Building transcriptome data files C1_R1_thout/tmp/genes
	[2015-12-22 22:25:34] Building Bowtie index from genes.fa
	[2015-12-22 22:25:42] Mapping left_kept_reads to transcriptome genes with Bowtie2
	[2015-12-22 22:31:42] Resuming TopHat pipeline with unmapped reads
	[2015-12-22 22:31:42] Mapping left_kept_reads.m2g_um to genome genome with Bowtie2
	[2015-12-22 22:32:15] Mapping left_kept_reads.m2g_um_seg1 to genome genome with Bowtie2 (1/2)
	[2015-12-22 22:32:22] Mapping left_kept_reads.m2g_um_seg2 to genome genome with Bowtie2 (2/2)
	[2015-12-22 22:32:22] Searching for junctions via segment mapping
	Coverage-search algorithm is turned on, making this step very slow
	Please try running TopHat again with the option (--no-coverage-search) if this step takes too much time or memory.
	[2015-12-22 22:33:14] Retrieving sequences for splices
	[2015-12-22 22:33:15] Indexing splices
	Building a SMALL index
	[2015-12-22 22:33:15] Mapping left_kept_reads.m2g_um_seg1 to genome segment_juncs with Bowtie2 (1/2)
	[2015-12-22 22:33:19] Mapping left_kept_reads.m2g_um_seg2 to genome segment_juncs with Bowtie2 (2/2)
	[2015-12-22 22:33:19] Joining segment hits
	[2015-12-22 22:33:26] Reporting output tracks
	-----------------------------------------------
	[2015-12-22 22:38:51] A summary of the alignment counts can be found in C1_R1_thout/align_summary.txt
	[2015-12-22 22:38:51] Run complete: 00:15:04 elapsed
	[izadi@lbox161 bowtie2-2.2.5]$ cd /usr/data/nfs6/izadi/angel/cufflinks-2.2.1.Linux_x86_64/
	[izadi@lbox161 cufflinks-2.2.1.Linux_x86_64]$ cufflinks -p 8 -o C1_R1_clout accepted_hits.bam
	You are using Cufflinks v2.2.1, which is the most recent release.
	[22:41:31] Inspecting reads and determining fragment length distribution.
	> Processed 15175 loci. [*************************] 100%
	> Map Properties:
	> Normalized Map Mass: 17168817.00
	> Raw Map Mass: 17168817.00
	> Fragment Length Distribution: Truncated Gaussian (default)
	> Default Mean: 200
	> Default Std Dev: 80
	[22:43:12] Assembling transcripts and estimating abundances.
	> Processed 15241 loci. [*************************] 100%
	[izadi@lbox161 cufflinks-2.2.1.Linux_x86_64]$ cuffmerge -g genes.gtf -s genome.fa -p 8 assemblies.txt

	[Tue Dec 22 22:48:46 2015] Beginning transcriptome assembly merge
	-------------------------------------------

	[Tue Dec 22 22:48:46 2015] Preparing output location ./merged_asm/
	[Tue Dec 22 22:48:46 2015] Converting GTF files to SAM
	[22:48:46] Loading reference annotation.
	[Tue Dec 22 22:48:47 2015] Quantitating transcripts
	You are using Cufflinks v2.2.1, which is the most recent release.
	Command line:
	cufflinks -o ./merged_asm/ -F 0.05 -g genes.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 8 ./merged_asm/tmp/mergeSam_fileM85AgV
	[bam_header_read] EOF marker is absent. The input is probably truncated.
	[bam_header_read] invalid BAM binary header (this is not a BAM file).
	File ./merged_asm/tmp/mergeSam_fileM85AgV doesn't appear to be a valid BAM file, trying SAM...
	[22:48:47] Loading reference annotation.
	[22:48:47] Inspecting reads and determining fragment length distribution.
	Processed 6417 loci.
	> Map Properties:
	> Normalized Map Mass: 7139.00
	> Raw Map Mass: 7139.00
	> Fragment Length Distribution: Truncated Gaussian (default)
	> Default Mean: 200
	> Default Std Dev: 80
	[22:48:47] Assembling transcripts and estimating abundances.
	Processed 6417 loci.
	[Tue Dec 22 22:48:51 2015] Comparing against reference file genes.gtf
	You are using Cufflinks v2.2.1, which is the most recent release.
	No fasta index found for genome.fa. Rebuilding, please wait..
	Fasta index rebuilt.
	Warning: couldn't find fasta record for 'MT'!
	[Tue Dec 22 22:48:52 2015] Comparing against reference file genes.gtf
	You are using Cufflinks v2.2.1, which is the most recent release.
	Warning: couldn't find fasta record for 'MT'!
	[izadi@lbox161 cufflinks-2.2.1.Linux_x86_64]$

view raw biostars-170434.sh hosted with ❤ by GitHub

ADD REPLY • link updated 5.4 years ago by Ram 45k • written 9.4 years ago by zizigolu ★ 4.3k

1

Entering edit mode

You'll have to rebuild your bowtie index after you make any changes to the fasta file.

ADD REPLY • link 9.4 years ago by Sean Davis 27k

0

Entering edit mode

thank you, i saw i should find another source in which i don't have to change genome.fa instead it's ok changing gtf

ADD REPLY • link 9.4 years ago by zizigolu ★ 4.3k

Login before adding your answer.