Good day, I've been trying to make the tuxedo pipeline for RNAseq data analysis work with my custom GTF but it haven't succeeded yet. I have tried many things, read many many posts...still nothing. I want to see differential expression in lncRNA. Some of them are known, some are unknown. I've had success with HTSeq --> DESeq, EdgeR...but I want to corroborate my results with cuffdiff.
GTF with known lncRNA only looks like that, (it should work but doesn't):
chr1 unknown exon 12567300 12567451 0 + 0 gene_id "SNORA59A"; gene_name "SNORA59A"; transcript_id "NR_003025_1"; tss_id "TSS3103";
chr1 unknown exon 24171572 24172345 0 - 0 gene_id "FUCA1"; gene_name "FUCA1"; p_id "P16785"; transcript_id "NM_000147"; tss_id "TSS3487";
chr1 unknown stop_codon 24172205 24172207 0 - 0 gene_id "FUCA1"; gene_name "FUCA1"; p_id "P16785"; transcript_id "NM_000147"; tss_id "TSS3487";
chr1 unknown CDS 24172208 24172345 0 - 0 gene_id "FUCA1"; gene_name "FUCA1"; p_id "P16785"; transcript_id "NM_000147"; tss_id "TSS3487";
..
..
..
But tophat returns me with:
format: fastq
quality scale: phred33 (default)
[2015-07-07 08:59:30] Reading known junctions from GTF file
Warning: TopHat did not find any junctions in GTF file
[2015-07-07 08:59:30] Preparing reads
left reads: min. length=20, max. length=50, 28359825 kept reads (14 discarded)
right reads: min. length=20, max. length=50, 28359834 kept reads (5 discarded)
[2015-07-07 09:05:45] Building transcriptome data files..
[2015-07-07 09:06:01] Building Bowtie index from lncRNA_toUse.fa
[FAILED]
Error: Couldn't build bowtie index with err = 1
My first goal is to fix this. Then I will have to try with my unknown lncRNA as well, which the only format I get is:
chr1 unknown exon 47562325 47644943 0 + 0 gene_id "CYP4A22-AS1"; gene_name "CYP4A22-AS1";
It is worth noting that with the "official" Genes.gtf, the pipeline works. I am puzzled, and thank you very much for any insight.
Have a great day,
J.
Just as a total guess, does it have something to do with lncRNA_toUse.fa? (as in, does it use the same labels as the gtf file)
Might be the same problem as this: https://biostar.usegalaxy.org/p/10046/
Tophat creates a log file and stores the exact commands that it uses. Try to run the last one and you'll (presumably) get the actual error that bowtie2 produces.