Recently I updated tophat pipeline to tophat2 and and bowtie2. I was trying to working on DEGs in Oryza sativa (japonica species) using Illumina paired end reads and got the error Error: gtf_to_fasta returned an error
.
tophat2 --GTF all.gff3.txt -o result2 riceindex 5D-con1_1_val_1.fq 5D-con1_2_val_2.fq
From previous posts, I came to know that
- inconsistencies in chr naming between GTF file and index file is one possible reason. But my gff3 file from RGAP worked well with tophat older version. (ftp://ftp.plantbiology.msu.edu/pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/version_7.0/all.dir/all.gff3).
Another possibility suggested was orphan reads after adapter removal. But I used Trim galore+ cutadap wrapper tool for this purpose, as below
trim_galore -paired -phred333 -q 20 -a adapterseqence -stringency 5 -e 0.1 -t -r1 35 -r2 35 set1_1.fq set2_2.fq
Another post suggested the non-compatible index and annotation file can also make problem. I used IRGSP 5 for building index file. Can anyone please tell me what is the wrong things am doing? Also is there any other better file source for Rice RNASeq analysis?
If you are able to use the sequence/annotation/aligner index bundle for Rice found on iGenomes site you may be able to avoid this sort of issues: http://support.illumina.com/sequencing/sequencing_software/igenome.html
Thanks for reply @genomax2. I checked the link which you provided. But the files looks the older one. As I know MSU has version 7 now and IRGSP build 5 is available . Do you have any comments?
Do either of those locations make a package (sequence/annotations) available? If they do I would say choose that and then build your own aligner indexes for bowtie2.
Thanks @genomax2. I will make a try..!!