I aligned many fastq files with HISAT2 to grch38. This proceeded without problems.
But in the next step with StringTie, which I am trying to find novel transcripts and their counts with the Gencode27 GTF:
stringtie Donor1_IL2OKT3ZA.HISAT2.sort.bam -G /illumina/runs/RNASeq/Gencode27/gencode.v27.annotation.gtf -A try.tab -p 4 > stringtie.out 2> stringtie.err
However, I get an error
WARNING: no reference transcripts were found for the genomic sequences where reads were mapped!
Please make sure the -G annotation file uses the same naming convention for the genome sequences.
Why doesn't Stringtie recognize Gencode annotation? Do I have to do something to the gencode data?
Update: STAR works with this stringtie, but HISAT2 output doesn't. Strange.
stringtie's output from cut -f 3 try.tab | sort | uniq
looks like
703404669@ssxfisctimga004:~/RNASeq_benchmark/GSE96075/HISAT2$ cut -f 3 try.tab | sort | uniq
1
10
11
12
13
14
15
16
17
18
19
2
20
21
22
3
4
5
6
7
8
9
GL000008.2
GL000009.2
GL000194.1
GL000205.2
GL000214.1
GL000218.1
GL000219.1
GL000220.1
GL000221.1
GL000224.1
KI270442.1
KI270706.1
KI270711.1
KI270713.1
KI270721.1
KI270733.1
KI270734.1
KI270742.1
KI270744.1
KI270745.1
MT
Reference
X
Y
chr1
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr2
chr20
chr21
chr22
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chrM
chrX
chrY
UPDATE: I have found several other instances of this error, but no one ever addressed how to solve this:
https://github.com/gpertea/stringtie/issues/113
Warning encountered while transcript abundance estimation using stringtie
Did you use the appropriate gencode HISAT2 index during mapping? In other words did you create a HISAT2 index using the gencode reference genome fasta file?
with which option can I link the gencode GTF? none of the options I can see offer this.
Are you sure about the path used for Gencode/GTF annotation is specified correctly ?
yes, the GTF file exists and is readable
Please, paste the output of
cut -f 3 yourfile.gff | sort | uniq
!@Macspider thanks I've updated the question
That doesn't look at all like a GFF file, it should contain:
or other stuff like that!
hi Macspider, I'm getting the same error with the Gencode GFF as I am with the Gencode GTF.
Yes, but what you pasted is not at all the third column of a GFF/GTF file!
http://www.ensembl.org/info/website/upload/gff.html#fields
Hi Macspider, that's the output from stringtie, not the input. The input GTF and GFF were downloaded from Gencode.