Subread FeatureCounts produces zero percent successfull alignment
1
0
Entering edit mode
2.9 years ago

Dear All, I working on rice genome transcriptome analysis. I have done alignment using the Hisat2 tool process greater than 80% s score. Then I perform the count matrix generation using the subread package

Commad Subread

/apps/subread-1.6.2-source/bin/featureCounts -p -B -a all.gtf -o counts  *bam

http://rice.uga.edu/pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/version_7.0/all.dir/

Reference fasta file

>LOC_Os01g01010 genomic|TBC domain containing protein, expressed
AGATGAGCTGGTGGGGATGCTCTAAGAGAACGAGAGAAGCACAGAGCAGATAAACCACAC
CCACAGGCACCACCGTCCTTGTTGGTAATGAAGAAGACGAGACGACGACTTCCCCACTAG
GAAACACGACGGAGGCGGAGATGATCGACGGCGGAGAGAGCTACAGAAACATCGATGCCT
CCTGTCCAATCCCCCCATCCCATTCGGTAGTTGGATTGAAGACTACCGAATAAGAGAAGC

GTF file all.gtf

Chr1    MSU_osa1r7  exon    2903    3268    .   +   .   transcript_id "LOC_Os01g01010.1"; gene_id "LOC_Os01g01010"; gene_name "LOC_Os01g01010";
Chr1    MSU_osa1r7  exon    3354    3616    .   +   .   transcript_id "LOC_Os01g01010.1"; gene_id "LOC_Os01g01010"; gene_name "LOC_Os01g01010";
Chr1    MSU_osa1r7  exon    4357    4455    .   +   .   transcript_id "LOC_Os01g01010.1"; gene_id "LOC_Os01g01010";gene_name "LOC_Os01g01010";

enter image description here

enter image description here

zero subread featrureCounts alignment • 843 views
ADD COMMENT
1
Entering edit mode
2.9 years ago

As you can see in your files the FASTA file does not match the GFT file

Now how you reference file calls the sequences >LOC_Os01g01010 whereas the feature file designates them as Chr1

Make sure to use a genomic file, not a transcriptome file as your alignment target.

ADD COMMENT

Login before adding your answer.

Traffic: 2476 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6