Hi All,
I am working with pig genome and I had received some FASTQ transcriptome files from my lab mates.
I have processed the FASTQ files using STAR Aligner
~/Software/STAR/STAR-STAR_2.5.0a/bin/Linux_x86_64/STAR --genomeDir ~/data/genomes/pig/susScr3/star/ --readFilesIn ~/Bureaublad/Mod.GC025680.151030.HiSeq2500.FCB.lane2.gcap_dev.R1.fastqgz.gz --readFilesCommand zcat --sjdbGTFfile ~/data/genomes/pig/susScr3/star/susScr3.04012016.gtf --quantMode TranscriptomeSAM GeneCounts --runThreadN 12 --outFilterMismatchNmax 8 --outFileNamePrefix starGC025680 --outFilterMultimapNmax 10
The GTF file was taken from UCSC
It looks like this
chr1 susScr3_refGene start_codon 133903981 133903983 0.000000 + . gene_id "NM_214429"; transcript_id "NM_214429";
chr1 susScr3_refGene CDS 133903981 133904133 0.000000 + 1 gene_id "NM_214429"; transcript_id "NM_214429";
chr1 susScr3_refGene exon 133903981 133904133 0.000000 + . gene_id "NM_214429"; transcript_id "NM_214429";
chr1 susScr3_refGene CDS 133914113 133914267 0.000000 + 1 gene_id "NM_214429"; transcript_id "NM_214429";
chr1 susScr3_refGene exon 133914113 133914267 0.000000 + . gene_id "NM_214429"; transcript_id "NM_214429";
chr1 susScr3_refGene CDS 133917281 133917449 0.000000 + 2 gene_id "NM_214429"; transcript_id "NM_214429";
chr1 susScr3_refGene exon 133917281 133917449 0.000000 + . gene_id "NM_214429"; transcript_id "NM_214429";
chr1 susScr3_refGene CDS 133919376 133919382 0.000000 + 1 gene_id "NM_214429"; transcript_id "NM_214429";
chr1 susScr3_refGene exon 133919376 133919382 0.000000 + . gene_id "NM_214429"; transcript_id "NM_214429";
chr1 susScr3_refGene CDS 133920137 133920252 0.000000 + 0 gene_id "NM_214429"; transcript_id "NM_214429";
Then I am using HTSeq_count function like this:
python ~/Software/HTSeq/HTSeq-0.6.1/scripts/htseq-count -m intersection-nonempty -t exon -i gene_id -f bam starGC025680Aligned.toTranscriptome.out.bam ~/data/genomes/pig/susScr3/star/susScr3.04012016.gtf -o Test
It processes the file, but dishes out
NM_214311 0
NM_214312 0
NM_214313 0
NM_214314 0
NM_214315 0
NM_214316 0
NM_214317 0
NM_214318 0
NM_214319 0
NM_214320 0
NM_214321 0
NM_214322 0
NM_214323 0
NM_214324 0
NM_214325 0
NR_130444 0
NR_131172 0
NR_132429 0
__no_feature 2844434
__ambiguous 0
__too_low_aQual 0
__not_aligned 0
__alignment_not_unique 534450
I don't know, what I am doing wrong and how can I fix this.
Kindly help
Thank you
Hi,
I did grep on my genome file and here are the results:
It is susScr3.fa