Counting abundance of each gene in SAM/BAM file
0
0
Entering edit mode
5.1 years ago
Hansen_869 ▴ 80

I have just aligned my raw FASTQ-files (from a metagenomic sample) to my predicted genes (I used Prokka) with BWA, in order to get an idea of the abundance of each of the genes in the sample. I now have a SAM and a BAM file. My question is: How do I count the abundance (how many reads are mapped to each gene), of all my genes. I basically want an output in the style of (of course doesn't have to be this exact formatting, but you get the idea):

Gene 1 - Mapped reads: 34

Gene 2 - Mapped reads: 85

and so forth.

I could make a script that would extract and count the mapped reads, but I wonder if there is a tool out there that could do it better?

Samtools Abundance gene bam sam • 2.9k views
ADD COMMENT
1
Entering edit mode

Prokka also outputs gff / gtf / bed files, correct? Use featureCounts with the bam and the gtf, read the docs for more details.

ADD REPLY
0
Entering edit mode

Thanks for your response! Porkka only outputs gff files. Is that enough?

ADD REPLY
0
Entering edit mode

featureCounts should work with gff. Out of curiosity, are the raw fastq files you mapped RNAseq reads?

ADD REPLY
0
Entering edit mode

Cool, I'll check it out! They are DNAseq, Illumina :)

ADD REPLY

Login before adding your answer.

Traffic: 1697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6