featureCounts difference assigned reads summary file and summed up reads in feature count matrix
0
0
Entering edit mode
3.0 years ago
Carambakaracho ★ 3.3k

Dear all,

this might be a naive question but my googlefoo fails me. I count reads from a bam, aligend by Star against a custom hg19 genome, after running picard markDuplicates, then counting reads assigned to exons with a slightly customized variation of the NCBI reference annotation gff. The customization is mostly adaption to the genome and propagation of some tags from gene level to exon.

Then I count reads using featureCounts, where $bam are all the bams in the pipeline. There's quite a lot of transcript variants per gene, also multimapping is allowed on purpose and I wanted to catch that with --fraction

featureCounts \
    -p -f -T 4 \
    -O -M --fraction \
    -a hg19.gff \
    -F GTF \
    -t "exon" \
    -g "ID" \
    -s 2 \
    --extraAttributes "toplevel_id,gene,transcript_id,GeneID,gbkey,gene_biotype,description,tag" \
    -o out.tsv \
    ${bam}

When summing up the counts assigned in the count table, they're different to what the corresponding .summary file reports as assigned reads. Is this an known / expected side effect of the fractional count with multiple exons and multimapping?

RNAseq featureCounts • 1.3k views
ADD COMMENT
1
Entering edit mode

I'd say that this is a subtle and specialized use case that only the implementer knows for sure. It feels like one of those issues where the various definitions of terms can be reconciled in multiple ways, and the different reports use it differently.

perhaps asking on the software's issue tracker would be a more appropriate

FWIW I would sort of ignore the discrepancy as something "expected"

ADD REPLY
0
Entering edit mode

Thanks, sort of matches my gut feeling. Except the “write-the-developer-part”, obviously…

ADD REPLY

Login before adding your answer.

Traffic: 1685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6