Question

Best practice for normalizing sequencing depth for differential miRNA expression?

1

Entering edit mode

17 months ago

Trivas ★ 1.8k

I am trying to perform differential miRNA expression between samples with and without exogenous expression of miRNAs using small RNA-seq data. The exogenous miRNAs are in the top 10 expressed miRNAs per sample and the sequencing depths between samples are different enough to warrant asking this question to make sure I'm doing things correctly.

I first used the miRge3 pipeline to quantify all human miRNAs. I then take the trimmed and collapsed fasta files, expand them, remap the reads against human miRNAs using bowtie1, take unmapped reads, then finally map my unmapped reads to my exogenous expression constructs. I take the final .bam file and use featureCounts to count the exogenous miRNAs. (Side note: I've tried making custom references for miRge3 so I didn't have to do these extra steps and have had 0 luck.)

My plan was to merge the output from featureCounts with the output from miRge3 to create a "master" counts file containing my exogenous miRs and host miRs and let DESeq2 do the size normalization from there.

Is there anything else I should account for? One idea is to use this first pass object to identify some control miRNAs to estimate size factors from, but beyond that I don't have any ideas.

DESeq2 miRge3 normalization miRNA • 607 views

ADD COMMENT • link updated 17 months ago by biofalconch ★ 1.3k • written 17 months ago by Trivas ★ 1.8k

0

Entering edit mode

I think concatenating the counts is ok as long as there is no overlap between the two libraries. That said, I would use the actual library size (e.g. how many reads were sequenced per sample), but it should be around the same as the sum of all your counts per sample (at least the proportions should).

ADD REPLY • link 17 months ago by biofalconch ★ 1.3k