Which reference transcriptome/genome to use for mus musculus if I know the particular strain involved in the experiment?
1
1
Entering edit mode
21 months ago
e.r.zakiev ▴ 230

I know my samples are from C57BL6.

Should I use the specific for C57BL6 or just generic mus musculus reference transcriptomes/genomes for alignments?

I am worried that the file size for the reference transcriptome for C57BL6 (Mus_musculus_c57bl6nj.C57BL_6NJ_v1.cdna.all.fa.gz, 39.7 MB) is 22% smaller than its generic counterpart (Mus_musculus.GRCm39.cdna.all.fa.gz, 51.2 MB). Clearly, biologically the C57BL6 transcriptome cannot be 22% smaller than the transcriptome of some other strain, so there is something going on with the lower detalization for C57BL6??

What do I gain, what do I lose if I opt for C57BL6-specific transcriptome/genome?

mice alignment genome transcriptome • 829 views
ADD COMMENT
2
Entering edit mode
21 months ago
LChart 4.5k

There's some basic literature on the topic here: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010552 . Slightly improved uniquely mapped reads; though mapping parameters appear to have a stronger impact than choice of reference.

However, for differential expression (BL6/J untreaded vs BL6/J treated) the question isn't so much "how do the quantifications change" but "how do the logFCs change" -- and I don't see published results on this. I should imagine that, by aligning to the strain transcription, some genes might get slightly higher coverage to bump them over the soft-filtering threshold; but few (if any) logFC values should alter based on the reference.

It's probably worth doing twice just to put your mind at ease.

ADD COMMENT

Login before adding your answer.

Traffic: 1835 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6