Question

STAR alignment setting for RNA-Seq data from mouse background expressing a human ortholog

0

Entering edit mode

8 months ago

marionette.kent • 0

Hi all,

My colleague has a cell line model, of which it is mouse cell but a human gene is inserted and expressed. I was going to use STAR then RSEM to do the quantification. However, I am not sure if I need to play with the settings for multi-mapped reads since the human ortholog might be very similar to the mouse ortholog. Or I should say, in general, I am not sure when options like –outFilterMultimapNmax in STAR should be set manually (I have heard that it's necessary for pseudogenes and repeats, but not sure about orthologs).

I assume that I should first modify the fasta and gtf file to create an artificial chromosome for the gene, then try to align it with various setting if the two genes are very similar?

Many thanks

alignment star ortholog rna-seq • 758 views

ADD COMMENT • link updated 8 months ago by swbarnes2 14k • written 8 months ago by marionette.kent • 0

1

Entering edit mode

I might suggest seeing how similar the sequences are, the more similar they are... MAYBE you can avoid adding the exact human sequence in the alignment index.

However, I think it's probably best to add the human FASTA sequence to the sequences you use to generate the genome index. That will be most accurate.

ADD REPLY • link 8 months ago by Yogi ▴ 70

score 1 · Answer 1 · 2024-03-06

1

Entering edit mode

8 months ago

swbarnes2 14k

The safest way is to make a new human genomic index with the mouse gene added on, like a new chromosome. This way, the reads should align to whichever sequence, human or mouse, they best match to. RSEM should help with the fact that some reads might map equally well to both versions.

EDIT: I flip flopped the species. The whole cell is mouse, so it needs to be the whole mouse genome + human gene.

ADD COMMENT • link 8 months ago by swbarnes2 14k

0

Entering edit mode

Thanks Yogi and @swbarnes2 . Should I add the sequence before and after the CDS? I assume it will be helpful if any 5'/3'-UTR is included as well. From my understanding, the cell line was knocked in with the human gene to mouse background with a Tet on/off promoter. I can probably get the sequence before and after the CDS (if any), but if it's not important I might just leave it.

ADD REPLY • link 8 months ago by marionette.kent • 0