How to map reads onto human hg38 gene body regions instead of whole genome?
2
0
Entering edit mode
4.0 years ago

Hi there, I am thinking of mapping reads onto gene bodies to make it less strict for off-target identification. Wondering if there's any file there already, or I have to extract certain regions manually. Thanks!

alignment genome • 1.0k views
ADD COMMENT
1
Entering edit mode
4.0 years ago

I would get the fasta file of the human transcripts and map it against it. You can get the RefSeq annotation from here. Then you can build an index with your favorite aligner and align against it.
Other option would be to extract the genes from a gtf annotation and use bedtools getfasta to get the fasta file from the desired intervals.

ADD COMMENT
0
Entering edit mode
4.0 years ago

Can't say if it's a good idea or not, but the way I would do that is to mask the genome reference. Just hack up your genome reference to have a bunch of NNNN in the intergenic regions.

Or use a transcriptome mapper like STAR/RSEM directly, and forego the genome mapping.

ADD COMMENT

Login before adding your answer.

Traffic: 1956 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6