Hi there, I am thinking of mapping reads onto gene bodies to make it less strict for off-target identification. Wondering if there's any file there already, or I have to extract certain regions manually. Thanks!
Hi there, I am thinking of mapping reads onto gene bodies to make it less strict for off-target identification. Wondering if there's any file there already, or I have to extract certain regions manually. Thanks!
I would get the fasta file of the human transcripts and map it against it. You can get the RefSeq annotation from here.
Then you can build an index with your favorite aligner and align against it.
Other option would be to extract the genes from a gtf annotation and use bedtools getfasta
to get the fasta file from the desired intervals.
Can't say if it's a good idea or not, but the way I would do that is to mask the genome reference. Just hack up your genome reference to have a bunch of NNNN in the intergenic regions.
Or use a transcriptome mapper like STAR/RSEM directly, and forego the genome mapping.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.