Hi, I have a text file of 7500 genomic positions with their chr, start, end and want to get their nucleotide sequences. Can someone point me some tool or any thoughts on how to do it?
Thank you,
Hi, I have a text file of 7500 genomic positions with their chr, start, end and want to get their nucleotide sequences. Can someone point me some tool or any thoughts on how to do it?
Thank you,
Use getfasta from bedtools : http://bedtools.readthedocs.io/en/latest/content/tools/getfasta.html
if you use python:
import pysam
genome = pysam.Fastafile(path_to_genome+'genome.fa')
sequence = genome.fetch(chr, start, end)
path_to_genome: would be any genome you have downloaded (e.g. hg19).
I have blog posts on this http://crazyhottommy.blogspot.com/2013/04/batch-converting-coordinates-to.html http://crazyhottommy.blogspot.com/2015/02/fetch-genomic-sequences-from-coordinates.html
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you. It is handy.