Finally I took the position like this:
from Bio.Blast import NCBIXML
result=open(folder+"Blast-Alignment.xml","r")
records= NCBIXML.parse(result)
item=next(records)
for alignment in item.alignments:
for hsp in alignment.hsps:
if hsp.expect < 0.01:
print('****Alignment****')
print('sequence:', alignment.title)
print('length:', alignment.length)
print("start:", hsp.sbjct_start)
print("end:", hsp.sbjct_end)
print('score:', hsp.score)
print('gaps:', hsp.gaps)
print(hsp.query[0:90] + '...')
print(hsp.match[0:90] + '...')
print(hsp.sbjct[0:90] + '...')
Output:
****Alignment****
sequence: gi|568815580|ref|NC_000018.10| Homo sapiens chromosome 18, GRCh38.p12 Primary Assembly
length: 80373285
start: 63307591
end: 63307572
score: 20.0
gaps: 0
CTGCTTTGTAACAGCTTGTG...
||||||||||||||||||||...
CTGCTTTGTAACAGCTTGTG...
And with pyensembl I donwloaded manually the reference genome and look for the concrete position giving back the gene located there.
import pyensembl
ensembl = pyensembl.EnsemblRelease()
data = Genome(
reference_name='GRCh38',
annotation_name='my_genome_features',
gtf_path_or_url='C:\\Users\\User\\AppData\\Local\\pyensembl\\Homo_sapiens.GRCh38.93.gtf')
# parse GTF and construct database of genomic features
data.index()
gene_names = data.gene_names_at_locus(contig=18, position=63307591)
print(gene_names)
Output:
['BCL2']
see Aligning Two Proteins With Their Domains/Annotations
I forgot to mention that the region I want to blast is intronic and quite short (20 nt), so when I follow your advice it doesn't detect anything. The only way I think it could be possible is taking the position and search for it like here is described with biopython here with exons ( I have to check if it is possible with introns too).
https://biopython.org/wiki/Coordinate_mapping
well, in that case, you can do a blast from the command line using blast-short task, and format the output like a BED file using the text column based output.
After that you can intersect your alignments with an annotation file using bedtools. That way each alignment will get an overlapping feature.