Entering edit mode
6.1 years ago
shubhra.bhattacharya
▴
140
Dear all, I have extracted gene from reference using gff file. The gene is p53 and the entry is as follows:
AC_000176.1 BestRefSeq%2CGnomon gene 27985492 27997883 . - . ID=gene22694;Dbxref=BGD:BT10936,GeneID:281542;Name=TP53;description=tumor protein p53;gbkey=Gene;gene=TP53;gene_biotype=protein_coding
I have fetched out this gene using:
bedtools2_2.26/bin/bedtools getfasta -fi sequences.fa -bed p53.bed > p53_from_bed.fa
The bed file (p53.bed) is as follows:
AC_000176.1 27985492 27997883
However when I blast this extracted sequence against Nucleotide db I get the top hit as: Bos mutus isolate yakQH1 chromosome 19 whereas my organism of interest is bos taurus.
Also, identity against Bos taurus TP53 is 99% but query coverage is 17%
Select seq NM_174201.2
Bos taurus tumor protein p53 (TP53), mRNA
1729 3975 17% 0.0 99% NM_174201.2
Is this hit only for mRNA region and hence its coverage is low? Can someone tell me where my understanding is wrong?
Are you extracting only the top hit from your search? It is possible that blast is returning an almost identical sequence as top hit from a closely related species which is why you get Bos mutus.
BTW: If you have a known sequence what is the reason to blast it back against
nt
?I performed online blast and yes that was the top hit ie, the Bos mutus hit. But should I still not get a query coverage of 100% against my gene of interest in Bos taurus from the nt db?
PS: This query was raised by a third party client :)