Entering edit mode
3.8 years ago
Tom
▴
20
I have a gene section of a cow genomic sequence (X59856) coding for a milk protein and I want to find the corresponding gene clusters in mouse, and human. What's the best way to slice out and download just the subsection of the aligned portion that matched? I would also want about 10,000 nucleotides up/downstream to get the flanking regions?
One option is to see if the protein is present in pre-computed alignments in NCBI Homologene (e.g. this is for
milk
protein of some kind https://www.ncbi.nlm.nih.gov/homologene/4334 ). You can then take the proteins and map then back on the genome sequence. Both Ensembl (Compara) and UCSC also have precomputed sections like this.Can you be a lot more specific?
NCBI's datasets tool can retrieve ortholog data: https://ncbiinsights.ncbi.nlm.nih.gov/2021/02/23/datasets-orthologs