Entering edit mode
4.1 years ago
tommy
▴
40
Hello,
I have a question about BLAST. What I want is trying to build a phylogenetic tree building on a specific gene. But I met troubles at the first step. I don't know how to get this specific gene in the library with command line. And I couldn't find any tutorial about this as well.
What I have is a lot of gene FASTA files of yeast. And I want to build phylogenetic tree on AQY gene.
Would someone give me some guides about this?
Thank you in advance.
If you are have raw sequence data then you would either need to align it to a reference and then generate a consensus. If there is no reference available then you will need to assemble the data and then identify where the gene is located in that assembly.
So, instead of finding the specific gene, align then, and using phylogenetic tree, what I should do first is the alignment of the whole sequence? Cause I have found some phylogenetic tree build on one specific gene, so I am curious how to do that.
one of the article mentioned
But I don't know how to do this.
Thanks for your help.
For building the phylogenetic tree, I think you need to extract the gene sequences, remove the redundant ones, align, ....
What kind of data do you have? It it next generation sequencing data or some other type?
yes, I have lots of .fa file of yeast.
This is an important piece of information and should have been included in the original question. Are these genome
.fa
files?Yes, it is nucleo FASTA file. Thanks for letting me know, I have added some detail in the question.
If I interpret your question correctly, you want to find sequences in some other genomes that are similar to your yeast query genes. Then you want to retrieve sequences matching your queries, extract the coding sequences, presumably translate them, remove redundant sequences, do a multiple alignment on the proteins and construct a phylogenetic tree from the proteins.
All of these tasks are easy to do in the BIRCH system, which is specifically designed to leverage Unix-style systems like Linux and MacOSX. Perhaps the greatest strength of BIRCH is that, in addition to hundreds of bioinformatics programs, BIRCH has a substantial body of tutorials that take you through exactly these types of tasks step by step. To see BIRCH in action, visit our YouTube channel.
Thanks, Brian. But I think what I want to achieve is build a phylogenetic tree based on a specific gene. like Gallon 2016, they built on PAD1 gene. figure 5c https://www.cell.com/cell/fulltext/S0092-8674(16)31071-6?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0092867416310716%3Fshowall%3Dtrue
Do you have the accession number for that gene? Could you provide more information?
I am trying to analysis AQY1 and AQY2 gene of yeast. Thanks for your help.
one of the article mentioned
I think this might be the right way, But I don't know how to do this.
Thanks for your help.
blastn
searchs for a query (a nucleotide sequence) in a reference (e.g. reference genome(s)). So, I think they already had gene sequences for AQY1 and AQY2, and used it as query to search for similar genes in a genome of interest?You can go to this link, select
blastn
, check the box next to "Align two or more sequences" https://blast.ncbi.nlm.nih.gov/BlastAlign.cgiThen paste a gene sequence in the first text box, and a genome ID (NC_001144.5) in the next one, and see the results. You can do the same using the commandline, but you need to download
blastn
executable file.If you don't have a genome of interest you can use the link without checking the box for alignment, and just use a gene query.
Alternatively, you can use
tblastn
and use a protein sequence/ protein accession number.Thanks for your help. That's very helpful