Dear Friends, Hi (I am not native in English)
I want to launch some Ortholog comparison between some candidate gene of my species of interest and Danio rerio (zebrafish).
I have found similar approach in this paper under the section: "Ortholog comparison between A. sinensis and Danio rerio" but it is not clear to me which software they have used?
They have searched the orthologs of 11 genes in three gene families (sox, apolipoprotein and cyclin) in two previusly mentioned fishes.
. .
My questions:
1- Which software is better for this comparison?
2- Where I could collect the orthologs of these 11 genes ? (did I must collect their proteins? or nucleotides from some Zebrafish database ?)
3- How I must offer my set of genes (I mean the genes of my newly sequenced transcriptome) to compare with?
Maybe I must use all the transcriptome assembly or its translation ? (it seems that OrthoFinder needs proteins)
.
Thank you in advance
NOTE: my data are from RNA-seq and de novo assembly. I also use Transdecoder to translate longest ORF for each isoforms.
Hi, I still do not know that what program the authors have used, but in Orthofinder manual it has read:
Performing a complete OrthoFinder analysis is simple:
1- Download the amino acid sequences, in FASTA format, for the species you want to analyse. If you have the option, it is best to use a version containing a single representative/longest transcript-variant for each gene.
2- Optionally, you may want to rename the files to something simple since the filenames will be used as species identifiers in the results. E.g if you were using the 'Homo_sapiens.GRCh38.pep.all.fa' file you could rename it to 'Homo_sapiens.fa' or 'Human.fa'.
3- Place the FASTA files all in a single directory.
4- To perform a complete OrthoFinder analysis requires just one command: orthofinder -f fasta_files_directory [-t number_of_threads]
.............................
So, I have downloaded the protein sequence of those 11 genes of Danio rerio from Uniprot and make a fasta file (OR I must collect all the zebrafish proteins ?), BUT I do not know how I must use my transcripts (all of them OR finding the representative of those 11 genes using tblastn?) ?
Thanks in advance for any help ;-)