I have a set of transcriptomes (nucleotide sequences) from five non-model species and one genome from a sixth species. For the genome, I have an annotated CDS .fna file, nucleotide and amino acid sequences, all of which is not available from any of the major genome/proteome databases. I am wanting to select a set of genes that are putatively orthologous, are single-copy (unique within each genome), and are present in all six species. I need nucleotide sequences in the end that meet these criteria.
I have extremely little bioinformatic skills, and any kind explanation and advice would be much appreciated since I am trying to learn. I thought of within-species BLAST (alternatively BLAT or Bowtie) to identify unique gene regions for each transcriptome/genome. However, I would expect a very lengthy output from these runs that would include everything (single-copy genes and multiple-copy genes). How could I get an output of the single-copy genes?
Once I've managed to do that, I would then need to do across-species BLAST to determine which genes are present in all species. To being this process, would I need to use makeblastdb on the set of single-copy genes and subsequently use this database to query against a .fasta file of all single-copy genes (e.g., blastn)? Like the previous step, I expect a lot in the resulting output in terms of data I want and don't want, so how would I be able to deal with this? Would reciprocal best hit with BLAST be a better option to produce what I'm wanting?
Sorry if this seems way too basic. I admit I do not have a strong or even moderate bioinformatic background, but trying to become more experienced.