I'm reading up on fish osmoregulation and have come across many papers talking about the expression levels for a very important gene, atpase alpha (NKAa or atp1a1); it has many names. There are different "versions" that are well known such as atpa1a, atpa1b, and atpa1c where each one is expressed differently depending on species/habitat/tissue.
But half the papers refer to these genes as isoforms, and the other half call them paralogues because of the whole genome duplication event.
I want to sequence these genes for my own species later on but as I write about it, how do I fully understand if these are indeed paralogues or isoforms? Looking up the data from the papers on Ensembl lists these genes as paralogues but why is there this difference between papers? These terms are not interchangeable in my book.
Not a typical bioinformatics question, but I'll give it a try. Probably both are true. This gene must have experienced duplication events (eight local or whole genome) that generated paralogous copies. Each copy may also have multiple transcription isoforms. You'll have to find the conserved regions across all known copies, presumably in species related to your target species, and "fish" all copies out from whole genome sequences or by PCR with primer anchored to the conserved regions. Paralogous genes are at different physical genomic locations, while isoforms come from a single locus. You'll need to do RT-PCR or mRNA-Seq to confirm all isoforms.