I am currently working on a project involving 100 isolates in Illumina FASTQ format. The objective of my study is to investigate the phylogenetic relationships among these isolates based on a conserved sequence within the HMG gene. The HMG gene has been suggested in a paper as a differentiating factor for species identification.
I am seeking guidance on how to align these isolates to the HMG gene to perform subsequent phylogenetic analysis. Specifically, I would like to know the steps involved in aligning the isolates to the gene and how to extract the HMG gene sequences from the FASTQ data.
I have already obtained the HMG gene sequence in FASTA format, and I understand that I may need to perform sequence assembly and identification of the gene regions before alignment. However, I am unsure about the specific tools or software that would be suitable for this task, especially considering the memory limitations I have encountered with SPAdes.