Hi,
I had this curious question "Can we use HI-C sequencing data to assamble scaffolds to chromosomes from one isolate to other isolate?"
Let me elaborate.
In fungal genom assembly paper, i have seen people using a closest reference genome to assamble contigs into scaffolds using tools like RagTag. i.e in Fusarium genomes where there is only one genome assembly on NCBI which is treated as a reference genome of comperative analysis and to scaffold contigs. Will it intorduce some genome biasness in query genome? Maybe but i dont know how much. Theoratically i feel that it wopuld but again, ppl use it for scaffolding.
Similar to this, I had a thought, why not get HI-C sequence data of a closest relative, within the bounds of same specie and use it to scaffold the contigs into scaffolds/chromosoems. Will that be a good approach.
for example.
Sample A is a plant pathogen fungi of a specie complex named XYZ
, is is known to be pathogenic to Arabidopsis plant. someone sequenced it with PacBio + HI-C
to generate chromosome level sequence.
Sample B is another fungus belonging to same specie complex XYZ, is known to be pathogenic, but sequenced using Nanopore + Illumina sequence.
So now can we use HI-C data of sample-A to scaffold the sample-B assembly better ?
I would love to hear your thoughts on this.
While someone will be along with an answer that is specific based on their experience you may want to try asking ChatGPT these two questions and evaluate the responses. "use HI-C sequencing data to assemble scaffolds to chromosomes of a related species" and "is Hi-C data only usable for the organism it originated from".
Devil would always be in the detail of how closely related the two strains are and how good the quality of the data is. But it sounds like it would be worth trying.