Question

Spoligotyping of whole genome sequences

0

Entering edit mode

10.2 years ago

ribas.aca • 0

Hello, I was reading this paper and was wondering.

For some simple organisms, like Mycobacterium bovis, Mycobacterium tuberculosis, there is a great number of whole sequences in genbank for example.

Is it possible to, white a whole genome sequence, type the organism?

If I'm not understanding it wrong, the Restriction fragment length polymorphism (RFLP) typing consist of the amplification of an area of the DNA, which, if we know the primers, we could simple get on the whole genome right?

Then cut it with a RFLP, which cut the sequence in a specific sequence, for example, ACCT, so every ACCT on the sequence will make a cut, transforming the sequence extracted on the step above into many sequences of various lengths. And them compare the lengths with a reference, the lengths are black thinks saw in figure 2 on the paper of the link right? So we known that are sequences of x bases, sequences of y bases, and comparing these numbers with the reference we give a name to the organism, a type?

If I'm not totally misinterpreting, are this procedure possible and implemented in some language, python or R with Bioconductor? The procedure to take a whole genome and type it with RFLP vs a reference? Also can someone suggest more literature about this?

Spoligotyping wholegenome typeing rflp • 2.2k views

ADD COMMENT • link updated 2.7 years ago by Ram 45k • written 10.2 years ago by ribas.aca • 0

0

Entering edit mode

I think you're describing RFLP fingerprinting, where if we run the same RFLP assay on two samples we can see if they're the same individual. It won;t be perfectly the same, but for the most part it works between real samples. The reference genomes aren't really so accurate, so you should try a known sample to compare against.

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 10.2 years ago by karl.stamm 4.1k

Ram · Answer 1 · 2015-12-22

Spoligotyping results can be derived from the whole genome sequence without much difficulty, but RFLP results are more difficult. To reliably reproduce the cuts in silico, your genome would need to be completely assembled and this is not an easy task. If you have the closed genomes, it should not be very difficult to write a script in python or R to do this. Which language you would prefer depends on which you are most familiar in.