Question

What is the best approach to detect unknown recombinant gene in genome?

0

Entering edit mode

5.3 years ago

Can Holyavkin ▴ 250

We have just sequenced the whole genome of a bacteria. We aim to detect if any recombinant gene is inserted into the genome. Inserted gene is unknown.

So far:

Mapped the paired-reads to our organism reference genome. (via bwa)
Extracted the unmapped paired reads. (both pairs shouldn't be mapped) (via samtools)
Performed de novo assembly with those unmapped reads. (via velvet)

Now, I have ~400 contigs and I'll BLAST each of them. Is it a valid approach?

Or should I focus on another method? Maybe I should focus on integration breakpoints instead of unmapped reads?

recombinant sequencing • 941 views

ADD COMMENT • link updated 23 months ago by Ram 45k • written 5.3 years ago by Can Holyavkin ▴ 250

1

Entering edit mode

Do you have a reference for the bacterium in question? Do you have no idea of the sequence/function (e.g. antibiotic resistance) of the gene inserted or are you looking to see if there are extraneous sequences that appear to disrupt an ORF?

Past thread that may be useful: Identification of the sequence insertion site in the genome

ADD REPLY • link 5.3 years ago by GenoMax 151k

0

Entering edit mode

Thank you for your comment. Yes, I have the reference sequence of bacterium and already mapped the reads to it. But no idea of sequence/function of the gene inserted into genome. I am not interested if inserted gene distrupts an ORF. I want to detect any gene or large sequence that integrated to genome.

I'll check the link you mentioned and update the question if it helps.

ADD REPLY • link 5.3 years ago by Can Holyavkin ▴ 250

0

Entering edit mode

Are 400 the total contigs from assembly? If so you should map those to the reference that you have. Using blat (as long as your reference is reasonably homologous) may be the fastest option. You could also use minimap2 since it will generate BAM files that you can view in IGV. These analyses will give you an idea of redundancy and parts/contigs that don't map to the genome. You will need to do some addition work (PCR etc) to prove that the insertion is indeed where you think it is).

ADD REPLY • link 5.3 years ago by GenoMax 151k