Question

How to find a gene from genome shotgun reads

0

Entering edit mode

10.6 years ago

qiyunzhu ▴ 130

I am trying to extract a gene marker, say, COII (cytochrome oxidase subunit II) (e.g., NCBI: JQ319797), from a sequenced but unassembled insect genome, Musca domestica (house fly) (NCBI: AQPM00000000). I downloaded all the scaffolds, created a local BLAST database, and blastned a COII homolog against this database. The result includes multiple hits, each looking like a real COII, but they slightly differ from each other in sequence. I thought one genome should only contain one version of gene, but the result seems to be a mixture. I am asking if it is an artifact due to the imcompleteness of the genome sequencing project, or I did something wrong, or, since it is a mitochondrial gene, there are supposed to be different versions within an insect? And how should I correctly get this gene marker from the genome? Thank you!

Assembly sequencing • 3.1k views

ADD COMMENT • link updated 4.3 years ago by Biostar 20 • written 10.6 years ago by qiyunzhu ▴ 130

0

Entering edit mode

were the read counts for different variants comparable? also I would suggest changing the name of the question

ADD REPLY • link 10.6 years ago by oganm ▴ 60

0

Entering edit mode

Yes they are comparable. They align well, with several to a dozen polymorphisms, and sometimes short gaps. How would you suggest the new title to be?

ADD REPLY • link 10.6 years ago by qiyunzhu ▴ 130

0

Entering edit mode

first you resolve mitochondrial reads, and try assemble it and annotate mitochondrial genome with mitochondrial annotation servers like MITOS, DOGMA etc, it will annotate COII gene.

or you just design primers specific to this gene for insects and sequence this gene.

ADD REPLY • link 10.6 years ago by cvu ▴ 180

0

Entering edit mode

Thanks for your idea! However, I also need some nuclear genomes. And the genomic data does not indicate which reads are from mitochondria. Doing sequencing (you mean experimentally, right?) is not an option for me now, as we don't have those insect samples.

ADD REPLY • link 10.6 years ago by qiyunzhu ▴ 130

0

Entering edit mode

you just simply map your reads against closest reference mitochondrial genome, this is how you can resolve mitochondrial reads.

ADD REPLY • link 10.6 years ago by cvu ▴ 180