I have plant genomic reads (WGS method) through hiseq 2000 with paired-end reads (read1.fastq, read2.fastq). I need to remove chloroplast and mitchondria reads to assemble only nuclear reads. I need to know which is best and fast way to remove contaminants
1. First doing genome assembly and remove mitochondria and chloroplast reads from genome using Blast.
2. Remove first mitochondria and chloroplast reads from fastq files by mapping reads to respective mitochondria and chloroplast read using bwa and get unmapped reads as nuclear reads as fastq file using samtools and picard and then do genome assembly.
Question in topic: do you know how NCBI filters out mitochondrial contigs from submitted nuclear genome? I couldn't dig into that info.
where do you get the chloroplast and mitchondria genome sequence? NCBI? Thanks,