Hi, I am trying to get full gene sequences of Avian Influenza virus from Illumina reads. My sequencing guys do the Illumina run, use velvet to do a denovo assembly, blast the results to get the best reference, then use BWA to map reads to that reference, and then call a new consensus sequence.
My first question is The reference to use is often coding region, not full gene, because that is what is most often published. Is there something I can do with the data to extend the reads beyond the 3' and 5' ends of my reference that was mapped against?
My second question is The reference might have insertions or deletions compared to an isolate I am trying to sequence. Is there a way for BWA to recognise where my data is longer than the reference give that data? At the moment I think BWA just trims reads to fit the reference.
Thanks for your help James