Hello,
I'm currently looking for recommendations on a method for haplotyping. The input is a .sam file with reads mapped to a (known) reference. The reads come from a species with 2 haplotypes (human), rather than reconstructing the genome I want to reconstruct its both haplotypes.
The reads are long (~10,000 bp) with high error rate and lots of indels (3-15%, nucleotides are errors, a majority are indels).
I've found one in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4073643/ but do not seem to be publicly available for download. Primarily I have problems to find what methods right now are considered the best, as well as being easy to install and use.
Why don't you explicitly write which technology was used? We can guess on nanopore data, but there is no need to be obscure about it.
Have a look at WhatsHap, I haven't used it though.
Sorry, Pacific Biosciences Real-time Sequencer.
So Falcon Unzip maybe?
Thank you, will try that one.
Seems like the program is currently under development and very much not trivial to use. Do you know about something for the slightly less tech-savy? (Like Bwa-Mem and samtools, those I can handle)
Actually I want to do the same (in other species), I had work with illumina and PacBio reads before and I think that the best way to do that is sequencing chromosomes separately, at least you have an overall RDC higher than 500x. You can see disPades assembler, it is a good aproximation
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4449708/