Hi everyone,
I have been analysing a pool-seq dataset using the popoolation suite of tools, but I would like to give MULTIPOOL a try as well. However, I am having some difficulties in formatting the input file. According to the wiki (https://github.com/matted/multipool/wiki), it should be something like this:
18698 38 50
19079 38 37
19190 37 34
19235 28 41
19418 45 18
19592 42 47
19607 37 53
In which column 1 is snp location, column 2 is allele count for pool A and column 3 is allele count for pool B (they don't specify, but I imagine it's reference allele counts).
Does anyone have any suggestion on how to construct this input (from bam, mpileup or other source)? I tried doing it using popoolation2, but the output is not suitable for this: it gives major and minor allele counts per population, so if you're doing something similar to a bulk segregant analysis the regions of interest (positions with divergent allele frequencies) will be masked.
Thanks!