I want to use PSMC to analysis the division time of two closely human population. For that, I need to construct a pseudo-diploid sequence. Now, I have these two human populations' bam file. How should I do?
Could I directly use below command?
samtools mpileup -C50 -uf <ref.fa> <file1.bam> <file2.bam> | bcftools view -c - | vcftools vcf2fq | gzip > diploid.fq.gz
I think this command may be wrong. Another way is getting haploid sequence from each bam file, and then use seqtk mergefa to merge the two haploid sequences. But I do not know how to get haploid sequence from the bam file.
Could someone give me some suggestions, please?
From samtools mpileup manual:
If your BAMs do not keep sample (individual) information as @RG tags, your samtools command probably won't produce the correct result.
Thank you.
My BAMs have the @RG tags, that's mean, if I input two BAMs one time, it can get the haploid sequence for each one?