Hi,
I've encountered a couple's whole exome sequencing data (Consanguineous marriage - Cousins ) which there seems to be a very high similarity between their variants that I'm concerned the samples aren't even from a couple. maybe somehow accidently from one person!! there are lots of common pathogenic variants (based on ACMG) (about 50) which is way higher than previous dual samples. I tried to measure the similarity of fastq and bam files. I ran commet on the fastq files and the output was 92 percent similarity (which was about 90 for another dual sample). I tried the multiBamSummary from deeptools and it showed 0.99 pearson correlation. running fastqc showed no contamination or technical errors.
any suggestion on how to proceed? Is there any other test so I can try to know if these two fastq samples are from different people?!
Actually I found out that both of them have reads aligned to the chromosome Y! Is this enough evidence for the samples to be from at least not a couple?
Nop, there are pseudoautosomal regions on Y and X chromosomes. I'd advice you to simply calculate ratio Y to X reads, in other words,cdetermine the sex of each of them
using the mosdepth tool I calculated mean per base coverage , one was X=0.72 ,Y=0.31 and the other was X:0.74, Y:0.39. I think this indicates both samples are from the same male person.
If the clinical case states that this is a man-to-woman marriage, then yes, it is very likely you got two males here.