Hi, Recently we recieved, fastq files and alignments from another group. This is a Clinical Exome Sequencing study, so it has been studied from DNA. It's a pairend Illumina sequencing.
We have another candidate that explains the phenotype fittingly but there is a peciluar view in one of the genes EIF4G1 which raises few questions.
The BAM file just says clcsgenomicsgridworker which doesn't give much info about the alignment. I repated the alignment using sarek v3.5.1
but there was no large deletions though coverage peaks looked similar. When I BLAT some of the read sequences manually on UCSC we get the alignments accordance with the CLC with the large gaps. BLAT also shows nothing similar on other genomic regions.
The event looks like a retrotransposon activity. But in order to discover it should we align like RNAseq? Is anyone have experience with the CLC workflow?
Thanks, Barış
BAM header from CLC
@PG ID:0 VN:21.0 PN:clcgenomicsgridworker
Example reads
>ex1
GGAAGGAATTTCTACCTGAAGGCCAGGACATTGGTGCATTCGTCGCTGAACAGAAGGTGGAGTATACCCTGGGAGAGGAGTCGGAAGCCCCTGGCCAGAGGGCACTCCCCTCC
>ex2
CCCCTGGCCAGAGGGCACTCCCCTCCGAGGAGCTGAACAGGCAGCTGGAGAAGCTGCTGAAGGAGGGCAGCAGTAACCAGCGGGTGTTCGACTGGATAGAGGCCAACCTGAGTGAGCAGCAGATAGTATCCAACACGTTAGTTCGAG
>ex3
GTTAGTTCGAGCCCTCATGACGGCTGTCTGCTATTCTGCAATTATTTTTGAGACTCCCCTCCGAGTGGACGTTGCAGTGCTGAAAGCGCGAGCGAAGCTGCTGCAGAAATACCTGTGTGACGAGCAGAAGGAGCTACAGGCGCTCTACG
About that, we would have seen this kinda pattern in other genes too, but this is the only one. :/ Maybe CLC aligner uses some hybrid approach?
Yes, I am fiddling with bwa gap parameters to get something similar to no success.