Hi All, I have many bam and their fastq files from several different sources of whole genome sequencing experiments. some of those sources might have used UMI in their workflow, while others didn't. In the case UMIs were used, I don't have their structure/form.
Is there a way to detect whether fastq/bam files have UMIs?
I found that umi-tools could be used or fastp from this thread: Use fastp to preprocess FASTQ data with unique molecular identifer (UMI) integrated But in both cases the user needs to specify the format of the UMIs, which I don't have. In my case I want to detect whether UMIs were used in each of the samples I have (I also don't need to remove them, simply to detect if they were used in the sample)
Any suggestions would be appreciated. Thanks.