Hello
I am working in preprocessing the exome data. I need some inputs in some of the preprocessing steps. I have included the read group in the unmapped bam file. Should the read group info also be included in the mapped bam file?. Then, after merging the mapped and unmapped bam files, should the merged bam files be aligned with reference sequence?. Is there any pipeline available for processing exome data?. I don't have access to the GATK cloud. I am referring to the workflow outline followed by GATK but as GATK keeps updating the workflow in the cloud, I might miss some of the steps followed for germline variant analysis.
Thanks
Instead of building pipelines for standard analysis I recommend to use curated workflows such as https://github.com/nf-core/sarek which works out of the box, in case that is an option to you.