Hello everyone! I'm by no means a bioinformaticist, but would like to learn some art (my background is chemistry/computer science/machine learning, I do ML-supported drug design).
I would like to analyse human genetic data. Specifically, the task is as follows: given a pair of FASTQ files (produced by Illumina), I would like to get a list of mutations of each gene present in the data in the form GENE <position> A>C or as an rs number. The data contains cDNA reads.
So far I was able to run this pipeline to completion: https://gencore.bio.nyu.edu/variant-calling-pipeline-gatk4/ However, I can not make any sense of the results. The pipeline produced some VCF files, but the SNPs seem to be not annotated with genes or at least I can not read it right :(. I used this reference genome: https://ftp.ensembl.org/pub/release-86/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh38.cdna.all.fa.gz
Could you please advise an easy pipeline/tutorial/course to learn how to do a basic SNP analysis? Or please advise how to use the mentioned tools
Thanks for great links!