Entering edit mode
8.0 years ago
cristina_sabiers
▴
110
well Im back with silly questions..
Is there any easy guide to understand how to filter my vcf files? I have exome secuency, I filtered my vcf file just to have SNPS and exclude QD<2 (I need to start somewhere) eventhought still I have over 35.000 genes (too many for my brain to study)
Got QUAL from 100-3000 If Im not wrong higher better (but which I can discard...under 1000...800....??
In my vcf head file have many different values of:
[1]CHROM [2]POS [3]REF [4]ALT [5]QUAL [6]GENE [7]GT [8]GQ [9]FILTER [10]AF [11]AO [12]BKPTID [13]CDF_LD [14]CDF_MAPD [15]CIEND [16]CIPOS [17]CONFIDENCE [18]DP [19]END [20]FAO [21]FDP [22]FR [23]FRO [24]FSAF [25]FSAR [26]FSRF [27]FSRR [28]FWDB [29]FXX [30]HOMLEN [31]HOMSEQ [32]HRUN [33]HS [34]LEN [35]MEINFO [36]MLLD [37]NS [38]NUMTILES [39]OALT [40]OID [41]OMAPALT [42]OPOS [43]OREF [44]PRECISE [45]PRECISION [46]QD [47]RBI [48]REFB [49]REVB [50]RO [51]SAF [52]SAR [53]SRF [54]SRR [55]SSEN [56]SSEP [57]SSSB [58]STB [59]STBP [60]SVLEN [61]SVTYPE [62]TYPE [63]VARB [64]FUNC [65]SF
If anyone can provide a link where I can learn easily how to reduce my vcf file I would really appreciate it.
Thanks
Giva a look here, especially if you used GATK.
https://software.broadinstitute.org/gatk/best-practices/
Thanks Fabio, I hadnt done this vcf files, my pc sadly cant handle to do that kind of jobb (I tried once and was a hell). Just got this vcf done by a company. And I dont think they used Gatk, they have their own program.
Thanks for the link.
That highly depends on what the aim of your analysis is. Looking for a causal variant, an eQTL study, population genetics,...