Entering edit mode
4.6 years ago
bigfoot
▴
10
Dear all, How can I be sure that I have also removed all the missing values from the GWAS summary statistics? I have already applied some general Quality Controls steps (in the script below, names are in place of the number of the columns):
cat file_input_sumstats | \
awk 'NR==1 || ($MAF > 0.01) && ($INFO > 0.6) {print}' | \
awk 'NR==1 || (($A1 == "A" && $A2 == "C") || ($A1 == "A" && $A2 == "T") || ($A1 == "A" && $A2 == "G") || ($A1 == "C" && $A2 == "A") || ($A1 == "C" && $A2 == "G") || ($A1 == "C" && $A2 == "T") || ($A1 == "T" && $A2 == "A") || ($A1 == "T" && $A2 == "C") || ($A1 == "T" && $A2 == "G") || ($A1 == "G" && $A2 == "A") || ($A1 == "G" && $A2 == "C") || ($A1 == "G" && $A2 == "T")) {print}' | \
awk 'NR==1 || ($OR > 0) {print}' | \
awk 'NR==1 || ($SE > 0 && $SE < 100000) {print}' | \
awk 'NR==1 || ($P > 0 && $P < 1) {print}' | \
awk 'NR==1 || ($INFO > 0 && $INFO < 1.5) {print}' | \
awk 'NR==1 || ($EAF > 0.01 && $EAF < 0.99) {print}' | \
awk 'NR==1 || ($MAF > 0 && $MAF < 1) {print}' > output
Hello and welcome. For questions like this, you should provide sample input and desired output. Thanks.