Entering edit mode
5.0 years ago
mel22
▴
100
Hi I have many large files with association results. Each file containes 8 columns (the 3rd one is the p value), I need to create from each file a new one conataining only observations where the p value is < 10 e-5. How can I do this with bash code ? Here a small example from these files :
SNP N P p2 or1 or2 q q1
c10_pos5974849 2 0.1881 0.1881 1.1931 1.1931 0.5707 0.00
c10_pos5975482 2 0.3225 0.3225 0.8670 0.8670 0.8840 0.00
c11_pos68438345 2 0.6537 0.66 0.9705 0.9690 0.2856 12.29
c11_pos107693921 2 0.8938 0.8558 1.0133 1.0250 0.1755 45.52
c12_pos67499221 2 0.8351 0.8351 1.0236 1.0236 0.6413 0.00
c14_pos67844869 2 0.1103 0.1915 0.7334 0.7229 0.2039 38.05
c14_pos68073026 2 0.09954 0.1298 0.6383 0.6215 0.2662 19.11
c14_pos68087872 2 0.3704 0.3704 1.2500 1.2500 0.7319 0.00
Thank you
one word :
awk
! ;)in general: if you're working with column like data, always consider
awk
for processing itwith gnu-parallel and awk:
create a folder by name "out" in the current folder and run the script in the current folder. Remove dry-run to execute the command.
That's great thanks cpad0112