Question

Identify SNPs with Fst estimates > 0.9 and annotate them

1

Entering edit mode

18 months ago

anasjamshed ▴ 140

I have SNPS in the .fst file containing the estimated fst value in the 6th column like:

2R  4459    1   1.000   96.0    1:2=0.01762811
2R  9728    1   1.000   99.0    1:2=0.01340363
2R  9828    1   1.000   100.0   1:2=0.01554609
2R  9928    1   1.000   99.0    1:2=0.01454173
2R  10028   1   1.000   100.0   1:2=0.01317223
2R  10128   1   1.000   100.0   1:2=0.01554917
2R  10228   1   1.000   100.0   1:2=0.01202964
2R  10328   1   1.000   100.0   1:2=0.01316962
2R  10428   1   1.000   100.0   1:2=0.01317223
2R  10528   1   1.000   100.0   1:2=0.01316962
2R  10628   1   1.000   100.0   1:2=0.01778599
2R  10728   1   1.000   100.0   1:2=0.01554609
2R  10828   1   1.000   100.0   1:2=0.01554917

I want to filer those SNPs that have a value greater than 0.9 so I am trying this command in Linux:

awk -F"\t" '$6>0.02' file.fst

But it's not fetching an exact 0.9 from the 6th column due to the presence of 1:2=0 in every row of the 5th column.

Which changes do I need to make in the awk command?

After finding SNPs, I need to annotate them by using snp eff so is it possible to apply SnpEff to the .fst file?

snpeff fst SNP • 792 views

ADD COMMENT • link updated 10 weeks ago by Pierre Lindenbaum 164k • written 18 months ago by anasjamshed ▴ 140

0

Entering edit mode

Don't forget to follow up on your threads. If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work. If an answer was not really helpful or did not work, provide detailed feedback so others know not to use that answer.

Upvote|Bookmark|Accept

ADD REPLY • link 10 weeks ago by Pierre Lindenbaum 164k

score 1 · Answer 1 · 2024-09-11

1

Entering edit mode

10 weeks ago

Yugan Gogul Muthukumar ▴ 10

I don't have much experience using Linux but I use R to do the same thing for which i use this command

library(dplyr)
library(purrr)

FST_file <- read.delim("YourFSTfile.fst")

significatn_snp <- FST_file %>% filter(FST > 0.9)

write.csv(signfificant_snp, "significant_snps.csv", row.names = FALSE)

ADD COMMENT • link 10 weeks ago by Yugan Gogul Muthukumar ▴ 10

score 1 · Answer 2 · 2024-09-11

1

Entering edit mode

10 weeks ago

Pierre Lindenbaum 164k

using '=' as the separator, the column for FST is now the 2nd:

LC_ALL=C awk -F"=" '$2>0.9' file.fst

ADD COMMENT • link 10 weeks ago by Pierre Lindenbaum 164k