Entering edit mode
5 months ago
Stavroula
•
0
Hello all,
I was wondering if anyone can help me, I have a table with the following format:
FORMAT
GT:DP:HF:CILOW:CIUP:SDP (column V9)
Info
0/1:4282:0.001:0.0:0.003:5;0. (column V10)
and I want to filter in R for values that are HF<0.1 and HF>0.99 without caring about the rest of the info.
Is there a way to do that?
I have been trying with this command:
Control1MTL1_Filtered2<-filter(Control1MTL1_Filtered, V10= c(::<=0.1::))
but it does't recognise the format.
Any ideas would be more than appreciated.
Best, Stavroula
Why do you want to use R for something that is better addressed by purpose-built utilities such as bcftools?
On second thought, it doesn't look like you have the VCF, just a tab delimited file with VCF columns. You're going to need to do some wrangling.
First off,
V10
is notInfo
. INFO is a completely separate column, probablyV8
. CallV10
"sample" or something. Split V9 and V10 using:
as the delimiter and then create a key-value pair with split V9 as the keys and split V10 as the values. It's going to take some serious dplyr/tidyr gymnastics to do this, so rpolicastro is probably the person that can help you there.Indeed! That looks like a genotype column.
I agree with the others on using bcftools and defining a proper filter, especially if you want to export and use the vcf file later on.