I have a Plink file that contains chr, rs and position
for 30K variants. I would like to find a way that would convert this information into chr, start and end
. Is there an easy way of achieving this? I will be grateful for any advice.
I have a Plink file that contains chr, rs and position
for 30K variants. I would like to find a way that would convert this information into chr, start and end
. Is there an easy way of achieving this? I will be grateful for any advice.
You can use awk
to extract the columns with the chromosome name and the position to create a valid bed
file:
$ awk -v FS="\t" -v OFS="\t" 'NR>1 {print $1, $4-1, $4, $2}' input.bim > output.bed
The coordinates in the bim files are 1-based. But bed
uses 0-based coordinates. That's why we have to subtract 1 ($4-1
) from the given position for the start coordinate.
This will create:
23 60424 60425 rs34557243
23 60691 60692 rs28419004
23 60881 60882 rs28705946
fin swimmer
If we are going to read it into R, why create intermediate files? Just do it within R:
library(data.table)
fread("myBim.txt", skip = 1)[, list(V1, V4, V4)]
# V1 V4 V4
# 1: 23 60425 60425
# 2: 23 60692 60692
# 3: 23 60882 60882
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hello,
could you please give some examples of your input? Should the desired output a
bed
file? I'm asking because one have to consider the 0-based vs 1-based interval problematic.fin swimmer
Hello. Thank you for your response. My ultimate aim is to create a
.txt
file that will containchr, start and end
data for my 30K variants. I will then use this.txt
file for other analysis. So I really do not mind the format for the output for a long as I can read it in R. My current bim file looks like this:Please let me know if this is helpful. Thank you