Remove Chr:position format to just Chr
3
0
Entering edit mode
3.8 years ago

Hello all,

I attempted to run a PRS analysis and received the error:

84489606 variant(s) not found in previous data
0 variant(s) included

Error: No vairant remained!

I think it is because of the differing format between my base and target data:

##BIM TARGET
22      chr22:10510227:A:G      0       10510227        G       A

##BASE MDD.QC.gz
rs      ref     alt     pval    effref  info    chr     pos     reffrq  N
rs10    A       C       0.9576  -5e-04  1       7       92383888        0.0596  500199

How can I remove the chr:position:allele format in the target bim file? I believe I need awk print but not entirely sure how to produce this code. Google was no help.

I wonder if the rs snp format in the base data is also causing this error. If so, how can I add the rs format to my target data or remove the rs format from my base data.

Thank you immensely!

imputation PRSice CHR • 2.1k views
ADD COMMENT
0
Entering edit mode

Do you want to remove the chr:position:allele format completely?

ADD REPLY
1
Entering edit mode
3.8 years ago

You could use the following on the target bim file and see if that helps. It should remove the column completely from the target bim file

awk '{$2=""; print $0}' target.bim > new.target.bim
ADD COMMENT
0
Entering edit mode

Thank you, William. Would this move it completely? Can I keep the chr22 and just remove the following position/allele info?

Also, any idea of how to update my snpid with rs in my target file?

ADD REPLY
0
Entering edit mode

Do your target file and base data have the exact same positions?

ADD REPLY
0
Entering edit mode

That might still leave the column there, just with empty contents. If you want to remove the second column you can use cut cut -f-2 target.bim > new.target.bim. You can adjust the delimiter with -d.

ADD REPLY
1
Entering edit mode

You could also do this if you wanted to keep chr22

awk '{sub(/:.*/,"",$2)}1' OFS='\t' target.bim > new.target.bim
ADD REPLY
0
Entering edit mode

Please add tab as OFS. Is it 6 or 1?

ADD REPLY
0
Entering edit mode

It should be 1. Sorry, that was my mistake

ADD REPLY
1
Entering edit mode
3.8 years ago

with sed:

$ echo -e '22\tchr22:10510227:A:G\t0\t10510227\tG\tA'
22      chr22:10510227:A:G      0       10510227        G       A

$ echo -e '22\tchr22:10510227:A:G\t0\t10510227\tG\tA' | sed -r 's/:[0-9]+:[A-Z]+:[A-Z]+//'
22      chr22   0       10510227        G       A
ADD COMMENT
1
Entering edit mode
3.8 years ago
Sam ★ 4.8k

Based on your log, it seems like you are using PRSice. You can use --chr-id C:L with the latest version (2.3.3), which should automatically do the chr ID generation for you.

Edit: just realize it is you. Use --chr-id should solve the problem.

ADD COMMENT
0
Entering edit mode

Thank you, Sam. So, for my target file, it looks like my snps are listed in this format: chr1:17847:T:C, with chr#:position:a1:a2. I have no idea if it's supposed to be formatted this way. My base file snp's are formatted as rs, and I've specified this in the script by using --snp rs, as you've previously suggested.

Would --chr-id C:L work if my bim file does not have headers? Know of a quick way to add headers to my bim if needed? Lastly, does PRSice recgonize this --chr-id C:L:a1:a2 ?... is it necessary to format it this way considering my format is chr1:17847:T:C in the target file?

Thank you!

ADD REPLY
0
Entering edit mode

Because bim file is well defined and we know that it always contain the required information in specific columns. And definitely don't manually modify the bim file, as that breaks the format guideline.

--chr-id C:L works because setting that ask PRSice to ignore the SNP column in the base file, and try to construct a new ID using the chromosome (C) and base pair (L). You can get the a1 a2 using a and b respectively. So something like --chr-id C:L:a:b will do. However, if you can run your script without a and b, I will definitely do that, because by adding the A1 and A2 in your SNP ID, you effectively forbid PRSice from performing strand flipping, which can lead to information lost.

ADD REPLY

Login before adding your answer.

Traffic: 1798 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6