Cis- eQTL Question
1
0
Entering edit mode
7.4 years ago
mms140130 ▴ 60

I'm trying to do a cis eqtl analysis, so I have a linear regression where gene expression is the dependent variable and the snps are the independent variable , I have 20,000 genes with 1000 patients and 700,000 snps with 1000 patients , I need to reduce the dimension of the snps so I will choose the snps that is 1000 bp above TSS and 1000 bp below TSS of the gene (cis-eqtl) then I will combine the snps that are above TSS (not sure how yet ???)in one variable, and combine the snps that are below TSS in one variable in aim of reducing the snps in the model ,,and those two combined variables will be added to the model.

Does this make sense ?

gene SNP • 2.7k views
ADD COMMENT
1
Entering edit mode

I think it would make more sense to use a published method such as FastQTL.

Although it might be very interesting to reinvent the wheel, often that's not necessary.

ADD REPLY
0
Entering edit mode

Thanks for your answer, but my advisor wants me to write my own code, so Im trying to think how to do that ??

ADD REPLY
0
Entering edit mode

Then it makes sense to look up how published methods do their job, and try to replicate that.

ADD REPLY
2
Entering edit mode
7.4 years ago

I'd highly recommend MatrixEQTL, it's phenomenally quick for the scale of tests you're describing. Follow the tutorial, once you've got your head around it, it should be relatively simple to apply to your data.

ADD COMMENT
0
Entering edit mode

well, I have used MatrixEQTL before but my advisor doesn't want me to use it , he wants me to write my own code !!

ADD REPLY
1
Entering edit mode

I don't see the point in reinventing the wheel. Your logic sounds fine, but typically cis distances are around 1e6 bases. Take a look at the matrixEQTL source if you're doing this in R, applying these operations one by one will take forever, but if you can get your head around the matrix operations, then that's what allows matrixEQTL to be so fast. There's other smaller speed ups such as parlapply, which are parallel implementations of apply.

then I will combine the snps that are above TSS (not sure how yet ???)in one variable, and combine the snps that are below TSS in one variable in aim of reducing the snps in the model ,,and those two combined variables will be added to the model.

This bit doesn't make a lot of sense... If you're just looking for cisQTLs then choose your distance from the gene (TSS is fine) apply the tests and be done. If you then want to look for trans associations (stuff that doesn't fall in your cis distance), then you'll need to test those too.

ADD REPLY
0
Entering edit mode

Thank you for your answer ..

ADD REPLY

Login before adding your answer.

Traffic: 2874 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6