Does anyone use Python for variant calling?
6
0
Entering edit mode
8.5 years ago

I have NGS raw data and would like to take that fastq file to VCF file by variant calling workflow. And in all of these steps I would like to use python. So which tools I can use to process my fastq file all the way to VCF and then annotate my variants. Thanks in advance. By the way I need to use python. That is my professors order :/

python SNP calling NGS variant • 6.9k views
ADD COMMENT
6
Entering edit mode

As much as I love python, that's a stupid requirement -_-;

Your Professor should be teaching you to find the best variant calling algorithm, not the one with the least number of curly brackets. Does subprocessing other tools count? :P

ADD REPLY
8
Entering edit mode

mypythonvariantcaller.py

import subprocess, shlex
subprocess.call(shlex.split('java -jar GenomeAnalysisTK.jar ... ...'))
ADD REPLY
1
Entering edit mode

With all do respect, questioning his professor's IQ level is neither the right way to solve the issue, nor the proper way to work with a boss!

Professors give orders all the time, your task is to search for the plausibility of the task and prepare scientific arguments why it would/wouldn't work (as in this case).

ADD REPLY
2
Entering edit mode

due* respect. While I agree with your second statement, John was not questioning the Prof's IQ - he was merely remarking that it was unlikely the Prof seriously meant to enforce a language constraint on reinventing the wheel for a thoroughly solved challenge - if you look at it closely, that does sound stupid.

ADD REPLY
1
Entering edit mode

What if this is a "learn python" (the hard way) exercise. Variant calling just happens to be an end point.

ADD REPLY
5
Entering edit mode
8.5 years ago

You wouldn't do everything in python, that'd be a waste of CPU cycles and your time programming. Rather, you'd use something like snakemake to build a convenient python-based pipeline. It's quite likely that this is what your professor meant.

ADD COMMENT
2
Entering edit mode
8.5 years ago

Perhaps Platypus is a solution, a variant caller written (partially) in python: http://www.well.ox.ac.uk/platypus

I don't know what your position in this research is, but following your professor's orders is not scientifically correct, be critical and check alternatives. Most people use GATK AFAIK, so don't make it too hard on yourself by using something exotic.

ADD COMMENT
2
Entering edit mode
8.5 years ago
ivivek_ngs ★ 5.2k

I am totally in favor of what John is stating, if the requirement is to learn python and how to code in it , there is no point to re-invent it. You can make a processing script in python but then it comes with its own time frame and your professor should understand that. It will not be a new out of the box work , just a processing workflow but major part will be subprocesses calling BWA,GATK or other downstream variant annotation tools. Devon is correct about the wastage of CPU cycles as well. I would in that case look for a python framework processing script already built that employs my requirement and test it and show my result to the boss. That is how it will work, you have to deep learn what tools you need and what you are using at each and every step of variant calling and why you do use them. That is more important than any processing script emplying any scripting language unless you have a strict requirement of languages code of conduct at your work So take a look at the below link

variant_calling_pipeline

gatk_varcall

PHEnix

Enjoy!

ADD COMMENT
1
Entering edit mode
8.5 years ago

In addition, if you are going to do or have to do everything in python, will you write an aligner in python? I see you state that you start with fastq files.

ADD COMMENT
0
Entering edit mode
8.5 years ago

Thanks for you all I know this professor situation is kind of weird. But as you suggest I could subprocess other tools and try an analyysis and show the results to my professor which might convince him. So could you suggest me tools which are written in python so that I can sttart with them?

My workflow will need first an aligner, like BWA, then a tool for manipulating bwa files to bam and bai format, like samtools, and bcftools to vcf format, lastly an annotator like SnpEff and annovar.

I hope ou can help me like previous answers of your. in the mean time I will try other links and ideas of yours, thanks again .

ADD COMMENT
0
Entering edit mode

The ones I suggested above are python workflow itself. You can directly use that workflow and batch run it in a python process script for all the samples

ADD REPLY
0
Entering edit mode
8.5 years ago
Zaag ▴ 870

Maybe have a look at this (scroll down for a BWA-samtools workflow example)

http://snakemake.bitbucket.org/snakemake-tutorial.html

From the intro: Snakemake offers a definition language that is an extension of Python with syntax to define rules and workflow specific properties

ADD COMMENT

Login before adding your answer.

Traffic: 1544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6