I have NGS raw data and would like to take that fastq file to VCF file by variant calling workflow. And in all of these steps I would like to use python. So which tools I can use to process my fastq file all the way to VCF and then annotate my variants. Thanks in advance. By the way I need to use python. That is my professors order :/
As much as I love python, that's a stupid requirement -_-;
Your Professor should be teaching you to find the best variant calling algorithm, not the one with the least number of curly brackets. Does subprocessing other tools count? :P
With all do respect, questioning his professor's IQ level is neither the right way to solve the issue, nor the proper way to work with a boss!
Professors give orders all the time, your task is to search for the plausibility of the task and prepare scientific arguments why it would/wouldn't work (as in this case).
due* respect. While I agree with your second statement, John was not questioning the Prof's IQ - he was merely remarking that it was unlikely the Prof seriously meant to enforce a language constraint on reinventing the wheel for a thoroughly solved challenge - if you look at it closely, that does sound stupid.
You wouldn't do everything in python, that'd be a waste of CPU cycles and your time programming. Rather, you'd use something like snakemake to build a convenient python-based pipeline. It's quite likely that this is what your professor meant.
I don't know what your position in this research is, but following your professor's orders is not scientifically correct, be critical and check alternatives. Most people use GATK AFAIK, so don't make it too hard on yourself by using something exotic.
I am totally in favor of what John is stating, if the requirement is to learn python and how to code in it , there is no point to re-invent it. You can make a processing script in python but then it comes with its own time frame and your professor should understand that. It will not be a new out of the box work , just a processing workflow but major part will be subprocesses calling BWA,GATK or other downstream variant annotation tools. Devon is correct about the wastage of CPU cycles as well. I would in that case look for a python framework processing script already built that employs my requirement and test it and show my result to the boss. That is how it will work, you have to deep learn what tools you need and what you are using at each and every step of variant calling and why you do use them. That is more important than any processing script emplying any scripting language unless you have a strict requirement of languages code of conduct at your work So take a look at the below link
In addition, if you are going to do or have to do everything in python, will you write an aligner in python? I see you state that you start with fastq files.
Thanks for you all I know this professor situation is kind of weird. But as you suggest I could subprocess other tools and try an analyysis and show the results to my professor which might convince him. So could you suggest me tools which are written in python so that I can sttart with them?
My workflow will need first an aligner, like BWA, then a tool for manipulating bwa files to bam and bai format, like samtools, and bcftools to vcf format, lastly an annotator like SnpEff and annovar.
I hope ou can help me like previous answers of your. in the mean time I will try other links and ideas of yours, thanks again .
The ones I suggested above are python workflow itself. You can directly use that workflow and batch run it in a python process script for all the samples
As much as I love python, that's a stupid requirement -_-;
Your Professor should be teaching you to find the best variant calling algorithm, not the one with the least number of curly brackets. Does subprocessing other tools count? :P
mypythonvariantcaller.py
With all do respect, questioning his professor's IQ level is neither the right way to solve the issue, nor the proper way to work with a boss!
Professors give orders all the time, your task is to search for the plausibility of the task and prepare scientific arguments why it would/wouldn't work (as in this case).
due* respect. While I agree with your second statement, John was not questioning the Prof's IQ - he was merely remarking that it was unlikely the Prof seriously meant to enforce a language constraint on reinventing the wheel for a thoroughly solved challenge - if you look at it closely, that does sound stupid.
What if this is a "learn python" (the hard way) exercise. Variant calling just happens to be an end point.