Hi, I have 50 samples with a gen counts from an RNA-seq that I want to compare my counts with another one. I have two text files that a table of samples and genes stored there, and I put the data in here so you can notice the data structure. I also save my counts that store separately in the folders for each sample. I want to make two dictionaries to compare the values which are samples or genes, but I don't know how to do it. Is there anybody can help me about how to start? The table of counts that i want to compare with them : first 30 samples: https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE98212&format=file&file=GSE98212%5FH%5FDE%5Fgenes%5Fcount%2Etxt%2Egz
other 20 samples that remains: https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE98212&format=file&file=GSE98212%5FL%5FDE%5Fgenes%5Fcount%2Etxt%2Egz
You can have a look at DESeq2 or edgeR
thanks for your advice, but the problem is that i want to do it with python despite these packages. thank you again
That's a huge constraint for statistical stuff. Why do you absolutly wants a python solution ?
that is what my supervisor asking for so i have to do it with python! also in the project that i want to compare my data, they used edge R! i am desperate to go on this way!
If it is a school assignement, you should have said it straight in your post. If you're a Master intern or PhD student you still have the right to discuss the techniques used with your supervisor. I suggest you to read this two papers and search into python packages for statistical RNAseq stuff, create your arguments around what you read and discuss with your supervisor. Statistical interpretation of RNAseq data are in R packages.
i am a master student, yes indeed. i will follow your advise! i really appreciate that! thanks again! for the last part about R, we don't use it in here, i am a biotechnologist but because my supervisor is a bioengineer so we go ahead by python not R! He doesn't really get in to R language!
There are attempts to port DESeq to python, but none of them are fully complete AFAIK.
You could 'cheat' and use
rpy2
and just make all your function calls to R via python if you are absolutely hell-bent on adding a layer of complexity to appease your boss.the problem is that i don't know R language! but i try to handle it! thanks anyway
You don't really need to (and its not very difficult anyway to be honest). If you're following the DESEQ tutorial/walkthrough, they show you exactly what commands are needed at each stage. You'd just be wrapping those commands up in python instead of running them via R.
Unless, you have plenty of time on your hands and want to port the whole thing over to python natively from the ground up...