Hi, I know others have asked similar questions, but they were not particularly what I was looking for. I am Master's Bioinformatics student with a BA in Biology. I need a project idea that would last 3 months (Professor did not help). I was thinking something along the lines of getting sequencing data, trimming it, aligning it, and doing a differential expression analysis, but I don't know where to get the data for this (perhaps NCBI?). If you have other ideas for projects as well, that would be good too. I was also thinking about a project about something having to do with GWAS. My only experience in programming is what I learned in my Master's program.
Thanks for all your help
Do you have any experience with Bash scripting? It might be helpful to specify what programming you learned in your master's program. It would help if we had some more info about the goal of your project? Are you supposed to learn something about a specific tool, or contribute something unique to the community (etc.)?
Yes I have bash scripting experience, perl, python, R, and wish to utilize these in the project. I have statistical analysis experience with R as well.
I would just do something 'simple' like:
Do that and then Bob will be your uncle
Step 1 is Web browser; 2 and 3 are shell / BASH; 4 and 5 are R Programming Language
Of course, please come back here for help if needed.
Kevin
Thank you!! I was thinking of doing the steps you mentioned above multiple times for various datasets to test some sort of hypothesis (because I need this project to last the 3 months mentioned in the question). Is there some sort of hypothesis I could test by comparing multiple datasets in this fashion?
Let me think. Is this your major project for your MSc? - like, how would you rate the importance of it? From my perspective, a Masters student should not have to necessarily add anything new to literature; thus, merely doing a re-analysis should be sufficient.
Yeah it's not an official paper or anything like that. But I just need to figure out how to extend it for an entire semester. This is really just something that I'm doing because I could not find an internship/co-op, so my Professor let me choose this option.
Well, you could do something like download multiple cancer datasets and aim to come up with a 'pan-cancer' panel of markers (including non-coding RNAs). After you do your standard differential expression analyses to identify differentially expressed genes (DEGs), you could then do something 'cool' like refining the panel signature using lasso regression. I put some code here, which may help: A: How to exclude some of breast cancer subtypes just by looking at gene expressio
Here are some other posts of mine, which may help to give you further ideas:
Thanks!! But considering the read files are so large (500MB or more) wouldn't they be too big for expression analysis in R from my laptop? I have access to my school server for trimming and abundance estimates, but would the R-portion be able to be done completely on the server for differential expression analysis or would I need some to be done in R-studio?
500MB is peanuts these days... we now deal in at least gigabytes. The large projects deal with petabytes (ICGC, TCGA, 1000 Genomes).
If you have a relatively new laptop (?), 500MB should not be a problem. R should also be installed in your compute cluster environment, but that question needs to be directed to your local IT person, or directly to central IT. Installing a version of R for global use is not difficult - just depends on who the System Admin is. I've done it before whilst managing large clusters.
Thanks again! I am actually working on my laptop from home, and I connect to the school server through bash...so would I be able to do everything regarding the R-part on vim while on the server? Maybe I could save the graphs that I need as a PDF? Do I really need to use the R software interface?
Are you worried about the plotting window not showing when you use R on the server? To get that part working, you need to set-up X Window (X11).
Are you using Windows? If 'yes', then install a program call Xming and leave it running in the background. Then, when you log-in to the server using [hopefully] PuTTY, go to:
Connection > SSH > X11
and check the 'Enable X11 forwarding' checkbox, an also put into the text box the following:localhost:0
That will then transmit plotting windows from the server to your laptop.