I put a collection of videos on YouTube to help new users getting started using EaSeq (described in this post).
I know that it is slightly off, compared to the other tutorials here, to post links to video tutorials. But now that they are made, then I thought that it would be appropriate to mention them here.
How to import ChIP-seq data and a set of genomic regions into EaSeq (the example data can be downloaded here).
How to make heatmaps from the imported data above. In the example we visualize histone marks (H3K27me3 and H3K36me3) at CpG-islands and sort them according to their sizes.
One way to get further is to use the tutorials integrated within the program that guide users through a particular visualization or analysis step by step using own or example data.
All of the steps above are also combined in a 27-minute video.
There is much of EaSeq that is not covered here, and I will have to add new videos / instructions in steps. As I am rather untrained in making video tutorials, then I'd appreciate to get feedback on topics that you would like to see demonstrated or that needs to be explained better.
I am making a page on EaSeq's website that also covers this, and it will be updated as new tutorials are being made.
Add "for windows" in the title (of this and your original post) and add a "windows" tag as well. People won't realize that the software is for windows by just looking at the title.
I watch the 27-minute video through, and I only have a few minor points:
When programmers make a video on some software they wrote (but its also true for presentations and live demonstrations) they often make the mistake of conflating "why you should use this software" with "how to use this software". I know this because I am the worst offender :P and you actually do a pretty good job of explaining why feature xyz is useful -- but this is still a bad approach. A "why to use" video is advertising. Sell the strengths of the software, (which EaSeq has plenty of!), because that's all most people want to know -- what can this software do for me? I don't care how or where the button to do xzy is, I'll learn that later, for now I just want to know what your software can do and if its worth downloading. When you make a video on how to do something, then those videos should be framed around a specific analysis goal.
You speak clearly and in a relaxed way, which is enjoyable for someone like me who has 30 minutes to burn on a Saturday afternoon - but it may be a little too slow-paced for most people. Most people outright refuse to watch videos beyond 5-10 minutes, and the best "why you should use this software" is 3 minutes and under. Even the zoom transitions used were slow. Again, I suck at doing this too even though I know better, because it's not easy to cram 1000s of hours of work into 3 minutes, but it's important.
The resolution is pretty low on this YouTube transcoded version. You should perhaps try reuploading to Vimeo in full resolution and see if that looks better - another option which is something I do a lot is to lower your native screen resolution to something lower before you hit record, so that what you see as you record is closer to what your viewers will see after some lossy compression, etc.
As for EaSeq itself, I think it's a great project and a huge step in the right direction for Bioinformatics. There are obviously multiple implementation things I don't fully agree with -- and that's true of all software I use -- and EaSeq is no different, but the real single biggest issue I think you're going to face going forwards is the Windows-only thing. I get that the software is aimed at Biologists more than the computer scientists, so theres a case to be made that Windows is what the target audience is running...
but I dont know... maybe. But my gut feeling is that something like 80-90% of people who have access to the data are running Linux or OSX, and the 10% who are using Windows are not going to hear about EaSeq via word of mouth due to... um... herd-immunity. In short, you may find that you need the blessing of the established bioinformatic core before the people you are targeting will find out about the software in the first place, and those established bioinformaticians probably don't have a Windows Licence Key laying around to test your software out. Perhaps i'm being to negative, but it's an honest concern I have for what I believe is a great vision of how bioinformatics could be done.
but the real single biggest issue I think you're going to
face going forwards is the Windows-only thing.
I have not watched the videos/looked at the software (yet) but Windows compatibility may become a big plus for this project. There are no open source/windows options for doing NGS data analysis now (and at least a couple three people each week ask about Windows options on Biostars). Many biologists learn unix because they can't find an option for windows. Some like it and will use it long term but for many NGS analysis forms a part of the whole project and the quicker they can get the analysis done, they can move on to the more interesting/difficult part of experiment. If one is in academic/industrial research access to a windows license is not going to be a problem. Only problem @Mads may face is to find additional people to contribute and keep the project vibrant. Having many use this project would inevitably lead to support/feature request demands.
While you raise a good point as always genomax, I don't think "it doesn't exist so if it did it would be good" is entirely guaranteed. There are currently no NGS data analysis options for the Gameboy Advanced for example, and for good reason.
Regarding the Windows licence issue - I think it's a big deal. Neither of us have used the software yet either, and I think that's very telling. I want to - but I first have to figure out how i'm going to run Windows on this computer, and then second, once I have Windows running, how i'm going to get my data off the cluster and into it. That would probably take 50 minutes or so for me to arrange, and, you know, even on a Saturday afternoon I don't have that much time to spare right now to demo an app.
However, if the software would run on Linux, i'd have apt-get'd it and tested it out in less time than it took to write this comment.
You are not the intended audience for this software :-) It is targeted at those who already have/use windows and would like to hit the ground running (unless @Mads wants to correct me). I have to give you credit. You watched a 30 minute video for a software that you knew you were not going to be able to use/test right away.
That is correct. If you speak R or Linux fluently, then the (gain)/(time to learn) is of course more modest and depend on type of work and extent.
Sasoned bioinformaticians might however be interested in telling their local wet-lab scientists about EaSeq, so that they can do data visualization / analysis more autonomously.
It my impression that many bioinformaticians 1) spend a lot of time on extracting some grain of truth from noisy data with no prior quality test, 2) make the same tasks (heatmaps, log2fd, etc) over and over again, and 3) prefer communicating with wet-lab scientists who actually know what analyses they want and can explain it.
So I thought it would be a win-win to post here, so that wet-lab scientists can make some rough quality tests, do the most mundane analyses themselves, and let their ideas mature, before involving bioinformaticians. Life scientists would on the other hand be able to explore their data and confirm / reject hypotheses faster.
Finally, it is my impression that this forum is visited by a diverse group of people - also people who are unfamiliar with R and Linux. I think that EaSeq might help this group of people to become productive faster.
I agree that playing OS-setup is a rather boring game. I'll prioritize finding out how feasible a cross-platform version is to make - and more importantly how robust it will be...
Thanks for the encouraging view point. Time will tell. I think a cross-platform solution would be preferable and will pursue that if feasible.
Only problem @Mads may face is to find additional people to contribute and keep the project vibrant.
Yes. That is true. I have some plans and work that will improve the project, but my reach is of course quite limited. I will have to "sew the parachute while falling"... :-)
Most people outright refuse to watch videos beyond 5-10 minutes, and the best "why you should use this software" is 3 minutes and under.
I second this sentiment. Personally, I do not watch videos at all, ever, unless absolutely necessary, and one of my biggest pet peeves is video tutorials that are not accompanied by detailed text describing all of the topics demonstrated in the video. It can take me less than 2 minutes to skim a written tutorial and find the information I need, whereas searching a video tutorial for specific information is extremely cumbersome and time consuming. It is also impossible to Google information contained in a video, or Ctrl+F for keywords on a page. IMO, a detailed written tutorial & documentation should be the primary focus with video only as a supplement. A GUI based software package would of course benefit from liberal use of screenshots as well.
Thank you very much for your honest and comprehensive feedback - and for taking the time. I appreciate your line of thinking and agree with many of the points - also the less pleasant ones.
The video was thought as how to, and not to "sell" the program. I made a demo movie last year that was aimed at giving editors and users some whys, but I am am amateur when it comes to selling stuff easeq.net/demo) I'll kept in mind to make clearer distinction between why and how.
I am usually criticised for speaking too fast, so I did my best to slow down:-) point taken. Very true. You'll never see a 27m video from me again. It is also much less challenging to record shorter stretches.
That is odd. It is HD when I play it.
Windows. I am not going to disagree on this one either. If I am capable of it, I'll port it using mono - but that is a big if and presumably not done overnight. On the other hand then MS seems very dedicated to make .Net a crossplatform IDE, so I might get better tools for that soon.
Implementations. It is hard to make a tool that every one will agree on as just right. I read your posting on peak finding being a poor representation of richer data. It was thought provoking, but I did not manage to make up my mind, and will have to get back on that. It would be interesting to contemplate on how that can be improved. I am offline for the next days, but will post in that thread when I get back.
Thanks again...
Honestly, the alternative to peak calling is exactly what you have created with EaSeq :)
When data has to be transferred from disk to brain, people instinctively want to try to reduce the amount of data to transfer - ideally in the form of a single p value, total number of mapped reads, QC pass/fail, etc. What is really going to push us forwards in Bioinformatics is not reducing/simplifying the data in terms of content, but making that data transfer speeds from computer to brain faster. Real-time analysis programs like yours - I don't know - I think it's the future.
That is great, then :-) You are right in that many abstract and reduced presentations become too simplistic - and it might also be hard to assess their robustness. I have always been troubled the focus (of PIs in particular) on the absolute number of peaks as a measure of anything.
Add "for windows" in the title (of this and your original post) and add a "windows" tag as well. People won't realize that the software is for windows by just looking at the title.
Cheers. That is something I should have done from the beginning. I'll do that.