Hi fellow colleagues! Happy coming weekends =)
- What is the best way from your experience to distribute bioinformatics software and what is the best way for you to get software? Do you prefer to download compiled versions from https://sourceforge.net/, or seeing lively github repository and ability to compile yourself is crucial?
- When you distribute your tools what do you measure about its usage? How do you encourage people to really cite your tools as papers (not just web-links). Are there any tools to calculate how many times your software was cited as a web-link in papers? Do you try to get information about users of your software like email addresses, names, countries, phone numbers?
- How do you solve licensing and liability issue especially for something you had done on your own spare time as a side project. How do you protect yourself or maybe even find a way to make some money off your tools?
Maybe you know a good tutorial on this? If not, let's answer to this questions in the discussion below and make it as a tutorial for everybody interested.
UPDATE: two more questions (thanks to genomax2):
4. What do you think about having access for your tool as a library or a package for python, ruby, perl, R, other languages? Is this important and needed?
5. Are there any collaborations we can join to provide our tools to be part of bigger packages and still have the ability to publish about them, have control and support, maybe a way to sell it as well?
UPDATE: two more questions (thanks to Arnaud):
6. As a tool developer and as a user what do you think of software tools as plugins and how one can develop a plugin say for samtools?
7. Is Galaxy (or similar solutions) a good way to distribute software for tool developers and for users?
Thank you,
Petr
Thank you!
Thank you shenwei356. Could you please say if you
How in general you want to download and install software you use for science?
Also as a software developer and scientist what is more important for you (especial for your career, in writing grant proposals etc): how many times your paper was cited, how many times your software tools were downloaded or how many times your tools were used in a paper but sited as awebpage?
Source code and binary files are both OK for me, but the later is better. But sometimes it's hard to compile from source for some users, especially beginners. So I choose writing sicence softwares in Go, so I can cross-compile single binary files for Linux/Windows/OS X. Users really like this.
Is this a poll??
Lots of things are important for my career, citations should be the first one. You can see my paper citation on Google Scholar.
I didn't count the cases of being cited as a webpage. Although SeqKit was download 300+ times for v0.4.3 ( ). But these's no citations or webpage links for now :(. They may just have not make the papers published yet :)
Kind of a poll, yes. I want to align my own preferences with our community preferences, standards, and needs, so I thought it is a great place to discuss how we distribute our software, how we analyse its distribution process and how we personally like to get software at the same time (and the way we cite it).
For example, I love tools that work out of the box on Linux and MacOS, no installation, no dependencies, just put it in the PATH or update the PATH and you are good to go (but I prefer to use direct links, so more control over the versions of each tool). I do not care about windows, in spite of the fact that one of the work laptops is Windows. I like to have access to all versions of the software tool just in case and I tend to keep all versions I used in the archive. I hate to provide any information about myself to download and try a tool. Moreover, I do not mind to provide some help on making it better if I like it, but I have not seen a tool, that encourages this. Most encourage to cite the paper, but to be honest, I sometimes cite with a web link instead of a proper reference to a paper, especially if I was not able to find a proper paper to cite in a few minutes (maybe even a few seconds). Also, I am ok with software that checks with its web server to tell me about available updates.
To conclude I think there is a gap between what user want and do and what software tools developer want and do in terms of software distribution. Maybe I am wrong and there is no gap. But if there is one, we can address it from both sides.
I do care about Windows users, who are about 1/3 of whole users of seqkit and csvtk according to the download history.
These's no need to hate anything that needs filling information to download, they may just want to track users and take a poll, e.g., SPAdes download page encourages users to fill user information but also explicitly provides direct download links
Github is a good place to communicate with the developers and contribute or help to improve the projects.
Published softwares that encourage citation usually provide paper links.
Softwares that check updates are common, I did this too. Using packages management tools like conda and brew is also a good way to keep them updated.