FASTQC, for example, doesn't seem to have a publication associated with it. How would you cite it?
FASTQC, for example, doesn't seem to have a publication associated with it. How would you cite it?
Just to say, thank you for wanting to cite tools. From our investigations, we've found that over 2/3 of the papers that mention Ensembl do not cite our papers. No idea how many more than that use our stuff and don't even bother to mention us. It's as if people think that building a bioinformatic database or tool is not real science, so doesn't need acknowledgement. In science, citations are currency, and using someone's work without citing them is essentially theft.
I suggest you use the author, date and link.
How does this look?
Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc
If anyone is looking for bibtex. Here you go,
@misc{andrews2012,
address = {{Babraham, UK}},
title = {{{FastQC}}},
copyright = {GPL v3},
abstract = {FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.},
howpublished = {Babraham Institute},
author = {Andrews, Simon and Krueger, Felix and {Segonds-Pichon}, Anne and Biggins, Laura and Krueger, Christel and Wingett, Steven},
month = jan,
year = {2012}
}
I would also try to include the DOI into the citation. If the software is in biorxiv, it will already get a doi, else if the software is in Github, the owner of the repository can also acquire one from zenodo (https://zenodo.org/).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
2/3rds of my wetlab work never resulted in authorship either, and just like writing a tool i'm talking about months of work. It's certainly a lot easier to use FastQC or Ensembl datasets without acknowledgement than a BSc/Masters/PhD student -- particularly it if there's no 'proof' of usage in the paper ("The quality of our data... ...looked good based on a variety of metrics / ...compared favorably to other publicly available datasets.") -- however, I think it must have been Socrates who once said: "Don't hate the player, hate the game". More and more these days Science appears to be a 0-sum game - particularly to young scientists who after their PhD have only a 1 in 200 chance of becoming an independent researcher. Under those conditions, a mentality of "take as much credit as possible, and only hand it out where absolutely necessary" is an inevitability. I wish we could go back to the old-days, but I think it's more realistic that tool/service developers forget about asking nicely for people to do the right thing, and instead think of ways to ensure citations where tools where used and punishments if they don't. Alternatively, research like yours showing the disparity between use and citations needs to be done to really make it known how valuable services like Ensembl are to life sciences.
Thank you for providing tools for all of us! I have nothing but respect for the people behind good, open source software.
Great point. However the reviewers of journals can also comment on the references. In principle they can use this power to enforce everyone to give proper credit. Is it the case? I don't know. So researchers are just part of the whole story..
But if 2/3 of the people writing papers think they don't need to cite databases or tools, then a similar 2/3 of reviewers probably have the same attitude. Or maybe 80% of both groups think you don't need to cite, and only by the small number of referees who think it's important pointing it out do we get the citation rate as high as 1/3.
Forgetting or ignoring to cite is one thing but there is also the problem with number of reference limits some journals have for some brief formats which sadly makes it impossible to cite every single tool used in the study properly. Perhaps tools should be excluded from such limits.
Faced this problem recently whereby I had ~55 citations, but the journal had a limit of 30. So, I removed all citations of bioinformatics programs and it reduced to, coincidentally, 30 citations.
Note that journals often restrict the number of references included in the manuscript, making the "standard" tools/databases/repositories such as Ensembl the first victim.
I keep telling people this because having the citations also assists others in reproducing analyses. Unfortunately, many journal editors and reviewers don't want to know about the bioinformatics methodologies and, thus, this information is frequently overlooked in published work.