Entering edit mode
2.4 years ago
grant.hovhannisyan
★
2.6k
Hi all,
Is there any software/workflow which can automatically clean/tidy up the file system for bioinformatics projects? For example, remove/tag duplicated datasets, find the same bam/sam files, make appropriate symlinks between them, compress uncompressed files, report potentially saved space, etc. Basically a C-Cleaner for bioinformatics.
cheers
This type of functionality is built into high performance storage systems at OS level (for the device).
One example - NetApp data on tap. https://library.netapp.com/ecmdocs/ECMP1368859/html/GUID-7D804119-9EB1-4CE8-B02D-6AF215D17201.html
thanks! it seems to be a generic system for HPCs, but I was wondering if something bioinformatics related exists