Hello,
I have received Illumina sequencing reads for 100 samples. I have 8 R1.fastq.gz
and 8 R2.fastq.gz
files for each sample in each subfolder. I want to run a MD5 check for all the fastq
files in each subfolder.
My folder structure looks like:
/mydata/sequencing/clean
/mydata/sequencing/clean/Sample1
/mydata/sequencing/clean/Sample2
......
/mydata/sequencing/clean/Sample100
I am using the following code being on the /mydata/sequencing/clean
directory.
find . -type f -exec md5sum-lite {} \;
This code is printing the md5 result on Terminal.
How do I get these results printed to a .txt
file? Besides, is there a more elegant way to do this MD5 check?
Thank you.
There are three ways to do this depending on destination location and destination file name:
you want to store each file's md5sum where each file is located (Remove dry-run if you are okay with dummy run output) :
$ find . -type f -name "*.pdf" | parallel --dry-run md5sum {} ">" {.}.md5sum
you want to store each file's md5sum (one md5sum file for one file) at a single location (current directory in the example below) irrespective of file location and number of md5sum files are equal to number of files:
$ find . -type f -name "*.pdf" | parallel --plus --dry-run md5sum {} ">" {/.}.md5sum
You want to store all the md5sums in a single file in current directory and all MD5sums are written to a single file
all.md5
in current directory:$ find . -type f -name "*.pdf" | parallel --plus md5sum {} > all.md5
Thanks a lot for the helpful comment.