We used a sequencing service and they gave us a 100GB tar file to download. After downloading, I checked the md5sum code and it matches theirs. But after I unzip the tar file and find fastq.gz files inside a folder, I tried gunzip -c filename.fastq.gz | head, I get "not in gzip format" error. I tried file filename.fastq.gz, it says "data" (not gzip compressed data as I would expect). When I just double click on a fastq.gz file, it goes into gz cpgz loop. Is it possible that they gave us corrupt files?
What's the output of
file xxx.tar
andfile xxxx.gz
.You can also ask for help from the service provider.
POSIX tar archive (GNU) and data. I've contacted them but no answer so far... Just wanted to ask here to see what else I can try... Thank you.
How did you
unzip
the tar file and on what OS?It is possible that you may have corrupted the file during download, in case you did not download it in binary mode. Since
md5sum
is ok that possibility is slim though.I just double clicked the tar file on my mac Sierra... maybe I should try unzipping it in other ways. Thank you.
Is the file gzipped at all? Probably file extension is .fastq.gz, but it is simply a fastq file. Try to do
head -4 filename.fastq.gz
. Also check gzip integrity (gzip -tv <input.gz>)
and CRC integrity (gzip -lv <input.gz>
). File command output data denotes that File command is not able to determine the content of the file.I tried to see the content by using head command, and it shows some gibberish (lots of question marks, some numbers and alphabets.) I tried changing the file extension to
filename.fastq
to see what happens, and it still gives me gibberish. As for the other commands you suggested, I getgzip: filename.fastq.gz: not in gzip format, finename.fastq.gz: NOT OK
, andnot in gzip format
. I guess at this point it's clear that the files I've got are not gzip files even though the name looks like it. Thank you very much for your help.could you please paste the result of
This is exactly what it says:
Since you are on MacOS, try unarchiver in app store. It is supposed to handle several formats including cpgz. My guess (from googling) that you might have run into the problem explained here: http://osxdaily.com/2013/02/13/open-zip-cpgz-file/. Let us know if any one of the methods works, for future reference.