Hi all,
I'm currently downloading public datasets on the ENA platform.
When I want to download files of a specific study (for example here : https://www.ebi.ac.uk/ena/data/view/PRJEB23709 ), for a given sample I have the choice between "FASTQ file" or "Submitted File".
I was wondering what is the difference between those two files (the "submitted" one being sligthly bigger that the corresponding "FASTQ" one)
Thanks
If I remember well you can also upload (un)aligned BAM files , which ENA will then convert back to fastq I think, but as ATpoint I as well suggest to always go for the fastq version.
In this specific case
Submitted file
appears to contain the actual sample name. If you get theENA
fastq files then you may need to keep track of metadata for the sample names. So in this case I suggest that you download a sample ENA and Submitted files. Compare them (they should be identical) and the probably get theSubmitted
files instead.I've checked and they are indeed the same files. The difference in size was only due to the modification of reads name in the "FastQ" files.
Hi, I am facing an issue in choosing a correct fastq file, since there are two entries for 1 sample accession (ERS1042158). [1]https://www.ebi.ac.uk/ena/data/view/ERS1042158
Can anyone please explain why there are two entries for 1 sample and what could be the possible difference? File size is also little different (87 GB vs 84 GB).
Hi AISHA ,
would you mind posing this as a new question (rather then adding it here) ? This way we try to keep the questions/answers logically structured.
thx
Okay. I am going to post it as a new question.