Hello, I am downloading some data from this project from ENA: https://www.ebi.ac.uk/ena/browser/view/SRX13384934
It is mentioned that the data is single ended, but there are still two fastq files (their experimental accession number is the same).
I tried to confirm this on NCBI SRA as well: https://www.ncbi.nlm.nih.gov/sra/?term=SRX13384934 Even on NCBI SRA, it is mentioned that it is single end, but there are two fastq files given.
I am assuming that the two files are the same and the number of runs which were done for this sample set was two for confirmation?
I have done fastqc and trimmomatic on these files, and they do differ by sequence length, should I opt for the one best in quality?
I have to take either of these files to perform Kallisto and then DESeq2.
Thank you in advance!
I cannot use both, to perform Kallisto quantification I need one single file only.
@swbarnes is correct. If you look at the
data access
tab for these entries in SRA ( one example https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR17204972&display=data-access ) then you will see that there areL001
andL002
in the original file names. This means the same sample was run on two lanes.You can
cat file1.gz file2.gz > one_file.gz
and use the single file as input.Both samples were run as 75 bp. You will get a range of read length after trimming the data. That is normal.
I understood what you are implying, thank you