Hi,
This sample run http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM835232 at geo has the sra files in bit different manner submitted.
Such that the mate pairs are loaded to SRA as single end runs resulting in two files per sample.
My problem is, how can I get proper fastq files from these two SRA files.
I tried
fastq-dump -A SRR364680.sra
fastq-dump -A SRR384964.sra
and after that bowtie, but it doesn't work. Has anybody ever dealt with such a data, if yes how can I proceed to get unaligned FAstq files that can be used for alignment.
Here is the head of the two fastqs
file 1
@SRR364680.sra.1 SFGF-GA2-1_63:2:112:1559:999 length=80
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR364680.sra.1 SFGF-GA2-1_63:2:112:1559:999 length=80
################################################################################
@SRR364680.sra.2 SFGF-GA2-1_63:2:112:9048:999 length=80
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR364680.sra.2 SFGF-GA2-1_63:2:112:9048:999 length=80
################################################################################
@SRR364680.sra.3 SFGF-GA2-1_63:2:112:10809:999 length=80
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
File 2
@SRR384964.sra.1 SFGF-GA2-1_63:2:14:1899:1000 length=80
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR384964.sra.1 SFGF-GA2-1_63:2:14:1899:1000 length=80
################################################################################
@SRR384964.sra.2 SFGF-GA2-1_63:2:14:11711:999 length=80
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR384964.sra.2 SFGF-GA2-1_63:2:14:11711:999 length=80
################################################################################
@SRR384964.sra.3 SFGF-GA2-1_63:2:14:13989:1000 length=80
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
Thank you
Can you post the first 5 read names from two fastq files ? What's the error with bowtie ?
Please check the updated question.
I am wondering, if it's paired end data, it will have the read1 and read2 information in the read name (like
#1
,#2
or/1
,/2
etc) to distinguish the read pairs. But I don't see them here. The read pairs (R1 and R2) should be in same order for alignment.