Issue When Spliting An Srf File: 4 Files Obtained
1
1
Entering edit mode
13.5 years ago

I am spliting SRF files into FASTQ using the staden io_library program srf2fastq:

    srf2fastq  -c -s ./fastq/name_split -a -n   name.srf

But some of the SRF files contain 4 chunks of sequences instead of two (paired end experiment), creating four FASTQ files _1, _2, _3, _4 with each one with the reads with /1, /2, /3 ,/4 respectively.

The problem is that the FASTQ files _2 and _4 are 'technical reads' that would be discarded and only the _1 and _3 should be use. But this mean that my reads names for the reverse reads (the FASTQ file _3 )end with /3 instead of the usual /2.

Questions:

  • Would it create confusion to other people leaving them ended in /3 instead of /2?.

  • Should I rename the reads to have /2 instead /3?.

  • Can I only extract the wanted 2 chunks from the SRF instead the 4 of them?

Looking at the SRF file with srf_info I can know which chunks I want:

    > srf_info -l255 name.srf

    Reading archive name.srf.
    trace_name:  + name_456:8:1:404:759 ... name_456:8:1:381:649 x10
    Reads: GOOD : 10
    Reads: TOTAL : 10
    Chunk: BASE : 10 238
    Chunk: CNF1 : 10 409
    Chunk: CNF4 : 10 2890
      Mdata key: SCALE : 10
    Chunk: SMP4 : 10 5780
    Chunk: REGN : 10 130
      Mdata key: NAME : 10
        names=forward:P;skip1:T;reverse:P;skip2:T boundaries=35;36;71 x10
    Bases: A: 306
    Bases: C: 98
    Bases: G: 123
    Bases: T: 193
    Bases: TOTAL: 720

the Chunk: REGN has the two 'skip' (called 'technical reads') and the two wanted chunks, forward and reverse (the 'application reads').

next-gen sequencing fastq • 2.4k views
ADD COMMENT
1
Entering edit mode
13.5 years ago

Would it create confusion to other people leaving them ended in /3 instead of /2?. Should I rename the reads to have /2 instead /3?

The manual states that if you were not passing the -n flag then the files would be labeled by the region names (forward/reverse) rather than numerically. That might be the best solution that avoids confusion.

Can I only extract the wanted 2 chunks from the SRF instead the 4 of them?

Not sure, does not seem likely.

ADD COMMENT
0
Entering edit mode

@Istvan, humm but the -n is about the filenames ( [+1] probably is better to leave it out), but still my problem are the readnames /1 and /3 that is done with -a.

ADD REPLY

Login before adding your answer.

Traffic: 1678 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6