[Python] reads from fastq file
4
0
Entering edit mode
6.8 years ago
thebioinfo • 0

I am looking for the fastest python code to extract only reads from fastq file and store it in new file.

python fastq • 6.5k views
ADD COMMENT
0
Entering edit mode

to extract only reads from fastq file and store it in new file

What do you want to do?

ADD REPLY
0
Entering edit mode

as a beginner of python, i just need to learn how to do it.

ADD REPLY
0
Entering edit mode

You just want to copy all reads to a new file?

ADD REPLY
0
Entering edit mode

yes i want to copy all reads to a new file

ADD REPLY
0
Entering edit mode

Do you mean fastq to fasta?

ADD REPLY
0
Entering edit mode

yes fastq to fasta.....

ADD REPLY
0
Entering edit mode

Well, you better be a more precise next time when asking questions.

ADD REPLY
4
Entering edit mode
6.8 years ago
said3427 ▴ 120
from Bio import SeqIO

SeqIO.convert('myfile.fastq','fastq','myoutput.fasta','fasta')
ADD COMMENT
3
Entering edit mode
6.8 years ago
$ python -c "import subprocess; subprocess.check_call(\"awk '{ if (NR%4==1) { print \\\">\\\"\$0; } else if (NR%4==2) { print \$0; } }' source.fq > destination.fa\", shell=True)"

Or:

#!/usr/bin/env python
import subprocess
subprocess.check_call("awk '{ if (NR%4==1) { print \">\"$0; } else if (NR%4==2) { print $0; } }' source.fq > destination.fa", shell=True)
ADD COMMENT
0
Entering edit mode

While technically correct and probably the most efficient, it doesn't really teach OP something about Python :-p

ADD REPLY
1
Entering edit mode

Doing things on the command line is going to be the fastest route, so the subprocess library is useful to know, if the asker is forced to use Python.

ADD REPLY
2
Entering edit mode
6.8 years ago

this is a general code snippet to copy files, what you are asking for... but quite useless in my opinion.

with open("myfile.fastq") as infile, open("myoutput.fastq", 'w') as output:
    for line in infile:
        output.write(line)
ADD COMMENT
0
Entering edit mode

I guess @thebioinfo wants only reads, so each 4the line.

ADD REPLY
2
Entering edit mode
6.8 years ago
Buffo ★ 2.4k

As you are looking for fastq to fasta, I think t is a duplicated question here

If you want to learn some python you may try:

import sys
filename = sys.argv[1]

with open(filename, "r") as infile:
    line_ct = 0
    for line in infile:
        if (line_ct % 4 == 0):
            print(">" + line[1:], end="")
            line_ct = 0
        if (line_ct  == 1):
            print(line, end = "")
        line_ct += 1
ADD COMMENT

Login before adding your answer.

Traffic: 1679 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6