Biopython - appending single fastq records to existing file
0
0
Entering edit mode
2.9 years ago
mbabic • 0

I'm not sure if this is actually possible, but - is there a way to add a FASTQ record to an existing FASTQ file, without affecting previously written data?

While that is the entirety of the question, in case it's helpful, some context - in case someone has approach ideas. For complicated reasons, I'm sorting through multiple independent gigantic FASTQ read files, which then need to be sorted into much smaller subfiles for alignment, based on particular barcodes.

I.e. I need to go through whatever.fastq and sort out all reads that have BarcodeX into barcodex.fastq file. Then I need to go through alsothis.fastq and repeat the process, adding more of BarcodeX reads to barcodex.fastq. And so on, and so forth, for possibly hundreds of independent files. Appending the source .fastqs into one file is not feasible due to total resulting size; and data needs to be added from time to time as well.

biopython fastq • 807 views
ADD COMMENT
1
Entering edit mode

You can append to a file by using the append option in Python when you write. See here.
You can use Biopython in the reading and screening process to get your individual sequence record, convert that to a string by typecasting, and then just use pure python for the writing/appending step.
The question may become though how fast do you need this to run. (Also scale. Hundreds may be fine with Python.) It may be better to use pure shell and in that case you'd use >> to append whereas you usually use > to just write to a file. See here.

ADD REPLY
1
Entering edit mode

which then need to be sorted into much smaller subfiles for alignment, based on particular barcodes.

You may want to use a tool meant to do this. e.g. sabre (LINK)

ADD REPLY

Login before adding your answer.

Traffic: 1863 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6