How to split a whole genome sequence file?
0
0
Entering edit mode
9.7 years ago

How to split a whole genome sequence file (see also here)? I have 8 GB RAM and my computer chokes with large whole genome sequence .gb and fasta files. Please tell how to split the file and advise a software which can do alignment fast with low memory requirements. I have tried Geneous Pro, CLS Genomics, DNA Star, SeqSphere --- all need at least 16-32 RAM.

sequencing next-gen wgs • 3.2k views
ADD COMMENT
1
Entering edit mode

Split them according to what?

ADD REPLY
0
Entering edit mode

Splitting the file should not be a problem (depending on how you wanna split it) the simplest way is to pick a sliding window size and print everything what is in it, to different files. (Let me know if there is a need for a simple parser do do this or you can manage on your own). As far as the software tool goes, these tools are not designed to run with low memory requirements. Usually they are in memory algorithms. Now depending on what you are trying to achieve, there is a chance for you to get away with splitting the input file, but in order for me to give some advice I would need more information.

cheers
mxs

ADD REPLY

Login before adding your answer.

Traffic: 1531 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6