I have fastq files of around 12 GB size. I have tried opening them with Sublime Text and Atom but they are not able to read them. What apps can read really large FASTQ files on Windows?
I have fastq files of around 12 GB size. I have tried opening them with Sublime Text and Atom but they are not able to read them. What apps can read really large FASTQ files on Windows?
As others have said, there's usually not much reason to look at large fastq files, but BBTools can process them in Windows if you install Java. For example...
java -cp C:\BBMap jgi.ReformatReads in=file.fastq out=stdout.fq reads=10
...would print the first 10 reads to the screen. It's worth noting that BBTools is developed in Windows so I can guarantee that all of the programs work in Windows as well as Linux.
Windows 10+ also has "Windows Subsystem for Linux" which can make this kind of thing much easier, allowing to (hopefully) use BBTools' shell scripts, for example, so the syntax would simpler:
reformat.sh in=file.fastq out=stdout.fq reads=10
I'm about to reboot my computer to install Windows Subsystem for Linux to see what it can do now... hopefully it has standard utilities like zcat and head, which are convenient for looking at fastq files.
There are a number of bioinformatics workbenches that will run under windows, and will allow you to analyse fastq files.
Some examples are:
Unfortunately these are all quite expensive, although some offer a free trial.
If you just want to look at the file, then I've had luck opening big files with Notepad++. However, if you want to open a 12GB file in any of these sorts of tool (assuming thats 12GB uncompressed, not .gz), then you will definately need at least 12GB of free memory.
If really all you need to do is look at the file. Then you might try opening the powershell application and using
$ Get-Content PATH/TO/FASTQ/FILE.fastq | Out-Host -Paging
or
$ Get-Content PATH/TO/FASTQ/FILE.fastq -First 40
You don't want to do this, but one of my colleagues did once - he used large file viewer at this link
https://stackoverflow.com/questions/159521/text-editor-to-open-big-giant-huge-large-text-files
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Why do you want to open up such a large file in a text editor? You're not really going to learn anything new from manually looking at millions of sequences...
Just use the windows command line to view a few lines of the file instead. Or, if you really want to open it sublime text, maybe select only a few thousand reads from your FASTQ file and put them in a separate file then open that file up in sublime.
Also, as I suggested already here https://bioinformatics.stackexchange.com/questions/21931/how-to-download-sequencing-data-on-windows-using-sra-toolkit you're not doing much in a pure Windows environment. Use WSL2 or any other Unix.
Imho less than 10 reads is good enough to figure out what can be discovered in a FASTQ by a naked eye. Sequence naming convention, formatting, sequence length, interleaved or not and maybe quality encoding if one really insist. But apart from this one will get way more info using an intact (compressed by some sequencing core or SRA archive) FASTQ file and dedicated tools, i.e. fastp, seqkit etc.