I used GATK's Funcotator to annotate a VCF file I have and it produced a MAF file just over 2GB in size. I've tried using pandas in Python and maftools in R and it hasn't worked. Specifically the file size seems to be too large to be opened in R throwing this error
Error in data.table::fread(file = maf, sep = "\t", stringsAsFactors = FALSE, : File '' does not exist or is non-readable.
and pandas isn't really made for MAF files. Usually when running this annotation it was enough to open it in Excel but this file is way too big. Does anybody know an application or package (whether it be R or python or something else) to open MAF files of this size? Any help is appreciated.
That error is not because memory, it is pointing that the file is not located where your maf variable declares it.
Why do you need to open it? Linux commands
more
,grep
,awk
can help you to view the content.I wrote a simple helper alias for files like these:
Spreadsheet-like view on the terminal!
Yes this works so I can view it! The problem is I need to manipulate it as if it were a dataframe so I can apply certain thresholds to the data and select for specific columns.
Your
maf
variable doesn't seem to contain the path to the MAF file. Can you show the output ofdput(maf)
?This is what I have:
when I use dput(maf) I get this: