I've never seen that format for machine-generated raw mass-spec data. Are you sure it's from a mass spec machine? This is a long shot, but perhaps the first few bytes of the file might be an ASCII representation of the generating instrument?
Unsurprising that you're lost if you don't know where the file originated. There's not much you can do without that information, unless you get lucky with a Google search.
You will need to be more specific. Raw files from different machines are not compatible. Perhaps if you post some of the filenames, somebody here will recognise the machine (e.g. Agilent raw data is stored in ".d" directories). Also, I strongly suggest you contact the people who provided you with the files and request more details about the experiment - it will be difficult to perform meaningful analysis without knowing what kind of data you have, e.g. different software is optimised for different machines.
Thanks guys - i think i figured it out - looks like an oracle binary dump file. the above tends to substantiate this. will have to confirm this week...
Different machines generate raw data in different formats. A typical first step is to convert your specific raw data into a commonly used generic format like mzXML or mzML.
You might then want to convert this into other formats for downstream analysis. E.g. you might convert into MGF to allow the software Mascot to identify peptides in your data.
So your first step should be to identify the machine your data comes from. I strongly suggest you go back to the person who gave you the data and request more details about the experiment - without knowing exactly what you're looking it, it will be very difficult to perform a meaningful analysis (e.g. different software is optimised for different machines). Also, you should know things like whether labeling like SILAC was used.
If for some reason you can't do this, perhaps you can post some examples of the filenames/directory names you have, and somebody here might be able to recognise the machine. For example, Agilent raw data is stored in ".d" directories.
The following website provides a large list of mass spectrometry software, including format conversion utilities:
Vendor software generally has options to export data in one or several open formats. You could ask the original provider to send the converted file, as the vendor specific binaries are pretty much useless without their software.
The state-of-the-art converter is msconvert, part of proteowizard. It is also distributed with the ISB's Trans-Proteomic Pipeline (TPP), that will provide you with additional software to do whatever analysis you are looking for. Note that you will need to perform the conversion from binary to text/xml format on a Windows machine, as the vendor libraries that provide access to binary data only run on the Microsoft OS.
The ms-utils.org link provided by Bio_X2Y is a very helpful resource. To visualise your data, once converted in mzML for instance, I would also suggest EBI's PRIDE Inspector, which will also provide some QC plots for free.
Hope this helps.
ADD COMMENT
• link
updated 5.2 years ago by
Ram
44k
•
written 13.5 years ago by
Laurent
★
1.7k
I've never seen that format for machine-generated raw mass-spec data. Are you sure it's from a mass spec machine? This is a long shot, but perhaps the first few bytes of the file might be an ASCII representation of the generating instrument?
Unsurprising that you're lost if you don't know where the file originated. There's not much you can do without that information, unless you get lucky with a Google search.
You will need to be more specific. Raw files from different machines are not compatible. Perhaps if you post some of the filenames, somebody here will recognise the machine (e.g. Agilent raw data is stored in ".d" directories). Also, I strongly suggest you contact the people who provided you with the files and request more details about the experiment - it will be difficult to perform meaningful analysis without knowing what kind of data you have, e.g. different software is optimised for different machines.
Same here, have never seen that format...
Thanks guys - i think i figured it out - looks like an oracle binary dump file. the above tends to substantiate this. will have to confirm this week...