How To Extract Data From Exel Sheet
3
0
Entering edit mode
12.1 years ago
sebabiokr ▴ 10

Greeting to all,

I have analyzed my Metagenome data in MG RAST pipeline. There is much information about the organism in the table. Could you please tell me how to extract those data to simple table format to discuss the results?

Thank you

Sebastian

EDIT

The date contains following information

metagenome source domain phylum class order family genus species abundance avg eValue avg % ident avg align len# hits i used RDP as a annotation source. i need to extract all RNA data and need to classify them in %.

Thank you

metagenomics • 4.1k views
ADD COMMENT
3
Entering edit mode

Without some idea of (1) what the data look like and (2) what kind of analyses you want to do - no, we cannot. Please provide more detail.

ADD REPLY
2
Entering edit mode

Please edit the original question with this information, rather than posting it in an answer. I have made a start for you, but you need to make it readable.

ADD REPLY
0
Entering edit mode

Thank you i will do the same.

ADD REPLY
1
Entering edit mode

could you include an example of the input and desired output? Do you have any preferences in coding language used?

ADD REPLY
5
Entering edit mode
12.1 years ago
Josh Herr 5.8k

So you uploaded FASTA files into MG-RAST and now have an excel file output? You probably have a table of data from MG-RAST and you need to know what to do with it? Why did you sequence your samples in the first place if you have no understanding of your research question or what you will do with your data output?

By adding your sequences to the MG-RAST pipeline, did you actually analyze the data or did you use the default settings on the pipeline for FASTA annotation? You say you used the RDP database, does that mean you have amplicon data? If you have WGS data you will have to use another database for annotation (MG-RAST uses the M5NR database as standard now).

You give a list of the output for the columns of your TSV table, but you do not say how these columns were computed by the pipeline. In MG-RAST there are four options that will provide you a table like this: organism abundance ((1) representative/best hit and (2) lowest common ancestor) and functional abundance ((3) hierarchical and (4) complete annotations). Which one did you choose? Also, you are not describing the entire table so I do not know if you actually analyzed the data properly or don't know how to see all the columns on your table.

Your table annotations are the following: - metagenome - your sample name - source - your treatment - domain - domain of closest BLAT hit - phylum - phylum of closest BLAT hit - class - do I need to go on here? - order - family - genus - species - abundance - read abundance for this organism/gene - avg eValue - the average eValue for all the hits (this will vary often at the level of clustering in your pipeline, typically at 95% to 98% for OTUs) - avg % ident - the average % match of your reads to the database hit - avg align len - the average length of your sequence alignment - # hits - number of hits to the database (can be, and usually is, different from the read abundance)

Lastly, you say you need to "extract RNA data" what does this mean? Do you have metagenomic transcriptome cDNA library data or are you confused and looking for ribosomal DNA profiles?

There are many different directions to take from where you are at now. Do you know what type of analyses you will be using to assess your metagenomic (or amplicon) data? You'll have to figure this out.

Please in the future, provide us with enough information to help you, but first, read the manual or instructions, or take a basic course, before asking help in a forum like this.

ADD COMMENT
0
Entering edit mode
12.1 years ago
vaskin90 ▴ 290

Export the table to Comma Separated Value format. Excel can do that. Then a lot of tools can read it.

ADD COMMENT
0
Entering edit mode

This question is referring to the spreadsheet from within MG-RAST, which is already in TSV format, but yes, lots of tools can read either a TSV or CSV formatted data matrix.

ADD REPLY

Login before adding your answer.

Traffic: 1757 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6