how can I merge or put together the different genomic dataset?
0
0
Entering edit mode
2.3 years ago
seda ▴ 20

Hello to everyone!

I pulled raw data from SRA which consists of 7 different bioproject datasets. All are metagenomic data. Some are 16S rRNA sequence, some are whole genome sequence.

I analyzed all the raw data separately according to the bioproject information such as the instrument used etc.

I will then build a machine learning model using this analyzed data. But my problem is how can I merge or put together these different datasets before model building?

Can you help me on how to proceed?

Thank you Seda

merging integration dataset genome fusion • 847 views
ADD COMMENT
1
Entering edit mode

one simple solution of this problem is to using python pandas library. here are two main steps:

  1. open each file and create a dataframe with the selected columns name
  2. merge these dataframes (Merge, join, concatenate and compare)

i got all this python code already running. let me know if you need my help?

ADD REPLY
0
Entering edit mode

Ernest Bonat, Ph.D. Thank you very much! This is exactly the information I need.

ADD REPLY

Login before adding your answer.

Traffic: 2771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6