comparing two metagenomics data sets
3
0
Entering edit mode
21 months ago
arshad1292 ▴ 110

Hello all,

I have a shotgun metagenomic dataset (20 samples) paired-end reads. I want to compare my data to another dataset published and available online. I am confused as how can I do it.

Please let me know if you have an idea.

Thanks in advance!

metagenomics • 1.2k views
ADD COMMENT
2
Entering edit mode
21 months ago
Mensur Dlakic ★ 28k

Assembling the reads into metagenomes will in most cases give higher sensitivity than raw-reads comparison. If you want to try that:

https://github.com/MrOlm/inStrain

Program capabilities are described here.

You will most likely need k-mer-based or sketch-based approaches to compare metagenomes from raw reads:

ADD COMMENT
1
Entering edit mode
21 months ago

Do you only need a qualitative result (Who is there?) or also a quantitative (How many of each?)? In any case, the most accurate would be to download the raw data for the published dataset as well and run all samples through the same pipeline. Check for approximately balanced k-mer content and sequencing depth in the inputs.

ADD COMMENT
1
Entering edit mode
21 months ago

You can perform the comparison of your shotgun metagenomic data to the published dataset by using various bioinformatics tools. Here's a general pipeline for it:

  • Quality check: You can use FastQC to assess the quality of the reads in your data.
  • Read pre-processing: Trim the reads to remove any adapter sequences, low-quality bases, etc. using tools such as Trimmomatic.
  • Assembly: Assemble the reads into longer contigs using a de novo assembler such as SPAdes.
  • Taxonomic classification: Use tools like MetaPhlAn2 to assign taxonomic labels to the assembled contigs based on their sequence similarity to known reference genomes.
  • Comparative analysis: You can perform a comparison of the taxonomic profiles generated from your data and the published dataset. To visualize and compare the results, you can use tools such as Krona or MEGAN.

It's worth noting that this is a general pipeline, and the exact steps you take may vary depending on your specific research goals and the tools available to you.

ADD COMMENT

Login before adding your answer.

Traffic: 1980 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6