Hello,
I am new in this field. I am doing metagenome analysis with shotgun reads. All reads are single ended. DNAs were obtained from airways of human. I just want to find taxon abundances in the samples. Then I will predict the diversities and core microbes.
My mapping results are terrible. How can I handle bad mappings?? OR should I change the tools that I used the analysis?? Which tools are more accurate or sensitive for microbiome analysis?? I need any suggestions, please!
I followed this pipeline:
- Assembly was done using Megahit
- Short contigs (<200 bps) were removed using prinseq
- Read mapping against contigs was performed using BWA
- Similarity searches for GenBank, KEGG, eggNOG were done using Diamond
- Binning was done using MaxBin2
This is my mapping results:
# Sample Total reads Mapped reads Mapping perc Total bases
samples13 21380728 17881628 83.63 1618006383
samples14 109599 22051 20.12 7606328
samples15 258752 119090 46.02 18803788
samples16 340586 147490 43.30 24935657
samples12 7342679 6205921 84.52 524794709
samples11 7741157 6283578 81.17 554721680
samples17 17108901 15213361 88.92 1294292384
samples18 4012626 2850684 71.04 302834087
What makes you say that? I mostly see samples with >70% alignment rates, which is fine.
Friederike Each sample belongs to a different person. Mapping percentage of sample 14,15 and 16 are under the 50%. They are really bad for me, especially sample 14. I don't get any healthy information from those samples.
Have you actually looked at the results? What type of information are you missing from the results? What exactly is throwing you off? (Not saying that these samples didn't fail, but in order to get a sense of why they may have failed, we need more information)
Friederike In the nutshell, I should make a comparison between samples like (11-12) , (13-14), (15-16), (17-18).
for example; in the image (assume the samples are ordered 11 12 13...), we expect equivalent level of sample 13 and 14, but sample 14 has very low abundance compare to sample 13. I'm not sure, but this situation depends on sample amount, right? If so, is it possible to normalize with these results and calculate diversity and core microbes? I need relative abundance For this, right?
yes