Question

My alignment rate in Bowtie2 was too low

0

Entering edit mode

3.4 years ago

Liu Zijia • 0

I‘m new in Metagenomics. Recently I tried to reconstruct the analysis process by following the steps in the article. I used BBMap software suite to process raw data from ENA database. And then I used Bowtie2 to mapping reads to mouse genome, but I found my results were too bad. Here are my pipeline: 1.Raw data accession number RJEB15095 (from ENA) I just download one sample.(ERR1562570) enter image description here

2.Remove adapters and low-quality reads(BBMap Suite) Here are my command line:

a.Adapter trimming

./bbduk.sh in=/home/liuzijia/桌面/Gut/ena_files/ERR1562570/ERR1562570_1.fastq.gz in2=/home/liuzijia/桌面/Gut/ena_files/ERR1562570/ERR1562570_2.fastq.gz out=ERR1562570_1_clean.fastq out2=ERR1562570_2_clean.fastq ref=/home/liuzijia/bbmap/resources/adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo

b.Quality trimming:

./bbduk.sh ./bbduk.sh in=/home/liuzijia/桌面/Gut/ena_files/ERR1562570/ERR1562570_1.fastq.gz in2=/home/liuzijia/桌面/Gut/ena_files/ERR1562570/ERR1562570_2.fastq.gz out=/home/liuzijia/桌面/Gut/BBduk_result/ERR1562570_1_clean_TRIMQ.fastq out2=/home/liuzijia/桌面/Gut/BBduk_result/ERR1562570_2_clean_TRIMQ.fastq qtrim=r trimq=10

c.Force-Trim Modulo:

./bbduk.sh in=/home/liuzijia/桌面/Gut/BBduk_result/ERR1562570_1_clean_TRIMQ.fastq in2=/home/liuzijia/桌面/Gut/BBduk_result/ERR1562570_2_clean_TRIMQ.fastq out=/home/liuzijia/桌面/Gut/BBduk_result/ERR1562570_1_Force-Trim_Modulo.fastq out2=/home/liuzijia/桌面/Gut/BBduk_result/ERR1562570_2_Force-Trim_Modulo.fastq ftm=5

Then I tried Bowtie2 by using paired-end sequence ERR1562570_1_Force-Trim_Modulo.fastq and ERR1562570_2_Force-Trim_Modulo.fastq, and I got this:

enter image description here

And Here is my command line code: bowtie2 -x /media/liuzijia/数据包/Gut/Bowite2/INDEX/mus_musculus -1 ERR1562570_1_Force-Trim_Modulo.fastq -2 ERR1562570_2_Force-Trim_Modulo.fastq -S bowtie_seq_mm10_1.sam

Can you help me?

Bowtie2 NGS BBDuk Metagenomics BBMap • 1.5k views

ADD COMMENT • link updated 13 months ago by vk ▴ 40 • written 3.4 years ago by Liu Zijia • 0

0

Entering edit mode

enter image description here

The pic before is that the process I want to reconstruct(DOI: https://doi.org/10.7554/eLife.58609)

But I just cannot realize it..

ADD REPLY • link 3.4 years ago by Liu Zijia • 0

score 3 · Accepted Answer · 2022-03-27

3

Entering edit mode

3.4 years ago

vk ▴ 40

Here they have sequenced the gut microbiome and they aligned reads against the mouse genome only to decontaminate (so it should be low, but like cpad0112 mentioned below this should have been around 5-10%?) The sequences are of cecal microbiota, they would align to different bacterial species for which you need to map the reads against NCBI non-redundant database using diamond blast.

ADD COMMENT • link 13 months ago by vk ▴ 40

0

Entering edit mode

In general, one cannot isolate just the non-host reads in metagenomics experiment, AFAIK (when sampled from the host). Host reads will be present inevitably, and in OP's case it is 0% (almost negligible). Unless the sequence submitter submitted reads that didn't map to host sequence (i.e. pruned reads), data cannot be devoid of host reads. There might be also that OP's reference, code or computing issues for such a low alignment.