hi, I am struggling with host removal step during whole genome shotgun sequencing analysis pipeline of paired end data. When I am running my dataset for mapping with bowtie2 ,It is generating SAM file,it is further ran by samtools.I am getting wrong bam file,because I am getting tructated file in next step of sorting.I really dont know if there is problem with bowtie2 or in samtools.
- Can anyone suggest a way to check where I am doing wrong.Can we convert SAM file into tabular form to check which reads are mapped and which are not??Is there any script for this?
- I also wanted to know if there is another tool that can be used for host removal from my WGS paired end datasets. As I am trying to resolve this since 2 months ,still it is not working,If there is another tool for this step then I can try with that.
- Can we generate blast output from bowtie2??
Can you share the commands you are using?
a) bowtie2 mapping against host sequence Host example: human genome hg19 (download bowtie2 hg19 index)
1) create bowtie2 index database (host_DB) from host reference genome
2) bowtie2 mapping against host sequence database, keep both mapped and unmapped reads (paired-end reads)
3) convert file .sam to .bam
b) filter required unmapped reads
SAMtools SAM-flag filter: get unmapped pairs (both ends unmapped)