no output after running samtools depth
1
0
Entering edit mode
6.6 years ago
Ana ▴ 200

I want to ask another question related to my previous post here: C: compute depth of coverage for certain regions of the genome using samtools depth

I have downloaded bam files of 1000 genomes project phase 3, "not the entire genome but only for certain regions" by using samtools view. I want to compute depth of coverage for these specific regions which can be done by using samtools depth -r .

According to my previous post, samtools depth -r will not work and first I need to reindex my bam files. I re-index them by using

wget -q -O - "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/current.tree" | cut -f 1 | grep '.bam$' | grep -Ev -i ".chrom20.|.chrom11." | parallel --dry-run /data/programs/samtools-1.3.1/samtools view -b -o {/} {} "16:32297737-33456560"

however, when I try samtools depth -r after re-indexing,

/data/programs/samtools-1.3.1/samtools depth -r 16:32297737-33456560 NA21137.mapped.ILLUMINA.bwa.GIH.low_coverage.20120522.bam  > bb

my output file is empty, I also get an error message that

Warning: The index file is older than the data file: NA21137.mapped.ILLUMINA.bwa.GIH.low_coverage.20120522.bam.bai

I wonder what is wrong here? dos anyone have any idea? thanks

samtools bam • 2.6k views
ADD COMMENT
0
Entering edit mode
6.6 years ago

Index the bam file and try again.

wget -q -O - "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/current.tree" | cut -f 1 | grep '.bam$' | grep -Ev -i ".chrom20.|.chrom11." | parallel --dry-run /data/programs/samtools-1.3.1/samtools view -b -o {/} {} "16:32297737-33456560"

Above script will not fetch any bam as you are dry running parallel. To download you are supposed to remove --dry-run.

ADD COMMENT
0
Entering edit mode

thanks @cpad0112, unfortunately I really do not understand your response and what do you mean by indexing bam files and trying the code again! I downloaded the bam files by this command

wget -q -O - "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/current.tree" | cut -f 1 | grep '.bam$' | while read B; do echo -n "$B " && /data/programs/samtools-1.3.1/samtools  view -b -o ${B##*/} "http://ftp.1000genomes.ebi.ac.uk/vol1/$B" "16:32297737-33456560"  && rm *.chrom20.* && rm *.chrom11.* ; done

the outputs are bam and bam.bai files (bam.bai are index files as far as I know). later on I re-index the files by this

wget -q -O - "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/current.tree" | cut -f 1 | grep '.bam$' | grep -Ev -i ".chrom20.|.chrom11." | parallel --dry-run /data/programs/samtools-1.3.1/samtools view -b -o {/} {} "16:32297737-33456560"

Can you explain it bit more for me, I am totally new in human genome project, a bit confused by what you mean exactly.

ADD REPLY
1
Entering edit mode

wget -q -O - "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/current.tree" | cut -f 1 | grep '.bam$' | grep -Ev -i ".chrom20.|.chrom11." | parallel --dry-run /data/programs/samtools-1.3.1/samtools view -b -o {/} {} "16:32297737-33456560" is not for indexing the bam files. This a different way to download the bam files from internet. samtools index input.bam is the right command to index bam files.

Following is a test run for indexing bam and calculating depth by samtools on a single file, in a temporary directory. If these steps are successful, then you can apply them to all the bams in the directory.

  1. make a new directory (name it like test)
  2. Copy NA21137.mapped.ILLUMINA.bwa.GIH.low_coverage.20120522.bam and NA21137.mapped.ILLUMINA.bwa.GIH.low_coverage.20120522.bam.bai (copy, don't move the file)
  3. cd to the new directory
  4. You should see two files: with .bam and .bam.bai extension.
  5. Delete .bai file (rm NA21137.mapped.ILLUMINA.bwa.GIH.low_coverage.20120522.bam.bai)
  6. Now you should see only bam file
  7. Index the bam file with following command:

    samtools index NA21137.mapped.ILLUMINA.bwa.GIH.low_coverage.20120522.bam

  8. Now you should see two files with .bam and .bam.bai extensions

  9. Run samtools depth (example given below):

    samtools depth -r 16:32297737-32297837 NA21137.mapped.ILLUMINA.bwa.GIH.low_coverage.20120522.bam

  10. Step 9 would print the result to the screen. Result can be saved in a separate file.

samtools depth -r 16:32297737-32297837 NA21137.mapped.ILLUMINA.bwa.GIH.low_coverage.20120522.bam > NA21137.mapped.ILLUMINA.bwa.GIH.low_coverage.20120522. text.

Note: I edited post mentioning parallel that parallel is used for downloading bams, as alternative to bash script provided in that post.

ADD REPLY
1
Entering edit mode

If above steps are successful, navigate to wherever bam files are. Try running following command:

$ for i in *.bam;do  samtools index $i;done

This would index all the bam files. Try running samtools depth again on bam files.

ADD REPLY
0
Entering edit mode

thanks so much, excellent help

ADD REPLY

Login before adding your answer.

Traffic: 1869 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6