Hi!
I am working with an organism that has two plasmids and one chromosome. I would like to determine the relative copy number of my plasmids, respectively to my chromosome, which is in one copy. I have whole-genome sequencing data for this organism, so I thought of mapping my reads to the reference (3 sequences), determine the coverage of each sequence, and therefore get the values I am seeking. My code is:
bowtie2-build three-seqs.fasta three-seqs.fasta
bowtie2 --threads 10 -x three-seqs.fasta -U reads.fq.gz | samtools sort --threads 10 - -o mapped.bam
samtools depth mapped.bam >mapped.depth
perl calc.coverage.in.bam.depth.mads-albertsen.pl -i mapped.depth
# the last script is from: https://github.com/MadsAlbertsen/multi-metagenome/blob/master/misc.scripts/calc.coverage.in.bam.depth.pl
I get the following output from the perl script:
Name Average.coverage
first 1198.907
second 286.216
chromosome 508.999
By dividing the coverage of each reference by the total coverage would give me the relative coverage, hence relative copy number:
Name Average.coverage % coverage
first 1198.907 60.1
second 286.216 14.4
chromosome 508.999 25.5
Therefore, my first
plasmid is roughly twice as abundant as my chromosome
, and my second
plasmid roughly 0.5x, which would mean it's only in every second cell.
Is this workflow correct? Or am I missing something here, e.g. do I need to take into account the sequence length?
If there were no biases in sampling/lib prep/sequencing then that sounds about right. Sounds like your plasmids don't share much sequence with your chromosome.
Thanks for the hint, I should have checked that! I ran a blast and calculated ANI, and there's only 3 or 4 chunks that are similar due to conserved genes, but they are just 2 kb in total.
However: I now filtered my
mapped.bam
by Q30, which removes 6 % of my reads. When I calculate coverages, I get roughly equal coverage for mychromosome
and my plasmidfirst
, thesecond
plasmid however stays at a relative abundance of 0.5:Why could that be? Is it multimapping reads that are causing this shift?