Plotting .tab/.bam file
0
0
Entering edit mode
5.1 years ago

Hey all,

I am trying to find a way to create a scatter plot and histogram using matplotlib for an alignment I generated. I aligned my reads to a bacterial genome and I indexed and sorted the file, and used using:

samtools view -b s_oneidensis_alignemnt_sensitive.sam > alignment.bam
samtools sort alignment.bam > alignment.sorted.bam
samtools index alignment.sorted.bam
samtools depth -a alignment.sorted.bam > pileup.tab

Now I'd like to generate a scatterplot with x-axis = position in genome and y-axis = depth of coverage and then a histogram with x-axis = depth of coverage and y-axis = read count. I'm still new to python and trying to figure out a method using the .tab file or should I use the .bam file? Any help or nudges in the right direction would be greatly appreciated. Thanks!

matplotlib tab python bam • 2.7k views
ADD COMMENT
0
Entering edit mode

A similar topic has been discussed in How to plot coverage and depth statistics of a bam file. Tab file (.tab) is just another text file where the columns are tab-separated, read the pileup.tab file using pandas and plot using pyplot.

ADD REPLY
0
Entering edit mode

So the issue I'm having with this is extracting the alignment .tab file's columns into a list my current code is its indexing the first strings indexed in a row not the column (NCBI ascension ID for the genome) such as A and E with this code:

    %matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

x = []
y = []

table = pd.read_csv('pileup.tab', sep='\t')
alignment = pd.DataFrame(data=table)
for column in alignment:
    x.append(int(column[1]))
    y.append(int(column[2]))

plt.plot(x, y, 'ro')
plt.xlabel('Position in Genome')
plt.ylabel('Depth of Coverage')
plt.show()
ADD REPLY

Login before adding your answer.

Traffic: 1771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6