I have 3 files bam files, one is. bam,2nd one is the index .bai file and 3rd one is gtf file.
The name of the BAM file is sample_sort.bam. The two columns of interest for purpose are column 3, the name of the chromosome, and column 5, the start position of the alignment. The gtf file contains the chromosome in the first column and the start and end positions of an exon in the 4th and 5th column, and the gene name in the last column.
First I want to parse bam file then make index and finally compare it to a file that contains gene annotations (GTF). I
I started like:
But I don't know how to convert bam into sam through pysam. Also, I don't know how to parse given columns and and compare it to to gtf file. My final output should be matrix with two columns, one for the gene name and the other for the number of reads that matched .
Plz help me
Read count per exon per transcript
thanks but I need to compare bam with sam by using python
Well from a Python point of view, the syntax error in cell 8 telling you the issue is that caret symbol. That is an operator that Python respects for testing for conditions (IPython can also use it for writing with
%store
magic command), and so Python's not happy because it's context is all wrong being around commas and as an argument to a method call. And, more importantly, that symbol isn't being used how you mean it. You probably are thinking you are using it to write to a file?Maybe try the following in your Jupyter notebook:
The
pysam.view
part of that is based on https://stackoverflow.com/a/59314536/8508004 showingpysam.view(ops, bamfile, '1:2010000-20200000','2:2010000-20200000')
and this post suggestingpysam.view("-S", "file.sam")
as proper syntax. The rest is using some Python/IPython in Jupyter.The first line assigns the output of
pysam.view
to a Python variablesam_out
. The%store
line using Jupyter/IPython magics to save the sam_out to a file namedsamplesort.sam
, see here. Normally, the ways to write to a file in Python are a bit more verbose, but thatstore
command is a niceBash-like
shortcut you can use with IPython or Jupyter.If that doesn't work, try the longer version of writing to a file like here, something like:
Note that I didn't try converting that last suggestion to the specific options you seemed interested in. And so, you'll probably want to adapt that further.
And so if you get something working you may want to post a follow-up to help others.
I want to convert bam into sam
You converted bam into sam in your first two cells you show in your notebook, right?
Did you try after you ran those two cells, the following to write it to a file as you seem to want from that
pysamview
attempt:It's working but I want a matrix which contains read count and gene names
I want this type of matrix :