Hi,
I have 2 RNA-seq files for two samples and aligned them to the reference genome so I have BAM files. I would like to get the read density of the first 1000 nucleotides for all transcripts and then get the average of that in such a way I would get one value per sample (which is average read density for the first 1000 nt of all transcripts) . so far, in python I have got a dictionary containing one transcript per gene as a representative of gene (in this dictionary I have the gene name and transcript name). do you guys know how I can get the read density of the first 1000 nt for each transcript? the I can get the average of that.
Thanks
Why don't you create a bed file with first 1000bp of each transcript and get the coverage with bedtools or some other tool ? You can even get coverage at each base using genomecoverage function in bedtools. If you want it to be Python, there are many libraries in deeptools or HTseq packages.
then I would get the read density from the end of all transcripts and average them. at the end I am interested in the ratio of the average from the end and beginning of each transcript
The original question does not mention anything about "End" or "ratios".