Counting reads spanning two genes
1
0
Entering edit mode
6.4 years ago
caggtaagtat ★ 1.9k

Hi,

I'm looking for genes, which were mapped by STAR at mitochondrial transcripts, covering two neighboring genes. Is it possible to extract this inforamtion after a STAR run or do I have to do something different?

Edit: So basically, I would like to count all reads of the same scaffold which cover the border of two genes next to each other.

RNA-Seq sequencing • 1.4k views
ADD COMMENT
2
Entering edit mode
6.4 years ago
echo -e "contig\t(start2-1)\tstart2" > gene2.bed
samtools view -bu in.bam "contig:end1-end1" | samtools view -c -L gene2.bed -
ADD COMMENT
0
Entering edit mode

Could you maybe explain the first line? When I execute this using my bam file, I get the error: Could not read file "gene2.bed"

Thank you for the quick answer!

ADD REPLY
1
Entering edit mode

Could you maybe explain the first line?

the first line produce a one-line BED file containing the start position of the second gene. As a bed is a half-open interval, you'll need to subsract '1' for the start position.

ADD REPLY
0
Entering edit mode

Ok so with the first command I create this file:

contig (start2-1) start2

However the second line unfortunatly leads to the error:

[bed_read] Parse error reading gene2.bed at line 1
samtools view: Could not read file "gene2.bed"

Is there maybe somthing missing in your code, because at the end there is a single -

Edit: Oh ok, I thought this would be some sort of regular expression, but it seems like I have to state the specific coordinates, which I want to check.

ADD REPLY
0
Entering edit mode

Thank you again for your response! After inserting the conserning contig and start/end values, the test run for one start/end pair was successfull. I will now use your template embedded in a simple loop to check all the neighboring start/end values of my contig.

Would you however mind explaining the samtools view -bu part ? I didn't find it in the manual and am not sure what it does

ADD REPLY
1
Entering edit mode

I didn't find it in the manual

I found it in the manual http://www.htslib.org/doc/samtools.html

ADD REPLY
0
Entering edit mode

Ah it's a combination of -b and -u, I see :) thanks again

ADD REPLY
0
Entering edit mode

Can I again ask for your advice?

If executed following command

echo -e "MT\t14746\t14747" > gene2.bed
samtools view -bu file.bam "MT:14148-14148" | samtools view -c -L gene2.bed

This gives me the output 1500. Do I understand correctly, that this means, that 1500 reads cover both coordinates 14746 and 14148 simultaneously

ADD REPLY

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6