Question

how to extract the promoter region from featureCounts output?

0

Entering edit mode

7.3 years ago

shoujun.gu ▴ 380

Here is what I plan to do:

map reads to ref
count reads by featureCounts
get DEG by edgeR
extract -1000 bp promoter region of DEG

now I'm at step 2. After I get the count file, I found some genes have more than one start sites in the 'Start' column in featureCounts output file. Like:

Geneid Chr Start End

4933401J01Rik chr1 3073253 3074322

Gm26206 chr1 3102016 3102125

Xkr4 chr1;chr1;chr1;chr1;chr1;chr1;chr1 3205901;3206523;3213439;3213609;3214482;3421702;3670552 3207317;3207317;3215632;3216344;3216968;3421901;3671498

Gm18956 chr1 3252757 3253236

If each gene has just 1 start site in the 'start' column, i think I could extract the promoter region by using bedtools. But since some genes have more than 1 TSS (eg. Xkr4), how to extract all the promoter regions from them? any suggestions?

Thanks.

RNA-Seq next-gen gene • 2.6k views

ADD COMMENT • link 7.3 years ago by shoujun.gu ▴ 380

score 0 · Answer 1 · 2017-08-02

0

Entering edit mode

7.3 years ago

GouthamAtla 12k

If you are quentifying genes using featureCounts, you will have only one line per gene in your output. Can you post your featureCounts command and head of your GTF ?

There is no standard definition of promoter unless you have histone modification data ( K4Me3 ) or open-chromatin regions to define a promoter. If you see multiple promoters for a gene, probably they are different transcripts. Depending on your goal, either you can consider all the promoters i.e +/- 500bp ( core promoter ) of all TSS or consider one promoter of the transcript that shows highest expression.

ADD COMMENT • link 7.3 years ago by GouthamAtla 12k

0

Entering edit mode

I updated my post.

yes, it looks like they are different transcripts. My question is: 1. how do I determine which transcript is dominant? 2. if I want to extract all the promoter region sequence, are there any easy way to do it?

ADD REPLY • link 7.3 years ago by shoujun.gu ▴ 380