how to make cufflinks produce the nuclotide sequence of the gene + 5000 upstream necloutides
1
0
Entering edit mode
9.7 years ago
Michel Edwar ▴ 80

Hello,

I have a bam file of a chromosome, I want to use cufflinks or something similar from the terminal to get the actual genes on the chromosome complete with 5000 nucleotide upstream so I can check for promoter regions TF binding sites. I already use the following but it only gives me the transcripts of the genes.

cufflinks --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8 file1.bam
gtf2bed --do-not-sort < transcripts.gtf > transcripts.bed
bedtools getfasta -fi chr.fa -bed transcripts.bed -fo results.fasta

Regards

promoter upstream cufflinks bam • 2.4k views
ADD COMMENT
0
Entering edit mode
9.7 years ago
Manvendra Singh ★ 2.2k

I would not run Cufflinks for this; I would do following

modify your gtf file as ( This simple example, you can include other coloumns as well

awk '{ if ($7=="+") 
           print $1,$4-5000,$5;
       else 
           print $1, $4,$5+5000;
}' OFS="\t" your_converted_bed_file > your_modified_bedfile

now either you convert your modified file into gtf (just replace 4th and 5th coloumns with 2nd and third from this file, and run featureCounts or you can run bamcov by providing the converted bed file as input.

after this you would need to normalize the data with total number of mappable reads.

hth

ADD COMMENT

Login before adding your answer.

Traffic: 2957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6