Hg38 TSS .bed/.gtf file for chipseq analysis
1
2
Entering edit mode
8.4 years ago
satishedu ▴ 20

Hi Everyone,

How can I obtain a bed file with the TSS region? I am performing a chipseq analysis and to see the read covrage around the TSS using deeptools , I would like to have the hg38 TSS region .bed/gtf file. I googled a lot for this and found good solution for the hg19 assembly. It was hard for me to find the right solution for hg38 assembly. I kindly request someone to guide me to get the required files.

Thank you.

Satish

ChIP-Seq Assembly TSS bed gtf • 6.4k views
ADD COMMENT
3
Entering edit mode
8.4 years ago

We normally download GTF files from Ensembl and then convert using UCSC tools:

awk '{if ($$3 != "gene") print $0;}' file.gtf \
| grep -v "^#" \
| gtfToGenePred /dev/stdin /dev/stdout \
| genePredToBed stdin genes.bed

It's not pretty, but it works.

I should note that the next release of deepTools will accept GTF files, which should make everyone's life easier :)

ADD COMMENT
0
Entering edit mode

Thanks Devon, I will try this approach. I would also like to ask what kind of .bed file should I be exactly having to know the read coverage intensity of my chip seq around TSS. I am working chipseq on FOXO transcription factor.

ADD REPLY
0
Entering edit mode

You want at minimum a 6 column BED file, with entries for each transcript. There are easier ways of making that than what I showed to you, but if you wanted a BED12 file for future use (e.g., the next version of deepTools) then you'd already be good to go with it.

ADD REPLY
0
Entering edit mode

Devon, can you please provide two more information

Why two dollar signs used in $$3 != "gene" instead of just $3. I am quite curious to know whether $$ have any different meaning.

Also,if I am to get a bed file with TSS regions alone , could you please suggest any tweaks required with the given script. (I tried the given script, the output bed seemed to have lot more information than just TSS co-ordinates).

ADD REPLY
0
Entering edit mode

This was originally being run inside of a Makefile, so it needed to be escaped properly. That wouldn't be needed otherwise. I probably should have noted that when I posted this originally.

ADD REPLY
0
Entering edit mode

Devon, Thanks a lot for clarifying.

Regarding the other query, is there a robust way that you would recommend to get a bed/gtf file for TSS alone?

ADD REPLY
1
Entering edit mode

If you have BED file then it's a bit easier. It'd be something like:

awk '{if($6 == "-") {$2 = $3 - 1} else {$3 = $2 + 1} print}' input.bed > output.bed

For a GTF file it'd be something like:

awk '{if($3 == "transcript") {if($7 == "-") {$2 = $3} else {$3 = $2} print}}' input.gtf > output.gtf'

If there are comment lines then they'd need to be removed, either with grep or directly in awk.

ADD REPLY
0
Entering edit mode

Thanks Devon. The two one liners provided would be of great help to me indeed.

Jf

ADD REPLY

Login before adding your answer.

Traffic: 2111 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6