Yes, I have this bed file:
$head CITS.bed
chr1 568974 568975 CITS_1[gene=chr1_f_c24][PH=12][PH0=0.29][P=1.01e-12] 12 +
chr1 2239149 2239150 CITS_2[gene=chr1_f_c1136][PH=7][PH0=0.40][P=2.21e-04] 7 +
chr1 2239899 2239900 CITS_3[gene=chr1_f_c1138][PH=6][PH0=0.21][P=3.56e-04] 6 +
chr1 2461199 2461200 CITS_4[gene=chr1_f_c1237][PH=5][PH0=0.17][P=1.46e-04] 5 +
And I want to get something like this (as a random example) with each attribute in a different column but each column corresponding to one attribute.
chr1 568974 568975 CITS_1[gene=chr1_f_c24][PH=12][PH0=0.29][P=1.01e-12] 12 + Gene_ID:EST000000 Gene_name: GeneX Transcript_name: Transcript X Feature: 5'UTR
chr1 2239149 2239150 CITS_2[gene=chr1_f_c1136][PH=7][PH0=0.40][P=2.21e-04] 7 + Gene_ID:EST0000001 Gene_name: GeneY Transcript_name: Transcript Y Feature: lnRNA
chr1 2239899 2239900 CITS_3[gene=chr1_f_c1138][PH=6][PH0=0.21][P=3.56e-04] 6 + Gene_ID:EST0000002 Gene_name: GeneZ Transcript_name: Transcript Z Feature: miRNA001
The bed file contains only RNA reads (mRNAs, miRNAs, lnRNAs, snRNAs).
I had originally converted the gtf file into a bed file before using bedtools intersect.
But yes you are correct, the gtf file (gencode.v28.annotation.hg38.gtf) is really messy (attributes column):
chr1 HAVANA gene 11869 14409 . + . gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2";
chr1 HAVANA transcript 11869 14409 . + . gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "RP11-34P13.1-002"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";
Can you show which files you have and what are you trying to get? In general, I wouldn't recommend doing bedtools intersect on a gtf file because bedtools don't really understanf the relations between features like gene -> transcript -> exon and your output file might get very messed up. Definitely check it in Genome Browser and look if all your exons, transcripts are in place, etc.