I used gencode human reference genome release 41 (https://www.gencodegenes.org/human/release_41.html) to align my fastq files. Specifically I used the first GTF file in that link to create the indexes for STAR. Now I am trying to use PICARD to get ribosome metrics but am having trouble creating the ribosome interval file.
First, do I have to use the same gtf file to get the ribosome intervals for Picard in a bed format? I tried to create the bed file of ribosome intervals with 5 columns (chr, start, stop, strand, gene id) with the basic gene annotation gtf not the first gtf file in the link ( which I used for STAR) and am getting an error of interval not within the sequence.
Second, am I doing it correctly where I am using the GTF reference genome file to get the ribosomomal sequences, then converting those sequence positions into a bed file to input as ribosome_interval.list file to get ribosome metrics?
I appreciate any help.
show us the commands please
Sorry for not including the command in the original post as I created it using my phone. The command I used is the following:
So from the reference geneome gtf file, I extracted the ribosomal genes and named them ribosomal genes.gtf and then am using the above command to get the interval list. Appreciate if you could guide me if I am doing this right and to my first question, does the gtf file I used to create STAR indexes have to be the same GTF file I use to get the ribosomal genes?
We have solved this issue as we hard to format the file in bed format and include headers that are required of a list file. example ribosomal_interval.list file can be found in this post. (Ribosomal Intervals For Collectrnaseqmetrics)