I'm trying to use my .sam
file from STAR
and GTF
file with bedtools complement
to get the following the number of intergenic reads, the number of genic reads, and the number reads that didn't map at all.
How can I do this? I saw this post which suggested some other tools but I need to do a lot of preprocessing for RNA-SeqQC and would like to use bedtools (and any downstream tool that would be necessary for this process): Does STAR give the reads that were mapped at intergenic regions? How can I get the regions and count reads?
Does anyone know how I would do this?
I converted my sam file to a bed file and ran the following:
bedtools -i mapped.bed -g contig.lengths.tsv
but it doesn't know where the intergenic regions are...
bedtools alone doesn't work. there must be a subprogram like 'intersect'
Do you know of any tutorials to do this?
Here is the Bedtools intersect documentation. There are a fairly large number of examples provided.
https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html
For this would I need to run complement, then intersect?