I have an output from cellranger with gene expression for each cell (pre-mRNA seq). I need to remove all non-coding gene from there. What is the best way to do it?
I have an output from cellranger with gene expression for each cell (pre-mRNA seq). I need to remove all non-coding gene from there. What is the best way to do it?
If you want to subset the bam file you can do:
samtools view -h -b -L genes_coordinates.bed in.bam > out.bam
where genes_coordinates.bed is a bed file with the genomic coordinates of each coding gene. This information could for example be downloaded from ensembl (biomart).
You have to download it from any of the standard repositories, e.g. GENCODE, NCBI. it is a reference annotation file. Do you have a background in NGS analysis? No offense but if you are stuck with removing some genes than the single-cell analysis will be..."fun". But seriously, you should spend some time with the basics.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What is pre-mRNA seq? Do you mean total RNA seq?
pre-mRNA is a primary transcript that contains both introns and exons
I know, but how can you only sequence pre-mRNA? Capping and poly-adenylation is happening after transcription, but splicing is already happening during transcription, so I'm genuinely curious, because I never heard of pre-mRNA seq.