i want to use 10X scRNA-seq data to analyse alternative last exon ,after mapping i get .bam file ,which software is a good choice to count exon?
i want to use 10X scRNA-seq data to analyse alternative last exon ,after mapping i get .bam file ,which software is a good choice to count exon?
You need to create a custom reference GTF where each of your exon is a gene. So a uniq gene_id
and gene_name
for each of your exons and then use that GTF in cell-ranger pipeline from the beginning.
In GTF file, the gene_id
and gene_name
attributes are used to get gene level counts, which basically sums up exon level counts that belong to a same gene. if you want exon level counts, each exon should have its own unique gene_id
gene_name
attribute.
For example, lets say you have a gene definition ( dummy example):
chr1 dummy exon 380 401 . + 0 gene_id "001"; gene_name "aa";
chr1 dummy exon 501 650 . + 2 gene_id "001"; gene_name "aa";
chr1 dummy exon 700 707 . + 2 gene_id "001"; gene_name "aa";
chr1 dummy exon 380 382 . + 0 gene_id "001"; gene_name "aa";
chr1 dummy exon 708 710 . + 0 gene_id "001"; gene_name "aa";
That should be changed to:
chr1 dummy exon 380 401 . + 0 gene_id "001_ex1"; gene_name "aa_ex1";
chr1 dummy exon 501 650 . + 2 gene_id "001_ex2"; gene_name "aa_ex2";
chr1 dummy exon 700 707 . + 2 gene_id "001_ex3"; gene_name "aa_ex3";
chr1 dummy exon 380 382 . + 0 gene_id "001_ex4"; gene_name "aa_ex4";
chr1 dummy exon 708 710 . + 0 gene_id "001_ex5"; gene_name "aa_ex5";
Then instead of getting counts for gene 001
, now you will get counts for each individual exons in your output.
Note: The exons need not to be renamed to 1,2,3,4, etc. They can be given any unique value.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks , but coding is difficult for me . How could i get the special gtf from normal gtf
You mean i should change all type "exon“ into “gene" in GTF'3th col? I also cant understand why need a uniq gene_id and gene_name for each of exons. Thanks for you explain
I updated my answer. Please read about GTF/GFF formats and how gene level counts are obtained by any quantification methods.