Dear all,
I have recently started using the cellranger from 10x for scRNA-seq data, after having used my own pipeline (with STAR alignment) for smart-seq2 data, to get the count matrix for then later analyzing with Seurat or Scanpy.
I have however faced a strange issue: not all genes appear in the features.csv file. For example, the gene Clec7a does not appear, which seems strange to me. I then checked the total number of features present in the features.csv file and it was approximately 32000, which is very different from the 53000 I used to have with my own pipeline. This seems to indicate that the genome annotation files (gtf) are somehow different.
Does 10x filter the genome annotation file in a certain way to decrease the number of genes? Is there an option to control this?
Note: I use the same reference genome version in my own pipeline as the 10x one.
That's really helpful!
In this case, I will have to make a new reference using "cellranger mkref" without the filterings to overcome this problem.