Hi! I have a 10X scRNAseq data set that contains pig and human cells. I am trying to generate a hybrid human/pig reference for CellRanger as described by Cheung et al Nature Comm 2024
My approach is to download the ENSEMBL pig and human FASTA/GTF files. In order to ensure fully unique gene names, I wanted to add a prefix 'HUMAN' and 'PIG' to each. However, I am not sure where to add the prefix in both files bc their nomenclature seems to be different. For example, below, I could make it "HUMAN_ENST00000390473.1" and "HUMAN_ENSG00000142611" but I'm not sure how CellRanger could map them together.
Any ideas? Please let me know if I am misunderstanding or approaching this wrong.
#FASTA EXAMPLE:
ENST00000390473.1 cdna chromosome:GRCh38:14:22450089:22450139:1 gene:ENSG00000211825.1 gene_biotype:TR_J_gene transcript_biotype:TR_J_gene gene_symbol:TRDJ1 description:T cell receptor delta joining 1 [Source:HGNC Symbol;Acc:HGNC:12257]
ACACCGATAAACTCATCTTTGGAAAAGGAACCCGTGTGACTGTGGAACCAA
#GTF EXAMPLE:
1 ensembl_havana gene 3069168 3438621 . + . gene_id "ENSG00000142611"; gene_version "17"; gene_name "PRDM16"; gene_source "ensembl_havana"; gene_biotype "protein_coding";
Looks like 10x has a recommendation to create a reference with multiple species: https://www.10xgenomics.com/support/software/cell-ranger/latest/analysis/inputs/cr-3p-references#multiple-species-4f40e4