I am trying to run the cell labeling using Popular vote at https://colab.research.google.com/drive/1Yw4ZDMoPgXNiC1ZQo2eS75Sw8Y_23rrb?usp=sharing#scrollTo=h41Q6U5wMwyP.
When running the part of the code:
from popv import process_query adata = process_query(query_adata,
ref_adata,
save_folder=save_folder,
query_batch_key=query_batch_key,
query_labels_key=query_labels_key,
unknown_celltype_label=unknown_celltype_label,
pretrained_scvi_path=None,
ref_labels_key=ref_labels_key,
ref_batch_key=ref_batch_key,
n_samples_per_label=n_samples_per_label)
I get the following error:
Sampling 100 per label
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-11-6957ee90eda5> in <module>
10 ref_labels_key=ref_labels_key,
11 ref_batch_key=ref_batch_key,
---> 12 n_samples_per_label=n_samples_per_label)
4 frames /usr/local/lib/python3.7/dist-packages/popv/annotation.py in process_query(query_adata, ref_adata, save_folder, ref_labels_key, ref_batch_key, ref_cell_ontology_key, query_labels_key, query_batch_key, pretrained_scvi_path, unknown_celltype_label, training_mode, hvg, n_samples_per_label)
153
154 if training_mode == "online":
--> 155 query_adata = query_adata[:, ref_adata.var_names].copy()
156 adata = anndata.concat((ref_adata, query_adata))
157 elif training_mode == "offline":
/usr/local/lib/python3.7/dist-packages/anndata/_core/anndata.py in
__getitem__(self, index) 1114 def __getitem__(self, index: Index) -> "AnnData": 1115 """Returns a sliced view of the object."""
-> 1116 oidx, vidx = sel
Blockquote
f._normalize_indices(index) 1117 return AnnData(self, oidx=oidx, vidx=vidx, asview=True) 1118
/usr/local/lib/python3.7/dist-packages/anndata/_core/anndata.py in
_normalize_indices(self, index) 1095 1096 def _normalize_indices(self, index: Optional[Index]) -> Tuple[slice, slice]:
-> 1097 return _normalize_indices(index, self.obs_names, self.var_names) 1098 1099 # TODO: this is not quite complete...
/usr/local/lib/python3.7/dist-packages/anndata/_core/index.py in
_normalize_indices(index, names0, names1)
34 ax0, ax1 = unpack_index(index)
35 ax0 = _normalize_index(ax0, names0)
---> 36 ax1 = _normalize_index(ax1, names1)
37 return ax0, ax1
38
/usr/local/lib/python3.7/dist-packages/anndata/_core/index.py in
_normalize_index(indexer, index)
100 not_found = indexer[positions < 0]
101 raise KeyError(
--> 102 f"Values {list(not_found)}, from {list(indexer)}, "
103 "are not valid obs/ var names or indices."
104 )
KeyError: "Values ['MIR1302-2HG', 'OR4G4P', 'OR4G11P', 'AL627309.1', 'AL627309.3', 'CICP27', 'AL627309.6', 'AL627309.7', 'AL627309.2', 'AL627309.5', 'RNU6-1100P', 'AL627309.4', 'FO538757.1', 'WASH9P', 'AP006222.1', 'RPL23AP24', 'AL732372.1', 'AL732372.2', 'WBP1LP7', 'CICP7', 'AL732372.3', 'RF00026', 'AL669831.3', 'AC114498.1', 'MTND1P23', 'MTND2P28', 'MTCO1P12', 'AC114498.2', 'MTCO2P12', 'MTATP8P1', 'MTATP6P1', 'MTCO3P12', 'WBP1LP6', 'CICP3', 'AL669831.1', 'RNU6-1199P', 'AL669831.2', 'AL669831.5', 'AL669831.7', 'AL669831.4', 'TUBB8P11', 'AL669831.6', 'AL645608.6', 'AL645608.2', 'AL645608.4', 'LINC02593', 'AL645608.7', 'AL645608.3', 'AL645608.1', 'AL645608.5', 'AL390719.1', 'AL645608.8', 'C1orf159', 'AL390719.3', 'AL390719.2', 'TTLL10-AS1', 'C1QTNF12', 'AL162741.1', 'LINC01786', 'INTS11', 'AL139287.1', 'NDUFB4P8', 'MRPL20-AS1', 'RN7SL657P', 'AL391244.2', 'AL391244.1', 'LINC01770', 'AL645728.2', 'AL645728.1', 'AL691432.3', 'FNDC10', 'AL691432.2', 'FO704657.1', 'AL691432.1', 'AL031282.2', 'AL031282.1', 'SLC35E2A', 'AL109917.1', 'AL391845.2', 'AL391845.1', 'AL590822.2', 'PRKCZ-AS1', 'AL590822.1', 'AL589739.1', 'AL513477.1', 'AL513477.2', 'AL139246.1', 'AL139246.4', 'AL139246.5', 'TNFRSF14-AS1', 'AL139246.2', 'AL139246.3', 'PRXL2B', 'AL831784.1', 'AC242022.2', 'AC242022.1', 'AL592464.2', 'AL592464.1', 'AL589702.1', 'AL008733.1', 'AL512383.1', 'AL590438.1', 'AL512413.1', 'AL513320.1', 'AL136528.1', 'AL136528.2', 'RF02197', 'RN7SL574P', 'AL365330.1', 'C1orf174', 'LINC0..
Also, below is the content of the data:
ref_adata.var_names
Index(['DDX11L1', 'WASH7P', 'MIR6859-1', 'MIR1302-2HG', 'MIR1302-2', 'FAM138A', 'OR4G4P', 'OR4G11P', 'OR4F5', 'AL627309.1', ... 'MT-ND4', 'MT-TH', 'MT-TS2', 'MT-TL2', 'MT-ND5', 'MT-ND6', 'MT-TE', 'MT-CYB', 'MT-TT', 'MT-TP'], dtype='object', length=58870)
query_adata.var_names
Index(['DDX11L1', 'WASH7P', 'MIR6859-1', 'MIR1302-2', 'FAM138A', 'OR4F5', 'MIR6859-2', 'OR4F29', 'OR4F16', 'FAM87B', ... 'BPY2B', 'DAZ3', 'DAZ4', 'BPY2C', 'TTTY4C', 'TTTY17C', 'GOLGA2P3Y', 'CSPG4P1Y', 'CDY1', 'TTTY3'], dtype='object', length=23681)
Looks similar to the following issue: https://githubhot.com/repo/theislab/scanpy/issues/2095
I was able to run it successfully with
training_mode="online"
but the offline mode gives the above error