Hi.
The single cell gene expression for the cells had been perfomed using icell8 platform and the data is in this format
Type gene_id gene_name sample bx READNAME um
exon ENSG00000000457 SCYL3 Miao ACGAAGAGAGG 183 1
exon ENSG00000000457 SCYL3 Miao CTGGCTGGTTG 108 3
exon ENSG00000000457 SCYL3 Miao GAACCTCTATG 1 1
exon ENSG00000000457 SCYL3 Miao GTCTATGGACC 321 4
exon ENSG00000000457 SCYL3 Miao GTCTGCGCTAT 192 2
exon ENSG00000000457 SCYL3 Miao GTTGCCGTCAG 51 2
exon ENSG00000000457 SCYL3 Miao TCGGTATACCA 500 7
exon ENSG00000000460 C1orf112 Miao AACCGCGCCAT 10 1
exon ENSG00000000460 C1orf112 Miao AACCGGCAAGC 8 2.
exon ENSG00000000460 C1orf112 Miao AACCTTACGGC 310 4
exon ENSG00000000460 C1orf112 Miao AACGATCAGTT 297 2
The multiple genes like SCYL3 and the barcode ACGAAGAGAGG which corresponds to well e.g searching it in well list means
SYCL3 with barcode ACGAAGAGAGG belongs to sample S1. samples S1,S2 and S4 is annotated from well list e.g
("1462" "ACGAAGAGAGG" 58 12 "S1" "C2" NA NA)
Do we need data for DE analysis in the format listed below?
gene_name gene_id S1 S2 S4
SYCL3 ENSG00000000457 183 1 108
But when we are searching for GTCTATGGACC
barcode and it belongs to sample S4, so now there are 2 counts for SYCL3 in sample S4 and this is similar for other genes where multiple gene entries are found in samples.
Sample well list file below: We already know Column and Rows belong the which sample e.g S1, S2 or S4
Row Col Candidate For dispense Sample Barcode State Cells1 Cells2 Signal1 Signal2 Size1 Size2 Integ Signal1 Integ Signal2 Circularity1 Circularity2 Confidence Confidence1 Confidence2 Dispense tip Drop index Global drop index Source well Sequencing count Image1 Image2
41 56 True True Miao AACGAGTATAT Good 1 0 7113 45 320085 1 0.9506 0.97 0.98 1 158 201 A1 Pos81_1-Hoechst_G10.tif Pos81_4-Rhodamine_G10.tif
33 30 True True Miao CTCCGACCTAG Good 1 0 6654 49 326046 1 0.9220821 0.9409 0.98 7 59 71 D1 Pos65_1-Hoechst_F06.tif Pos65_4-Rhodamine_F06.tif
If I'm correct, for differential expression analysis of genes among samples , we need to have row having unique genes and counts corresponding to that genes in the sample. Which count to consider? Please correct me if I'm not correct!!
Appreciate your help!!
Best
Ankush
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thanks for your help !!!!