Merging single ATAC seq data sets from multiple sample
0
0
Entering edit mode
7 weeks ago
1769mkc ★ 1.3k

How to combine multiple single cell data(ATAC) into one file. As an example the dataset I have is this to start with

zcat GSE268807_G129_D_barcodes.tsv.gz | head
AAACAGCCAAAGCTCC-1
AAACAGCCAAGGTCGA-1
AAACAGCCACAGGAAT-1
AAACAGCCACATTAAC-1
AAACAGCCACCTGGTG-1
AAACAGCCACGCAACT-1
AAACAGCCAGCTAACC-1
AAACAGCCAGCTAATT-1
AAACATGCAAATTCGT-1
AAACATGCAACACCTA-1




zcat GSE268807_G150_D_barcodes.tsv.gz | head
AAACATGCAGACAAAC-1
AAACATGCATAAAGCA-1
AAACATGCATGAATCT-1
AAACCAACACCTACGG-1
AAACCGAAGCAGCTCA-1
AAACCGAAGCTTGCTC-1
AAACCGAAGGCGGATG-1
AAACCGCGTTTGACCT-1
AAACGCGCAAAGCCTC-1
AAACGCGCAATGCCTA-1 


zcat GSE268807_G129_D_matrix.mtx.gz| head   %%MatrixMarket matrix coordinate integer general
%metadata_json: {"software_version": "cellranger-arc-2.0.2", "format_version": 2}
131903 17176 60994557
25 1 1
33 1 1
54 1 1
60 1 1
61 1 1
63 1 1
85 1 1   zcat GSE268807_G150_D_matrix.mtx.gz| head
%%MatrixMarket matrix coordinate integer general
%metadata_json: {"software_version": "cellranger-arc-2.0.2", "format_version": 2}
114970 3068 23594006
69 1 2
137 1 1
158 1 1
248 1 1
465 1 1
469 1 1
476 1 1 


zcat GSE268807_G129_D_features.tsv.gz| head
ENSG00000243485 MIR1302-2HG     Gene Expression chr1    29553   30267
ENSG00000237613 FAM138A Gene Expression chr1    36080   36081
ENSG00000186092 OR4F5   Gene Expression chr1    65418   69055
ENSG00000238009 AL627309.1      Gene Expression chr1    120931  133723
ENSG00000239945 AL627309.3      Gene Expression chr1    91104   91105
ENSG00000239906 AL627309.2      Gene Expression chr1    140338  140339
ENSG00000241860 AL627309.5      Gene Expression chr1    149706  173862
ENSG00000241599 AL627309.4      Gene Expression chr1    160445  160446
ENSG00000286448 AP006222.2      Gene Expression chr1    266854  266855
ENSG00000236601 AL732372.1      Gene Expression chr1    360056  360057  


zcat GSE268807_G150_D_features.tsv.gz| head
ENSG00000243485 MIR1302-2HG     Gene Expression chr1    29553   30267
ENSG00000237613 FAM138A Gene Expression chr1    36080   36081
ENSG00000186092 OR4F5   Gene Expression chr1    65418   69055
ENSG00000238009 AL627309.1      Gene Expression chr1    120931  133723
ENSG00000239945 AL627309.3      Gene Expression chr1    91104   91105
ENSG00000239906 AL627309.2      Gene Expression chr1    140338  140339
ENSG00000241860 AL627309.5      Gene Expression chr1    149706  173862
ENSG00000241599 AL627309.4      Gene Expression chr1    160445  160446
ENSG00000286448 AP006222.2      Gene Expression chr1    266854  266855
ENSG00000236601 AL732372.1      Gene Expression chr1    360056  360057

Here I have feature file, barcode file and mtx file., this is the data source .

My final objective is the make one file for each such as one barcode, one feature and one mtx file.

For barcode and feature I can think of merging where I can filter the duplicates, but I'm not able to figure out how to merge the mtx file.

Any suggestion or help would be really appreciated

`

10x • 314 views
ADD COMMENT

Login before adding your answer.

Traffic: 3037 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6