counting from a single BAM file with multiple samples
1
0
Entering edit mode
14 months ago
dr-device ▴ 10

I've got a BAM file that contains RNA-seq data from about 20 samples. The samples are differentiated by one another via the use of a cell barcode - which is appended to the end of the readID on the first column via an underscore (for example, READ:ID:WITH:ABUNCH:OF:COLONS_GCCCTTAT).

Is there any way to use featureCounts or umi tools to generate a gene count matrix with each sample in a different column? I know umi tools can do that but I don't have any UMIs (unique molecular identifiers), just cell barcodes, and I don't see a way of using it without UMIs.

cell-barcode RNA-seq • 1.3k views
ADD COMMENT
0
Entering edit mode

You mention cell barcodes. Is this an scRNA-seq BAM?

ADD REPLY
0
Entering edit mode

Yes, its from a plate-based protocol but with literally only 20 cells. Im therefore planning on just using edgeR for my analysis rather than doing traditional single cell analyses.

ADD REPLY
0
Entering edit mode

You can do sth like that has been discussed here.

ADD REPLY
0
Entering edit mode

I think you can use UMI-tools to do this with something like --extract_method=regex --bc-pattern=(?P<cell_1>8)$ --ignore-umi. However I don't think --extract_method=regex is supported nativelyby umi-tools count.

ADD REPLY
1
Entering edit mode
13 months ago
dr-device ▴ 10

I ended up splitting the BAM file by cell barcode using my own custom script and then using featureCounts individually on each BAM file before combining the results into a count Matrix in R.

ADD COMMENT

Login before adding your answer.

Traffic: 2712 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6