I've got a BAM file that contains RNA-seq data from about 20 samples. The samples are differentiated by one another via the use of a cell barcode - which is appended to the end of the readID on the first column via an underscore (for example, READ:ID:WITH:ABUNCH:OF:COLONS_GCCCTTAT
).
Is there any way to use featureCounts or umi tools to generate a gene count matrix with each sample in a different column? I know umi tools can do that but I don't have any UMIs (unique molecular identifiers), just cell barcodes, and I don't see a way of using it without UMIs.
You mention cell barcodes. Is this an scRNA-seq BAM?
Yes, its from a plate-based protocol but with literally only 20 cells. Im therefore planning on just using edgeR for my analysis rather than doing traditional single cell analyses.
You can do sth like that has been discussed here.
I think you can use UMI-tools to do this with something like
--extract_method=regex --bc-pattern=(?P<cell_1>8)$ --ignore-umi
. However I don't think--extract_method=regex
is supported nativelybyumi-tools count
.