Hi,
I am following the instructions for the PASTA package (https://satijalab.org/seurat/articles/pasta_vignette.html). This package uses scRNA-seq data to infer alternative polyadenylation usage from scRNAseq data. It requires among many input files also a fragment file. The authors state the following must be done to obtain this file:
A fragment file (produced from the aligned BAM file, we recommend using the blocks function in sinto.
So I successfully installed Sinto but when I followed the instructions on the Sinto website I got a file with 0 bytes. As I understood you ought to use the possorted_genome_bam.bam file from CellRanger output.
this is my prompt:
sinto fragments -b possorted_genome_bam.bam -f my.bed --barcode_regex "[^:]*" -p 4
and this is the output:
Function run_fragments called with the following arguments:
bam possorted_genome_bam.bam
fragments my.bed
min_mapq 30
nproc 4
barcodetag CB
cells None
barcode_regex [^:]*
use_chrom (?i)^chr
max_distance 5000
min_distance 10
chunksize 500000
shift_plus 4
shift_minus -5
collapse_within False
func <function run_fragments at 0x7fe9a85dfa60>
Function completed in 0.0 m 0.59 s
I also ran it without specifying barcode_regex but I got the same result.
In case you wonder what the possorted_genome_bam looks like: this is the output for samtools view possorted_genome_bam.bam | head -n 3
SRR17534016.9003356 16 chr1 3014927 0 63M30S 0 0 CCGACTAGGCCATCTTTTGATACATATGCAGCTAGAGACAAGAGCTCCGGGGTACTAGTTAGTCCCATGTACTCTGCGTTGATACCACTGCTT CCCCCCCCCC;C-CCCCCC-CCCCCCCCC-CC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC NH:i:10 HI:i:1 AS:i:62 nM:i:0 ts:i:30 RG:Z:young2F:0:1:unknow_flowcell:0 RE:A:I xf:i:0 CR:Z:CCGTTTATCGCTGTTC CY:Z:CC-CCCCCCCCCCCCC CB:Z:CCGTTCATCGCTGTTC-1 UR:Z:TTCAAGGTTCCA UY:Z:CCCCCCCCCCCC UB:Z:TTCAAGGTTCCA SRR17534016.11370570 16 chr1 3018673 1 92M 0 0 TTTGTTTTAGGATAAAATGTTCTGTAGATATCTGTCAAGTCCATTTGTTTCATCACTTCTGTTAGTTTCACTGTGTCCCTGTTTAGTTTCTG CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC NH:i:3 HI:i:1 AS:i:90 nM:i:0 RG:Z:young2F:0:1:unknow_flowcell:0 RE:A:I xf:i:0 CR:Z:CCGTAGGGTAGGATAT CY:Z:CCCCCCCCCCCCCCCC CB:Z:CCGTAGGGTAGGATAT-1 UR:Z:TGAATGGCTTCT UY:Z:CCCCCCCCCCCC UB:Z:TGAATGGCTTCT SRR17534018.9851676 16 chr1 3018686 1 63M30S 0 0 AAAATGTTCTGTAGATATCTGTCAAGTCCATTTGTTTCATCACTTCTGTTAGTTTCACTGTGTCCCATGTACTCTGCGTTGATACCACTGCTT CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC NH:i:3 HI:i:1 AS:i:62 nM:i:0 ts:i:30 RG:Z:young2F:0:1:unknow_flowcell:0 RE:A:I xf:i:0 CR:Z:CACGAATAGACCAGCA CY:Z:CCCCCCCCCCCCCCCC CB:Z:CACGAATAGACCAGCA-1 UR:Z:AGCTCCCGGGAT UY:Z:CCCC;CCCCCCC UB:Z:AGCTCCCGGGAT (biotools2) [nt793@eris2n4 outs]$ samtools view possorted_genome_bam.bam | head -n 1 SRR17534016.9003356 16 chr1 3014927 0 63M30S 0 0 CCGACTAGGCCATCTTTTGATACATATGCAGCTAGAGACAAGAGCTCCGGGGTACTAGTTAGTCCCATGTACTCTGCGTTGATACCACTGCTT CCCCCCCCCC;C-CCCCCC-CCCCCCCCC-CC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC NH:i:10 HI:i:1 AS:i:62 nM:i:0 ts:i:30 RG:Z:young2F:0:1:unknow_flowcell:0 RE:A:I xf:i:0 CR:Z:CCGTTTATCGCTGTTC CY:Z:CC-CCCCCCCCCCCCC CB:Z:CCGTTCATCGCTGTTC-1 UR:Z:TTCAAGGTTCCA UY:Z:CCCCCCCCCCCC UB:Z:TTCAAGGTTCCA
I understand that Sinto was written for scATAC-seq and I guess some more steps must follow in order to prepare it for Sinto. I am thankful for any help I can get.
What sort of aligner are you using? My guess is that the MAPQ in sinto is set too high :(
Hi, I am using the output from celltanger which itself uses STAR. I just want to emphasize that I am dealing with RNA-seq and not atac seq.
I ran
the output is:
and the file is again 0 bytes. At least it was running longer now :P
I also tried with: --shift_plus 0 --shift_minus 0 but the result is the same
Hi, sorry for taking long to respond.
Since it is RNA (and I assume 3' based at that) sinto wont work. sinto requires paired end reads in order to work (usually one of the reads in RNA-seq is used for DNA storage ). Do you really need a fragment file for RNA-seq?
Satija lab is recommending that
sinto
be used in the link provided by OP in original post though.What they say is use
blocks
function insinto
which I am not sure how it relates tosinto fragments
command.The Satija lab just commented on github.
I will try their code and see if it works. In any case I will let you know in this thread.