Hi, I have some 10× genomics scRNA-seq data. I use alevin to compare them to transcriptome. I want to get different isoform's count . However, when I use the following code, I got the CB × gene_id matrix. How can I get CB× transcript_id ? Thank you in advance.
nohup salmon alevin -l ISR \
-1 /mnt/sda1/houruiyan/1kPBMC/fastqfile/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R1_001.fastq.gz \
/mnt/sda1/houruiyan/1kPBMC/fastqfile/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R1_001.fastq.gz \
-2 /mnt/sda1/houruiyan/1kPBMC/fastqfile/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R2_001.fastq.gz \
/mnt/sda1/houruiyan/1kPBMC/fastqfile/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R2_001.fastq.gz \
--chromiumV3 \
-i /mnt/sda1/houruiyan/humanRef/salmon_index/salmon_index_v34/ \
-p 20 \
-o alevin_output \
--tgMap /mnt/sda1/houruiyan/humanRef/hg38txp2gene.tsv &
from vpolo.alevin import parser
import pandas as pd
pd.set_option('display.max_columns',None)
alevindf=parser.read_quants_bin('/mnt/sda1/houruiyan/1kPBMC/alevin_output')
print(alevindf)
ENSG00000259376.1 ENSG00000259755.1 ENSG00000287892.1 \
TATCGCCTCTCCCAAC 0.0 0.0 0.0
CACTTCGTCACCTACC 0.0 0.0 0.0
TCCTCTTAGCCAAGGT 0.0 0.0 0.0
CCTCCAAAGGCCCGTT 0.0 0.0 0.0
AAGACAACAGATCACT 0.0 0.0 0.0
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.OK, thank you very much!