I am trying to build an index for a single nuc experiment using Kallisto, but I was wondering if someone could please help breakdown the following for kb ref
I am a bit confused on what exactly the significance of t2g.txt, cdna_t2c.txt, and intron_t2c.txt are
I am also not 100% sure about the difference between lamanno vs nucleus on the workflow
In my case, after running the kb ref, t2g.txt, cDNA_t2c.txt and introns_t2c.txt files are... EMPTY.
As I look back at the kb --help, I understand that all of these parameters are supposed to be generated :
required arguments:
-i INDEX Path to the kallisto index **to be constructed.**
-g T2G Path to transcript-to-gene mapping **to be generated**
-f1 FASTA [Optional with -d] Path to the cDNA FASTA (lamanno, nucleus) or mismatch FASTA (kite) **to be generated**
required arguments for `lamanno` and `nucleus` workflows:
-f2 FASTA Path to the intron FASTA **to be generated**
-c1 T2C Path **to generate** cDNA transcripts-to-capture
-c2 T2C Path **to generate** intron transcripts-to-capture
There is something very unclear to me, what could I be doing wrong ? How did you solve you problem GenoMax ?
Briefly, t2g.txt contains the transcripts-to-gene mappings, cdna_t2c.txt contains all the cDNA (spliced) transcripts, and intron_t2c.txt contains all the "intronic" (i.e. unspliced) transcripts.
nucleus is used for single-nucleus data while lamanno is used for RNA velocity. There are subtle differences between the two workflows (e.g. for nucleus, the spliced+unspliced matrices are added up while for RNA velocity, separate matrices are generated that can be fed directly into the velocyto workflow).
does the t2g.txt, cdna.fa, intron.fa cdna_t2c.txt, intron_t2c.txt get generated?
One of the reasons I am confused is because I was sent files that were built using the comparative annotation toolkit with a few additional items and I haven't fully made sense of everything.
However, one of the things I am seeing is t2g.txt files, such as cDNA_introns_t2g.txt, introns_t2g.txt, cDNA_t2g.txt, cDNA.fa, introns.fa etc
So, part of me thought these are needed when building the index
Hi there,
First of all thank you for openning this issue, it helped me better understand the nature of the command parameters. But here is my issue :
I am buiding a mouse index using kb ref with the following command line :
In my case, after running the kb ref, t2g.txt, cDNA_t2c.txt and introns_t2c.txt files are... EMPTY. As I look back at the kb --help, I understand that all of these parameters are supposed to be generated :
There is something very unclear to me, what could I be doing wrong ? How did you solve you problem GenoMax ?
Thanks in advance for your help
Please create a new question rather than posting this as an answer to an existing question.
Also, you cross-posted here: https://github.com/pachterlab/kallistobustools/issues/44
(and I answered there)