Dear All!
I am very new to bioinformatics and trying to connect phenotype of certain cells with expression of certain genes (all exhibiting the same function). I downloaded FASTA sequences for all genes of interest from NCBI and RNA-seq data from cells. I trimmed adapters from obtained RNA sequences and run Kallisto quant on it. Now I am trying to obtain the matrix from Kallisto to further downstream input.
I am looking at this tutorial https://bioc.ism.ac.jp/packages/3.5/bioc/vignettes/tximport/inst/doc/tximport.html#import-transcript-level-estimates However, I am confused with this part
files <- file.path(dir, "kallisto_boot", samples$run, "abundance.h5")
As the authors use tximportData package, I am not sure what files should "kallisto_boot" include. Transcripts? Abundance.tsv and json file?
I cannot help with
kallisto
as I do not use it but the index it expects is an entire transcriptome, not just a collection of selected genes. Download a reference transcriptome, either from NCBI/RefSeq or Gencode, then runkallisto
, then tximport, normalize data with a tool of your choice, e.g.edgeR
orDESeq2
and then do whatever downstream analysis you plan to do. I assume that thiskallisto_boot
is a column in the output files of the tool thattximport
expects. Simply run the example code and see if it works.