Hi, I met some trouble when using kallisto to analyse homologous genes expression. In my case, sample1 and sample2 are two varieties with very similar genome sequence. I merged the transcript fasta files of these two samples, then I ran kallisto using paired-end reads of sample1. Kallisto worked successfully. In the aboudence.tsv file, I compared the TPM value of homologous genes in sample1 and sample2, though many genes in sample2 were much lesser than sample1, there were a part of genes showed higher TPM value in sample2 than sample1. This confused me whether I made some mistake in analysis procedures or kallisto could not effectively figure out this project?
Thank you very much! Aifu.
While that may be the case for genomes, were the transcript files you merged of similar size/content (# of transcripts etc)? Were there more transcripts in one genome compared to the other?
Thank you genomax, this sounds reasonable. But I still have another question, I used sample1 transcripts fasta file(which contains every transcripts of sample1), and used sample1 reads file do kallisto analysis and obtained sample1.tpm.tsv. I also used the same sample1 transcripts fasta file and sample2 reads to do kallisto analysis and obtained sample2.tpm.tsv. Since this time every transcripts get a TPM value of sample1 and sample2, I expect that sample1 TPM value will be generally higher than sample2 TPM. Because sample1 and sample2 have 2-3 SNPs per Kb, maybe not all transcripts of sample1 will be higher than sample2, I checked that sample1 TPM/sample2 TPM >1.5, there is 3241 transcripts, but sampel1 TPM/sample2 TPM <1/1.5, there is 4185, that is strange because I use sample1 transcripts fasta file, it seems to be more fit for sample2 reads. Godness.
I am looking for more informations. The method framework is at: https://www.rna-seqblog.com/iihg-intro-to-kallisto-for-rna-seq/.