Question

Kallisto and homologous genes expression

0

Entering edit mode

6.3 years ago

afli ▴ 190

Hi, I met some trouble when using kallisto to analyse homologous genes expression. In my case, sample1 and sample2 are two varieties with very similar genome sequence. I merged the transcript fasta files of these two samples, then I ran kallisto using paired-end reads of sample1. Kallisto worked successfully. In the aboudence.tsv file, I compared the TPM value of homologous genes in sample1 and sample2, though many genes in sample2 were much lesser than sample1, there were a part of genes showed higher TPM value in sample2 than sample1. This confused me whether I made some mistake in analysis procedures or kallisto could not effectively figure out this project?

Thank you very much! Aifu.

kallisto honologous genes ASE • 1.4k views

ADD COMMENT • link 6.3 years ago by afli ▴ 190

0

Entering edit mode

In my case, sample1 and sample2 are two varieties with very similar genome sequence. I merged the transcript fasta files of these two samples,

While that may be the case for genomes, were the transcript files you merged of similar size/content (# of transcripts etc)? Were there more transcripts in one genome compared to the other?

ADD REPLY • link 6.3 years ago by GenoMax 147k

0

Entering edit mode

Thank you genomax, this sounds reasonable. But I still have another question, I used sample1 transcripts fasta file(which contains every transcripts of sample1), and used sample1 reads file do kallisto analysis and obtained sample1.tpm.tsv. I also used the same sample1 transcripts fasta file and sample2 reads to do kallisto analysis and obtained sample2.tpm.tsv. Since this time every transcripts get a TPM value of sample1 and sample2, I expect that sample1 TPM value will be generally higher than sample2 TPM. Because sample1 and sample2 have 2-3 SNPs per Kb, maybe not all transcripts of sample1 will be higher than sample2, I checked that sample1 TPM/sample2 TPM >1.5, there is 3241 transcripts, but sampel1 TPM/sample2 TPM <1/1.5, there is 4185, that is strange because I use sample1 transcripts fasta file, it seems to be more fit for sample2 reads. Godness.

I am looking for more informations. The method framework is at: https://www.rna-seqblog.com/iihg-intro-to-kallisto-for-rna-seq/.

ADD REPLY • link 6.3 years ago by afli ▴ 190