Hello, I have a pathosystem of host and obligate pathogen. There are three strains of host which is treated with two different strains of pathogens. I have RNA seq reads for these treatments described below and have 3 replicates for each treatment
a. Host 1 is treated with Pathogen 1 and Host 2 is also treated with pathogen 1.
b. Host 1 is treated with Pathogen 2 and Host 3 is also treated with pathogen 2.
The experimental setup I have in mind is to do a pair wise comparison between the two treatments in (a) and same in (b). What I'm interested is to look for the genes that are highly up/down in pathogen strains during the infection.
We don't have a control for the pathogen as it is an obligate parasite. So , i would like to know if this comparison between treatments is sensible?
Would there be any other way to handle this condition. Any suggestions would be very helpful.
I'm not from the pathogen world. What does this mean in terms of availability of RNASeq data?
we cannot make a control of the pathogen and sequence them as it always comes only with the host Hence we dont have RNA seq data for the pathogen alone
But you have it for the host alone and can therefore filter out the intersection between transcriptomes, leaving it with the host transcriptome only I guess.
I have generated the transcriptome of the pathogen1 and 2 by filtering out the host reads. However inorder to do differential expression and to look for the genes that are activated/repressed during the infection, i suppose the usual way is to have the control reads . The reads that you filter out from the host are not really the control, as it is initially with the host itself.
Maybe more of a philosophical question:
If you find the pathogen only within the host, how can you tell what is up/down regulated upon infection? If I get it right, this organism lives only IF it infects a host. Therefore, outside of it, you shouldn't have expression at all.
Am I missing something?
i am not sure if i explained it clearly.
An obligate parasite or holoparasite is a parasitic organism that can not complete its life-cycle without exploiting a suitable host. If an obligate parasite can not obtain a host it will fail to reproduce. (From wiki.)
In our case we have the parasite and few hosts are susceptible by letting it entering the cell and destroy it completly wheras few are able to resist to the parasite nit letting the parasite entering into cell of the host. So we are basically interested in the genes which are highly expressed during these conditions of R and S.
My question here is it biologically sensible to compare just the treatments in program like edgeR or deseq2?
So you are actually interested in the host genes that are up-regulated (and not the parasite genes or both)? Are you able to tell the host/parasite gene expression apart (count them separately)?
You don't really have an alternate option (and if you have already done the experiment) why not do the comparisons as you suggested originally and see what you get.
Edit: You are forced to assume that the pathogen is going to behave the same way with different hosts (if you only need host genes), which will likely not true. Oh well.
I think OP mentioned that he can indeed distinguish between host and pathogen transcriptome and that the focus is on the latter.
This statement indicates that the interesting biology is happening in the host not the parasite. OP will eventually provide some clarity.
The host genes are already analyzed and already looked into the story of defense genes in the host. Since we have no idea about the parasite, i have started to dig in the parasite side of the story to look for genes that are highly expressed and regulated. This one I don't take into account the variation of hosts. So wit this setup i was not sure, if the treatment-treatment comparison is usual or not?
(H1+P1)~(H1+P2)
may be the only valid comparison here in that case. Unless you want to treat the hosts as invariant and do((H1+P1)+(H2+P1)) ~ ((H1+P2)+(H3+P3))
comparison.thank you. i will try it
It makes perfect sense to compare the two conditions in deseq2 / edger. There is no strict necessity of having control - treatment pairs. The question this comparison would answer is about the transcriptional differences in the two states of the pathogens. Do you actually want to compare the host transcriptome? In that case it would be worth to add non-infected samples.
I think the bottleneck are not EdgeR or DESeq2 but the experimental design, or more broadly speaking, Mother Nature being cruel to your project.
For DESeq2 or analogue softwares, there is no complication in finding a set of DEGs given a p-value threshold, a correction method and raw counts. The point is more to understand if what you're feeding the programs with has sense or not. I think that comparing two pathogenic infections without a healthy one is a little bit weak as design, but that's what you can get with the organism you work on.
I would add that you probably lack some samples: