Entering edit mode
5.9 years ago
zizigolu
★
4.3k
Hi,
I have FASTQ files from 2 separates RNA-seq experiments but from same patients. In these different experiments, in one of them 2545 probes and in another one 1402 probes been sequenced. I do have 719 common probes between them. I want to merge these experiments. For each well I have 4 lanes (?? because I have 4 fastq files for each well). How I can merge fastq files for each well from both experiments? For example for well 1, experiments 1 I have
OBP1_L001-ds.073409c051ac418f83e3e0d75c70fdfc
OBP1_L002-ds.3538648090c14d5bbf34699ee903e3ac
OBP1_L003-ds.ef0dc5bbc14346c3b356dfedb0dad288
OBP1_L004-ds.1ac734677f4b41e793990156cf1c44a7
And for well 1 , experiment 2 I have
IOP1_L001-ds.5fe08d0acbbc4f50a47a13ec2c54102b
IOP1_L002-ds.529be7c7e6b947cfa79ed8ab9c573f17
IOP1_L003-ds.f93ad5e9a3b7457cb8611b4caf16c05e
IOP1_L004-ds.5ea5f00c987144a3b2851ba91becce4d
Hello F!
Questions similar to yours can already be found at:
We have closed your question to allow us to keep similar content in the same thread.
If you disagree with this please tell us why in a reply below. We'll be happy to talk about it.
Cheers!
Sorry, but these are from different experiments in one of them 2545 probes and in another one 1402 probes been sequenced
I reopened the question. Still, the toplevel question does not contain any information about probes whatsoever. Please edit it and provide sufficient details. Brief Reminder On How To Ask A Good Question
Does all that about probes actually matter if your question is about fastq merging only?
By the way I still think you cannot just merge the data off these two capture platforms.
@b.nota believes I must merge data from the scratch (FASTQ files).
I never said to merge fastq files from technical replicates from different batches. In your previous question you said you wanted to average read counts, which I advised not to do. You said you had different gene annotations from both batches, so I recommended to do the alignment and feature counts with the same set of genes. It became clear also that you were not using normal RNA-seq (as your tag was claiming). I have never used HTG, so my advise was on normal RNA-seq. I never advised to merge fastq files from different batches though. I think you get better help from us if you describe in more detail what you have and what you need (and why).
Sorry, that all comes from my too narrow information, I interpret things in a way that is unrealistic
If I understand correctly, you use a method with probes, so you only target a subset of the transcriptome. You say that 2 different experiments with different probe sets, contain overlapping samples, and only ~700 probes are in common. My question is why do you want to merge these? Or why average read counts of these overlapping samples. What do you want to gain with this? It seems not to be logic to merge or average such technical replicates. Why are you interested in these technical replicates?
These are two separate panels correct? I am not sure this is the way to do things in that case.
Yes they are different panels, so what would be the option please?
What exactly are you trying to do and are you certain it is logical? I know you have been working on this for a few days but I may have missed that point, in case it was mentioned before.
With HTG data I have always seen only one file per sample but then we never run HTG pools in more than one lane. Is that what has been done here? Same pool run on multiple lanes?
I guess yes, I have 96 samples but for each sample I have 4 fastq files.
Then it should be fine to
cat
theL001-L004
files for each sample together for one specific panel. If there is some HTG specific nuance in having them run like this I am not aware of it.I am still not certain what you are doing with data from 2 distinct panels.
In Illumina saying we have 4 lanes for each sample maybe that means I have data from NextSeq sequencer
Actually they think I must merge genes from both panels to have more genes and more power to detect differentially expressed genes because both panels come from same patients.
👍 for NextSeq part.
As for the merging of things from two different panels what you say
sounds reasonable
but you may be in uncharted territory here. Do you have a link o HTG's website that says it is possible to use this data for DE analysis?I am not sure about differential expression, but people in HTG experiments do differential expression by t-test if they have matched samples or ANOVA with unmatched samples. I saw for HTG for whole transcripton (miRNA) people even use DESeq2. Even I saw people use edgeR for non whole transcriptom. I know these 96 samples are from same patients (one patient for both panels) and we have 719 common probs between panels but the rest are non-common. Yesterday I used cat as @genomax suggested to merge the lanes but HTG parser did not recognise the merged FASQ files. This week HTG producers come to unerversity for a meeting I have to present my results and my boss asked me to tell them what I think about this assay. I am not sure what I should ask though