Entering edit mode
3.1 years ago
qstefano
▴
20
Hello everyone!
Is there a way to assign experimental groups and replicates within a GSE properly?
Metadata does not follow a standard, so it is not possible to automatically distinguish groups and replicates with the information contained (no luck in my case).
Example GSE87071 contains 4 samples, of which one with 2 replicates. I can see that by parsing metadata:
[1] "time post release: E. coli cells immediately following release from Stationary phase t=0"
[2] "biological replicate: t=0 Replicate 1"
[1] "time post release: E. coli cells 1 hour following release from Stationary phase (t=1)"
[2] "biological replicate: Replicate 1"
[1] "time post release: E. coli cells 2 hour2 following release from Stationary phase (t=2)"
[2] "biological replicate: Replicate 1"
[1] "time post release: E. coli cells immediately following release from Stationary phase t=0"
[2] "biological replicate: t=0 replicate 2"
Do you know of an alternative way or package that can do this? Also via inference.
What is your expected metadata output ?
In this example, I expect each of the 4 samples to be labeled with the group number they belong to, so a vector [1, 2, 3, 1].
Unfortunately, there is no unique attribute in the metadata. And these strings are very variable, so an extraction from the string itself is hardly reproducible