Question

Number of samples in each lane in GEO datasets

0

Entering edit mode

6.9 years ago

statfa ▴ 790

Hi

As you know, GC-content bias could arise in RNA-seq data if two or more samples are sequenced in each lane. How can I know how many samples are sequenced in each lane when I get the data from GEO datasets?

For example, look at this link: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE47944

Thank you

RNA-Seq lane gc-content • 1.9k views

ADD COMMENT • link 6.9 years ago by statfa ▴ 790

0

Entering edit mode

GC-content bias could arise in RNA-seq data if two or more samples are sequenced in each lane

Not that I know of. Do you have a reference?

ADD REPLY • link 6.9 years ago by GenoMax 148k

0

Entering edit mode

Yes, sure. Please read the read count normalization section here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4917940/

ADD REPLY • link 6.9 years ago by statfa ▴ 790

0

Entering edit mode

See: Ajdust for GC content bias in RNA-seq DE analysis. Only if you see a need to in your data, then consider accounting for it. I am not sure number of samples and/or lanes have any correlation to GC bias.

Edit: I just noticed that the thread I linked is your own.

BTW: You may have to contact the data submitters to see if you can find which samples ran in which lane/in what combination.

ADD REPLY • link 6.9 years ago by GenoMax 148k

0

Entering edit mode

Thank you very much. that's my post. Well, I read that in some papers. When only one sample was sequenced in each lane, they didn't bother about GC-content bias. I was hopeful that I could find the information about the number of samples in each lane in GEO datasets so that i didn't need to worry about that bias. DEseq2 corrects the bias. It does't check if the bias exists. The dataset I posted as an example is used by the paper I sent you the link. in the paper, the claim that in this data, there's only one sample sequenced in each lane, so they didn't normalize GC content bias. But nowhere could I find any info about the number of samples in each lane in GEO datasets.

ADD REPLY • link 6.9 years ago by statfa ▴ 790

0

Entering edit mode

But nowhere could I find any info about the number of samples in each lane in GEO datasets.

I could be wrong but AFAIK that is not a required piece of information for GEO/SRA submissions.

As @Devon had suggested in your past thread, if you are worried about GC bias then you will have to test for it each time.

ADD REPLY • link 6.9 years ago by GenoMax 148k