DNA SPRITE Expirment Benchmark
1
0
Entering edit mode
3 months ago
Amaal • 0

Hello, I was wondering if anyone could help, I am interested in reproducing the work that has been published in the paper SPRITE-publication for the purpose of benchmarking. I have accessed the raw files that had been published here: Raw-data.

For the purpose of analysis, I have merged the 11 technical - replicates into a single paired sample (forward and reverse) and executed the pipeline as published in Github SPRITE-Pipeline. The configuration file Config.txt used in this process pertains to the barcodes utilized for SPRITE tagging in the GM12878 DNA-SPRITE experiment.

According to the publication, "a successful SPRITE experiment typically achieves >75% of total reads tagged with all five identified barcodes, corresponding to a ligation efficiency of approximately 95% per round (0.955 rounds = 0.75)". However,my results deviate from the reported findings despite using identical data, barcodes, and scripts. Below are the observed benchmark percentages:

5995219 (0.4%) reads found with 0 barcodes.
37246361 (2.3%) reads found with 1 barcode.
101531585 (6.4%) reads found with 2 barcodes.
304877984 (19.2%) reads found with 3 barcodes.
161625469 (10.2%) reads found with 4 barcodes.
**974956107 (61.5%) reads found with 5 barcodes.**

Additionally, the distribution of barcodes across different positions is as follows:

1526493880 (96.2%) barcodes found in position 1.
1511746202 (95.3%) barcodes found in position 2.
1459052384 (92.0%) barcodes found in position 3.
**1141582366 (72.0%) barcodes found in position 4.
1037351062 (65.4%) barcodes found in position 5.**

I can't find a reason why the benchmark isn't giving me the same result. Any suggestions?

SPRITE Genome • 499 views
ADD COMMENT
2
Entering edit mode
3 months ago
dsull ★ 6.9k

You won't always get 75% (it is merely a good target to aim for). If you look at the original publication in Cell (published 4 years before that paper), Figure S1B show 67.1% efficiency for a human/mouse mixing SPRITE experiment.

There is high variability between experiments. Your efficiency is more than enough to work with to proceed with analysis of the data.

Additionally, feel free to examine some of the other SPRITE datasets available on GEO.

ADD COMMENT
0
Entering edit mode

Thank you very much for your response. However, in my case, I used the publication’s data, pipeline, and barcodes, but I didn’t achieve the same results. What could be causing these differences in ligation efficiency? Is there anything I can do to optimize my results?.. Thank you again! I will do as you suggested and run other SPRITE data from GEO. Could you please provide me with other data for human DNA-SPRITE, other than the one that uploaded in GEO and the in 4DNF

ADD REPLY
1
Entering edit mode

Please use "Add comment" rather than "Add your answer" unless it's an answer :)

There is some DNA SPRITE data available at GSE247833 and at GSE151515.

If you followed the steps (i.e. the first three steps: adapter trimming, barcode identification, and efficiency calculation), your results are fine. Honestly, don't worry about it; maybe a different set of reads were used for that figure. If you want to recover more reads, just edit the config.txt file to be more error tolerant.

ADD REPLY
0
Entering edit mode

Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 1595 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6