Tool Or Software To Determing Tag Redundancy (Summary Of Library Complexity)
2
2
Entering edit mode
11.4 years ago
k2bhide ▴ 80

Hello all, This is Ketaki Bhide from Purdue University. I tried to use ALEXA-seq to find Summary of library complexity - estimated by tag redundancy per million reads for some of my fastq files. However was not able to use ALEXA-seq since I failed to have resources to set up required web server and local databases etc. I would really appreciate if anyone would let me know a tool/software that would determine tag redundancy and estimate library complexity from fastq files.

library • 3.6k views
ADD COMMENT
2
Entering edit mode
11.4 years ago

FastQC, BAMStats, Samtools, Bamtools, and Picard each have read and alignment QC components.

For example, the 'overrepresented sequences' component of FastQC can be informative in the context of library redundancy.

The Picard Metrics page is a good resource.

Bamtools and Samtools have components that would help you randomly sample reads to create your own custom library complexity plots.

ADD COMMENT
1
Entering edit mode
11.4 years ago
KCC ★ 4.1k

In general, you can remove duplicates using picard's MarkDuplicates.jar or samtools rmdup. (Remember to sort your reads first!)

After you remove duplicates you can just count the number of reads before and after, and subtract the latter from the former. This will give you a count of the redundant reads. You can then divide by the total number of reads (suitably adjusted) to get the redundancy per million reads.

Googling Alexa-seq, I found this "Library complexity is calculated for the sequence library by randomly sampling 1 million reads and determining the number of unique and redundant reads within the pool. This sampling is repeated (with replacement) at least 3 times and average values across these samples is used."

You can get random samples of your SAM file using the technique found Sample Sam File. So, you can make the 3 random samples of a million reads yourself and identify duplicates as suggested above. Detailed instructions on how to use picard or samtools for duplicate removal can be found by searching this site.

ADD COMMENT

Login before adding your answer.

Traffic: 1386 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6