The problem using samtools stats
1
0
Entering edit mode
7.6 years ago
934963534 ▴ 20

There is information of "ACGT content per cycle" What does the cycle mean?

Also I see there are lines start with GCC and has the information of the persentages of A G C T, I wonder if it is the information of the A G C T distribution of each sequence base.

Thank you for your answering!

sequencing samtools stats • 1.6k views
ADD COMMENT
0
Entering edit mode

What is the command that you are running? Also, please share it's output.

ADD REPLY
0
Entering edit mode
7.6 years ago

The wording of "ACGT content per cycle" comes from Sanger and Illumina sequencing. A "cycle" in this context is a base, so the first cycle is the first base in all alignments, the second cycle is the second base and so on. In some experiments (namely whole genome sequencing) one expects the amount of ACGT to be constant across "cycles". In many other types of experiments (e.g., RNAseq or amplicon sequencing), this is not the case. Either way, the graph output by FastQC is probably more useful than what samtools stats is giving you.

Yes, the GCC lines have the per-cycle ACGT content, which is why they're preceded by:

# ACGT content per cycle. Use `grep ^GCC | cut -f 2-` to extract this part. The columns are: cycle; A,C,G,T base counts as a percentage of all A/C/G/T bases [%]; and N and O counts as a percentage of all A/C/G/T bases [%]
ADD COMMENT
0
Entering edit mode

Thanks, I have got it.

ADD REPLY

Login before adding your answer.

Traffic: 1982 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6