Imagine I have a set of sequences in fastq format where the actual sequence look like this:
TAGGGTTGGGCCTGACAAGTCAT
TAGGGTTGGGCCTGACAAGTCAT
TAGGGTTGGGCCTGACAAGTCAT
TAGGGTTGGGCCTGACAAGTCAT
TAGGGTTGGGCCTGACAAGTCAT
TAGGGTTGGGCCTGACAAGTCAT
TAGGGTTGGGCCTGACAAGTCAG
Imagine these little tags all stem (theoretically speaking) from the same template. The last G in the last sequence is likely to be a sequencing error. Is there a way to go:
./magicalprogram in.fastq > out.fastq
where out.fastq would contain: TAGGGTTGGGCCTGACAAGTCAT as this is the consensus
Thanks !
they all have the very same length ?
In theory, yes, in practice, no