"CAGE (cap analysis of gene expression; Table S1) was as described (Yang et al., 2011) and sequenced using a HiSeq 2000 (100 nt reads). After removing adaptor sequences and checking read quality using Flexbar 2.2 with the parameters of “-at 3 -ao 10 --min-readlength 20 --max-uncalled 70 --phred-pre-trim 10”, we retained only reads beginning with NG or GG (the last two nucleotides on the 5′ adaptor). We then removed the first two nucleotides and mapped the sequences to the mouse genome using TopHat 2.0.4. " This is the way the literature works, how do I write code to remove the first two nucleotides
multiple ways:
You can use "HEADCROP" option in "Trimmomatic"