Below is part of the result of FastQC on a fastq file from a single-cell RNA-seq data. I noticed that the sequence content on the first 15nts is much more uneven that the rest. I cannot find a reasonable explanation.
I apologize for my ignorance but I'm still new to some of the techniques used in next gen sequencing. From my understanding, isn't it that shotgun sequencing can cut the sequence anywhere, so there shouldn't be a reason, for example, that the first nt has significantly higher frequency of C than any other bases, especially considering C in average is lower than T and A. It's almost like there are some frequently appearing specific sequence on the head of all the pieces sequenced, but I don't know why could that be. I'm sure a lot of people have seen similar pattern before.
This could be due to un-removed barcodes or any specific primers used in sample amplification?