16S amplicon sequencing input data (spreadsheet): barcode vs. linker primer seqs?
0
0
Entering edit mode
11 months ago

The header and first line of my 16S amplicon seq data look as below. I'm a bit confused as to how to interpret the "Barcode sequences" line. Is it barcode+linker primer? Or a barcode for each end of my paired-end data? Or something else?

Lane    Sample  Barcode sequence    
1   32A TCTCATGATA+TCCACGTGTT   

The other columns in my metadata spreadsheet are: # of Reads, % of the lane, % Perfect barcode, % One mismatch barcode, Yield (Mbases, % PF Clusters, % >= Q30 bases, and Mean Quality Score

Also, when I look at the R1 and R2 files, these barcodes (nor the reverse compliments) do not seem to appear at the start or end of the sequences. Additionally, each barcode combo (e.g. TCTCATGATA+TCCACGTGTT) appears in my metadata exactly twice, once in lane 1 and once in lane 2, for the same sample.

I've never seen metadata that looked like this and I'm just not too sure how to interpret the Barcode sequence. The original investigators are out on holiday so there is no one to ask, and I'm sure this is a standard format so hoping someone with more experience can help clarify. Thanks!

barcode amplicon 16s • 691 views
ADD COMMENT
1
Entering edit mode

Or a barcode for each end of my paired-end data?

Additionally, each barcode combo (e.g. TCTCATGATA+TCCACGTGTT) appears in my metadata exactly twice, once in lane 1 and once in lane 2, for the same sample.

Looking at the example above they appear to be Illumina indexes used to label the sample.

People tend to confuse indexes (used to tag the sample ) and barcodes (generally sequences you add to your construct so they are sequenced in main illumina read). Illumina index sequence reads are read independently of the main R1/R2 reads so they are never part of R1/R2 reads. During demultiplexing index reads are transferred to the fastq headers, which is the only place you will find them.

ADD REPLY
0
Entering edit mode

Thanks GenoMax ! Indeed, these indexes do appear in my fastq headers. So am I correct to assume the barcodes have not been removed and that I need additional info from the investigators (the barcodes and linker primers) before I can analyze these samples? (This metadata I've described is all I got from them.)

ADD REPLY

Login before adding your answer.

Traffic: 2674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6