Question

De-multiplex of no illumina index RNAseq libraries on Novaseq

0

Entering edit mode

6.7 years ago

linjc.xmu ▴ 30

Dear all, I made a set of RNA-seq libraries without illumina index embedded, but with inner barcode right after read 1 sequences. Now they were sequenced on Novaseq platform with others libraries. Where could I get my data? In undetermined data, or in the data marked by GGGGGG index? Thanks a lot.

sequencing • 3.1k views

ADD COMMENT • link updated 6.7 years ago by Devon Ryan 105k • written 6.7 years ago by linjc.xmu ▴ 30

1

Entering edit mode

So where exactly is your barcode?

Like this?:

5'--|--Adapter--|--Barcode--|--RNAseq/cDNA--|--Adapter--|--3'

if so, how long are the cDNA fragments and what was the read length of your run?

ADD REPLY • link 6.7 years ago by ATpoint 88k

0

Entering edit mode

My humble recommendation, after several long unresolved discussions with sequencing guys is to search your internal barcode in both "undetermined" and G8. For 8 base codes, at a precise known location, error probability is low.

ADD REPLY • link 6.7 years ago by jomo018 ▴ 730

0

Entering edit mode

Yes. The barcode location is right. My insertion size is ~180-375 bp. Read length is PE150. Sequencing facility sent me G8 data split by my barcode. But the unique mapping rate is low (~47%), multi-alignment rate is ~50%. Usually, I got 90% unique mapping for arabidopsis samples on Hiseq2500. So I am splitting data again from undetermined data as Devon said. I am not sure which one could be used. Or merge both?

ADD REPLY • link 6.7 years ago by linjc.xmu ▴ 30

score 2 · Accepted Answer · 2018-09-04

2

Entering edit mode

6.7 years ago

Devon Ryan 105k

If the samples lacked standard Illumina barcodes they'll be mostly in Undetermined.

ADD COMMENT • link 6.7 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks. Sequencing company said Novaseq generates a GGGGGG index file (reads) naturally. What's this?

ADD REPLY • link 6.7 years ago by linjc.xmu ▴ 30

1

Entering edit mode

Machines with 2-color chemistry (NextSeq and NovaSeq) can see no signal for a G, but unless you put that in your sample sheet (terrible idea) you'll see the reads for it in Undetermined.

ADD REPLY • link 6.7 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks. Do you mean it's better to get data from Undetermined one?

ADD REPLY • link 6.7 years ago by linjc.xmu ▴ 30

0

Entering edit mode

I would say there shouldn't be a G8 "sample" to begin with, unless the the NovaSeq produces that by default for some reason (it's about the only Illumina machine we don't have, so I can't check). The thing with unbarcoded samples is that signal from neighboring clusters has a way of bleeding over into them during the index reads, so they will often not be purely no signal (a G in 2 color chemistry).

ADD REPLY • link 6.7 years ago by Devon Ryan 105k