Jellyfish & Paired end reads
1
0
Entering edit mode
9 months ago

I am currently going through kmer analysis of some 2x150 PE sequencing that I carried out for a first year PhD project for whole genome sequencing.

I carried out five runs so have 10 paired end files from one individual. As this is the case, do all reads need merging/concatenating before going through jellyfish or should only one of the paired end reads be merged and put through that way?

Alternatively - as jellyfish reads all canonical Kmers would just putting the forward read of one of the runs be enough as it would read it forward and reverse anyway?

Thanks in advance!

Jellyfish genome-assembly kmer • 934 views
ADD COMMENT
1
Entering edit mode

You may need to do some QC first before doing kmer counting. I would look for Illumina adapters/multiplexing sequences just in case. Also you want to know if you have some inserts shorter than 300bp so the ends of r1 and r2 may overlap.

ADD REPLY
0
Entering edit mode

QC has already been carried out! There is no adapters in there and reads have been quality trimmed

ADD REPLY
1
Entering edit mode

I carried out five runs so have 10 paired end files from one individual.

Are these technical replicates of sequencing a single library or are these five independent libraries?

ADD REPLY
0
Entering edit mode

So there is two libraries, three replicates of one, two replicates of the other.

ADD REPLY
0
Entering edit mode
9 months ago
Darked89 4.7k

Before starting jellyfish you may take a look at the benchmarks 2018 benchmarks and try to estimate if k-mer counting for your organism and data set combo could be processed by jellyfish in a reasonable time.

The other point from the benchmark article above is that IO can be a bottleneck. For that reason I would cluster/reorder the reads (individual paired FASTQs) using clumpify from BBMap and run some benchmarks maybe starting with one or few r1 FASTQ files.
I would hope that ordering reads with clumpify should improve lookup times for a given k-mer but this is likely algorithm dependent.

ADD COMMENT

Login before adding your answer.

Traffic: 1658 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6