Question

What should be the minimum 454 read (single-end) length for de novo assembly run

0

Entering edit mode

10.4 years ago

muralis.bio • 0

Hi,
I have 454 (single-end) data with 20x coverage for a plant genome with 1.2 Gb size. Can anyone give me pointer on what should be the minimum read length i should use for a de novo assembly run by Celera Caboge

Thanks in advance...

Assembly • 3.3k views

ADD COMMENT • link updated 10.4 years ago by rtliu ★ 2.2k • written 10.4 years ago by muralis.bio • 0

1

Entering edit mode

It will be difficult to tell without looking at read length distribution.

ADD REPLY • link 10.4 years ago by GouthamAtla 12k

0

Entering edit mode

following is my read length distribution...

greater than (>) small or equalto <= No of seq 0 60 163078 60 120 1320262 120 180 1128226 180 240 1101897 240 300 1257487 300 400 3928630 400 500 4911455 500 600 2488924 600 700 2936272 700 800 3610518 800 900 4271892 900 1000 5989034 1000 1100 5135502 1100 1200 513789 1200 1300 11654 1300 1400 329 1400 1500 102 1500 1600 75 1600 1700 88 1700 1800 196

ADD REPLY • link 10.4 years ago by muralis.bio • 0

0

Entering edit mode

A histogram would be good to represent read length distributions. In general, you need to select a k-mer value while assembling the reads, hence remove the reads that are smaller than twice the k-mer length. But you need to try multiple times to get the best assembly.

ADD REPLY • link 10.4 years ago by GouthamAtla 12k

Ram · Accepted Answer · 2014-11-24

0

Entering edit mode

10.4 years ago

rtliu ★ 2.2k

It seems this information was hidden in The CA minimum read length remains 64bp.

ADD COMMENT • link updated 5.5 years ago by Ram 45k • written 10.4 years ago by rtliu ★ 2.2k

0

Entering edit mode

Hi Here is my read length distribution Sorry i could not get an weblink based image Can i get some idea of what should should be the min read length now..! Regards