whole genome analysis
1
0
Entering edit mode
8.3 years ago
reza ▴ 300

I am beginner in bioinformatics and need to help. I have one set data of whole genome sequenced by illumine including 4 files named HI*001.index_1.80_R1, 001.index_1.80_R2 and 002.index_180_R1 , 002.index_180_R2. I do my analysis whit CLC and received paired distance rang 152-630 and 162-630 for tow set of data (I merge 001.R1 and R2 together also 00.2.R1 and R2). 1. What is the meaning of paired distance? It is insert size? Reported insert sizes in papers are one value (e.g 520 kbp). How can I estimate insert size with CLC? 2. what is the meaning of 001. R1,R2 and 002.R1,R2? I have two sequenced whole genome with two different insert size for my individual?

sequencing next-gen Assembly • 1.5k views
ADD COMMENT
0
Entering edit mode
8.3 years ago
Fabio Marroni ★ 3.0k

Yes, paired distance should be insert size. You only have to check, because insert size is usually computed including read length, while I am not sure if paired distance also includes read length. The insert size is a distribution. In papers it is possible that they only give the mode, the median or the average of the insert size. R1 and R2 are read1 and read2. The number is usually associated with the lane of sequencing. The meaning of 001. R1,R2 is that they probably ran the same library in two different sequencing lanes (the insert size are indeed very similar, so they are probably from the same library).

ADD COMMENT
0
Entering edit mode

thanks Fabio for your attention and help, in your opinion, why illumina gave me 4 files data (with same library, approximately) instead 2 files? 2 files (001.R1 and 001.R2) is not enough for my assembly and downstream analysis? 4 files from same library help me for high accuracy?

ADD REPLY
0
Entering edit mode

I think it's just a technical issue. Several samples are sequenced in at least two lanes (to reach high sequencing amount), and each lane generate two files (R1 and R2). They are returned independently just because this is how the instrument produces them. However, it is also useful to have them separate to check the properties of both. All the software can handle multiple files, so it's not a problem.

ADD REPLY

Login before adding your answer.

Traffic: 1832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6