Entering edit mode
5.7 years ago
genya35
▴
50
Hello,
I have two Illumina fastq files (read 1 and read 2) that contain reads from three different frameworks. Each framework has a different sequence length distribution from the other two. Are there any existing tools that can separate each framework into a separate file based on the sequence length distribution? Do you have any other suggestions how to separate reads from three different frameworks?
Thanks
What does
frameworks
exactly mean? Different sequencing platforms or the same platform but different sequencers/run types (e.g. Illumina 2 x 100bp, 2 x 50 bp etc.)?You could look at using
reformat.sh
from BBMap suite withminlength=
andmaxlength=
options if all the reads are the same length (but each with a distinct value) for threeplatforms
.the same platforms but different primers for each framework, (Miseq Illumina, average length of read is 168 after joining the two reads.) I will not know the exact range of size for each framework but the sequence size from the three frameworks will not overlap. I was hoping to find a tool that would group based on sequence length using statistical methods. Thanks
Still not clear about what you mean by
framework
. If the sizes are distinct then you may be able to use the program above or may need to roll your own custom solution that just looks at the length of reads and bins them.