Entering edit mode
23 months ago
Apex92
▴
320
Dear all,
I have a condition where I need to use all sequencing reads (a concatenated fasta file) as a reference. The concatenated fasta file has 50,829,402 reads. I tried to use bowtie 1 to build the index of the reference as bowtie-build concatenated.fasta ref
but I get the following error.
Error: Reference sequence has more than 2^32-1 characters! Please divide the
reference into batches or chunks of about 3.6 billion characters or less each
and index each independently.
How can I solve this? Instead of running bowtie-build on a concatenated fasta file can I use bowtie-build -f *.fasta ref
?
Preferably I need to use bowtie 1.
Thanks.
May I ask why you do that? I cannot imagine any situation where concatenating reads (which is randomly fragmentated DNA after all) would make any sense.
These are sequenced PEA products that I need to check with the expected PEA sequences. I tried to map reads which are long (due to UMI and primers) to the expected PEA sequences (shorter) allowing zero mismatches but I got no alignment that makes sense. Thus I decided to do it the other way around (to map the expected PEA sequences to the sequenced PEA products).
What is PEA? I think it would make sense to include a layout and brief description (technically) of what you did and how the R1/R2 structure and reference looks like and then one can suggest an alignment strategy. This so far sounds quite non-standard.
https://olink.com/our-platform/our-pea-technology/
Perhaps the kit includes access to data analysis software? https://olink.com/our-platform/our-pea-technology/data-generation-and-qc/
It is not clear if it is possible to analyze the data using standard software.
Yes, use their software if posssible or contact customer support. This is no standard assay it seems.