Question

Align short sequence against ONT reads

0

Entering edit mode

2.4 years ago

kirillkirilenko ▴ 40

Hi there! I have many .fq files contained long reads (I got them with ONT MiniON). Also I have a .fasta file contains a specific short sequence (exon from different organism). I want to get if this short sequence aligns against these long reads (is there orthology between organisms?). I do not know what aligner to use, I used to work with short reads and usually I aligned them against a large reference genome (with bwa mem e.g.). In the beginning I do not want to assemble .fastq files. Thanks in advance!

long-reads ONT alignment genome • 1.4k views

ADD COMMENT • link updated 8 months ago by Ram 44k • written 2.4 years ago by kirillkirilenko ▴ 40

0

Entering edit mode

Are the long reads very different from each other (e.g. covering different regions of the genome) or do they all represent the same (more or less) region? Why don't you align the long reads against the genome of the organism from which you've derived the (one?) exon sequence?

ADD REPLY • link 2.4 years ago by Friederike 9.0k

score 0 · Answer 1 · 2022-06-27

0

Entering edit mode

2.4 years ago

colindaven 7.0k

Very weird question, but ....

just use the gene/exon as a reference sequence
use minimap2 to align the ONT reads against this

Not sure if minimap2 will work well if the reads are so much longer than a (tiny) exon reference sequence, but you can try it.

ADD COMMENT • link 2.4 years ago by colindaven 7.0k

0

Entering edit mode

It won't work because minimap2 aligns against a large reference sequence. What I want to do is to align each read in .fastq files against short reference sequence (exon)

ADD REPLY • link 2.4 years ago by kirillkirilenko ▴ 40

1

Entering edit mode

ONT reads can have a lot of errors and if the exon is to short the following solution might not work for the noisy reads.

Use shred.sh from BBMap to generate 300 bp fragments from the ONT reads:

shred.sh in=ONT.fq out=ONT_frag.fq length=300 # ONT_frag.fq should retain the read header similar to the original but with a small modification

Example:

@NB501138:291:H7FCVBGXH:1:11101:15246:1057 1:N:0:1 # original header before shred

@NB501138:291:H7FCVBGXH:1:11101:15246:1057 1:N:0:1_0-19 # header after shred 

@NB501138:291:H7FCVBGXH:1:11101:15246:1057 1:N:0:1_20-39 # header after shred 

@NB501138:291:H7FCVBGXH:1:11101:15246:1057 1:N:0:1_40-59 # header after shred

Use bbduk.sh to identify the 300 bp fragment with the exon

bbduk.sh in=ONT_frag.fq outm=ONT_exon_frag.fq k=31 ref=exon.fasta # the headers in ONT_exone_frag.fq should tell you which ONT reads have the exon