Hi all,
I want to align the coding sequence of a gene (which I cropped out from hg38) to genomic read contigs of a non-human ape. All contigs are in a single fasta file (2.8 GB) and the length of the coding sequence that I want to align to it is ~400 bp. I've been using Minimap2, here is my command:
minimap2 -ax map-ont ape_genome_contigs.fasta human_gene_coding_seq.fa > aln.sam
And here is the output of this command:
goekberk@bonobo:/disk1/goekberk$ minimap2/minimap2 -ax map-ont ape_genome_contigs.fasta human_gene_coding_seq.fa.fa > aln.sam
[M::main::11.782*1.00] loaded/built the index for 2771 target sequence(s)
[M::mm_mapopt_update::13.971*1.00] mid_occ = 630
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 2771
[M::mm_idx_stat::15.343*1.00] distinct minimizers: 99452876 (39.18% are singletons); average occurrences: 5.363; average spacing: 5.304
[M::main] Version: 2.17-r943-dirty
[M::main] CMD: minimap2/minimap2 -ax map-ont ape_genome_contigs.fasta human_gene_coding_seq.fa.fa
[M::main] Real time: 16.021 sec; CPU: 16.022 sec; Peak RSS: 7.437 GB
However, for some reason, sam files that I generate have 0 aligments.
$ samtools view -c aln.sam
0
I tried map-pb
and asm10
options as well as minimap2 -a [-x preset] target.mmi query.fa > output.sam
command but could not manage to have a successful alignment. I was wondering if I'm missing something crucial about the long-read alignments to a set of genomic contigs. Any help is much appreciated.
Cheers, Gökberk
Have you tried using
blast+
?I tried that but since the genome is about 2.8 GB, I could not manage to upload it and received errors even though I tried multiple times.
You should try this with a local install of
blast+
. Evenblat
may be worth a try. No indexing needed.