Entering edit mode
4.7 years ago
eennadi
▴
40
Hello,
I have been trying to find repeats using Repeatmodeler. It kept dying at a particular point, I have had to install afresh severally with no success.
Find below the log file
Building database mucdb:
Reading ~/mysseq.fasta...
Number of sequences (bp) added to database: 546 ( 489829010 bp )
RepeatModeler Version 2.0.1
Search Engine = rmblast 2.10.0+
Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker
LTR Structural Analysis: Enabled ( GenomeTools 1.6.1, LTR_Retriever v2.8.5,
Ninja 0.95-cluster_only, MAFFT 7.455,
CD-HIT 4.8.1 )
Random Number Seed: 1585277998
Database = mucadb .
Sequences = 546
Bases = 489829010
N50 = 3469483
Contig Histogram:
Size(bp) Count
15019389-16092152 | [ 1 ]
13946626-15019388 | [ ]
12873863-13946625 | [ ]
11801100-12873862 | [ 1 ]
10728338-11801100 | [ 2 ]
9655575-10728337 | [ 1 ]
8582812-9655574 | [ 5 ]
7510049-8582811 | [ 1 ]
6437286-7510048 | [ 5 ]
5364524-6437286 | [ 3 ]
4291761-5364523 | [ 7 ]
3218998-4291760 |* [ 13 ]
2146235-3218997 |*** [ 31 ]
1073472-2146234 |***** [ 45 ]
710-1073472 |************************************************** [ 430 ]
Using output directory = ~/maker/repeat/RM_20957.FriMar270400042020
Storage Throughput = poor ( 234.38 MB/s )
RepeatModeler Round # 1
Searching for Repeats
-- Sampling from the database...
Gathering up to 40000000 bp
Final Sample Size = 40033332 bp ( 40033332 non ambiguous )
Num Contigs Represented = 230
Sequence extraction : 00:00:05 (hh:mm:ss) Elapsed Time
-- Running RepeatScout on the sequences...
RepeatScout: Running build_lmer_table ( l = 14 )..
RepeatScout: Running RepeatScout.. : 1307 raw families identified
RepeatScout: Running filtering stage.. 1073 families remaining
RepeatScout: 00:11:02 (hh:mm:ss) Elapsed Time
Large Satellite Filtering.. : 2 found in 00:00:13 (hh:mm:ss) Elapsed Time
Collecting repeat instances...
-- Refining Family R=146 / 0 ( RS Elements: 2286, Using 100 )
-- Refining Family R=233 / 1 ( RS Elements: 1970, Using 100 )
-- Refining Family R=1 / 2 ( RS Elements: 1947, Using 100 )
-- Refining Family R=212 / 3 ( RS Elements: 1896, Using 100 )
-- Refining Family R=37 / 4 ( RS Elements: 1761, Using 100 )
-- Refining Family R=209 / 5 ( RS Elements: 1758, Using 100 )
-- Refining Family R=25 / 6 ( RS Elements: 1754, Using 100 )
-- Refining Family R=429 / 691 ( RS Elements: 16, Using 16 )
Family Refinement: 02:37:38 (hh:mm:ss) Elapsed Time
RepeatModeler Round # 2
Searching for Repeats
-- Sampling from the database...
Gathering up to 3000000 bp
Sequence extraction : 00:00:00 (hh:mm:ss) Elapsed Time
-- Running TRFMask on the sequence...
360 Tandem Repeats Masked
TRFMask time 00:00:35 (hh:mm:ss) Elapsed Time
-- Masking repeats from the previous rounds...
Masking 1 - 5 of 78
Masking 16 - 30 of 78
Masking 41 - 65 of 78
Masking 76 - 78 of 78
TE Masking time 00:00:23 (hh:mm:ss) Elapsed Time
-- Sample Stats:
Sample Size 3011011 bp
Num Contigs Represented = 53
Non ambiguous bp:
Initial: 3011011 bp
After Masking: 2607822 bp
Masked: 13.39 %
-- Input Database Coverage: 3011011 bp out of 489829010 bp ( 0.61 % )
Sampling Time: 00:00:58 (hh:mm:ss) Elapsed Time
Running all-by-other comparisons...
2% completed, 00:2:13 (hh:mm:ss) est. time remaining.
3% completed, 00:1:12 (hh:mm:ss) est. time remaining.
6% completed, 00:1:53 (hh:mm:ss) est. time remaining.
9% completed, 00:1:50 (hh:mm:ss) est. time remaining.
11% completed, 00:1:26 (hh:mm:ss) est. time remaining.
12% completed, 00:1:28 (hh:mm:ss) est. time remaining.
14% completed, 00:1:19 (hh:mm:ss) est. time remaining.
17% completed, 00:1:12 (hh:mm:ss) est. time remaining.
18% completed, 00:1:11 (hh:mm:ss) est. time remaining.
19% completed, 00:1:11 (hh:mm:ss) est. time remaining.
19% completed, 00:1:08 (hh:mm:ss) est. time remaining.
21% completed, 00:1:07 (hh:mm:ss) est. time remaining.
23% completed, 00:0:58 (hh:mm:ss) est. time remaining.
24% completed, 00:0:57 (hh:mm:ss) est. time remaining.
26% completed, 00:0:52 (hh:mm:ss) est. time remaining.
everything is fine in this log, what is your problem?