Transposable elements search using ion torrent short single-end reads and assembled data of plant
1
0
Entering edit mode
9.2 years ago
mirza ▴ 180

Hello everyone,

I have ion torrent short single-end reads and assembled data (CLC genomics workbench) of a plant. Can anyone tell me

  1. I should use reads or assembled contigs
  2. What are the best tools for such data
  3. Links of How to use/ run the suggested software
clc transposon iontorrent • 3.8k views
ADD COMMENT
0
Entering edit mode

Hi,

Sorry for this late reply.

I have single end ion torrent data not paired end data, is it possible to run it in Transposome.

ADD REPLY
0
Entering edit mode

Yes, sorry I thought it was clear above (in my comment). The only thing that is different is changing the default overlaps for long reads. Otherwise, you can run it just the same as with paired-end data (and the results will be comparable).

ADD REPLY
0
Entering edit mode

Hello Staton,

I was trying to install your tool, Transposome using the commands given. I successfully installed all the dependencies, but am stuck at the last step of installation. I am copying the commands and error here. I'll appreciate your help,

smarla@smarla-HP-Z400-Workstation:~$ cd Transposome
smarla@smarla-HP-Z400-Workstation:~/Transposome$ perl Makefile.PL
g++ -o graph_binary.o -c graph_binary.cpp -ansi -O5 -Wall
g++ -o community.o -c community.cpp -ansi -O5 -Wall
g++ -o main_community.o -c main_community.cpp -ansi -O5 -Wall
g++ -o louvain_community graph_binary.o community.o main_community.o -ansi -lm -Wall
g++ -o graph.o -c graph.cpp -ansi -O5 -Wall
g++ -o main_convert.o -c main_convert.cpp -ansi -O5 -Wall
g++ -o louvain_convert graph.o main_convert.o -ansi -lm -Wall
g++ -o main_hierarchy.o -c main_hierarchy.cpp -ansi -O5 -Wall
g++ -o louvain_hierarchy main_hierarchy.o -ansi -lm -Wall
Generating a Unix-style Makefile
Writing Makefile for Transposome
Writing MYMETA.yml and MYMETA.json
smarla@smarla-HP-Z400-Workstation:~/Transposome$ make
Skip blib/lib/Transposome/SeqUtil.pm (unchanged)
Skip blib/lib/Transposome/Cluster.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Mapping.pm (unchanged)
Skip blib/lib/Transposome.pm (unchanged)
Skip blib/lib/Transposome/Test/TestFixture/TestConfig.pm (unchanged)
Skip blib/lib/Transposome/SeqIO.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Typemap.pm (unchanged)
Skip blib/lib/Transposome/Role/Config.pm (unchanged)
Skip blib/lib/Transposome/PairFinder.pm (unchanged)
Skip blib/lib/Transposome/SeqFactory.pm (unchanged)
Skip blib/lib/Transposome/Run/Blast.pm (unchanged)
Skip blib/lib/Transposome/SeqIO/fastq.pm (unchanged)
Skip blib/lib/Transposome/Annotation.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Methods.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Search.pm (unchanged)
Skip blib/lib/Transposome/Annotation/Summary.pm (unchanged)
Skip blib/lib/Transposome/Role/File.pm (unchanged)
Skip blib/lib/Transposome/SeqIO/fasta.pm (unchanged)
Skip blib/lib/Transposome/Role/Types.pm (unchanged)
Skip blib/lib/Transposome/Role/Util.pm (unchanged)
Skip blib/lib/Transposome/Test/TestFixture.pm (unchanged)
cp bin/louvain_community blib/bin/louvain_community
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/louvain_community
cp bin/formatdb blib/bin/formatdb
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/formatdb
cp bin/louvain_convert blib/bin/louvain_convert
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/louvain_convert
cp bin/transposome blib/bin/transposome
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/transposome
cp bin/mgblast blib/bin/mgblast
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/mgblast
cp bin/louvain_hierarchy blib/bin/louvain_hierarchy
"/usr/bin/perl" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/bin/louvain_hierarchy
Manifying 1 pod document
Manifying 21 pod documents
smarla@smarla-HP-Z400-Workstation:~/Transposome$ make test
PERL_DL_NONLAZY=1 "/usr/bin/perl" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/00-load.t ............. 7/13 # Testing Transposome 0.09.8, Perl 5.014002, /usr/bin/perl
t/00-load.t ............. ok
t/01-utils_config.t ..... ok
t/02-utils_seq.t ........ ok
t/03-utils_blast.t ...... ok
t/04-seqio.t ............ ok
t/05-seqio-fasta-fh.t ... ok
t/06-seqio-fastq-fh.t ... ok
t/07-seqstore.t ......... ok
t/08-seqsample.t ........ ok
t/09-megablast.t ........ ok
t/10-pairfinder.t ....... ok
t/11-cluster.t .......... ok
t/12-annotation.t ....... ok
t/13-allmethods.t ....... ok
t/14-analysis_steps.t ... ok
t/15-transposome_app.t .. ok
All tests successful.
Files=16, Tests=1052, 95 wallclock secs ( 0.17 usr  0.03 sys + 14.14 cusr  3.46 csys = 17.80 CPU)
Result: PASS
smarla@smarla-HP-Z400-Workstation:~/Transposome$ make install
Manifying 1 pod document
Manifying 21 pod documents
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
ERROR: Can't create '/usr/local/bin'
Do not have write permissions on '/usr/local/bin'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 at -e line 1.
make: *** [pure_site_install] Error 13
smarla@smarla-HP-Z400-Workstation:~/Transposome$
ADD REPLY
0
Entering edit mode

This is weird, nothing gets installed into /usr/local/bin or under /usr/local so I'm not sure. What OS/distribution are you using?

ADD REPLY
0
Entering edit mode

I am using Ubuntu 12.04 LTS

ADD REPLY
0
Entering edit mode

I tested on Ubuntu 12.04 and it does install under /usr/local on that system if you are using the system Perl. You just need to type "sudo make install" to install it. Otherwise, I suggest setting up perlbrew (it is very easy, and there are copy-and-paste commands to do it on the Transposome wiki under "installing dependencies") so you don't need admin to do anything with Perl.

ADD REPLY
0
Entering edit mode

Hi,

I used sudo make install and here is the result

Appending installation info to /usr/lib/perl/5.14/perllocal.pod
smarla@smarla-HP-Z400-Workstation:~/Transposome-0.09.7$

Its installed, thanks! :)

Now, I have the following questions,

  1. What should be the size of my input file?
  2. Whats a better strategy- using raw reads or contigs?
ADD REPLY
0
Entering edit mode
  1. I would start with 100,000 reads that are sampled from the whole data set. Then, you can increase after seeing how long the analysis will take.
  2. Definitely use raw reads.
ADD REPLY
0
Entering edit mode

Thanks SES, I appreciate.

ADD REPLY
0
Entering edit mode

Ok thanks!!

ADD REPLY
0
Entering edit mode
9.2 years ago
SES 8.6k

1. Use the raw reads.

2. That depends a little on what it is you want to do and the species, but RepeatExplorer and Transposome can be used.

3. RepeatExplorer and Transposome are a good place to start.

For some context, genome assembly is complicated by repeats and the regions that are typically missing, compressed, or misassembled are the repetitive regions. Therefore, you don't want to do a low-coverage assembly to look for repeats. If you have a high-quality draft supported by genetic and physical maps, cytogenetic data, etc. then use the assembly. If not, you are going to be telling lies!

RepeatExplorer and Transposome (developed by myself) were both designed around solving problems with plant genomes, so this is an ideal use case. RepeatExplorer underestimates the repeat abundance (sometimes by a lot), so this is something important to consider if you are thinking of making a biological or evolutionary study. On the other hand, it may be easier to use (web vs. command line) depending on your background, albeit much slower. I don't have experience running RepeatExplorer with single-end data, but I can tell you that Transposome seems to do better, in terms of biological expectations, with long reads including single-end, so this should work well. If you have any questions, feel free to ask.

ADD COMMENT
0
Entering edit mode

Has anyone developed a k-mer based approach to estimating TE abundance as % of the genome?

ADD REPLY
0
Entering edit mode

Yes, you can use k-mer frequency to look at repeat properties in the genome (uniqueness, occurrence ratios, etc.) but this is only really informative (biologically) when combined with information about what TEs are in a genome. Without that, you can't say a whole lot based on k-mers alone. This approach is super useful for comparative purposes though, and for visualizing the genomic abundance of repeats (in Fig. 2 of this paper I did this to show the genomic abundance of repeats in a BAC).

ADD REPLY
0
Entering edit mode

Thanks SES, for your reply. I did went through your tool's home page and it says that its for paired end reads??!!

ADD REPLY
0
Entering edit mode

If you are referring to Transposome, see this page for how to use long read data (those directions should be good for Ion Torrent data assuming read lengths ~400bp).

ADD REPLY

Login before adding your answer.

Traffic: 1352 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6