Hello Biostars,
I was trying out a new ( relatively ) software proTrac to map piRNA clusters. THe software calls for ELAND formatted files. In my quest for conversion tools I found pyicos to convert SAM to eland but the eland output had many fields than specified in the document for proTrac.Looks like that is a different version of ELAND format. However the bottom line is the format needed for input files into proTrac needs to have (shown below) as indicated in the proTrac documentation(shown below).
My question is :
Does anyone have any experience using this software? or have converted SAM to ELAND3 using a tool? or any suggestions helps :)
Thanks:)
Geneart.
4. Input file proTRAC uses a list of mapped sequence reads (ELAND3) generated by the SeqMap mapping tool (Jiang, H., Wong, W.H. (2008) SeqMap: Mapping Massive Amount of Oligonucleotides to the Genome, Bioinformatics, 24(20)). SeqMap is freely available at http://www-personal.umich.edu/~jianghui/seqmap/. Map your sequence dataset in FASTA-format to a genome of your choice. Many genomes are available at ftp://ftp.ncbi.nih.gov/genomes/. To obtain the correct output format, run SeqMap with the option /output_all_matches. Use the generated output file without any changes as input file for proTRAC. If your sequence dataset contains transcriptional information (a non-redundant FASTA file where each FASTA title refers to the number of identical sequence reads),
1 ATGGCTCGACTCGCGATAC 45 TGGCTTTATTGCGCTTTTAACA 12 ATTCGCTAACGGGCGAAAAG
this information can be used to display different transcription rates within one cluster, since FASTA titles are saved and can be extracted from the SeqMap output file:
trans_id trans_coord target_seq probe_id probe_seq num_mismatch strand Chr1 10368 ATGGCTCGACTCGCGATAC 1 ATGGCTCGACTCGCGATAC 0 - Chr1 44754 ATTCGCTAACGGGCGAAAAG 12 ATTCGCTAACGGGCGAAAAG 0 - Chr1 56834 TGGCTTTATTGCGCTTTTAACA 45 TGGCTTTATTGCGCTTTTAACA 0 - Chr1 96823 ATTCGCTAACGGGCGAAAAG 12 ATTCGCTAACGGGCGAAAAG 0 -