Recently, I got two easy and handy online tools for genomic analysis and data visualization. When you process a large scale of sequencing data, there are two to bring you portable approaches and get quick results with no need to deploy your PC. I'd like to share with you and the usage is not complicated at all to get.
**
1. getorf - ORF prediction
**
getorf is an (online) software tool that finds and extracts open reading frames (ORFs) present in any sequence that a user can feed as an input. It is a command line program from EMBOSS (the European Molecular Biology Open Software Suite), and a part of Nucleic: Gene finding command groups.
getorf can work with either a single or multiple nucleotide sequence. The program takes a standard EMBOSS sequence query (also known as ‘USA’) as an input which mainly includes srs:embl, srs:uniprot and esembl, as defined in EMBOSS installations. Alternatively, data can also be read from sequences written by an EMBOSS or other third-party application as long as it is a supported format. The format of the input can be specified using command-line qualifier – sformat ‘xxx’, where ‘xxx’ is replaced by the format name. Formats that are available are: gff (gff3), gff2, embl (em), genbank (gb, refseq), ddbj, refseqp, pir (nbrf), swissprot (swiss, sw), dasgff and debug.
Once the input format and sequence are defined, then the tool can be used in a pretty straight-forward way from the interface, as below. The search can customize by different parameters such as the organism, minimum/maximum size of the ORF to report, the type of sequence (circular/linear), number of flanking nucleotides to report. Additionally, the type of output and its format can also be defined according to the need. When dealing with relatively larger sequences, one can opt to receive the results in email as well1.
Figure 1 Screenshot of getorf interface
There are three ways to upload the sequences. After sequence input, click ‘Run getorf’ and you will get output like the sequence shown below. And finally copy and save the results and you can use blastp to do the successive analysis.
Figure 2 My Output example
**
2. jvenn - Venn diagram drawing
**
Venn diagrams in biology enable to show the differences between gene lists that originate from different analyses, as display comparison. However, when the number of lists to be compared is more than four, the diagram can be cumbersome to read and analyze. To solve this exact shortcoming seen in the existing software packages, jvenn tool has been developed as a JQuery plugin.
jvenn can handle up to six different input lists, using either classical or Edwards-Venn layouts. The user inputs can be easily customized and controlled for output optimization, and one can easily embed the jvenn in a webpage, making it more dynamic. The jvenn library packages comes with full documentation in figure of .png/.svg or statistic data in .csv and examples, and it is available freely at http://bioinfo.genotoul.fr/jvenn. Another feature of jvenn that makes it more desirable is that it does not need any local installation. Statistics charts incorporated in jvenn allow for a simple and quick overview of the sizes of different lists as well their overlaps, giving it an edge over other comparable packages available so far.
Three different input formats are accepted by the jvenn library: “lists”, “Intersection counts” and “count lists”. The lists are composed of different elements which are given in the fields “data”. For “intersection counts”, the lists are labelled (‘A’ or ‘B’), used to correspond between the list and its count. “Count lists” give a count number for individual elements in the list. All of this is intuitively easy to read as for example, when the user points at an intersection count, all list(s) sharing the intersection are highlighted.
Figure 1 My Example from jvenn
Hence, jvenn provides a user-friendly interface with enhanced readability for different analytical visualizations, for example OTUs (operational taxonomic units). Jvenn makes it much easier for bioinformatic analysis to profile the counts of common and unique differential genes in multiple groups.
Sourced from Novogene Website: