I have three TSV files, each with two columns. File one has gene name and effect, another probe and effect, and the third gene and probe. The first is by far the largest, and I want to visualize the full network in Cytoscape. Now, the first file has about 10M edges, and Cytoscape has trouble loading that.
However, the other files are much smaller, and in particular, the number of probes is relatively small. So, I am seeking a command line tool that can filter only lines (interactions) from the geneName-effect file for which the gene is found in the third file, and the effect in the second.
What command line set up can I use for this, or do I need to hack something up for this in a Perl, Python, or Groovy?
How would I do that? I have no sqlite3 experience... can that easily import TSV files then?