say you have a text file containing a list of organisms:
$ cat input.txt
Homo Sapiens
Drosophila melanogaster
Canis lupus familiaris
Escherichia coli
the following bash script send some request with curl and extract the distance with xmllint/xpath
#!/bin/bash
IFS="
"
cat input.txt | tr " " "+" | while read O1
do
cat input.txt | tr " " "+" | while read O2
do
if [[ "${O1}" < "${O2}" ]]
then
curl -s "http://timetree.org/index.php?taxon_a=${O1}&taxon_b=${O2}&submit=Search" |\
xmllint --html --format --xpath 'concat("insert into SPECIES(org1,org2,dist) values (__QUOTE____A____QUOTE__,__QUOTE____B____QUOTE__,__QUOTE__",normalize-space(//span[@class="panel year block"][h1]),"__QUOTE__);#")' - 2> /dev/null |\
tr "#" "\n" |
sed -e "s/__A__/${O1}/g" |
sed -e "s/__B__/${O2}/g" |
sed -e "s/__QUOTE__/'/g" |
tr "+" " "
fi
done
done
Result:
~$ bash organisms.sh
insert into SPECIES(org1,org2,dist) values ('Drosophila melanogaster','Homo Sapiens','782.7 Million Years Ago');
insert into SPECIES(org1,org2,dist) values ('Drosophila melanogaster','Escherichia coli','2535.8 Million Years Ago');
insert into SPECIES(org1,org2,dist) values ('Canis lupus familiaris','Homo Sapiens','94.2 Million Years Ago');
insert into SPECIES(org1,org2,dist) values ('Canis lupus familiaris','Drosophila melanogaster','782.7 Million Years Ago');
insert into SPECIES(org1,org2,dist) values ('Canis lupus familiaris','Escherichia coli','2535.8 Million Years Ago');
insert into SPECIES(org1,org2,dist) values ('Escherichia coli','Homo Sapiens','2535.8 Million Years Ago');
Any chance you have or know of a new solution to this problem? Would love to get some of the data off the site.
No, sorry. I stopped using timetree.org since without the allowance to extract data automatically is of little use in science. Just a curiosity to show to friends in the phone.
You can give it a try to DateLife.org (see last response). It didn't worked for me and I don't know if it's still on development. Test it and report your results!