Biological Databases Geographical Distribution
1
0
Entering edit mode
7.6 years ago

Hi,

I am interested to know how biological databases have been distributed worldwide? I think most of them are located in US and Europe. Am I right? Any other important country involved?

Biological databses geographical distribution • 2.2k views
ADD COMMENT
3
Entering edit mode
7.6 years ago

If it helps, using my tools http://lindenb.github.io/jvarkit/XsltStream.html and http://lindenb.github.io/jvarkit/PubmedDump.html I've extracted the affiliation of the first author of NAR database issue 2015.

java -jar dist/pubmeddump.jar '"nucleic acids research"[JOURNAL] AND "Database Issue"[issue] and 2015[PDAT] ' | java -jar dist/xsltstream.jar -t biostar270498.xsl -n PubmedArticle
<?xml version='1.0' encoding="UTF-8" ?>
<xsl:stylesheet
xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
version='1.0'
>
<xsl:output method="text" />
<xsl:template match="/">
<xsl:apply-templates select="PubmedArticle"/>
</xsl:template>
<xsl:template match="PubmedArticle">
<xsl:apply-templates select="MedlineCitation/Article/AuthorList/Author[1]"/>
</xsl:template>
<xsl:template match="Author">
<xsl:value-of select="../../../PMID"/>
<xsl:text> </xsl:text>
<xsl:value-of select="AffiliationInfo/Affiliation/text()"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
PMID Author/Affiliation
25593349 CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1211 Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland, 1211 pascale.gaudet@isb-sib.ch.
25593348 Wellcome Trust Sanger Institute, Hinxton Cambridge, CB10 1SA, UK.
25593347 National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA nardatabase@gmail.com.
25514926 Cell Signaling Technology, 3 Trask Lane, Danvers, MA 01923, USA phornbeck@cellsignal.com.
25510499 Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA.
25510495 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A 8600 Rockville Pike, Bethesda, MD 20894, USA. tatiana@ncbi.nlm.nih.gov.
25477388 Center for the Science of Therapeutics, Broad Institute, 415 Main Street, Cambridge, MA 02142, USA.
25477381 DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan.
25429973 T.T.Chang Genetic Resources Center, IRRI, Los Baños, Laguna 4031, Philippines n.alexandrov@irri.org.
25429972 Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, SE-17121 Solna, Sweden erik.sonnhammer@scilifelab.se.
25428375 RCSB Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA pwrose@ucsd.edu.
25428374 Center for Biomolecular Science and Engineering, CBSE, UC Santa Cruz, 1156 High Street, Santa Cruz, CA 95064, USA kate@soe.ucsc.edu.
25428371 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
25428369
25428365 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 2094, USA.
25428363 Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada.
25428362 Boyce Thompson Institute for Plant Research, Ithaca, NY 14853, USA.
25428361 Cardiovascular Epidemiology and Human Genomics Branch, National Heart, Lung, and Blood Institute, The Framingham Heart Study, Framingham, MA 01702, USA.
25428358 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA jamesbr@ncbi.nlm.nih.gov.
25428357 Institute of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland Swiss Institute of Bioinformatics, 8057 Zurich, Switzerland Center of Growth, Metabolism, and Aging, Key Laboratory of Bio-Resources and Eco-Environment, College of Life Sciences, Sichuan University, Chengdu 610064, Sichuan, China haoyang.cai@gmail.com.
25428351 Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland evgenia.kriventseva@unige.ch.
25428349 McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA joanna@peas.welch.jhu.edu.
25416803 DIANA-Lab, Department of Electrical & Computer Engineering, University of Thessaly, 382 21 Volos, Greece Laboratory for Experimental Surgery and Surgical Research 'N.S. Christeas', Medical School of Athens, University of Athens, 11527 Athens, Greece arhatzig@uth.gr dalamag@imis.athena-innovation.gr ivlachos@lessr.eu.
25416797 Computational RNA Biology Group, Max Planck Institute for Biology of Ageing, 50931 Cologne, Germany.
25414358 University of Münster, Faculty of Medicine, Institute of Bioinformatics, Niels-Stensen Strasse 14, 48149 Münster, Germany.
25414356 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bldg. 38 A, Room 8N805, 8600 Rockville Pike, Bethesda, MD 20894, USA bauer@ncbi.nlm.nih.gov.
25414355 Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA.
25414353 Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850, P.R.China.
25414350 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA.
25414348 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK viji@ebi.ac.uk.
25414346 Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain.
25414345 Computer Science, University of Bristol, Bristol, BS8 1UB, UK Matt.Oates@bristol.ac.uk.
25414341 Anthony Nolan Research Institute, Hampstead, London, NW3 2QG, UK UCL Cancer Institute, University College London, Hampstead, London, NW3 2QG, UK.
25414340 Department of Plant Biology and Crop Science, Rothamsted Research, Harpenden, Herts, AL5 2JQ, UK martin.urban@rothamsted.ac.uk.
25414339 Bioinformatics and Drug Design Group, Department of Pharmacy, and Center for Computational Science and Engineering, National University of Singapore, Singapore 117543 State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Chengdu 610041, China Computational and Systems Biology, Singapore-MIT Alliance, National University of Singapore, Singapore.
25414335 Department of Biomedical Engineering and the Center for Biological Systems Engineering, Washington University, St Louis, MO 63130, USA.
25414328
25414324 Plant Genomics, J. Craig Venter Institute, Rockville, MD 20850, USA vkrishna@jcvi.org.
25414323 Institute of Integrative Biology, University of Liverpool, Liverpool, UK Center for Biomedical Research, Faculty of Medicine, Autonomous University of Coahuila, Torreon, Mexico.
25404137 EMBL/CRG Research Unit in Systems Biology, Centre for Genomic Regulation (CRG), 08003 Barcelona, Spain Bioinformatics Core Facility, Centre for Genomic Regulation (CRG), 08003 Barcelona, Spain.
25404132 Laboratory for Conservation and Utilization of Bioresource & Key Laboratory for Microbial Resources of the Ministry of Education, Yunnan University, Kunming 650091, China State Key Laboratory of Genetic Resources and Evolution, and Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China Kunming College of Life Science, University of Chinese Academy of Sciences,Kunming 650204, China.
25404130 European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK nicole@ebi.ac.uk.
25404129 Laboratory of Bioinformatics, National Institute of Biomedical Innovation (NIBIO), 7-6-8 Saito-Asagi, Ibaraki, Osaka 567-0085, Japan Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan k-tomii@aist.go.jp.
25404128 Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101.
25399423 Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan.
25399422 School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China.
25399418 University College London, Gower Street, London WC1E 6BT, UK Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zurich, Switzerland ETH Zurich, Computer Science, Universitätstr. 6, 8092 Zurich, Switzerland.
25399417 CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
25399415 Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara 630-0101, Japan.
25398906
25398905 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA federhen@ncbi.nlm.nih.gov.
25398903 State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China.
25398902 Research Center for Tumor Medical Science, China Medical University, Taichung 40402, Taiwan.
25398901 Department of Biological Science and Technology, National Chiao Tung University, Hsin-Chu 300, Taiwan Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu 300, Taiwan.
25398900 Laboratory of Genome Informatics, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan Data Integration and Analysis Facility, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan uchiyama@nibb.ac.jp.
25398898 Bioinformatics Core Laboratory, Chang Gung University, Taoyuan 333, Taiwan Molecular Medicine Research Center, Chang Gung University, Taoyuan 333, Taiwan.
25398897 Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
25398896 The Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA dossantos@morgan.harvard.edu.
25392426 The Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA.
25392425 HHMI Janelia Farm Research Campus, Ashburn, VA, USA.
25392424 'Momentum' Membrane Protein Bioinformatics Research Group, Institute of Enzymology, RCNS, HAS, Budapest PO Box 7, H-1518, Hungary.
25392422 Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22901, USA.
25392421 Department of Biomedical Engineering, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China.
25392420 Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai 980-8679, Japan.
25392419 Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India.
25392418 CEITEC-Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno, Czech Republic National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic Faculty of Informatics, Masaryk University Brno, Botanická 68a, 602 00 Brno, Czech Republic.
25392417 College of Computer Science and Technology, Guizhou University, Guiyang, Guizhou 550025, P.R. China.
25392416 The center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, 500 Dongchuan Road, Shanghai 200241, China.
25392415 Bioinformatics and Systems Biology Program, Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA.
25392413 UMR Résistance des Plantes aux Bioagresseurs (RPB), Institut de Recherche pour le Développement (IRD), BP 64501, 34394 Montpellier Cedex 5, France alexis.dereeper@ird.fr.
25392412 Addgene, Cambridge, MA 02139, USA joanne.kamens@addgene.org.
25392411 Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA Program in Molecular and Cell Biology and Genetics, Drexel University College of Medicine, 245 N. 15th St. Philadelphia, PA 19102, USA.
25392410 The University of Queensland Diamantina Institute, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia.
25392409 INRA, Unité de Recherche en Génomique Végétale, UMR 1165, ERL CNRS 8196, Saclay Plant Sciences, CP 5708, F-91057 Evry, France UEVE, Unité de Recherche en Génomique Végétale, UMR 1165, ERL CNRS 8196, Saclay Plant Sciences, CP 5708, F-91057 Evry, France.
25392408 Center for Biomolecular Science and Engineering, University of California at Santa Cruz, Santa Cruz, CA 95064, USA mary@soe.ucsc.edu.
25392407 Systems Biology Program, Centro Nacional de Biotecnología (CNB-CSIC), 28049 Cantoblanco-Madrid, Spain.
25392406 PRABI, Rhône Alpes Bioinformatics Center, UCBL, Lyon1, Université de Lyon, Lyon, France.
25392405 Department of Microbiology, University of Washington, Seattle, WA 98109, USA Washington National Primate Research Center, Seattle, WA 98109, USA.
25378343 Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland.
25378341 Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, 37085, Germany mako@nmr.mpibpc.mpg.de.
25378340
25378338 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA.
25378337 State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China School of Basic Medical Sciences, Beijing University of Chinese Medicine, Beijing 100029, People's Republic of China.
25378336 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK huntley@ebi.ac.uk.
25378335 School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China.
25378330 Department of Medicinal Chemistry, University of Michigan, 428 Church St, Ann Arbor, MI 48109-1065, USA.
25378329 UMR 5086 CNRS - Université Lyon 1, 69367 Lyon Cedex 07, France.
25378328 EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr. Aiguader 88, 08003 Barcelona, Spain Universitat Pompeu Fabra (UPF), Dr. Aiguader 88, 08003 Barcelona, Spain Theoretical Biophysics, Humboldt-Universitt zu Berlin, Invalidenstr 42, 10115 Berlin, Germany guglielmo.roma@crg.eu luis.serrano@crg.eu.
25378326 Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS, Paris F-75005, France Inserm U1024, Paris F-75005, France CNRS, UMR 8197, Paris F-75005, France alouis@biologie.ens.fr.
25378322 Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA Bioinformatics Graduate Program, Northeastern University, Boston, MA 02115, USA.
25378319 Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark.
25378318 Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan.
25378316 IMGT, the international ImMunoGeneTics information system, Université de Montpellier, Laboratoire d'ImmunoGénétique Moléculaire LIGM, UPR CNRS 1142, Institut de Génétique Humaine IGH, 141 rue de la Cardonille, Montpellier, 34396 cedex 5, France Marie-Paule.Lefranc@igh.cnrs.fr.
25378313 Center for Medical Genetics, Ghent University, Ghent 9000, Belgium.
25378312 Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada National Institute for Nanotechnology, 11421 Saskatchewan Drive, Edmonton, AB T6G 2M9, Canada david.wishart@ualberta.ca.
25378311 Sandia National Laboratories, Department of Systems Biology, Livermore, CA 94551, USA.
25378310 Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Langer Kamp 19 B, D-38106 Braunschweig, Germany.
25378308 New England Biolabs, Ipswich, MA 01938, USA roberts@neb.com.
25378306 CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, P. R. China Stem Cell Laboratory, UCL Cancer Institute, University College London, London WC1E 6BT, UK junyu@big.ac.cn.
25378303 Johann-Friedrich-Blumenbach Institute of Zoology and Anthropology, GZMB, Department of Evolutionary Developmental Genetics, Georg-August-University Göttingen, 37075 Göttingen, Germany contact@ibeetle-base.uni-goettingen.de.
25378302 Sandia National Laboratories, Department of Systems Biology, Livermore, CA 94550, USA.
25378301 Department of Biomedical Engineering, Washington University, St. Louis, MO 63130, USA Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO 63108, USA.
25361979 Institute of Bioinformatics, University Medical Center Göttingen, Georg August University, D-37077 Göttingen, Germany geneXplain GmbH, D-38302 Wolfenbüttel, Germany edgar.wingender@bioinf.med.uni-goettingen.de.
25361974 European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK.
25361973 Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
25361972 Department of Biomedical Sciences, University of Padua, 35131 Padova, Italy.
25361971 Integrative Genomics of Ageing Group, Institute of Integrative Biology, University of Liverpool, Liverpool, UK.
25361970 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK mcdowall@ebi.ac.uk.
25361969 Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Kanagawa 259-1193, Japan.
25361968 HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
25361966 State Key Laboratory of Stress Cell Biology, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, P.R. China.
25361965 European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany.
25355519 Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK, CB10 1SA. saf@sanger.ac.uk.
25355515 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA.
25355513 Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China.
25355511 Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA Department of Surgery, Medical College of Wisconsin, Milwaukee, WI 53226, USA shimoyama@mcw.edu.
25355510 Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea.
25352555 European Molecular Biology Laboratory, Hamburg Outstation, Notkestr. 85, Geb. 25a, 22603 Hamburg, Germany.
25352553 Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland.
25352552 European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
25352549 INSERM, U1081, Institute for Research on Cancer and Aging of Nice (IRCAN), F-06100 Nice, France CNRS, UMR 7284, Institute for Research on Cancer and Aging of Nice (IRCAN), F-06100 Nice, France Faculty of Medicine, Institute for Research on Cancer and Aging of Nice (IRCAN), University of Nice-Sophia-Antipolis, F-06100 Nice, France.
25352545 Centre for Molecular and Biomolecular Informatics, CMBI, Radboud university medical center, Geert Grooteplein Zuid 26-28 6525 GA Nijmegen, The Netherlands.
25352543
25348409 Center for Biomedical Informatics and Information Technology, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA.
25348408 Institute of Structural and Molecular Biology, UCL, 636 Darwin Building, Gower Street, WC1E 6BT, UK i.sillitoe@ucl.ac.uk.
25348407 Institute of Structural and Molecular Biology, UCL, 636 Darwin Building, Gower Street, London, WC1E 6BT, UK.
25348405
25348404 Lab of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School, Shenzhen, 518055, P. R. China.
25348402 Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA tbreddy@lbl.gov.
25348401 The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA janan.eppig@jax.org.
25348399 Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland.
25348397 Department of Medical Chemistry, Semmelweis University, Budapest, Hungary.
25336621 Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan.
25336620 Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, SE-751 08 Uppsala, Sweden joakim.galli@igp.uu.se.
25336619 Synthetic Biology and Bioengineering Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon 305-806, Republic of Korea Bio-Medical Science Co., Ltd., Daejeon 305-301, Republic of Korea moncher@kribb.re.kr.
25332403 National Agricultural Library, Beltsville, MD 20705, USA monica.poelchau@ars.usda.gov.
25332401 Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada Department of Oral Biological and Medical Sciences, University of British Columbia, Vancouver, British Columbia, Canada Centre for Blood Research, University of British Columbia, Vancouver, British Columbia, Canada Centre for High Throughput Biology, University of British Columbia, Vancouver, British Columbia, Canada.
25332399 The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, USA carol.bult@jax.org.
25332398 European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.
25332396 Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK.
25332395 Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, CH-1206, Switzerland Genoscope-LABGeM, CEA, Evry, F-91057, France anne.morgat@isb-sib.ch.
25332394 Garvan Institute of Medical Research, 384 Victoria Street, Sydney, NSW 2010, Australia St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2052, Australia.
25332392 Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430030, PR China Department of Biomedical Engineering, Key Laboratory of Molecular Biophysics of the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China.
25326331 Department of Immunology, Key Laboratory of Medical Immunology, Ministry of Health, School of Basic Medical Sciences, Peking University Health Science Center, No. 38 Xueyuan Road, Beijing 100191, China Peking University Center for Human Disease Genomics, No. 38 Xueyuan Road, Beijing 100191, China wangpzh@bjmu.edu.cn.
25326329 Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA.
25326323 Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA apdavis3@ncsu.edu.
25324316 Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo Waterfront Bio-IT Research Building, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan n.nagano@aist.go.jp.
25324314 Department of Genome and Gene Expression Data Analysis, Bioinformatics Institute, 138671, Singapore Interdisciplinary Graduate Program in Genetic Engineering, Graduate School, Kasetsart University, Bangkean, Bangkok 10900, Thailand.
25324312 Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou 325000, China.
25324309 University of Potsdam, Institute of Biochemistry and Biology, Karl-Liebknecht-Straße 24-25, Haus 20, 14476 Potsdam-Golm, Germany Max-Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany.
25324307 Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK dpires@dcc.ufmg.br.
25324305 Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607, USA.
25324303 CEA, IBEB, Lab Ecol Microb Rhizosphere & Environ Extrem, Saint-Paul-lez-Durance F-13108, France CNRS, UMR 7265 Biol Veget & Microbiol Environ, Saint-Paul-lez-Durance F-13108, France Aix Marseille Université, BVME UMR7265, Marseille F-13284, France.
25313161 European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom bmeldal@ebi.ac.uk.
25313160 Toxicogenomics Informatics Project, National Institute of Biomedical Innovation, Osaka 567-0085, Japan.
25313158 German Center for Diabetes Research, Neuherberg 85764, Germany Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg 85764, Germany.
25313157 University of Calgary-Computer Science, Calgary, Alberta, Canada.
25300491 Institut National de la Recherche Agronomique (INRA), UMR1331, TOXALIM (Research Centre in Food Toxicology), Université de Toulouse, Toulouse, France.
25300487 Structural Bioinformatics Group, Charite-University Medicine Berlin, Institute of Physiology, Lindenberger Weg 80, 13125 Berlin, Germany Graduate School of Computational Systems Biology, Humboldt-Universität zu Berlin Invalidenstrasse 42, 10115 Berlin, Germany.
25300483 Laboratoire d'innovation thérapeutique, Medalis Drug Discovery Center, UMR7200 CNRS-Université de Strasbourg, F-67400 Illkirch, France.
25300482 Division of Vaccine Discovery, La Jolla Institute for Allergy and Immunology, La Jolla, 9420 Athena Circle, CA 92037, USA rvita@liai.org.
25300481 Biobyte solutions GmbH, Bothestr 142, 69126 Heidelberg, Germany.
25294826 CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
25274737 National Key Laboratory of Crop Genetic Improvement, National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China.
25274736 College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China Institute of Cardiovascular Sciences and Key Laboratory of Molecular Cardiovascular Sciences, Peking University Health Science Center, Beijing, China.
25270878 Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, Punjab, India.
25270877 Department of Haematology, Wellcome Trust-MRC Cambridge Stem Cell Institute & Cambridge Institute for Medical Research, Cambridge University, Cambridge CB2 0XY, UK.
25262355 IISc Mathematics Initiative, Indian Institute of Science, Bangalore 560 012, Karnataka, India.
25262351 Department of Biomedical Engineering, Key Laboratory of Molecular Biophysics of the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China.
25232097 Integrative Genomics of Ageing Group, Institute of Integrative Biology, University of Liverpool, Liverpool, UK.
25217587 Department of Biology, University of Rome Tor Vergata, Rome, Italy sinnefa@gmail.com.
25190456 Department of Obstetrics, Gynecology & Women's Health, University of Minnesota, Minneapolis, MN 55455, USA.
25106869 Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kemitorvet, Building 208, DK-2800 Lyngby, Denmark.
view raw output.tsv hosted with ❤ by GitHub

ADD COMMENT
0
Entering edit mode

@Pierre, I am using your xsltstream to parse a xml i have downloaded using ncbi eutils. I have modified the above xls file(biostar270498.xsl) to get my desired output (Title and Abstract text). It works fine but after certain entires(~500) it throws a error. Can you please see if i have done anything wrong in the xsl or while using your tool.

Usage: cat ~/Downloads/test_e_renal_kidney.xml | java -jar dist/xsltstream.jar -t ~/Downloads/test_e_kid_renal.xsl -n PubmedArticle

xsl file:

   
<xsl:stylesheet xmlns:xsl="&lt;a href=" <a="" href="http://www.w3.org/1999/XSL/Transform" rel="nofollow">http://www.w3.org/1999/XSL/Transform" "="" rel="nofollow">http://www.w3.org/1999/XSL/Transform' version='1.0' >   
<xsl:output method="text" encoding="UTF-8"/>   
<xsl:output method="text"/>  
<xsl:template match="/">  
<xsl:apply-templates select="PubmedArticle"/>  
</xsl:template>  
<xsl:template match="PubmedArticle">  
<xsl:apply-templates select="MedlineCitation/Article/Abstract/AbstractText"/>  
<xsl:text>  
</xsl:text>  
</xsl:template>  
</xsl:stylesheet>  

The error i am getting after ~500 so output:

[SEVERE][XsltStream]ParseError at [row,col]:[98160,6]  
Message: The processing instruction target matching "[xX][mM][lL]" is not allowed.  
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[98160,6]  
Message: The processing instruction target matching "[xX][mM][lL]" is not allowed.  
    at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:596)  
    at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(XMLEventReaderImpl.java:83)  
    at com.github.lindenb.jvarkit.tools.misc.XsltStream.doWork(XsltStream.java:590)  
    at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMain(Launcher.java:763)  
    at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMainWithExit(Launcher.java:926)  
    at com.github.lindenb.jvarkit.tools.misc.XsltStream.main(XsltStream.java:627)  
[INFO][Launcher]xsltstream Exited with failure (-1)  

PS i am still working to get abstract using the above xls

Thank you for your time and thank you for your tool.

ADD REPLY
0
Entering edit mode

what is the output of

xmllint --stream --noout ~/Downloads/test_e_renal_kidney.xml

?

ADD REPLY
0
Entering edit mode

@Pierre Thank you for your response,

Output is:

/home/dell/Downloads/test_e_renal_kidney.xml:98160: parser error : XML declaration allowed only at the start of the document  
<?xml version="1.0" ?>  
     ^  
/home/dell/Downloads/test_e_renal_kidney.xml : failed to parse  

Now i see what is the problem, its in the xml file, it has version line multiple times.

grep -c '?xml version=' ~/Downloads/test_e_renal_kidney.xml 1426

Any way i can parse this xml file to get title and abstract becase it very large file(7.9gb)

Thank you for your time

ADD REPLY

Login before adding your answer.

Traffic: 1174 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6