----------------------------------------------------------------------------- EMBnet: European Molecular Biology Data Network Document version: October 1990 ----------------------------------------------------------------------------- EMBL Data Library, European Molecular Biology Laboratory, Postfach 10.2209, 6900 Heidelberg, Germany Tel: +49 6221 387258 Fax: +49 6221 387306 E-mail: embnet@embl.bitnet ----------------------------------------------------------------------------- Summary ------- EMBnet is an initiative to develop the European infrastructure for academic and commercial information services in biotechnology. The project includes the formation of a computer network for the access to, and exchange and analysis of data of importance to molecular biology and biotechnology. The network consists of nationally-appointed nodes in European countries, appropriately staffed and equipped with computing facilities to provide a biocomputing service and to develop network-based services within their country. Additional special nodes will be involved as hosts of databases or specialised facilities. EMBL acts both as a database supplier and coordinator of the network development. The need for biotechnology information services ----------------------------------------------- The recent explosion in data as a result of research in molecular biology will have a major impact on research and development in chemical, pharmaceutical, biotechnological, medical and agricultural technology. Computer access to the underlying databases is becoming increasingly important, and the requirement for comprehensive information services is well-recognised throughout the world [1,2,3]. These services should be coordinated within Europe, and a clear interface should exist to enable other initiatives, notably in the USA, to collaborate efficiently. A wide range of databases should be made available throughout Europe, and links built between them for ease of access and analysis. Current important databases include information on: gene and genomic sequences of proteins and nucleic acids, structures of biological macromolecules, human genome maps, genetic diseases, microbial strains, hybridoma, restriction enzymes, cloning vectors, industrial enzymes, taxonomic classification, toxicological data, and abstracts from biological journals. Structure of EMBnet ------------------- The European Molecular Biology Data Network is based on a network of national biocomputing centres in European countries. These national nodes provide a biocomputing service for their user community. Other special network nodes, such as database hosts and commercial information services, will be encouraged in addition to national service nodes. EMBL acts as both network coordinator and a database supplier, tasks which could eventually be assumed by a proposed European Institute of Bioinformatics [4]. The network must remain accessible to commercial as well as academic users. Some of the outstanding advantages of a decentralized data network are: o A large pool of expertise can be brought to bear on data flow, research and communication problems in a coordinated yet diversified fashion. o The user community varies from one country to the next, due to language and cultural differences, differences in national computer networks and local facilities. National and regional centres linked to a central node can specifically address these issues. Special attention can be focussed on the technologically less developed regions. o Remote login across international boundaries is costly and difficult. A system which enables users to retrieve and submit data from a computer within the same country is simpler and cheaper. EMBnet Network Functions ------------------------ (a) Activities at EMBnet national nodes DATABASE ACCESS Latest releases of the molecular biology databases and retrieval software should be made available to the user community. This includes sequence data distributed daily to national nodes from the EMBL Data Library between releases. SOFTWARE ACCESS Licensed and unlicensed molecular biology software for sequence analysis, including locally-developed software should be available as part of a comprehensive on-line service. USER SUPPORT On-site, electronic mail and telephone user support should be provided, and training courses in the use of molecular biology databases and software organised. DATABASE DEVELOPMENT Where possible, research in aspects of database development or theoretical biology should be performed in order to be actively involved in novel aspects of bioinformatics. The results of such research can be shared by the EMBnet community. (b) Network Activities DATA DISTRIBUTION EMBL distributes complete releases of all data several times each year on mass storage media such as magnetic tape or CD-ROM. New and updated data from EMBL and GenBank is distributed daily from EMBL to national nodes. National nodes should be therefore supplied with the most comprehensive collection of sequence data. Similar mechanisms could be useful for distribution of other databases. DATA ENTRY Systems for the the electronic submission of data to the EMBL Data Library and other database groups should be supported. COMPUTER CONFERENCING SYSTEM The development of a bulletin-board and conferencing system on the network will provide the vehicle for information exchange between all network partners. To be effective, this must involve relevant research topics and be easily accessible to laboratory scientists on the computers they commonly use for their work. ACCESS TO REMOTE FACILITIES Systems for interactive access to remote facilities should be developed, for example: access to specialised hardware and software, database hosts, software collections or other resources. TRAINING Where necessary, training for staff of the national service nodes should be arranged. Possible topics include the installation and support of particular software packages and technical networking issues. DATABASE DEVELOPMENT Research in database design for future generations of molecular biology databases should be continued. Implementation -------------- Identification of National Nodes Care has been taken to identify national nodes within European countries that have been mandated for that task by their host government or relevant research council. The first EMBnet nodes were established during 1988 at EMBL, CITI2, CAOS/CAMM Centre, the SERC Daresbury Laboratory , and an industrial trial link at Hoffman-La Roche. EMBL Council scientific delegates have been generally successful in initiating appropriate procedures within their countries for the establishment of further national nodes. These nodes are indicated in Appendix I. Other Nodes Other special purpose nodes are important for EMBnet, and these are implemented in addition to the national nodes. The configuration of these will be more varied than those of national nodes. These nodes may be centres of large-scale sequencing projects, database providers, or users of national node services. Gateways can be provided to some commercial information services. Configuration of EMBnet nodes The computer systems of EMBnet national nodes have either VMS or UNIX operating systems.The scale of the system required is of course dependent on the required usage within each country. Major databases of importance to molecular biology (eg nucleotide and protein sequences, mapping and structure databases) should be made available, together with query and analysis software. Typically 2-5 staff are required at a national node with combined expertise in database maintenance, sequence analysis, software development, theoretical molecular biology, computer science and networking. The initial network connections used DECnet on the public X.25 packet-switching networks. This policy has changed over time due to persistent DECnet addressing problems, increasing availability of Internet connectivity in Europe, and the increasing importance of non-DEC computer systems in molecular biology. TCP/IP was therefore adopted as a supported protocol for EMBnet. Use of existing international networks is desirable, eg Internet (IP), and the emerging IXI network (OSI-based, but initially allowing use of IP or DECNET). Sequence Data Distribution Automatic procedures have been operating at EMBL since Autumn 1988 for distribution of new and updated nucleotide sequence data from the EMBL Data Library, GenBank and DDBJ combined. This data is distributed to national nodes via EMBnet's DECnet and IP connections.Remote copies of the nucleotide sequence database, updated daily, are in this way maintained across Europe. The data is made available to software packages as part of on-line services, and also redistributed by national nodes to other organisations within their country. This same sequence data continues to be available via the EMBL File Server [5]. Conferencing System A schedule for a conferencing/bulletin-board system based on the Usenet News model has been drawn up. Most national nodes are operating VAX/VMS systems and will install ANU News software and news distribution will be based on the NNTP protocol using IP or DECnet connections. The existing BIOSCI bulletin-boards will be supported. Funding ------- Initial costs involved in establishing EMBnet has been borne by the participants, some of whom have received national funding. EMBL has borne the costs of sequence distribution. A funding proposal is currently being lodged with the EC BRIDGE programme. This is a joint application for funding of network-based activities involving several EMBnet nodes. The outcome should be known by the end of 1990. EMBnet Workshops ---------------- There have been three EMBnet workshops involving representatives from existing and potential network participants. - 1 st EMBnet Workshop, EMBL, July 19 1988. [6] - 2 nd EMBnet Workshop, EMBL, May 30-31 1989. [7] - 3 rd EMBnet Workshop, Uppsala, July 9-10 1990. [8] References ---------- [1] CEFIC (March 1987) Bio-Informatics in Europe. A position paper submitted by CEFIC on behalf of the chemical industry. [2] Franklin J. (January 1988) The Role of Information Technology and Services in the Future Competitiveness of Europe's Bio-Industries. A report compiled for the Commission of the European Communities. [3] CEFIC (March 1990) Bio-Informatics in Europe. I - Strategy for a European Biotechnology Information Infrastructure. [4] The EMBL Data Library (August 1989) Proposal for a European Institute of Bioinformatics. [5] Stoehr P.J., Omond R.A. (1989) The EMBL Network File Server. Nucleic Acids Res.:17:6763-6764. [6] Report of 1st EMBnet Workshop, EMBL, June 1988. [7] Report of 2nd EMBnet Workshop, EMBL, May 1989. [8] Report of 3rd EMBnet Workshop, Uppsala, July 9-10th 1990. Participants in EMBnet Project ------------------------------ - European Molecular Biology Laboratory, Postfach 10.2209, Meyerhofstr.1, 6900 Heidelberg, West Germany Contact: Peter Stoehr Tel.: +49 6221 387435 E-mail: Stoehr@embl.bitnet - CITI2, 45 rue des Saint-Peres, 75270 Paris, Cedex 06, France Contact: Philippe Dessen Tel.:+33 1 60194181 E-mail: Dessen@frciti51.bitnet - SERC Daresbury Laboratory, Warrington, Cheshire WA4 4AD, UK Contact: Howard Sherman Tel.: +44 925 603000 E-mail: BA@daresbury.ac.uk - CAOS/CAMM Centre, Faculty of Science, University of Nijmegen, Toernooiveld, 6525 Ed Nijmegen, The Netherlands Contact: Jack Leunissen Tel.: +31 80 516847 E-mail: CAOS@hnykun52.bitnet - Hoffman-La Roche, CH-4002 Basel, Switzerland Contact: Dan Doran Tel.: +41 61 6886812 E-mail: Doran@embl.bitnet - Centro de Biologia Molecular, Universidad Autinoma, Campus de Cantoblanco, Madrid 28049, Spain Contact: Jose-Maria Carazo Tel.: +34 1 3975070 (x272), E-mail: Carazo@emdcci11.bitnet - Biocomputing Biozentrum, Klingelbergstr. 70, CH-4056 Basel, Switzerland Contact: Reinhard Doelz Tel.: +41 61 253880 E-mail: Doelz@urz.unibas.ch - Biomedical Centre, University of Uppsala, Box 570, S-751 23 Uppsala, Sweden Contact: Nils-Einar Eriksson Tel.: +46 18174017 E-mail: Nisse@bmc1.bmc.uu.se - Dept. of Biological Services, Weizmann Institute of Science, Rehovot 76100, Israel Contact: Leon Esterman Tel.: +972 8 482470 E-mail: Lsestern@weizmann.bitnet - Institute of Medical Biochemistry, University of Oslo, P.O. Box 1112, Blindern, N-0317 Oslo 3, Norway Contact: Eyvind Paulssen Tel.: +47 2 454093 E-mail: L_Paulssen_E@use.uio.uninett - Institute of Biotechnology, University of Helsinki, Valimotie 7, 00380 Helsinki, Finland Contact: Christophe Roos Tel.: +358 0 4346022 E-mail: Roos@cc.helsinki.fi - Istituto di Chimica Biologica, Universita di Bari, Via Amendola 165/A, I- 70100 Bari, Italy Contact: Cecilia Saccone Tel.:+39 80 243303 E-mail: Saccone@ba.infn.it - Institute of Molecular Biology and Biotechnology, Research Centre of Crete, P.O.Box 1527, Heraklion, Crete, Greece Contact: Babis Savakis Tel.: +30 81 233554 E-mail: Savakis@grimbb.bitnet - D.K.F.Z., ImNeuenheimerFeld 280, 6900 Heidelberg, West Germany Contact: Matthias Hage Tel.: +49 6221 401271 E-mail: dok248@dhddkfz1.bitnet - BIOBASE, Institut for Medicinsk Biokemi, Ole Worms Alle, Bygning 170, Aarhus Universitet, DK-8000 Aarhus C, Denmark Contact: Hans Ullitz-Moller Tel.: +86 139784 E-mail: Hum@bio.aau.dk - UK HGMR, Resource Centre, Watford Rd, Harrow, Middlesex HA1 3UJ, UK Contact: Francis Rysavy Tel: +44 81 8693291 E-mail: f.rysavy@mrc-crc.ac.uk - CEPH, 27 rue Juliette Dodu, 75010 Paris, France Contact: Bob Cottingham Tel: +33 1 42499867 E-mail: bc@frceph51.bitnet - ICGEB, Padriciano 99, 34012 Trieste, Italy Contact: Mark Vandeyar Tel: +39 40 226555 E-mail: lr4ts1h3@icineca2.bitnet