Subsections

Computer Hardware available to DL Collaborators

The 1960s

The Flowers Report [21] contains comments on the status of computing in late 1965. The following paragraphs refer to the Science Research Council and Daresbury Laboratory, at that time one of its nuclear physics research institutes.

The S.R.C. is unique in possessing two very large nuclear physics laboratories [Rutherford Laboratory and Daresbury Laboratory], created under the National Institute for Research in Nuclear Science, which are operated for and largely by the universities as central facilities. All of the work of these two laboratories is concerned with large accelerators which cannot be used effectively except in close association with correspondingly large computers operating, in part, on line to the accelerators, either directly or through small control computers. The computing requirements of nuclear physics were recently examined by a joint working party of the former D.S.I.R. and N.I.R.N.S., and it was recommended that both laboratories should acquire large central computers of their own.

In the case of Daresbury the issue is clear because there is no other computer available to them for close, constant and reliable operation with the accelerator. A wealth of experience exists to show that in work of this kind computer demand doubles each year and a large expandable machine is therefore essential. There is no British machine of the required characteristics available at the right time, and the application is for an IBM 360/50 together with on line control computers. We support the application and recommend that the order be placed at once, but envisage that this machine may require upgrading to 360/65 standard within two or three years.

In the case of Rutherford High Energy Laboratory the issue might appear not quite as clear because of the presence of the Chilton Atlas nearby of which the Rutherford Laboratory already has one quarter of the use in addition to its own much smaller Orion. The joint working party examined the technical possibility of using Atlas rather than purchasing a further machine. However, the estimated demand exceeds the total capacity of Atlas especially when allowance is made for the fact that in the years 1966 and 1967 there will be a most serious shortage of computer time for the university film analysis groups most of which are involved in the international programme of C.E.R.N. This shortage alone amounts to at least one third of Atlas and the Rutherford Laboratory have taken the responsibility for meeting this demand. Further the Chilton Atlas is making a vital contribution to general university computing, and that of certain Government establishments, the need for which cannot possibly decline before the programme of new provision recommended in this Report is at least in its third year. The Rutherford Laboratory must therefore have a machine of its own.

The very first IBM computer on the Daresbury site, an IBM 1800 data acquisition and control system swiftly arrived in June 1966 and acted as a data logger and controls computer for the NINA synchrotron. It was rapidly followed by the first IBM mainframe computer at Daresbury, an IBM 360/50 which started service for data analysis in July 1966 [52]. This was replaced by an IBM 360/65 in November 1968 as forseen in the Flowers Report.

This growth in local computing power led to employment opportunities. A note in Computer Weekly of 3/11/1966 reads as follows. If you are a scientist, mathematician or physicist, but want to get into computers, there is rather a good opportunity for you just south of Liverpool. Daresbury Nuclear Physics Laboratory, near Warrington, Lancs, is setting up a computer group to provide a service for an experimental physics programme with a 4 GeV electron accelerator.

There are vacancies at all levels for programmers or physicist/ programmers. Educational requirements are high. Physics or maths graduates or physicists and mathematicians who have completed their Ph.D. thesis are the standard sought.

Another publication noted that the salary would be between £1,000 and £3,107 p.a.

It is worth noting that other IBM systems were in used for academic research purposes throughout the country, with an IBM 360 at UCL (University College London) and a joint IBM 370/67 service for Newcastle and Durham installed by mid 1967.

In the early years the main task at Daresbury was to provide computational power for the nuclear physics groups. Compared to the present, computing was very different in those days. The normal way of telling the computer what work to do was by punched cards, although some stalwarts were still holding out with 5 hole paper tape. Good old FORTRAN was there - what a marvellous language it seemed then, PL/1 was also sometimes used. Typically one prepared a job on punched cards and placed it on a trolley. Later an operator would come along, bringing back the previous trolley load of punched cards and the line printer output that had been produced. Turn around was measured in tens of minutes at the very least. The mean time between failures towards the end of the 1960s was a day. However these computer crashes were ``unseen'' by the users who were waiting impatiently for the trolley to re-appear. This was an early form of batch service where jobs ran one after another.

Figure 2: Original Computer Hall
Image original_computer_hall


The 1970s

The IBM 360/65 was replaced by an IBM 370/165 in January 1973 [17]. This had a stunning 12.5 MHz cpu and 3 MB of main memory, which was actually a lot for the time.

1973 also saw the arrival of TSO as an interactive service. It was one of those changes, like seeing colour television for the first time, which was not only here to stay but which could transform the whole character of computing. Several users could simultaneously edit programs interactively (cards no longer required), compile them and submit them to the batch queue.

Some more photos and information about the IBM 370 can be found here.

A PDP-11/05 computer was used as a central network controller at Daresbury from 1974-80. This had a single 16 bit 100 kHz processor and 16 kB micro core memory. They were popular at the time for real time applications and the C programming language was essentially written for this type of computer. The first version of UNIX ran on a PDP-11/20 in 1970. In many ways the PDP-11 could be considered to be the first ``modern'' computer. In addition to local terminals at Daresbury this one supported remote job entry (RJE) stations at Daresbury, Lancaster, Liverpool, Manchester and Sheffield. There was also a connection over the SRCNet packet switching network to a GEC 4080 computer fulfilling a similar function at the Rutherford site.

In Oct'1977 a cluster of six GEC 4070 computers was bought from Borehamwood at a cost of £170k. These had core memory and were deployed for nuclear physics data analysis and control in the NSF. They did not come on line for users until the early 1980s, but were then kept running and upgraded over a long period. They were particularly suitable for handling large quantities of data streaming from the experimental facility. Further data analysis was carried out on the IBM 370/165. Four of the GEC machines were connected in pairs, each pair forming a ``data station''. One was used interactively and the second for preparation of the subsequent experiment. These machines were connected to printers, plotters, tape units and graphic display units through the other two systems.

Some more information and photos of the PDP-11 and LSI-11 can be found here.

Some more photos of the GEC 4000 cluster and NSF control room can be found here.

Figure 3: Network Components in 1978
Image network_1978

Some notes on a visit to Daresbury in 1974 by F.R.A. Hopgood and D.G. House of the Atlas Computing Laboratory is re-produced in Appendix A. This gives a flavour of the environment at that time.

In 1975 the SRCnet was established linking the 360/195 at Rutherford with the 370/165 at Daresbury and the 256 kB ICL 1906A on the Atlas site. Also at that time two Interdata model 85s each with 64 kB memory were purchased at £100k to front end the IBM at Daresbury. All systems were linked using CAMAC. There were also experiments with microwave communication between different Laboratory buildings.

Starting in 1974, the SRS control system [39] was designed to use a two level computer network linked to the Daresbury site central computer, the latter being used for software preparation and applications requiring bulk data storage. Four mini computers were used, one dedicated to each of the three accelerators, the fourth providing controls for the experimental beam lines. These computers with Perkin-Elmer (Interdata) systems models 7/16s and 8/16s with 64 kB memory each. These were 16 bit computers running the proprietary OS/16-MT real time operating system enhanced to support CAMAC and network communications.

It was suggested [39] that these computer systems would be upgraded and used for eleven years, at which time (1985) the 16 bit machines would be replaced by two 32 bit control systems with a larger 32 bit central computer. These were Perkin-Elmer 3200 series systems (3205 and 3230 respectively) which became available from 1982, Perkin-Elmer had by then spun off Concurrent Computer Corporation which sold these machines. A further upgrade was being planned in 1994 [36].

By the time the SRS commenced operation in 1980 [38], the main control computers were Perkin-Elmer (Interdata) 7/32s with 320 kB memory. These was linked to three or more 7/16s, each of which oversaw elements of the SRS. In total there were two 7/32s and ten 7/16s which were ordered in 1976. The 7/32s initially has 128 kB of memory, 20 MB of disc, a line printer and three VDUs. The 7/16s had only 32 kB memory. CAMAC was also used to interface everything. The 7/32 was a 32 bit machine running OS/32-MT, this machine could also be used for plotting graphs of SRS operations parameters. Applications, mainly for plant control, were programmed using the real time RTL-2 language first used at Daresbury in 1974 in preference to Coral 66 1. Whilst every effort was made to ensure the whole system was reliable, it was noted that the mean time between failures of a 7/32 was typically a few days. In addition to Perkin-Elmer, a number of Honeywell H-160 computers were in use for data collection and local instrument control.


The Cray-1

The first Cray-1 supercomputer in the UK was installed temporarily at Daresbury Laboratory in November 1977. This was on loan from Cray Inc. and was one of the first Cray vector supercomputers outside the USA [66]. It was actually Cray serial number 1 (SN1) which had been installed at Los Alamos National Laboratory, USA for a six month trial in March 1976. Over the next two years it was upgraded to a Cray-1A and then Cray-1S/500 SN28 which finally became a 1S/1000 with 1 M word (64-bit) memory. The front end system was the IBM 370/165 with the PDP-11 still acting as a packet switching gateway. Not only was the Cray more powerful than the IBM, but was said to be more user friendly, in particular offering metter diagnostics for debugging programs.

The IBM was replaced with the AS/7000 in 1981 fulfilling the same purpose but with an increase in performance and throughput. The PDP was upgraded with the addition of a GEC 4065 in 1981 and GEC 4190 in November 1982. A contemporary account read as follows.

The Cray-1 system is a high speed computer well suited to the needs of the scientific community which Daresbury Laboratory serves. It is a large scale, general purpose digital computer which features vector as well as scalar processing; it achieves its speed with very fast memory and logic components, and a high degree of parallel operations. The basic clock period is 12.5 nanoseconds, ... the memory cycle is 50 nanoseconds. The machine can operate at speeds in excess of eighty million calculations per second.

The Cray-1 system at Daresbury consists of a processor with half a million (64 bit) words of memory, a maintenance controller and four large scale disk drives, each capable of storing 2.424 x10**9 bits of data. The system will be linked directly to the SRC network through the IBM 370/165.

Areas of science in which the Cray-1 computer system will be used include plasma physics, oceanography and engineering where three dimensional modelling can be undertaken. More complex study than is now possible will be done in protein crystallography, atomic physics and aerodynamics. The enhanced computing capabilities will advance research in astrophysics, nuclear theory and theoretical chemistry.

Users of the system will be drawn from many Universities and research groups throught the United Kingdom.

It was reported in the Chester Chronicle on 28/8/1981 that 300 ``boffins'' from around the world had met at Chester College for a conference entitled Vector and Parallel Processors in Computational Science. It was organised by staff from Daresbury and chaired by Prof. Mike Delves, a mathematician from University of Liverpool. The range and success of applications of the Cray is evident from the collection of papers published in 1982 [14]. The Cray system was thus demonstrated to be of tremendous benefit in physics and chemistry applications, with help from staff at Daresbury tuning codes to run on it, in some cases re-writing in machine code [18]. This was also the start of the Collaborative Computational Projects mentioned above.

The Cray system was bought outright for the UK and moved to the University of London Computer Centre in May 1983 where it was controlled by an Amdahl V8 front end machine. ULCC had been established in 1968 as a consequence of the Flowers Report. A full service for the 738 active users continued with an additional second hand Cray-1 (a Cray-1B?) in 1986 as recommended in the Forty Report [22]. It is possible that the size of memory of the ex. Daresbury Cray was also doubled at this time. The two Crays then became known by users as ``the Cray Twins'' or ``Ronnie and Reggie'' and ran until 1989 when they were replaced by a Cray-XMP/28.

Daresbury often pioneered new technology, but others put it into service. SN1 eventually returned to the USA and is now in the Cray Museum at Chippewa Falls, see http://www.craywiki.co.uk/index.php?title=Cray_museum.

Some more information and photos of the Cray-1S can be found here.

Figure 4: Cray 1 at Daresbury in 1978. The IBM 370/165 console is in the foreground.
Image cray_1978

In those days data and programs were stored on magnetic tape, there were thousands of them stored for our users in the computer room. Output could be provided on tape or printed on line printer paper which users collected from a hatch. Both storage and user interfaces have come a long way since.

Figure 5: Part of tape store at far end of computer room in 1979.
Image tapes_1979

Some more photos of the main Computer Hall at Daresbury can be found here.

The 1980s

It was noted in 1981 [31] that the NSF data handling system which consists of a network of five GEC 4070 processors and associated CAMAC equipment is nearing the point at which it can be tested as a complete system. Initially, there will be two independent data taking stations or event managers into which data from the experimental electronics will be organised and buffered before being transferred via a serial highway to the processors for sorting. There will be five graphics terminals for on line or off line interactive analysis of experimental data. The data handling system is linked, via the Daresbury central IBM 370/165, to workstations situated in user universities.

A National Advanced Systems NAS AS/7000 (IBM ``clone'') from Hitachi was installed in June 1981 as a central computer. It gave an enormous increase in power, a significant increase in reliability and facilitated the move to the then modern operating system MVS. The NAS AS/7000 was a Japanese equivalent of an IBM 370 running IBM's MVS operating system at around 2.7 MIPS. The system was installed in 1981 and had 8 MBytes of main memory, later upgraded to 16 MB. At the same time the network packet switching was upgraded to a GEC 4090, the NAS and GEC together costing some £850k. Dr. Brian Davies, head of Daresbury's Computer Systems and Electronics Division, explained: We have had an IBM 370/165 for about eight years now, which has been the workhorse for the laboratory. It is attached to the SERC network which includes most universities in the UK. All the applications are scientific, using Fortran, with a large scientific batch load and about 50 concurrently active terminals working under IBM's Time Sharing Option.

Figure 6: Comparison of 32 kbits of 2 us ferrite store from the IBM 370/165 (left) with 1 Mbits 360 ns MOS store from NAS AS/7000 (right).
Image memory_1980

FPS-164 and 264 array processors were in use between 1987-91. The FPS-164 had an 11 MHz 64-bit processor and 4 MB memory. FPS systems could be fitted with MAX boards for matrix operations, each giving 22 Mflop/s. Up to 15 of these could be fitted to a single FPS host. Three MAX boards were installed in the FPS-164/MAX at Daresbury. The more general purpose FPS-264 performed at 38 Mflop/s but did not have MAX boards. These FPS systems were installed based on the recommendations of the Forty report [22] which noted the strong demand for special purpose computers at the time.

Figure 7: The FPS-164/MAX shortly after installation in 1987.
Image fps164

Figure 8: FPS-264 in 1989.
Image fps264

On 16/12/1988 the AS/7000 user service was terminated, signalling the end of over two decades of IBM compatible computing at Daresbury. Substantial amounts of scientific computation had been completed, so everyone involved in the service joined in a traditional wake, by no means an entirely sad occasion, to mark its ending.

The transputer which came on the market in 1985 was programmed using a special parallel language called Occam. Some exhibits show a selection of the hardware used in transputer based systems, instruction manuals and samples of Occam code.

The Meiko M10 was the first explicitly parallel computer at Daresbury, requiring applications to be significantly re-written. It had 13x T800 transputers which comprised the hardware used at the Laboratory in its status of ``Associated Support Centre'' to the Engineering Board- DTI Transputer Initiative. The Inmos Transputer was a British invention, hence the interest from the DTI. The system (on loan from the Initiative) was upgraded to its final size in January 1989 with the installation of three ``quad boards''. A large amount of software was developed on this system. This included SRS data analysis software MORIA and XANES but also the Occam version of the FORTNET message passing harness which enabled other application codes written in Fortran to be used on Transputer systems. The system was however obsolete and switched off by 1994.

The FPS T20 had 16x T414 transputers with additional vector processing chips to give higher performance. It was operational under Ultrix on a microVax with JANET access. The Occam-2 compiler for FPS was developed at Daresbury by Bill Purvis. A vector maths library was also being written. A high speed interface was developed allowing data to be transferred between the Meiko M10 and the FPS, for instance allowing the FPS to access the graphics hardware in the Meiko. This machine however never realised its contractual upgrade path and when in May 1988 it was no longer supported by Floating Point Systems it was replaced by the Intel iPSC/2.

More information about Meiko parallel machines can be found here.


The 1990s

Following work on early transputer systems, parallel processing at Daresbury as a service focused around Intel iPSC/2 and iPSC/860 hypercube computers with respectively 32 and 64 processors and a Meiko M60 computing surface with 10 processors used for development work. Each node of the iPSC/860 and M60 consisted of a high performance Intel 64 bit i860 microprocessing chip, memory and an internal network interface to send data to other nodes. For a period of around nine months an 8 processor Alliant FX/2800 system was maintained at the Laboratory which it was developed into a stable system to be installed in the Engineering Department at the University of Manchester. There was a smaller system in use at Jodrell Bank. Descriptions of Parsytec transputer systems and other hardware on test have also appeared from time to time [53].

From 1990 to 1993 the Intel iPSC/860 evolved from an initial parallel development system to a resource acting as the focus of a National Supercomputing Service. The iPSC/860 system had a peak performance of around 2.5 Gflop/s, equivalent to the Cray Y-MP/8I at RAL but at considerably lower capital cost. Some 100 users were registered on the Intel with 140 projects having made use of the system, many of which were surveyed in ``Parallel Supercomputing '92'' and ``Parallel Supercomputing '93. The associated newsletter, ``Parallel News'', was circulated to around 1,000 people, providing up to date information on the service, together with a variety of articles, programming hints etc.

More information about Intel parallel machines can be found here.

Daresbury was thus instrumental in demonstrating the successful operation of a national parallel supercomputing service. Another service for the ``grand challenge'' projects in molecular simulation and high energy physics was operated with a similar 64 processor Meiko M60 system at the University of Edinburgh Parallel Computing Centre. During 1993 it was decided by the Advisory Board to the Research Councils, after much consultation and independent tendering and benchmarking exercises, that a new parallel supercomputer should be bought and installed at Edinburgh. This was a 256 processor Cray T3D system. Again Daresbury had pioneered the use of a novel type of computer system which was later installed at a different site.

Circa 1995, staff at Daresbury and visitors had access to many on site computers and to the Cray Y-MP/8I at RAL. The on site computers were by now linked by a local area network consisting of Ethernet, FDDI and dedicated optical fibres. Access from outside the laboratory was via the JANET service to local hosts such as the Convex and the Sun workstation which was a front end for the Intel hypercube. Some of the systems then in use are listed below.

Multi-user Data Processing and Computation

Convex:
C220 - this early Convex C2 system was installed in 1988 initially with one processor and 128 MBytes of main memory. It was some four times faster than the AS/7000. Early in 1989 a second 200 MHz processor and further 128 MBytes of memory was delivered. This two processor system was in use until 1994.
Vax:
3800, 11/750 and 8350.
Silicon Graphics:
4D/420 - quad-processor MIPS 3000 system with dedicated visualisation hardware.
Sun Microsystems:
470/MP - dual-processor system (front end to Intel).

Figure 9: Intel iPSC/2 in 1989
Image ipsc2

Distributed Memory Parallel Computers

Intel:
iPSC/2 - 32 SX/VX nodes were provided by the SERC Science Board in 1988 for this second generation Intel hypercube. Each node had an Intel 80386 plus Weitek 1167 floating point processor set and AMD VX vector co-processors with a total of 5 MByte of memory split between the two processor boards (4 MByte scalar and 1 MByte vector memory). A further 32 SX nodes and concurrent i/o system with 1.5 GBytes of disc were funded by ICI plc. The system was reduced again to 32 nodes to fund ICI's stake in the iPSC/860. The system became obsolete and too expensive to maintain as a service in 1993, it was however still being used in 1994 for system code development.
Intel:
iPSC/860 - 64x Intel i860 nodes. Initially 32 nodes were bought with Science Board funding and money from the trade in of ICI's iPSC/2 nodes. The system was installed in June 1990. It was upgraded to 64 nodes by the Advisory Board to the Research Councils (ABRC) Supercomputer Management Committee (SMC) in 1993 following the first successful year of a national peer reviewed service.
Meiko:
M60 Computing Surface - 4x T800 Transputer nodes were moved from the M10 and 10 Intel i860 nodes were added in a new Meiko M60 chassis funded by Engineering Board in 1990 to support similar systems in a number of UK Universities. This ran until 1993.

Figure 10: Convex C220 in 1989
Image c220

When it was first installed, the Intel iPSC/860, much like its predecessor the iPSC/2, remained very much an experimental machine. Indeed it was purchased to investigate the feasibility of parallel computing, but it proved to be a much valued resource by researchers, remainded in great demand, despite the introduction of alternative facilities at other UK sites. The rapid conversion of a number of scientific applications to run on the machine provided that parallel computing was a viable technique and was by no means as difficult as some were claiming. Even early teething troubles with the operating system did not deter users from exploiting the power of the machine.

The Inte iPSC/860 was made generally available as a UK wide service in January 1992, when Access to the machine was by peer reviewed grant application, and the take up was immediate, with 40 applications awarded a total of 100,000 node hours in the first year. Subsequent years saw the applications swell to 60, with a total of 320,000 node hours in 1994.

The Intel was essentially a collection of 64 single board computers, which operated in parallel and communicated with each other over a high speed inter-connection network. Although the connection had a fixed hypercube topology, the machine differed from earlier ones by having ``intelligent'' message routing, so that the programmer did not have to worry about the details of the topology and could pass a message between any pair of processors without significant overhead. Nowadays, all parallel machines use similar techniques and topology has become a detail that concerns only manufacturers and system administrators.

When first delivered, in March 1990, the iPSC/860 consisted of 32 processors, each with 8 MB of memory. This limited node memory provided arguably the major bottleneck in the process of developing parallel applications, a situation aggravated by the Intel developed operating system (NX/2) not supporting virtual memory. Driven by the early impact of the machine, EPSRC's Science Board funded an upgrade in October 1991 that increased the memory to 16 MBd per processor. A subsequent upgrade to 64 nodes provided an even greater increase in the power of the machine.

In addition to the processing nodes, a number of additional processors were provided to handle input and output to a collection of disks. Initially, 4 disks each of 750 MB were installed, a configuration that was later upgraded to 8 disks. When the machine was extended to 64 nodes, a simultaneous disk upgrade replaced all 8 disks with 1.5 GB disks with improved performance and reliability (as well as doubling the capacity). The i/o subsystem also included an Ethernet port so that the i/o system could be accessed directly from the network using the FTP protocol.

Shared Memory Parallel Computers

Alliant:
FX/2808 - eight Intel i860 processors sharing memory over a crossbar switch. This was on loan for a period of evaluation before being transferred to University of Manchester.

Workstation Clusters

Apollo:
DN10020 - two systems with a total of five processors running at 18 MHz and a shared disc pool. Apollo was a rival of Sun building graphics workstations in the mid 1980s.
HP:
9000 Series Models 750 and 730 - 5 nodes.
IBM:
RISC/6000 Series Models 530H - 3 nodes.

In the HP cluster, individual machines were connected with an FDDI network. They were used for parallel computing with the MPI, PVM or TCGMSG software, the latter was developed at Daresbury. The ``head'' node was an HP model 755 with 40 Mflop/s peak performance and 128 MB memory and 4.8 GB disk (of which 2GB was scratch space). ``Worker'' machines were HP 735 models with 40 Mflop/s performance, 80 MB memory and 3 GB scratch disk. Users would log into the head node and submit jobs to the worker nodes using DQS - the Distributed Queuing System.

Super Workstations

Stardent:
1520 - (formerly Ardent Titan-2) two systems with a total of three processors. The two processor machine was bought with industrial funding in 1989 to run molecular modelling software (such as BIOGRAF) which requires good graphical facilities, principally for the simulation and modelling of polymers. Both machines were switched off in mid March 1994 as they had become obsolete and too expensive to maintain.
Silicon Graphics:
SGI 4D/420-GTX - upgraded to four processors at the end of 1992.
IBM:
RISC/6000 model 530H - dedicated for development and support of the CRYSTAL code.
DEC:
Alpha - one had been in use for development of advanced networking methods since 1993, two more systems were purchased for the CFD group to support development of code to run on the Cray T3D parallel system (which also uses DEC alpha processors) in 1994.

Figure 11: Stardent consoles in 1989
Image stardent

Desktop Systems

Some 100 systems are in use, mostly Sun Sparc and 3/60 models, Silicon Graphics Indigo and DEC Workstations but also IBM PowerPC models and IBM and Apple PCs.

Figure 12: Sun 470 console in 1989
Image sun

Beowulf Computers

The first commodity cluster, then known as a Beowulf (from Norse mythology) was built in 1994 and used as a test and development system until 1998. This was built from off-the-shelf PCs linked together with ethernet - a total of 32x 450 MHz Pentium III processors. It occupied a hefty rack and the cables looked like coloured spaghetti. The PCs each had memory and disk but no keyboard or monitor. They were connected by dual fast Ethernet switches - 2x Extreme Summit48, one network for IP traffic (e.g. nfs) and the other for MPI message passing. Additional 8-port KVM switches can be used to attach a keyboard and monitor to any one of the nodes for administrative purposes. The whole cluster had a single master node (with a backup spare) for compilation and resource management.

Some of the applications running on these systems and the national Cray T3E supercomputing facility in Edinburgh, many arising from the work of the CCPs, were showcased at the HPCI Conference, 1998 [7].

Figure 13: So called Beowulf Cluster of PCs
Image beowulf1

Graphical Visualisation and Program Analysis

A contemporary account read: We now have the possibility to log into a graphical workstation to ``dial up'' another machine with intensive parallel or vectorial capability. We can make it perform a task and display the results as they are calculated, perhaps over a number of computational steps, or modify parameters of the problem between each frame to study the effects. A software package called DISPLAY has been developed. It is intended as a graphical ``shell'' or ``front end'' for computational codes of all types and has a simple menu based interface which the user may configure to run his or her application. The application can be run as one component of a parallel task within the distributed harness.

Other graphical front end software was used with Silicon Graphics workstations to ease interactive use of programs maintained by the data analysis group for our SRS users.

Performance analysis of complex parallel programs is also essential to obtain good efficiency. Graphical tools, such as those provided on the Intel hypercube, originally part of the Express programming harness, helped to do this.

More recently

Loki and Scali Clusters

Figure 14: Loki Cluster
Image loki_fromtop

A more adventurous cluster known as Loki (also a Norse name) was purchased in 1999 and ran until 2003. This eventually had 64x high performance DEC Alpha EV6/7 processors running in 64 bit at 667 MHz. Each processor had 1/2 GB memory. This was a big step up from the earlier systems. THe master node had a similar processor but 1GB of memory and a 16GB SCSI disk.

Loki was purchased for departmental use and initially contained 17x dual processor UP2000 boards from API including the master node. The principal interconnect solution was the proprietary Qnet, in which the network interface is based on QSW's Elan III ASIC. A secondary fast Ethernet was used for cluster management and a further Ethernet was available for communication traffic enabling comparison of the two interconnects on the same hardware platform.

Thanks to support from EPSRC Loki was upgraded to contain 66 Alpha processors including the master node in 2001. The new dual processor nodes were UP2000 and CS20 from API. This has necessitated upgrading the switch from the 16 node version to one with the potential to support up to 128.

Some more photos of the Loki cluster can be found here.

The slightly larger Scali cluster was bought in 2003 and used 64x AMD K7 processors, now with a huge 1 GB memory each - remember, this was before the demanding Java era and memory was still relatively expensive.

Both Loki and Scali had high performance switched networks connecting the nodes for parallel computing applications as opposed to the earlier quad or hypercube networks.

IBM SP2

An IBM SP2 machine, also essentially a cluster, was installed in mid-1995 and used intensively for development purposes [4]. It paved the way for the introduction of the HPCx national supercomputing service based at Daresbury.

SP stands for ``Scaleable POWERparallel''. This SP2 consisted of fourteen so called IBM Thin Node 2s each having 64 MB memory and two wide nodes each with 128 MB. Each RS/6000 node had a single POWER2 SuperChip processor running at 66.7 MHz and performing at a peak of 267 Mflop/s from its quad instruction floating point core. The nodes, which were the first of their type in the UK, were connected internally by a high performance internal network switch. They were housed in two frames with an additional Power PC RS/6000 control workstation.

More information about the IBM SP2 can be found here.

Year 2000 onwards

HPCx

HPCx started operation at Daresbury in 2002 as the 9th most powerful computer in the world (20th Top500 list, see http://www.top500.org). At that time it was an IBM p690 Regatta system with 1280x 1.3 GHz POWER4 processors and a total of 1.3 TB memory. The HPCx system was located at Daresbury Laboratory but operated by the a consortium legaly entitled HPCx Ltd.

By 2007, the HPCx system had been upgraded to IBM eServer 575 nodes for the compute and IBM eServer 575 nodes for login and disk I/O. Each eServer node then contained 16x 1.5 GHz POWER5 processors. The main HPCx service provided 160 nodes for compute jobs for users, giving a total of 2560 processors. There was a separate partition of 12 nodes reserved for special purposes. The peak computational power of the HPCx system was 15.3 Tflop/s.

The frames in the HPCx system were connected via IBM's High Performance Switch (HPS). Each eServer frame had two network adapters and there were two links per adapter, making a total of four links between each of the frames and the switch network.

The HPCx service ended on 31/1/2010 after a very successful and trouble free eight years. For more information about HPCx and the work carried out upon it see the Capability Computing newsletters on the Web site http://www.hpcx.ac.uk/about/newsletter/.

Some more information and photos of HPCx can be found here.

Figure 15: HPCx Phase 3.
Image HPCxPhoto


2005-2012: NW-GRID, the North West Grid

Computing collaborations in the north west of England go back a long way. Since the start, Daresbury had good links with local universities and research instutions. For instance POL, the Proudman Oceanographic Laboratory in Bidston, Wirral was linked into the Laboratory's IBM 370/165 in the early 1970s. Their own IBM 1130 was replaced with a new Honywell 66/20 in 1976. This was used to set up the British Oceanographic Data Service among other things.

Lancaster University, founded in 1965, was also an important local player. Back on 29/5/1975 Dr. Ken Beauchamp, their director of computer services, was reported in Computer Weekly as saying users don't care where their computing power comes from, as long as its available. At that time the university was operating a six year old ICL 1905F computer with 128 kB memory plus a number of micro-computers in various departments. This was augmented with an ICL 7905 in July 1975. Users also had access to ICL systems at Manchester University and to a new ICL 1906S at Liverpool University, as did Salford and Keele. All these local universities were also using the IBM 370/165 at Daresbury via remote job entry links.

After lengthy discussionns and negociation, NW-GRID was established in April 2005. It initially had four core Sun compute clusters from Streamline Computing funded by NWDA 2 which were accessible using Grid middleware and connected using a dedicated fibre network:

Daresbury - dl1.nw-grid.ac.uk - 96 nodes (some locally funded);
Lancaster - lancs1.nw-grid.ac.uk - 192 nodes (96 dedicated to local users);
Liverpool - lv1.nw-grid.ac.uk - 44 nodes;
Manchester - man2.nw-grid.ac.uk - 27 nodes

These were installed in 2006. The clusters all had 2x Sun x4200 head nodes each with dual-processor single core AMD Opteron 2.6 GHz (64-bit) processors and 16 GB memory. These had Panasas hardware supported file store of between 3-10 TB per cluster (Manchester had client only). The compute nodes were all Sun x4100 dual processor dual core AMD Opteron 2.4 GHz (64-bit) with 2-4 GB memory per core and 2x 73 GB disks per node. They were connected with Gbit/s ethernet.

Additional clusters and upgrades were procured at the core sites, helped by a second phase of NWDA funding as agreed in the original project plan. Other partners were encouraged to join and share their resources. This resulted in the following configuration in late 2008.

Daresbury: 96 nodes 2.4 GHz twin dual core CPU;
Lancaster: 48 nodes 2.6 GHz twin dual core CPU;
Lancaster: 67 nodes 2.6 GHz twin quad-core CPU;
Liverpool: 104 nodes, 2.2 GHz twin dual core and 2.3 GHz twin quad-core CPU;
Liverpool: 108 nodes, 2.4 GHz twin dual core and 2.3 GHz twin quad-core CPU and InfiniPath network;
Manchester: 25 nodes 2.4 GHz twin dual core CPU plus other local systems;

Daresbury, Lancaster and Liverpool have 8 TB of storage accessed by the Panasas file servers. In addition to this, there are RAID arrays of 2.8 TB at Manchester and 24 TB at each of Lancaster and Liverpool. Nodes are connected by separate data and communications interconnect using Gigabit Ethernet and Liverpool's second 108 node cluster is connected with InfiniPath.

Around this core are other computer systems that are connected to the NW-GRID.

Daresbury: IBM BlueGene-L (2048 cores);
Daresbury: IBM BlueGene-P (4096 cores);
Daresbury: 2560 node IBM 1.5 GHz POWER5 (HPCx - subject to approval);
Daresbury: 32 node Harpertown cluster with Nehalem processors and nVidia Tesla GPUs;
Daresbury 32 node Woodcrest cluster with ClearSpeed accelerators;
Lancaster: 124 node Streamline/ Sun cluster 2.6 GHz twin dual core;
Liverpool: 96 node, 196 core Xeon x86 cluster, contributed by Proudman Oceanographic Laboratory, with 5.7 TB of GPFS storage;
Liverpool: 960 node Dell cluster, Pentium IV processors (Physics);
Manchester: 44 node dual processor Opteron cluster, 2.5 TB RAID storage based on 2 GHz Opterons with 2 GB RAM;
Manchester: SGI Prism with 8 Itanium2 processors and 4x ATI FireGL X3 graphics pipes. There was a similar system at Daresbury;
University of Central Lancashire (UCLan): SGI Altix 3700 with 56x Intel Itanium CPUs and an Intel based cluster with 512 cores of 2.66GHz Xeon processore;
Huddersfield: several Beowulf style clusters plus a share of the enCore service hosted at Daresbury (see below). Huddersfield later acquired part of the original Lancs1 NW-GRID system.

It should also be noted that there are still many separate computer systems in use for academic research in the region. A somethat separate grid of large compute clusters is dedicated to high energy physics users, specifically for analysis of data from the Large Hadron Collider at CERN. This is known as NorthGrid and has partners from physics departments at Lancaster, Liverpool, Manchester and Sheffield with support from the networking team at Daresbury Laboratory.

Further developments are explained below where we refer to the UK e-Infrastructure.

Sun Cluster

Some more information and photos of the NW-GRID Sun cluster at Daresbury can be found here.

Figure 16: NW-GRID Daresbury Sun Cluster.
Image NW-Grid

IBM BlueGene

Two of the futuristic BlueGene systems were installed at the Laboratory, a BlueGene/L and a BluGene/P.

Some more information and photos of BlueGene can be found here.

iDataPlex

Known as SID for STFC iDataPlex, this was a partnership cloud service for University of Huddersfield, STFC staff and for commercial partners via an agreement with OCF plc. The OCF commercial service was referred to as the enCore Compute Cloud, see http://www.ocf.co.uk/what-we-do/encore.

Figure 17: STFC iDataPlex (SID)
Image sid_small

SID comprises a double width compute rack containing 40x nodes. Each node contains 2x six core 2.67 GHz Intel Westmere X56 processors with 24 GB of memory and 250 GB local SATA disk. Two of them have 2x attached nVidia Tesla GPU cards. The nodes are connected by a high performance QLogic Infiniband QDR network switch and GPFS storage.

Like the SP2, this system paved the way for something larger.

Rob Allan 2016-09-20