Document 13353262

advertisement

Distributed Resource Management and

Parallel Computation

Dr Michael Rudgyard

Streamline Computing Ltd

Streamline Computing Ltd

• Spin out of Warwick (& Oxford) University

• Specialising in distributed (technical) computing

– Cluster and GRID computing technology

• 14 employees & growing; focussed expertise in:

– Scientific Computing

– Computer systems and support

– Presently 5 PhDs in HPC and Parallel Computation

– Expect growth to 20+ people in 2003

Strategy

• Establish an HPC systems integration company..

• ....but re-invest profits into software

– Exploiting IP and significant expertise

– First software product released

– Two more products in prototype stage

• Two complementary ‘businesses’

– Both high growth

Track Record (2001 – date..)

• Installations include:

– Largest Sun HPC cluster in Europe (176 proc)

– Largest Sun / Myrinet cluster in UK (128 proc)

– AMD, Intel and Sun clusters at 21 UK Universities

– Commercial clients include Akzo Noble, Fujitsu,

Maclaren F1, Rolls Royce, Schlumberger, Texaco….

• Delivered a 264 proc Intel/Myrinet cluster:

– 1.3 Tflop/s Peak !!

– Forms part of the White Rose Computational Grid

Streamline and Grid Computing

• Pre-configured ‘grid’-enabled systems:

– Clusters and farms

– The SCore parallel environment

– Virtual ‘desktop’ clusters

• Grid-enabled software products:

– The Distributed Debugging Tool

– Large-scale distributed graphics

– Scaleable, intelligent & fault tolerant parallel computing

‘Grid’-enabled turnkey clusters

• Choice of DRMs and schedulers:

– (Sun) GridEngine

– PBS / PBS-Pro

– LSF / ClusterTools

Condor

Maui Scheduler

• Globus 2.x gatekeeper (Globus 3 ???)

• Customised access portal

The SCore parallel environment

• Developed by the Real World Computing

Partnership in Japan (www.pccluster.org).

• Unique features, that are unavailable in most parallel environments:

– Low latency, high bandwidth MPI drivers

– Network transparency: Ethernet, Gigabit and

Myrinet

– Multi-user time-sharing (gang scheduling)

– O/S level checkpointing and failover

– Integration with PBS and SGE

– MPICH-G port

– Cluster management functionality

‘Desktop’ Clusters

• Linux Workstation Strategy

– Integrated software stack for HPTC (compilers, tools & libraries) – cf. UNIX workstations

• Aim to provide a GRID at point of sale:

– Single point of administration for several machines

– Files served from front-end

– Resource management

– Globus enabled

– Portal

• A cluster with monitors !!

The Distributed Debugging Tool

• A debugger for distributed parallel application

– Launched at Supercomputing 2002

• Aim is to be the de-facto HPC debugging tool

– Linux ports for GNU, Absoft, Intel and PGI

– IA64 and Solaris ports; AIX and HP-UX soon…

– Commodity pricing structure !

• Existing architecture lends itself to the GRID:

– Thin client GUI + XML middleware + back-end

– Expect GRID-enabled version in 2003

Distributed Graphics Software

• Aims

– To enable very large models to be viewed and manipulated using commodity clusters

– Visualisation on (local or remote) graphics client

• Technology

– Sophisticated data-partitioning and parallel I/O tools

– Compression using distributed model simplification

– Parallel (real-time) rendering

• To be GRID-enabled within e-Science ‘Gviz’ project

Parallel Compiler and Tools Strategy

• Aim to invest in new computing paradigms

• Developing parallel applications is far from trivial

– OpenMP does not marry with cluster architecture

– MPI is too low-level

– Few skills in the marketplace !

– Yet growth of MPPs is exponential…

• Most existing applications are not GRID-friendly

– # of processors fixed

– No Fault Tolerance

– Little interaction with DRM

DRM for Parallel Computation

• Throughput of parallel jobs is limited by:

– Static submission model: ‘mpirun –np …..’

– Static execution model: # processors fixed

– Scaleability; many jobs use too many processors !

– Job Starvation

• Available tools can only solve some issues

– Advanced reservation and back-fill (eg Maui)

– Multi-user time-sharing (gang scheduling)

• The application itself must take responsibility !!

Dynamic Job Submission

• Job scheduler should decide the available processor resource !

• The application then requires:

– In built partitioning / data management

– Appropriate parallel I/O model

– Hooks into the DRM

• DRM requires:

– Typical memory and processor requirements

– LOS information

– Hooks into the application

Dynamic Parallel Execution

• Additional resources may become available or be required by other applications during execution…

• Ideal situation:

– DRM informs application

– Application dynamically re-partitions itself

• Other issues:

– DRM requires knowledge of the application (benefit of data redistribution must outweigh cost !)

– Frequency of dynamic scheduling

– Message passing must have dynamic capabilities

The Intelligent Parallel Application

• Optimal scheduling requires more information:

– How well the application scales

– Peak and average memory requirements

– Application performance vs. architecture

• The application ‘cookie’ concept:

– Application (and/or DRM) should gather information about its own capabilities

– DRM can then limit # of available processors

– Ideally requires hooks into the programming paradigm…

Fault Tolerance

• On large MPPs, processors/components will fail !

• Applications need fault tolerance:

– Checkpointing + RAID-like redundancy (cf SCore)

– Dynamic repartitioning capabilities

– Interaction with the DRM

– Transparency from the user’s perspective

• Fault-tolerance relies on many of the capabilities described above…

Conclusions

• Commitment to near-term GRID objectives

– Turn-key clusters, farms and storage installations

– On going development of ‘GRID-enabled’ tools

– Driven by existing commercial opportunities….

• ‘Blue’-sky project for next generation applications

– Exploits existing IP and advanced prototype

– Expect moderate income from focussed exploitation

– Strategic positioning: existing paradigms will ultimately be a barrier to the success of (V-)MPP computers / clusters !

Download