Introduction to DISTRIBUTED SYSTEMS

advertisement
Introduction to
DISTRIBUTED SYSTEMS
Tran, Van Hoai
Department of Systems & Networking
Faculty of Computer Science & Engineering
HCMC University of Technology
2009-2010
1
Outline
•
•
•
•
Why distributed systems needed ?
Examples
Definitions
Goals to build distributed systems
2009-2010
2
Why distributed systems needed ? (1)
• Functional distribution: computers have different
functional capabilities
– Client/server
– Host/terminal
– Data gathering/data processing
sharing of resources with specific functionalities
• Inherent distribution: stemming from application
domain, e.g.,
– cash register and inventory systems for supermarket
chains
– computer supported collaborative work
2009-2010
3
Why distributed systems needed ? (2)
• Load distribution/balancing: assign tasks to
computers such that overall performance is
optimized
• Replication of processing power: independent
computers working on the same task
– collection of microcomputers may have processing
power that no supercomputer will ever achieve
2009-2010
4
Why distributed systems needed ? (3)
• Physical separation: relying on the fact that
computers are physically separated (e.g., to
satisfy reliability requirements)
• Economics: collections of microprocessors
offer a better price/performance ratio than
large mainframes
– mainframes: 10 times faster, 1000 times as
expensive
2009-2010
5
Examples (1)
• Network of workstations
– all files accessible from all machines in the same way
and using the same path name
– system looks for the best place to execute a command
distributed system
• Workflow information system: automatic order
processing
– people from several departments at different
locations
– users unaware how an order to be processed
distributed system
2009-2010
6
Examples (2)
• World Wide Web: offering uniform model of
distributed documents
– in theory, no need to know where the document is
fetched
– in practice, the location should be awared
2009-2010
7
Examples (3)
• Internet
intranet
%
ISP
%
%
%
desktop computer:
server:
network link:
backbone
• interconnected
collection of
computer networks of many
satellite
link
different
types
• computer interacts by passing
messages using a common means
of communication
2009-2010
8
Examples (4)
• Intranet
email s erv er
Desktop
computers
print and other servers
Web server
Local area
netw ork
email s erv er
print
File s erv er
other servers
the rest of
the Internet
• resources shared to different
computers
router/firew all
2009-2010
9
Definitions (1)
 “A system in which hardware or software located
at networked computers communicate and
coordinate their actions only by message
passing”.
[Coulouris]
 “A system that consists of a collection of two or
more independent computers which coordinate
their processing through exchange of
synchronous or asynchronous message passing”.
2009-2010
10
Definitions (2)
“A distributed system is a collection of
independent computers that appear to the
users of the system as a single computer”.
[Tanenbaum]
“A distributed system is a collection of
autonomous computers linked by a network
with software designed to produce an
integrated computing facility”.
2009-2010
11
Computer networks vs.
Distributed systems
• Computer network: autonomous computers are
explicitly visible (have to be explicitly addressed)
• Distributed system: existence of multiple
computers is transparent
• However,
– many problems in common
– in some sense networks (or parts of them, e.g. name
services) are also distributed systems
– normally, every distributed system relies on services
provided by a computer network
2009-2010
12
Which examples are distributed
systems ?
• Network of workstations
distributed system
• Workflow information system: automatic
order processing
distributed system
• World Wide Web
not fully qualified as a distributed system
(Tanenbaum)
distributed system (Coulouris)
2009-2010
13
Middleware service
Machine A
Machine B
Machine C
Distributed applications
Middleware service
Local OS
Local OS
Local OS
• To guarantee
– supporting heterogeneous computers
– providing single view to users
2009-2010
14
Goals to build a distributed systems (1)
• Connecting users and resources
– sharing resource
– easier to collaborate and exchange information
disadvantage: security (intrusion), privacy
violation (communication tracking)
2009-2010
15
Goals to build a distributed systems (2)
• Transparency
Transparency
Description
Access
Hide differences in data representation and how a resource is
accessed
Location
Hide where a resource is located
Migration
Hide that a resource may move to another location
Relocation
Replication
tradeoff
high
Hide thatbetween
a resource mayabe
moveddegree
to another of
location while in
use
transparency
and the performance of system
Hide that a resource may have many copies
Concurrency
Hide that a resource may be shared by several competitive users
Failure
Hide the failure and recovery of a resource
Persistence
Hide whether a (software) resource is in memory or on disk
2009-2010
16
Goals to build a distributed systems (3)
• Openness
– Offering services according to standard rules that
describe syntax and semantics of those services
• syntax specification: in interface definition language
• semantic specification: in natural language
– Interoperability and portability
– Flexibility: using different components from
different developers
2009-2010
17
Goals to build a distributed systems (4)
• Scalability
– Measured in three dimensions
• size: more users, resources can be added easily
• geographics: users, resources may lie far apart
• administration: still easy to manage even spanning
many independent administrative organizations
– Some problems must be solved
• size: centralization
– centralized service: single server for all users
– centralized data: single online telephone book
– centralized algorithm: routing based on complete information
2009-2010
18
Goals to build a distributed systems (5)
• size: centralization
– centralized service: single server for all users
– centralized data: single online telephone book
– centralized algorithm: routing based on complete information
• geographics: synchronous & unreliable communication,
– some system only designed for LAN (blocking communication
depends strongly on quick response)
• administration: conflicting policies w.r.t. resource
usage, management, security
2009-2010
19
Scaling techniques
• Asynchronous communication
• Distribution
• Replication, caching
2009-2010
20
Some numbers (1)
• Computers in the Internet
Date
1979, Dec.
1989, July
1999, July
2003, Jan.
Computers
188
130,000
56,218,000
171,638,297
2009-2010
Web servers
0
0
5,560,866
35,424,956
21
Some numbers (2)
• Computers vs. Web servers in the Internet
Date
1993, July
1995, July
1997, July
1999, July
2001, July
Computers
Web servers
1,776,000
6,642,000
19,540,000
56,218,000
125,888,197
2009-2010
130
23,500
1,203,096
6,598,697
31,299,592
Percentage
0.008
0.4
6
12
25
22
Text books & materials
• Andrew S. Tanenbaum, Maaten Van Steen,
Distributed Systems: Principles and Paradigms,
Prentice Hall, Second Edition, 2007
• George Coulouris, Jean Dollimore, Tim
Kindberg, Distributed Systems: Concepts and
Design, Addison Wesley, Fourth Edition, 2005
• Google
2009-2010
23
How are you evaluated ?
• HW & quizzes: 30%
• Mid-term exam: 30%
• Final exam: 40%
2009-2010
24
How to reach me
• hoai@cse.hcmut.edu.vn
or hoaitv@gmail.com
• http://www.cse.hcmut.edu.vn/~hoai
2009-2010
25
Download