Experiences With Internet Traffic Measurement and Analysis Vern Paxson ICSI Center for Internet Research International Computer Science Institute and Lawrence Berkeley National Laboratory vern@icir.org March 5th, 2004 Outline • The 1990s: How is the Internet used? – – – – Growth and diversity Fractal traffic, “heavy tails” End-to-end dynamics Difficulties with measurement & analysis • The 2000s: How is the Internet abused? – Prevalence of misuse – Detecting attacks – Worms The 1990s How is the Internet Used? = 80% growth/year Data courtesy of Rick Adams Internet Growth: Exponential • Growth of 80%/year • Sustained for at least ten years … • … before the Web even existed. Internet is always changing. You do not have a lot of time to understand it. Characterizing Site Traffic • Methodology: passively record traffic in/out of a site • Danzig et al (1992) – 3 sites, 24 hrs, all packet headers • Paxson (1994) – TCP SYN/FIN/RST control packets • Gives hosts, sizes, start time, duration, application • Large filtering win (≈ 10-100:1 packets, 1000s:1 bytes) – 7 month-long traces at Lawrence Berkeley Natl. Laboratory – 8 day-long traces from 6 other sites Findings from Site Studies • Traffic mix (which protocols are used; how many connections/bytes they contribute) varies widely from site to site. • Mix also varies at the same site over time. • Most connections have much heavier traffic in one direction than the other: – Even interactive login sessions (20:1) Findings from Site Studies, con’t • Many random variables associated with connection characteristics (sizes, durations) are best described with log-normal distributions – But often these are not particularly good fits – And often their parameters vary significantly between datasets • The largest connections in bulk transfers are very large – Tail behavior is unpredictable Many of these findings differ from assumptions used in 1990s traffic modeling Theory vs. Measured Reality Scaling behavior in Internet Traffic Burstiness • Long-established framework: Poisson modeling • Central idea: network events (packet arrivals, connection arrivals) are well-modeled as independent • In simplest form, there’s just a rate parameter, • It then follows that the time between “calls” (events) is exponentially distributed, # of calls ~ Poisson • Implications (if assumptions correct): – Aggregated traffic will smooth out quickly – Correlations are fleeting, bursts are limited Burstiness: Theory vs. Measurement • For Internet traffic, Poisson models have fundamental problem: they greatly underestimate burstiness • Consider an arrival process: Xk gives # packets arriving during kth interval of length T. – Take 1-hour trace of Internet traffic (1995) – Generate (batch) Poisson arrivals with same mean and variance Previous Region 10 100 600 Burstiness Over Many Time Scales • Real traffic has strong, long-range correlations • Power spectrum: – Flat for Poisson processes – For measured traffic, diverges to as 0 • To build Poisson-based models that capture this characteristic takes many parameters • But due to great variation in Internet traffic, we are desperate for parsimonious models (few parameters) Describing Traffic with Fractals • Landmark 1993 paper by Leland et al proposed capturing such characteristics (in Ethernet traffic) using self-similarity, a form of fractal-based modeling: – Parameterized by mean, variance, and Hurst parameter • Models predict burstiness on all time scales Queueing delays / drop probabilities much higher than predicted by Poisson-based models Heavy Tails • Key prediction from fractal modeling: One way fractal traffic can arise in aggregate is if individual connections have activity periods (durations, sizes) whose distribution has infinite variance. • Infinite variance manifests in distribution’s upper tail • Consider Pareto distribution, F(x) = (x/a)- – If < 2, then F(x) has infinite variance – Can test for Pareto fit by plotting log F(x) vs. log x Straight line = Pareto distribution, slope estimates - Web connection sizes (226,386 observations) 28,000 observations = 1.3 Infinite Variance Self-Similarity & Heavy Tails, con’t • We find heavy-tailed sizes in many types of network traffic. Just a few extreme connections dominate the entire volume. Self-Similarity & Heavy Tails, con’t • We find heavy-tailed sizes in many types of network traffic. Just a few extreme connections dominate the entire volume. • Theorems then give us that this traffic aggregates to self-similar behavior. • While self-similar models are parsimonious, they are not (alas) “simple”. • You can have self-similar correlations for which magnitude of variations is small still possible to have a statistical multiplexing gain, especially at very high aggregation • Smaller time scales behave quite differently. – When very highly aggregated, they can appear Poisson! End-to-End Internet Dynamics Routing & Packets End-to-End Dynamics • Ultimately what the user cares about is not what’s happening on a given link, but the concatenation of behaviors along all of the hops in an end-to-end path. • Measurement methodology: deploy measurement servers at numerous Internet sites, measure the paths between them • Exhibits N2 scaling: as # sites grows, # paths between them grows rapidly. “Measurement Infrastructure” sites 1994-1995 End-to-End Dynamics Study Path in the Study: N2 Scaling Effect End-to-End Routing Dynamics • Analysis of 40,000 “traceroute” measurements between 37 sites, 900+ end-to-end paths. • Route prevalence: – most end-to-end paths through the Internet dominated by a single route. • Route persistence: – 2/3’s of routes remain unchanged for days/weeks – 1/3 of routes change on time scales of seconds to hours • Route symmetry: – More than half of all routes visited at least one different city in each direction Very important for tracking connection state inside network! End-to-End Packet Dynamics • Analysis of 20,000 TCP bulk transfers of 100 KB between 36 sites • Each traced at both ends using tcpdump • Benefits of using TCP: – Real-world traffic – Can probe fine-grained time scales but using congestion control • Drawbacks to using TCP: – Endpoint TCP behavior a major analysis headache – TCP’s loading of the transfer path also complicates analysis End-to-End Packet Dynamics: Unusual Behavior • Out-of-order delivery: – Not uncommon. 0.6%-2% of all packets. – Strongly site-specific. – Generally little impact on performance. • Replicated packets: – Very rare, but does occur (e.g., 1 packet in, 22 out) • Corrupted packets (bad checksum): – Overall, 1 in 5,000 (!) – Stone/Partridge (2000): between 1 in 1,100 and 1 in 32,000 • Undetected: between 1 in 16 million and 1 in 10 billion End-to-End Packet Dynamics: Loss • Half of all 100 KB transfers experienced no loss – 2/3s of paths within U.S. • The other half experienced significant loss: – Average 4-9%, but with wide variation • • • • TCP loss is not well described as independent Losses dominated by a few long-lived outages (Keep in mind: this is 1994-1995!) Subsequent studies: – – – – Loss rates have gotten much better Loss episodes well described as independent Same holds for regions of stable delay, throughput Time scales of constancy minutes or more Issues / Difficulties for Analyzing Internet Traffic Measurement, Simulation & Analysis There is No Such Thing as “Typical” • Heterogeneity in: – Traffic mix – Range of network capabilities • Bottleneck bandwidth (orders of magnitude) • Round-trip time (orders of magnitude) – Dynamic range of network conditions • Congestion / degree of multiplexing / available bandwidth • Proportion of traffic that is adaptive/rigid/attack • Immense size & growth – Rare events will occur • New applications explode on the scene Doubling every 7-8 weeks for 2 years There is No Such Thing as “Typical”, con’t • New applications explode on the scene – Not just the Web, but: Mbone, Napster, KaZaA etc., IM • Event robust statistics fail. – E.g., median size of FTP data transfer at LBL • • • • • • • Oct. 1992: 4.5 KB Mar. 1993: 2.1 KB Mar. 1998: 10.9 KB Dec. 1998: 5.6 KB Dec. 1999: 10.9 KB Jun. 2000: 62 KB Nov. 2000: 10 KB (60,000 samples) • Danger: if you misassume that something is “typical”, nothing tells you that you are wrong! The Search for Invariants • In the face of such diversity, identifying things that don’t change has immense utility • Some Internet traffic invariants: – Daily and weekly patterns – Self-similarity on time scales of 100s of msec and above – Heavy tails • both in activity periods and elsewhere, e.g., topology – Poisson user session arrivals – Log-normal sizes (excluding tails) – Keystrokes have a Pareto distribution The Danger of Mental Models “Exponential plus a constant offset” Not exponential - Pareto! Heavy tail: ≈ 1.0 Versus the Power of Modeling to Open Our Eyes • Fowler & Leland, 1991: Traffic ‘spikes’ (which cause actual losses) ride on longer-term ‘ripples’, that in turn ride on still longerterm ‘swells’ Versus the Power of Modeling to Open Our Eyes • Fowler & Leland, 1991: Traffic ‘spikes’ (which cause actual losses) ride on longer-term ‘ripples’, that in turn ride on still longerterm ‘swells’ • Lacked vocabulary that came from selfsimilar modeling (1993) • Similarly, 1993 self-similarity paper: We did so without first studying and modeling the behavior of individual Ethernet users (sources) • Modeling led to suggestion to investigate heavy tails Measurement Soundness • How well-founded is a given Internet measurement? • We can often use additional information to help calibrate. • One source: protocol structure – E.g., was a packet dropped by the network … … or by the measurement device? • For TCP, can check: did receiver acknowledge it? – If Yes, then dropped by measurement device – If No, then dropped by network • Can also calibrate using additional information Calibration Using Additional Information: Packet Timings Routing change? Clock adjustment Reproducibilty of Results (or lack thereof) • It is rare, though sometimes occurs, that raw measurements are made available to other researchers for further analysis or for confirmation. • It is more rare that analysis tools and scripts are made available, particularly in a coherent form that others can actually get to work. • It is even rarer that measurement glitches, “outliers,” analysis fudge factors, etc., are detailed. • In fact, often researchers cannot reproduce their own results. Towards Reproducible Results • Need to ensure a systematic approach to data reduction and analysis – I.e., a “paper trail” for how analysis was conducted, particularly when bugs are fixed • A methodology to do this: – Enforce discipline of using a single (master) script that builds all analysis results from the raw data – Maintain all intermediary/reduced forms of the data as explicitly ephemeral – Maintain a notebook of what was done and to what effect. – Use version control for scripts & notebook. – But also really need: ways to visualize what's changed in analysis results after a re-run. The 2000s How is the Internet Abused? Magnitude of Internet Attacks • As seen at Lawrence Berkeley National Laboratory, on a typical day in 2004: – > 70% of Internet connections (20 million out of 28 million) reflect clear attacks. – 60 different remote hosts scan one of LBL’s two blocks of 65,536 address in its entirety – More than 10,000 remote hosts engage in scanning activity • Much of this activity reflects “worms” • Much of the rest reflects automated scanand-exploit tools How is the Internet Abused? Detecting Network Attacks Design Goals for the “Bro” Intrusion Detection System • Monitor traffic in a very high performance environment • Real-time detection and response • Separation of mechanism from policy • Ready extensibility of both mechanism and policy • Resistant to evasion How Bro Works Network • Taps GigEther fiber link passively, sends up a copy of all network traffic. How Bro Works Tcpdump Filter Filtered Packet Stream libpcap Packet Stream Network • Kernel filters down high-volume stream via standard libpcap packet capture library. How Bro Works Event Control Event Stream Event Engine Tcpdump Filter Filtered Packet Stream libpcap Packet Stream Network • “Event engine” distills filtered stream into high-level, policy-neutral events reflecting underlying network activity – E.g., connection_attempt, http_reply, user_logged_in How Bro Works Policy Script Real-time Notification Record To Disk Policy Script Interpreter Event Control Event Stream Event Engine Tcpdump Filter Filtered Packet Stream libpcap Packet Stream Network • “Policy script” processes event stream, incorporates: – – Context from past events Site’s particular policies How Bro Works Policy Script Real-time Notification Record To Disk Policy Script Interpreter Event Control Event Stream Event Engine Tcpdump Filter Filtered Packet Stream libpcap Packet Stream Network • “Policy script” processes event stream, incorporates: – – Context from past events Site’s particular policies • … and takes action: • Records to disk • Generates alerts via syslog, paging • Executes programs as a form of response Experiences with Bro • Exciting research because used operationally (24x7) at several open sites (LBL, UCB, TUM) • Key enabler: sites’ threat model – Occasional break-ins are tolerable – “Jewels” are additionally protected (e.g., firewalls) • Significant real-world concern: policy management • Dynamic blocking critical to success – Currently, 100-200 blocks/day The Problem of Evasion • Fundamental problem passively measuring traffic on a link: Network traffic is inherently ambiguous • Generally not a significant issue for traffic characterization • But is in the presence of an adversary: Attackers can craft traffic to confuse/fool monitor Evading Detection Via Ambiguous TCP Retransmission The Problem of “Crud” • There are many such ambiguities attackers can leverage. • Unfortunately, they occur in benign traffic, too: – Legitimate tiny fragments, overlapping fragments – Receivers that acknowledge data they did not receive – Senders that retransmit different data than originally • In a diverse traffic stream, you will see these • Approaches for defending against evasion: – Traffic “normalizers” that actively remove ambiguities – “Mapping” of local hosts to determine their behaviors – Active participation by local hosts in intrusion detection How is the Internet Abused? The Threat of Internet Worms What is a Worm? • Self-replicating/self-propagating code. • Spreads across a network by exploiting flaws in open services. – As opposed to viruses, which require user action to quicken/spread. • Not new --- Morris Worm, Nov. 1988 – 6-10% of all Internet hosts infected • Many more since, but none on that scale …. until …. Code Red • Initial version released July 13, 2001. • Exploited known bug in Microsoft IIS Web servers. • 1st through 20th of each month: spread. 20th through end of each month: attack. • Payload: web site defacement. • Spread: via random scanning of 32-bit IP address space. • But: failure to seed random number generator linear growth. Code Red, con’t • Revision released July 19, 2001. • Payload: flooding attack on www.whitehouse.gov. • Bug lead to it dying for date ≥ 20th of the month. • But: this time random number generator correctly seeded. Bingo! Network Telescopes • Idea: monitor a cross-section of the IP address space to measure network traffic involving random addresses (flooding “backscatter”; worm scanning) • LBL’s cross-section: 1/32,768 of Internet. • UCSD’s cross-section: 1/256. Spread of Code Red • Network telescopes give lower bound on # infected hosts: 360K. • Course of infection fits classic logistic. • Note: larger the vulnerable population, faster the worm spreads. • That night ( 20th), worm dies … … except for hosts with inaccurate clocks! • It just takes one of these to restart the worm on August 1st … Striving for Greater Virulence: Code Red 2 • • • • • • Released August 4, 2001. Comment in code: “Code Red 2.” But in fact completely different code base. Payload: a root backdoor, resilient to reboots. Bug: crashes NT, only works on Windows 2000. Localized scanning: prefers nearby addresses. • Kills Code Red I. • Safety valve: programmed to die Oct 1, 2001. Striving for Greater Virulence: Nimda • Released September 18, 2001. • Multi-mode spreading: – – – – attack IIS servers via infected clients email itself to address book as a virus copy itself across open network shares modifying Web pages on infected servers w/ client exploit – scanning for Code Red II backdoors (!) worms form an ecosystem! • Leaped across firewalls. Life Just Before Slammer Life Just After Slammer A Lesson in Economy • Slammer exploits a connectionless UDP service, rather than connection-oriented TCP. • Entire worm fits in a single packet! When scanning, worm can “fire and forget”. • Worm infects 75,000+ hosts in 10 minutes (despite broken random number generator). • Progress limited by the Internet’s carrying capacity! The Usual Logistic Growth Slammer’s Bandwidth-Limited Growth Blaster • Released August 11, 2003. • Exploits flaw in RPC service ubiquitous across Windows. • Payload: attack Microsoft Windows Update. • Despite flawed scanning and secondary infection strategy, rapidly propagates to 100K’s of hosts. • Actually, bulk of infections are really Nachia, a Blaster counter-worm. • Key paradigm shift: firewalls don’t help. What if Spreading Were Well-Designed? • Observation (Weaver): Much of a worm’s scanning is redundant. • Idea: coordinated scanning – Construct permutation of address space – Each new worm starts at a random point – Worm instance that “encounters” another instance re-randomizes. Greatly accelerates worm in later stages. What if Spreading Were Well-Designed?, con’t • Observation (Weaver): Accelerate initial phase using a precomputed hit-list of say 1% vulnerable hosts. At 100 scans/worm/sec, can infect huge population in a few minutes. • Observation (Staniford): Compute hit-list of entire vulnerable population, propagate via divide & conquer. At 10 scans/worm/sec, infect in 10s of sec! Defenses • Detect via honeyfarms: collections of “honeypots” fed by a network telescope. – Any outbound connection from honeyfarm = worm. – Distill signature from inbound/outbound traffic. – If telescope covers N addresses, expect detection when worm has infected 1/N of population. • Thwart via scan suppressors: network elements that block traffic from hosts that make failed connection attempts to too many other hosts. Defenses? • Observation: worms don’t need to randomly scan • Meta-server worm: ask server for hosts to infect. E.g., query Google for “index.html”. • Topological worm: fuel spread with local information from infected hosts (web server logs, email address books, config files, SSH “known hosts”) No scanning signature; with rich interconnection topology, potentially very fast. Defenses?? • Contagion worm: propagate parasitically along with normally initiated communication. • E.g., using 2 exploits - Web browser & Web server - infect any vulnerable servers visited by browser, then any vulnerable browsers that come to those servers. • E.g., using 1 KaZaA exploit, glide along immense peer-to-peer network in days/hours. No unusual connection activity at all! :-( Some Observations • Today’s worms have significant real-world impact: – Code Red disrupted routing – Slammer disrupted elections, ATMs, airline schedules, operations at an off-line nuclear power plant … – Blaster possibly contributed to North American Blackout of Aug. 2003 • But today’s worms are amateurish – Frequent bugs, algorithm/attack botches – Unimaginative payloads Next-Generation Worm Authors • Potential for major damage with more nasty payloads :-(. • Military (“cyberwarfare”) • Criminals: – Denial-of-service, spamming for hire – Access for Sale: A New Class of Worm (Schecter/Smith, ACM CCS WORM 2003) • Money on the table Arms race Summary • Internet measurement is deeply challenging: – Immense diversity – Internet never ceases to be a moving target – Our mental models can betray us: the Internet is full of surprises! Seek invariants • Many of the last decade’s measurement questions -- “What are the basic characteristics and properties of Internet traffic?” -- have returned … • … but now regarding Internet attacks • What on Earth will the next decade hold??