ppt - Jon Schipp

advertisement
What's Under Your Hood?
Implementing a Network Monitoring System
4/9/2015
jonschipp@gmail.com
1
Who am I?

Jon Schipp

Unix Admin

Linux & Unix User Group

Southern Indiana Computer Klub
4/9/2015
jonschipp@gmail.com
2
and...
I like computers a lot
4/9/2015
jonschipp@gmail.com
3
What's Network Monitoring?
Monitoring?

Monitoring your network

Collecting data i.e. network traffic

Interpreting the data
4/9/2015
jonschipp@gmail.com
4
Why?

Network issues

Attack detection

Record keeping

Fun
4/9/2015
jonschipp@gmail.com
5
Focus

Small/Medium size business

Basement endeavors

Cheap goods

Working with what you have
4/9/2015
jonschipp@gmail.com
6
where the magic happens
4/9/2015
jonschipp@gmail.com
7
gimme the data

hubs

monitor/SPAN ports, port mirroring

taps

ip forwarding/relaying/tunneling, whatev
4/9/2015
jonschipp@gmail.com
8
4/9/2015
9
Forwarding/Relaying


Wireshark Remote Feature
Network Minor Pro: Pcap-over-IP
tcpdump -nni eth0 -s0 -w -| nc 192.168.1.254 33246
SSL/Encryption: ssh, socat, ncat, crypcat, stunnel

Netfilter's Iptables
iptables -t mangle -A PREROUTING -p tcp -m multiport --dport 80,443,22,20,21 -i eth0 -j TEE --gateway 192.168.1.254
iptables -t mangle -A PREROUTING -p tcp -m multiport --dport 80,443,22,20,21 -o eth0 -j TEE --gateway 192.168.1.254

OpenBSD's PF
pass out on em0 dup-to (em1 192.168.1.254) proto tcp from any to any port { 80, 443, 22, 20 ,21 }
pass in on em0 dup-to (em1, 192.168.1.254) proto tcp from any to any port { 80, 443, 22, 20, 21 }
4/9/2015
jonschipp@gmail.com
10
Architecture
4/9/2015
jonschipp@gmail.com
11
High Speed Packet Capture

High-end equipment is expensive

DIY: tuning and compiling

Hardware is pretty fast nowadays but...

4/9/2015
We are using software that isn't
designed for efficient packet capture
jonschipp@gmail.com
12
NIC's

Get a quality card

NAPI is good

DMA is good

4/9/2015
Intel PRO/1000 MT Gigabit models are
generally good, $30 on Ebay
jonschipp@gmail.com
13
PCI buses
(bus speed in MHz) * (bus width in bits) / 8 = speed in Megabytes/second
PCI
66 MHz
* 32 bit
/ 8 = 264 MB/s
PCI X
66 MHz
* 64 bit
/ 8 = 400 MB/s (minus 20% overhead)
PCI X
133 MHz * 64 bit
/ 8 = 850 MB/s (minus 20% overhead)
PCI X
266 MHz * 64 bit
/ 8 = 1700 MB/s (minus 20% overhead)
PCI X
533 MHz * 64 bit
/ 8 = 3400 MB/s (minus 20% overhead)
PCIe v1
2500 Mhz * 32 1 bit lanes / 8 = 250 MB/s (minus 20% overhead)
PCIe v2 x1 5000 Mhz * 1 1 bit lane / 8 = 500 MB/s (minus 20% overhead)
PCIe v2 x2 5000 Mhz * 2 1 bit lanes / 8 = 1000 MB/s (minus 20% overhead)
PCIe v2 x4 5000 Mhz * 4 1 bit lanes / 8 = 2000 MB/s (minus 20% overhead)
PCIe v2 x8 5000 Mhz * 8 1 bit lanes / 8 = 4000 MB/s (minus 20% overhead)
PCIe v2 x16 5000 Mhz * 16 1 bit lanes / 8 = 8000 MB/s (minus 20% overhead)
PCIe v2 x32 5000 Mhz * 32 1 bit lanes / 8 = 16000 MB/s (minus 20% overhead)
PCIe v3 x32 5000 Mhz * 32 1 bit lanes / 8 = 19700 MB/s (minus 1.5% overhead)
1000/8 = 128 Megabytes/second.
10000/8 = 1250 Megabytes/second
4/9/2015
jonschipp@gmail.com
14
Other things




4/9/2015
Decent commodity CPU,
e.g. Opteron whoops Xeon in capture
SMP is good
If you plan on storing the data, writing to
disk will be a bottleneck
RAID Striping, SATA? for sure
SSD (maybe ?) nah
jonschipp@gmail.com
15
Typical Frame Processing








Frame reaches NIC
Ethernet preamble is removed
FCS is calculated, if bad, dropped
If interface is set in promiscuous mode, capture all
Else, only process when dst MAC is me (unicast), or broadcast, or multicast (if on)
FIFO to kernel ring buffer, CPU or DMA
NIC generates an interrupt, interrupt handler is called
Passed to host stack → ip_input module → tcp/udp module → userspace
4/9/2015
jonschipp@gmail.com
16
Frame Processing
4/9/2015
jonschipp@gmail.com
17
Specimen

FreeBSD 8.2-RELEASE

Ubuntu Server 10.04
4/9/2015
jonschipp@gmail.com
18
mbuf kernel structure

FreeBSD - data and headers are stored in mbufs and mbuf clusters
$netstat -m | head -n 3
82/653/735 mbufs in use (current/cache/total)
0/648/648/25600 mbuf clusters in use (current/cache/total/max)
0/256 mbuf+clusters out of packet secondary zone in use (current/cache)
man mbuf: The total size of an mbuf, MSIZE, is a constant defined in <sys/param.h>.
$grep -H -n MSIZE /sys/sys/param.h
sys/sys/param.h:145:#define MSIZE
sysctl kern.ipc.nmbclusters=25600
256
/* size of an mbuf */
(default)
$ vmstat -z | grep mbuf_cluster
mbuf_cluster:
2048,
^size^
4/9/2015
25600
^limit^
jonschipp@gmail.com
19
sk_buff kernel structure

Linux - data and headers are stored in sk_buffs
/usr/include/linux/skbuff.h
4/9/2015
jonschipp@gmail.com
20
Problems

Each packet generates an interrupt, this can
lead to receive live lock/interrupt storm

Context switches

System Calls
4/9/2015
jonschipp@gmail.com
21
Solutions

Device Polling

NAPI

Shared memory, mmap(), and Zero Copy

Bypassing host stack
4/9/2015
jonschipp@gmail.com
22
Solutions, less so

Checksum offloading

Large Receive Offload (LRO)

Larger on-board memory size

More data descriptors
4/9/2015
jonschipp@gmail.com
23
Capture Mechanisms/Subsystems

Berkeley Packet Filter (BPF)
Filter packets before they get to user space

Linux Socket Filter (LSF)
Extended BPF (kinda)


4/9/2015
and PF_RING (Linux)
Others: CSPF, NDIS, xPF, MPF, DPF,
Swift and so on...
jonschipp@gmail.com
24
libpcap

C library for packet capture
Provides link layer access to data available on the network through
interfaces attached to the system.

Runs on almost all the modern Unices
winpcap for windows

4/9/2015
When data reaches user space, it's stored in
the libpcap buffer, applications read from it
jonschipp@gmail.com
25
FreeBSD Frame Processing
4/9/2015
jonschipp@gmail.com
26
FreeBSD Processing cont.

3 copies due to double buffer

Deals with smaller buffers compared to Linux

Half of the double buffer is copied to user space


4/9/2015
Packet is passed to each BPF device, /dev/bpf[0-9]
(where application via libpcap binds to)
App reads from HOLD buffer, data is copied from the
STORE buffer into the HOLD buffer
jonschipp@gmail.com
27
Linux Frame Processing
4/9/2015
jonschipp@gmail.com
28
Linux Processing cont.

2 copies

Deals with larger buffers compared to FreeBSD

Smart queue, pointers


4/9/2015
Packets copied individually, not whole buffers full of
packets
If packets are available, wake up user spacer(libpcap)
to grab data from LSF
jonschipp@gmail.com
29
Tuning: Interrupt Livelock

Interrupt usage high?

Most modern Linux kernels are compiled with device polling

FreeBSD does not have it on by default
options DEVICE_POLLING
options HZ=1000
make buildkernel KERNCONF=NEWKERN
make installkernel KERNCONF=NEWKERN
ifconfig em0 polling

4/9/2015
Get a New API (NAPI) card
jonschipp@gmail.com
30
Tuning: Buffers

Kernel dropping lots of packets?

Increase the size of your kernel buffers

FreeBSD
sysctl net.bpf.bufsize=4096
sysctl net.bpf.maxbufsize=524288

Linux
sysctl net.core.rmem_default=114688
sysctl net.core.rmem_max=131071
net.core.netdev_max_backlog=1000

Increase kernel virtual memory size
4/9/2015
jonschipp@gmail.com
31
Tuning: Drivers

Bad NIC performance?

FreeBSD: man driver e.g. man em:
hw.em.rxd
Number of receive descriptors allocated by the driver. The
default value is 256. The 82542 and 82543-based adapters can
handle up to 256 descriptors, while others can have up to 4096.
echo hm.em.rxd=4096 >> /boot/loader.conf

Linux: ethtool, find driver README file (/usr/src/linux/)
ethtool –g eth0
ethtool -G rx 4096
4/9/2015
jonschipp@gmail.com
32
tcpdump tests, average
6,000,000 packets in 60 seconds using iperf, loss

OS defaults, hardware: Dell PowerEdge 2850, Xeon (Quad), 4GB RAM

tcpdump -nni em0 -w test96.pcap | FreeBSD: 0%, Linux: 8%

tcpdump -nni em0 -w /dev/null | FreeBSD: 0%, Linux: 0%

tcpdump -nni em0 -s0 -w test65535.pcap | FreeBSD: 1.6%, Linux: 22%

tcpdump -nni em0 -s0 /dev/null | FreeBSD: 0%, Linux: .02%
4/9/2015
jonschipp@gmail.com
33
libpcap buffers

libpcap library initializes libpcap buffer to 32kb, if bpf
value is less than 32kb
if ((ioctl (fd, BIOCGBLEN, (caddr_t)&v) < 0) || v < 32768)
v = 32768;
Linux initializes its buffer size at 512Kb

Increase BPF buffer size globally, all apps, remember?
net.bpf.bufsize, net.bpf.maxbufsize

Libpcap will initialize its buffer to size in net.bpf.bufsize

Set buffer for tcpdump only, use -B 524288 (512kb)
4/9/2015
jonschipp@gmail.com
34
FreeBSD, interface drop counts
netstat
$ netstat -dI em0
Name Mtu Network
em0 1500 <Link#2>
Address
Ipkts Ierrs Idrop
00:02:b3:9a:c2:03 2083316 0
Opkts Oerrs Coll Drop
0 1043607 0 0 0
$ netstat –B
Pid Netif Flags
Recv
Drop Match Sblen Hblen Command
90460 em0 p--s--103
0
103 632 0 tcpdump
43960 em0 p--s--- 3803363
0 3803363 712 0 ntop
$ sysctl dev.em.0.dropped
dev.em.0.dropped: 0
$ grep -R -H -n if_iqdrops /usr/src/
sys/dev/e1000/if_lem.c:3470: ifp->if_iqdrops++;
usr.bin/netstat/if.c:289:
idrops = ifnet.if_iqdrops
4/9/2015
jonschipp@gmail.com
35
Linux, interface drop counts
ifconfig
$ ifconfig -a | egrep -e "(^eth|drop)"
$ ethtool -S eth0
static int get_dev_fields(char *bp, struct interface *ife)
$ awk '{ print $1, $5 }' /proc/net/dev {
switch (procnetdev_vsn) {
Inter-|
case 3:
face drop
sscanf(bp,
lo: 0
"%llu %llu %lu %lu %lu %lu %lu",
br0: 3354
&ife->stats.rx_bytes,
eth0: 0
&ife->stats.rx_packets,
eth1: 0
&ife->stats.rx_errors,
eth2: 0
&ife->stats.rx_dropped,
eth3: 14
...
eth4: 0
eth5: 103395
4/9/2015
jonschipp@gmail.com
36
tcpdump/libpcap drops



“Packets captured” – Packets processed by tcpdump
“Received by filter” – Passed the filter (LSF, BPF)
“Dropped by kernel” - Not enough space in kernel buffer
FreeBSD (kernel drops):


libpcap gets its drop count from the kernel (BPF)

ps_drop from pcap_stats() is bs_drop from BIOCGSTATS
Linux (kernel drops)


libpcap gets its drop count from PF_PACKET’s PACKET_STATISTICS

ps_drop from pcap_stats()

ps_ifdrop – Ubuntu addendum/patch (Linux , Tru64 Unix only) from /proc/net/dev
4/9/2015
jonschipp@gmail.com
37
PF_RING for Linux

Creates new socket called PF_RING
Works with existing PF_PACKET apps

Shared memory

Can bypass host stack, sniffing only

4/9/2015
PF_RING aware drivers for faster
capture: e1000, igb, ixgbe
jonschipp@gmail.com
38
PF_RING for Linux

Compile PF_RING

Compile PF_RING aware libpcap and tcpdump

Load PF_RING kernel module
modprobe pf_ring transparent_mode=2 enable_debug=0 enable_tx_capture=0 enable_ip_defrag=0 quick_mode=0

Recompile all apps to use new shared libraries, libpcap and
PF_RING
./configure CPPFLAGS=”-I/usr/local/include” LDFLAGS=”-L/usr/local/lib -lpfring -lpcap” \
&& make && make install
4/9/2015
jonschipp@gmail.com
39
PF_RING DNA

Direct NIC Access, pure speed

Map NIC memory and registers to user land



Packet copy from the NIC to the DMA ring is
done by the NIC's NPU
One application at a time
can use the DMA ring
Requires DNA driver
4/9/2015
jonschipp@gmail.com
40
PF_RING TNAPI

4/9/2015
Threaded NAPI
jonschipp@gmail.com
41
vPF_RING

Virtual PF_RING

Hypervisor bypass

Zero-Copy
4/9/2015
jonschipp@gmail.com
42
netmap FreeBSD

mmap() shared memory

Use less system calls

Creates new device, /dev/netmap


1 GHz CPU can generate the
14.8 Mpps that can saturate
a 10GigE interface
supports ixgbe, e1000, re
4/9/2015
jonschipp@gmail.com
43
others to checkout

Ringmap – FreeBSD – code.google.com/p/ringmap/

Zero-copy sockets – FreeBSD: man zero_copy
Requires specific NIC's
Recompile kernel with “options ZERO_COPY_SOCKETS”
The zero copy send and zero copy receive code can be individually turned
off via the kern.ipc.zero_copy.send and kern.ipc.zero_copy.receive sysctl
variables respectively.

4/9/2015
MMAP() libpcap – Linux - http://public.lanl.gov/cpw/
jonschipp@gmail.com
44
Interface Configuration
Linux
/etc/network/interfaces
FreeBSD
/etc/rc.conf
auto eth0
iface eth0 inet manual
up ifconfig eth0 0.0.0.0 -arp up
up ip link set eth0 promisc on
up ip link set eth0 multicast on
up ip link set eth0 mtu 1514
down ip link set eth0 promisc off
down ifconfig eth0 down
auto eth1
iface eth1 inet manual
up ifconfig eth1 0.0.0.0 -arp up
up ip link set eth1 promisc on
up ip link set eth1 multicast on
up ip link set eth1 mtu 1514
down ip link set eth1 promisc off
down ifconfig eth1 down
4/9/2015
ifconfig_em0=”inet 0.0.0.0 -arp promisc multicast mtu 1514 polling”
ifconfig_em1=”inet 0.0.0.0 -arp promisc multicast mtu 1514 polling”
Bridging two interfaces (Linux)
jonschipp@gmail.com
brctl addbr br0
brctl addif br0 eth0 eth1
ifconfig br0 up
45
Useful Applications








4/9/2015
snort, ntop, tcpdump, iftop
trafshow, wireshark, tshark, tcpick
tcpflow, etherape, ngrep, tcptrack
suricata, bro-ids, ttt
xplico, ifstat, tcpflow
iptraf, bmon, bwm-ng, slurm
dsniff, p0f, tcptrace, tcpreplay
ipsumdump, speedometer
jonschipp@gmail.com
46
ntop
ntop -d -L -u ntop –access-log-file=/var/log/ntop/access.log -b -C –output-packet-path=/var/log/ntopsuspicious.log –local-subnets 192.168.1.0/24,192.168.2.0/24,192.168.3.0/24 -o -M -p
/etc/ntop/protocol.list -i br0,eth0,eth1,eth2,eth3,eth4,eth5 -o /var/log/ntop
4/9/2015
jonschipp@gmail.com
47
netsniff-ng
Linux, libpcap independent, zero-copy mechanism
Kernel compiled with CONFIG_PACKET_MMAP
4/9/2015
jonschipp@gmail.com
48
Daemonlogger
Packet Logger & Soft Tap
This is a libpcap-based program. It has two runtime modes:
1)It sniffs packets and spools them straight to the disk and can daemonize itself for
background packet logging.
2)It sniffs packets and rewrites them to a second interface, essentially acting as a soft tap. It
can also do this in daemon mode.
4/9/2015
jonschipp@gmail.com
49
etherape
4/9/2015
jonschipp@gmail.com
50
iftop
4/9/2015
jonschipp@gmail.com
51
IPTraf
4/9/2015
jonschipp@gmail.com
52
Trafshow
4/9/2015
jonschipp@gmail.com
53
tcpick
4/9/2015
jonschipp@gmail.com
54
tcpstat
4/9/2015
jonschipp@gmail.com
55
speedometer
4/9/2015
jonschipp@gmail.com
56
bmon
4/9/2015
jonschipp@gmail.com
57
Contact


Questions, comments, criticism:
jonschipp@gmail.com
More info:
sickbits.networklabs.org/other/packetcapt
dclinux.org
4/9/2015
jonschipp@gmail.com
58
Download