Packet flows are quickly becoming one of the most powerful tools to understand network dynamics and a variety of network-based security incidents. Flows are powerful because they are compact and easy to acquire, but nevertheless track the movements of every single packet that travels over your network. As a result, you can use flows not only to diagnose network inefficiencies and bottlenecks, but also to trace the source of virus infections and even gauge the extent of a hacker's snooping.
A packet flow is really nothing more than a record of how many packets, travelling between two specific computers, crossed a particular point on your network. But this record has an incredible amount of detail.
For example, a single flow record might indicate that between 6:15:03 and 6:15:08 a total of 531 packets moved from port 80 on computer HUT1 to port 5535 on computer DESK2. Since port 80 is reserved for Web servers, you might reasonably expect from this flow record that the computer HUT1 was running a Web server from which PANDA2 downloaded a webpage. That's probably good news if HUT1 is one of the servers on your department's intranet. It's bad news if HUT1 is the CEO's laptop and PANDA2 is an unknown computer connected to your wireless network.
The most popular format for flow records is the Cisco NetFlow, a format that is generated automatically by many Cisco routers. Here's how it works: The job of every router on the Internet is to look at each packet it receives, decide which of the router's neighbours would be the appropriate next hop and send the packet along. For a home router with just two interfaces, routing is relatively easy. Packets either go to the home LAN or to an upstream Internet provider. But for a medium-size corporate router that has five or 10 different interfaces, routing decisions can become quite complex. Rather than recomputing the next hop for every packet, the router computes the answer once and saves it in a piece of high-speed memory called the route cache.
Each entry in the route cache corresponds to an individual packet flow. Of course, a router's route cache isn't infinitely large; whenever a new flow starts up, the router needs to take the oldest flow out of the cache to make room. A few years ago, these expired cache entries were thrown away. But Cisco and others realized they could be useful, so now most routers make it possible to send the old cache entries to a logging server.
Monitor the FlowThe first significant use of flow data was for billing by ISPs. With appropriate post-processing, it's not hard to determine total data sent within a particular time and to measure peak throughput.
These days, however, flow data increasingly is being used to detect, diagnose and understand security incidents. Take the case of the CEO's laptop. If laptops aren't supposed to run Web servers in your organization, it's a simple matter to detect HTTP flows from unauthorized hosts and generate an alarm.
Flow analysis can even detect worms. Normally you would expect to see that laptop making connections to the organization's servers and computers beyond the organization's firewall. But if you see the laptop systematically opening up connections to other machines throughout your network, this might indicate that it has been infected. If one of the computers the laptop touches starts opening up connections all over the organization, then almost certainly you have a worm. You can review the machines that connected to the CEO's laptop, before it started acting funny, to determine where the infection started.
Working with flow data at the record-by-record level is difficult and time-consuming. Fortunately there are a growing number of tools that will monitor your flows, produce reports and also generate alarms on suspicious behaviours.
One of the most sophisticated enterprise flow management tools is the Mazu Networks Profiler. Mazu can collect flow data using either the company's proprietary sensors or directly from Cisco or Juniper routers. The data is sent over your network to a Profiler appliance, where it is stored on a multiterabyte storage array. You interact with the system through a slick Web-based application using the appliance's built-in Web server.
After you've installed the Mazu hardware, you need to teach the system about the topology of your internal network. Ideally, you do this by importing your carefully maintained list of all the desktops, servers, laptops and other network hosts in your organization. Of course, most organizations don't maintain such lists, so Profiler has an "autodiscovery" mode in which it figures this information out for you by surveying your traffic. In this mode, the system will try to identify clients and servers within your network, group together hosts with similar behaviours and then present that data graphically to the operator.
Mazu Vice President of Marketing Tom Corn describes the company's experience with "autodiscovery." On one of Mazu's first deployments, Profiler discovered a group of laptops that were establishing VPN connections with an outside network. It turns out that Mazu's customer had hired a system integration firm to do some custom database work. The consultants had been given IP addresses inside the customer's network and had proceeded to open VPN connections back to their home network so that they could check their mail, access files and generally get their work done.
So far, so good. But upon further investigation, it turned out that the consultants were doing a lot more than e-mail and file sharing: They were exploring their client's internal network, systematically opening connections to more than 30 different locations. Some of these connections were legitimate, but others were probably inappropriate poking around. Once the issue was identified, Profiler was handed a policy that described where in the client's network the consultants were allowed access and where they were not. The system was programmed to generate an alert if this policy was violated so that the client could handle the infringement in an appropriate manner.
One challenge with enforcing policy based on flow information is that many computers have dynamically assigned IP addresses. Mazu handles this by analyzing the log files generated by Dynamic Host Configuration Protocol (DHCP) servers. This allows policies and reports to be based on actual names rather than on dynamically changing IP addresses.
Analyze the FlowFlow analysis can detect many kinds of security incidents that might make it past a signature-based antivirus or intrusion detection system (IDS). That's because signature-based systems typically look inside the data of each packet for specific kinds of attacks - they might generate an alert when a packet containing part of the Code Red virus passes over the network, for example. Signature identification fails to detect new attacks, of course, but it also fails to find attacks protected by encryption or those that are spread over a long period of time - the so-called slow-and-stealthy attacks.
Because flow analysis looks at behaviour, it can raise an alert on worms and other kinds of malicious programs that haven't been seen before but that have characteristic behavior. Mazu delivers its system with heuristics to detect a number of potentially hostile situations, including slow-stealthy scans, worms and distributed denial-of-service attacks. The system will also report new services or hosts, services or hosts that have gone silent, and a wide range of policy violations.
Although systems like Profile are usually deployed to increase an enterprise's security, frequently they become operational tools to improve performance. For example, one of Mazu's early customers called the company up shortly after the technology was deployed to complain that Profiler wasn't working properly. "It showed that a significant portion of their traffic was IPv6," and the customer was sure it wasn't running Internet Protocol version 6, recalls Corn.
Although most computers on the Internet today use IPv4, many organizations plan to switch to IPv6. As a result, practically every modern operating system has support for both IPv4 and IPv6. Most organizations, though, keep IPv6 turned off.
More investigation revealed that the network was in fact running IPv6 on all of its desktops, but on none of its servers. As a result, every connection to every server was first tried with IPv6, and then when that failed, it was tried again with IPv4. Not only did this generate a lot of extraneous traffic, it also slowed down the perceived network performance. But the network managers had no idea that this was happening. Once the problem was identified, it was a simple matter to reconfigure the desktops. Things ran considerably faster after that.
If you are interested in flow analysis but don't want to shell out the money for Mazu, check out ntop (www.ntop.org), an open-source NetFlow system.
And remember: While flow analysis is useful, it has some serious shortcomings. That's because flow-based systems look only at the packet headers. They don't look at the actual data that's moving over your network. They can't tell the difference between an e-mail message that has attached a photo of someone's high school sweetheart and one that has a copy of your organization's confidential product development plans.
Simson Garfinkel, CISSP, is a technology writer based in the Boston area.
Join the CIO New Zealand group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.