Notes about “iptables” use

Updated: 2006-03-20
Created: 2002

Licensing and disclaimer of warranty

This document is an incomplete draft.

Motivation

The iptables netfilter implementation in Linux 2.4 is quite cute and elegant, but its documentation is sometimes incomplete and baffling. This is in part because in the Linux community some people seem to think that writing bafflingly clever code gives satisfaction, and it is highly amusing to see other people try to figure it out.

The foremost example of this attitude seems to me the exceedingly clever and undocumented ip-routing system; the TC subsystem is in particular quite subtle and some vital details of are only available as archived News messages of its author.

There are some bits of documentation for ip-routing (for example 1, 2, 3, 4 5 and 6), but they are far from complete or equal to the task of illustrating an entirely novel and very subtle routing architecture.

There is some official and unofficial documentation for iptables, which I have consulted in order to prepare this document; for example Rusty's packet filtering HOWTO which is a somewhat shallow introduction, another introduction with an example here, another introduction that is also quite shallow, a very much deeper guide by James Stephens, and a tutorial on network gateway configuration at YoLinux., as well as of course the iptables(8) manual page, which documents the iptables command itself, and only accidentally the iptables subsystem itself.

This document is designed to provide both insight and information that is missing in other documents.

First there will be a description of what iptables is like structurally, and then a description of how it can be used for implementing session level policies and how they relate to subject-object level ones.

The higher level discussion will contain a number of illustrative examples.

If you are impatient for a small scale example, you can look below, and for a large example look at this firewall configuration script, which is the one that I use for my home PCs.

Background information

General networking

Safety, not security
Security is a difficult process that depends on having a threat model and involves subjects and objects and access control between them.
Firewalling is instead about generic misbehaviour and involves network connections and computers.
Firewalling is not even remotely about security; it is about safety; merely about screening computing resources from simple problems caused by mistakes or malice.
Firewalling does not add to security; it merely makes existing insecurities harder to notice in some very limited ways.
In particular, firewalling in general offers some safety against outsider malice or mistakes, but the vast majority of security problems are caused by insiders, and firewalling does not really help that much even as to safety with insiders.
Interfaces, not computers
In the IP protocol architecture there is no notion of computer; all communication happens strictly between interfaces, which may have one or more addresses.
The mapping between interfaces and addresses is also potentially many-to-many, even if one-to-one and one-to-many are far more common.
Domain names label interfaces, not computers.
Applications bind to network addresses, which are associated with interfaces not computers.
Whether interfaces belong to a specific computer is an accident as far as the IP protocol achitecture is concerned.
Sessions
While protocols can be connectionless or not, the pattern of communication between processes usually involves sessions, in which there are recognisable session setup, data transfer and session close (logical) phases.
Many of these sessions are asymmetrical, in which the processes involved follow a request/response communication pattern, where one process is a client and the other a server.
It is important to realise that packet filtering deals with protocol packets, but is really meant to apply to sessions and is often intended to determine whether and how client processes can use server processes or server processes can support client processes.
A lot of the issues in configuring iptables are a consequence of the aim to define fairly high level policies related to client/server interactions while using a mechanism that deals with the lowest levels of such interactions, individual packets.

iptables configuration as a result requires first defining policies at the level of sessions, and then figuring out which rules enforce those policies on packets traveling between interfaces; this requires the ability to conceptualize how agents like a user or a process and actions like web browsing map onto lower level entities like interfaces and packet attributes.

It is also very useful to be able to do the reverse, that is given an iptables configuration to infer what kind of policy it will end up implementing.

This is extremely important, because it is really hard and subtle to devise a good security policy, and even hard to map from a high level security policy expressed in terms of subjects and objects to a network one expressed in terms of processes, computers, client and servers, and finally map this onto an iptables configuration expressed in terms of packets and interfaces.

It is difficult to get any of these things right, and it may be more secure to define a security policy only in terms of general principles, devise some simple, easy to check and maintain iptables configuration and check that it implements a security policy compatible with the general principles.

Linux and iptables specificities

The Linux network subsystem is by now a very sophisticated and complex work, that has been grown more than designed; but when the growth becomes too unruly, someone provides a redesign.

Some principles are however quite distinctive, and one is very important and somewhat different from those of other network subsystems, and needs some explaining.

While the IP protocol architecture is about interfaces and traffic among then, Linux and its network subsystem are designed to run a computer rather than merely deal with interfaces, so they do define a notion of computer, and also of the set of interfaces that are local to the computer.

Now these interfaces are sort of connected with each other, by being part of the same computer; the kernel can move data between them, just like a serial line or an Ethernet cable can.

The Linux network subsystem provides some automagic functionality that considers all local interfaces, to some large extent, as if they were the one and same interface. In particular, by default, packets arriving on an a local interface but with the address of another are not forwarded to that interface; they are immediately deemed arrived and handed to the program bound at the other local interface.

The Linux kernel subsystem carries this degree of automagic to some extreme degree; for example any local interface will by default respond to ARP queries for other local interfaces.

Note that this is all by default, and things can be changed by tweaking various settings under /proc/net/.

In other words the Linux network subsystem does not route among its local interfaces, and this is meant to make the task of configuring the system easier, as well as to make the system work even if there is some misconfiguration; various bits of automagic happen to paper over minor ambiguities.

This, however, especially if one is considering network packet filtering, must be kept in mind, because it means that packets may arrive on unexpected interfaces, either intentionally, accidentally or maliciously.

This requires to a large extent to set up policies that don't assume that packets with a given destination address arrived via a particular interface.

General structure of iptables

The iptables subsystem involves the following entities:

packets
Each packet has a set of attributes.
tables
Each table defines a different type of processing; the list of implemented tables is a property of the kernel.
Each table contains a set of predefined or used defined chains.
Operations on tables:
cat /proc/net/ip_table_names
List all the system defined tables.
iptables -t table -F
Delete all the user chains in the table, and all the rules in all system chains.
chains
Each chain belongs to a table, and defines what kind of processing a packet in a certain situation might be subject to.
Each chain contains a possibly empty set of user defined rules.
These are the commands that manipulate chains:
iptables -t table -N chain
Create a new chain.
iptables -t table -X chain
Delete a chain
iptables -t table -F chain
Delete all rules in a chain.
iptables -t table -D chain TARGET
Define the default target for packets that don't match any rule in the chain. This can be one of the system targets.
rules
Each rule in a chain defines what to do to a packet being examined if it satisfies a particular condition.

The list of rules in a chain is given by the command iptables -t table -L chain -n.

Each rule contains a set of conditions that determine whether the rules applies to the packet or not, and one target which indicates which action is carried out if all the conditions match.

All tables may be applied to a packet, typically at different stages in the processing of the packet; usually in each table only one chain is applied to a packet, depending on the packet's attributes.

Tables

The default set of tables is:

filter
This table contains chains to process a packet to determine whether to allow it to exist or to delete it.
nat
This table contains chains within which targets are available that allow to change the address fields of a packet and to record the changes so that they be reversible on a session basis.
mangle
This table contains chains within which targets are available to change some non address fields of a packet.

Of these the filter table is the most commonly used, as it is that which is used to implement security policies. For this reason it is also the default table for the iptables command, and most existing documentation concentrates on setting it up.

The nat table is commonly used in the case where a set of local hosts should be either hidden (masquerading) or renumbered with respect to some remote hosts.

The mangle table is more rarely used to implement traffic control policies, for example altering the quality of service options of packets.

The filter table

This table contains three predefined chains, and exactly one of them is applied to a packet depending on its transit status:

INPUT
This chain is applied to all packets that have arrived on a local interface and are addressed to that interface or another local interface.

Such packets are to be passed directly to a local application, and have no destination interface.

These packets are usually either connections to a local server, or replies sent back to local clients.

Determining which of these packets is deleted or not determines which local servers are visibile to remote clients, or which remote servers
FORWARD
This chains is applied to packets that have arrived on a local interface and whose destination address is not associated with any of the local interfaces.

Such packets ave as attributes both an input and an output interface.
OUTPUT
This chain is applied to all packets that have been created directly by a local application, whether they are addressed to a local interface or not.

Such packets have no source interface.

It is very important to note that the chains in the filter table are applied to packets after after routing, as the output interface, if any, is known.

The targets that can be applied to packets that match the condition of a rule in one of these chains are:

-j chain
Continue processing the packet with the rules in the indicated user defined chain. This is analogous to a procedure call.
-j RETURN
Skip the rest of the current chain, and return to the previous one.
-j ACCEPT
The packet is accepted for further processing by the kernel network subsystem.
-j LOG
Write a list of packet attributes to the system log
-j REJECT
Send an ICMP packet related to the packet and continue processing it.
-j QUEUE
The packet is accepted for further processing, and sent to a user process.
-j DROP
The packet is deleted.

Some of the targets above are built-in to the iptables subsystem, and some are implemented by extension modules; however this is usually unimportant.

The goals of a firewall configuration

Ideally they are like these:

Accepting suitable packets

Given that sessions are not immediately evident in the flow of packets, reliable guesses must be used, either by:

Session inference and validity is usually based on a combination of the flags in the packet header(s), the interface and addresses of the packet, including the ports for higher level protocols.

Logging of interesting packets

Logging must be done for only a few packets that are deemed interesting, to avoid generating lots of noise.

For the same reason, and to prevent log base denial of service attacks all logging must be subject to frequency limits.

Logging can be done before accepting or dropping a packet, matching it twice, or in the same chain that accepts or drops it.

Despite the latter looking more appealing, my current preference is to use logging by double matching for session opening packets, which are probably rare in any case.

Dropping invalid packets

A packet is not suitable for the current node if it tries to open a non acceptable session or does not otherwise belong into an existing session.

Note that this principle applies to packets wheter entering, leaving or pssing thru the current node.

Logical structure of a firewall configuration

A firewall configuration is essentially a predicate, a logical formula: if the packet's attributes satisfy the predicate, it is passed on, otherwise it gets deleted.

Unfortunately this logical formula can have a dozen of free variables, and hundreds of terms.

It is very hard to build logic formulas that complex that work correctly, and they can also be very slow to evaluate.

The way to reduce the impact of the complexity is to split the big formula in a hierarchy of smaller formulas, and to short circuit evaluation. In iptables this is done by splitting the firewall configuration into a number of chains.

Each chain contains a number of terms which are evaluated sequentially, and that can be simple terms or invocations of other chains. A good organization of these chains is essential both as maintainability (and thus correctness), and of the speed with which packets are checked.

There are three principles I like for organizing a set of chains:

The order of checks should reflect the linear and hierachical order of fields in a packet.
Packets have a header(s) and a body; the body may contain another header(s) and a body for a higher level protocol, and so on.
In each chain put first the rules most likely to match the majority of packets or the packets that are most useful.
This depends a bit on the expected patterns of traffic, but some general rules are possile, and then a few common cases.
Design chains so that each check only one attribute of a packets.
This makes checking faster and allows sharing of chains, because the checking of a second aspect can be done in the rule that jumps to that chain.

Chains in the order of attributes in a packets

Packets are hierarchies of attributes; first the envelope attributes, then the attributes of the header, then the attributes of any contained header, then the contents of packets carried in the body.

Envelope attributes
The envelope attributes are about the packet and not in the packet, and include:
  • The input and output interface(s) associated with the packet (-i and -o).
  • The MAC source addresses associated with the packet (-m mac --mac-source).
  • The mark associated with a packet if any (-m mark --mark).
  • The state of the packet; this requires connection tracking (-m state --state).
Attributes of the header
The attributes of the header contain mostly information about IP routing, and they include:
  • The protocol type, such as IPv4 or IPv6. This is usually implicit in iptables.
  • The source address (-s).
  • The destination address (-d).
  • The size of the packet (-m length --length.
  • The type-of-service (--tos).
  • The time-to-live (--ttl).
  • The sub-protocol type (-p).
Contained header attributes
The attributes of contained headers depend on the sub protocol, which usually is TCP, UDP or ICMP or ICMPv6, and they include:
TCP
  • The source port (-p tcp --sp).
  • The destination port (-p tcp --dp).
  • The flags (-p tcp --tcp-flags).
  • The options (-p tcp --tcp-option).
  • The MSS (-p tcp --mss).
UDP
  • The source port (-p udp --sp).
  • The destination port (-p udp --dp).
ICMP
  • The type and subtype (-p icmp --icmp-type).

Chains checking a single attribute of a packet

TBD

Rules in most likely to match a useful packet first order

TBD

How to debug a configuration

It is not easy to get Netfilter actions logged in a nice sort of debugging way by default, so one must use static analysis, indirect means or explicit instrumentation:

Static inspection
After running a script that creates a Netfilter configuration it is very important to inspect the resulting chains and rules. The best way to do this is with the command line:
iptables -L -v -n | less -SCi
and remembering to scroll right to see the important details produced by -v.
Counters
Each rule and chain is associated with counters that record the number of packets that have matched it and their cumulative size. In figuring out which a packet has matched, in order to reconstruct the flow of the packet thru the chains and rules, it can be useful to zero all counters, and check which counters have become non zero.
It may be useful to add some rules that do nothing but match some packets to see if their counters register the passage of those packets.
Logging
it is possible to add explicit logging rules to chains. It is usually very important to do so for debugging.

Examples

Leaf node on the internet with some incoming connections

This example is about a simple leaf machine that accesses the internet via a single PPP connection; all output sessions are allowed, but only a few types of input sessions.

Preliminaries

Argument processing
PPPIP="$1"
PPPDV="$2"

: ${PPPDV:='ppp0'}
Set defaults
for C in INPUT FORWARD DROP
  do
    iptables -F "${C}"
    iptables -P "${C}" DROP
  done
Create empty chains
Flush, delete and recreate each of the chains we need.
We need input and output chains for envelope rules, IP header rules; for the IP header we also have two subchains to check source and destination addresses.
We have chains also for TCP, UDP or ICMP header rules, and a sub chain for each of those for port to packe type.
for D in I O
do
  for C in E IP TCP TCP_P UDP UDP_P ICMP ICMP_T
  do
    iptables -F "${C}_${D}"
    iptables -X "${C}_${D}"
    iptables -N "${C}_${D}"
  done
done

for D in S D
do
  for C in IP_A
  do
    iptables -F "${C}_${D}"
    iptables -X "${C}_${D}"
    iptables -N "${C}_${D}"
  done
done
The names of the chains follow some simple patterns:
  • Chain E for “envelope” field checking.
  • Chains related to checking fields in the headers of the various protocols like IP and those embedded inside IP, like TCP, UDP and ICMP.
  • Chains for checking specific services with suffix _P or ICMP types with suffix _T.
  • All these suffixed by _I or _O for input out output packets.
  • Chains for checking IP address validity with prefix IP_A and then suffix _S or _D depending on whether the source or destination address is checked.
Define simple address checks
These subchains for IP header checking verify that the source or destination address of a packet are not an invalid address (using a very minimal check).
iptables -A IP_A_S -j DROP -s 127.0.0.0/8
iptables -A IP_A_S -j DROP -s 10.0.0.0/8
iptables -A IP_A_S -j DROP -s 172.16.0.0/14
iptables -A IP_A_S -j DROP -s 192.168.0.0/24

iptables -A IP_A_D -j DROP -d 127.0.0.0/8
iptables -A IP_A_D -j DROP -d 10.0.0.0/8
iptables -A IP_A_D -j DROP -d 172.16.0.0/14
iptables -A IP_A_D -j DROP -d 192.168.0.0/24

Input processing

Check the envelope and the IP header
We accept all traffic on lo unconditionally. This is the first rule because we expect a lot of traffic on lo, for a desktop. If this were a loaded server we would probably put this rule below the next one.
Then we always accept traffic that belongs to a session that has already been checked, or is related to it. This means that later on we will only be checking session initiation packets, that are supposedly fairly rare.
Finally we check that the packet is coming in via the PPP device, else we drop it.
iptables -A INPUT -j E_I

  iptables -A E_I -j ACCEPT -i lo
  iptables -A E_I -j ACCEPT -m state --state ESTABLISHED,RELATED
  iptables -A E_I -j RETURN -i "$PPPDV"
  iptables -A E_I -j DROP
Then we check the source address, that should be valid, and the destination address must be the PPP address.
iptables -A INPUT -j IP_I

  iptables -A IP_I -j DROP \! -d "$PPPIP"
  iptables -A IP_O -j RETURN -s "$PPPIP"
  iptables -A IP_I -j IP_A_S
Check the TCP, UDP or ICMP header attributes
For input we accept only valid connection initiation packets (remember that packets belonging to existing connection are accepted very early on) and only on a few ports for TCP, none on UDP, and some essential ICMP packet types incoming.
iptables -A INPUT -j TCP_I -p tcp

  iptables -A TCP_I -j TCP_P_I -p tcp -m state --state NEW

    iptables -A TCP_P_I -j ACCEPT -p tcp -m multiport --dp ftp,ssh,http,ident
iptables -A INPUT -j UDP_I -p udp

  iptables -A UDP_I -j UDP_P_I -p udp -m state --state NEW

    iptables -A UDP_P_I -j DROP
iptables -A INPUT -j ICMP_I -p icmp

  iptables -A ICMP_I -j ICMP_T_I

    for T in 'destination-unreachable' 'source-quench' \
      'echo-reply' 'time-exceeded' 'parameter-problem'
    do
      iptables -A ICMP_T_I -j ACCEPT -p icmp --icmp-type "${T}"
    done

Output processing

Output is basically the same as input, but allowing all session initiations, so there won't be repeated comments.

Check the envelope and the IP header
iptables -A OUTPUT -j E_O

  iptables -A E_O -j ACCEPT -i lo
  iptables -A E_O -j ACCEPT -m state --state ESTABLISHED,RELATED
  iptables -A E_O -j RETURN -o "$PPPDV"
  iptables -A E_O -j DROP
iptables -A OUTPUT -j IP_O

  iptables -A IP_O -j DROP \! -s "$PPPIP"
  iptables -A IP_O -j RETURN -d "$PPPIP"
  iptables -A IP_O -j IP_A_D
Check the TCP, UDP or ICMP header attributes
iptables -A OUTPUT -j TCP_O -p tcp

  iptables -A TCP_O -j TCP_P_O -p tcp -m state --state NEW

    iptables -A TCP_P_O -j ACCEPT
iptables -A OUTPUT -j UDP_O -p udp

  iptables -A UDP_O -j UDP_P_O -p udp -m state --state NEW

    iptables -A UDP_P_O -j ACCEPT
iptables -A OUTPUT -j ICMP_O -p icmp

  iptables -A ICMP_O -j ICMP_T_O

    iptables -A ICMP_T_O -j ACCEPT