Updated: 2006-03-20
Created: 2002
Licensing and disclaimer of warranty
This document is an incomplete draft.
The iptables netfilter implementation in Linux 2.4 is quite cute and elegant, but its documentation is sometimes incomplete and baffling. This is in part because in the Linux community some people seem to think that writing bafflingly clever code gives satisfaction, and it is highly amusing to see other people try to figure it out.
The foremost example of this attitude seems to me the exceedingly clever and undocumented ip-routing system; the TC subsystem is in particular quite subtle and some vital details of are only available as archived News messages of its author.
There are some bits of documentation for ip-routing (for example 1, 2, 3, 4 5 and 6), but they are far from complete or equal to the task of illustrating an entirely novel and very subtle routing architecture.
There is
some official and unofficial documentation for iptables,
which I have consulted in order to prepare this document; for
example
Rusty's packet filtering HOWTO
which is a somewhat shallow introduction, another introduction with
an example
here,
another introduction
that is also quite shallow, a very much deeper
guide by James Stephens,
and a
tutorial on network gateway configuration at YoLinux.,
as well as of course the
iptables(8)
manual page, which documents the iptables
command
itself, and only accidentally the iptables subsystem itself.
This document is designed to provide both insight and information that is missing in other documents.
First there will be a description of what iptables
is like structurally, and then a description of how it can be
used for implementing session level policies and how they relate
to subject-object level ones.
The higher level discussion will contain a number of illustrative examples.
If you are impatient for a small scale example, you can look below, and for a large example look at this firewall configuration script, which is the one that I use for my home PCs.
iptables configuration as a result requires first defining policies at the level of sessions, and then figuring out which rules enforce those policies on packets traveling between interfaces; this requires the ability to conceptualize how agents like a user or a process and actions like web browsing map onto lower level entities like interfaces and packet attributes.
It is also very useful to be able to do the reverse, that is given an iptables configuration to infer what kind of policy it will end up implementing.
This is extremely important, because it is really hard and subtle to devise a good security policy, and even hard to map from a high level security policy expressed in terms of subjects and objects to a network one expressed in terms of processes, computers, client and servers, and finally map this onto an iptables configuration expressed in terms of packets and interfaces.
It is difficult to get any of these things right, and it may be more secure to define a security policy only in terms of general principles, devise some simple, easy to check and maintain iptables configuration and check that it implements a security policy compatible with the general principles.
The Linux network subsystem is by now a very sophisticated and complex work, that has been grown more than designed; but when the growth becomes too unruly, someone provides a redesign.
Some principles are however quite distinctive, and one is very important and somewhat different from those of other network subsystems, and needs some explaining.
While the IP protocol architecture is about interfaces and traffic among then, Linux and its network subsystem are designed to run a computer rather than merely deal with interfaces, so they do define a notion of computer, and also of the set of interfaces that are local to the computer.
Now these interfaces are sort of connected with each other, by being part of the same computer; the kernel can move data between them, just like a serial line or an Ethernet cable can.
The Linux network subsystem provides some automagic functionality that considers all local interfaces, to some large extent, as if they were the one and same interface. In particular, by default, packets arriving on an a local interface but with the address of another are not forwarded to that interface; they are immediately deemed arrived and handed to the program bound at the other local interface.
The Linux kernel subsystem carries this degree of automagic to some extreme degree; for example any local interface will by default respond to ARP queries for other local interfaces.
Note that this is all by default, and things can be changed by
tweaking various settings under /proc/net/
.
In other words the Linux network subsystem does not route among its local interfaces, and this is meant to make the task of configuring the system easier, as well as to make the system work even if there is some misconfiguration; various bits of automagic happen to paper over minor ambiguities.
This, however, especially if one is considering network packet filtering, must be kept in mind, because it means that packets may arrive on unexpected interfaces, either intentionally, accidentally or maliciously.
This requires to a large extent to set up policies that don't assume that packets with a given destination address arrived via a particular interface.
The iptables subsystem involves the following entities:
cat /proc/net/ip_table_names
iptables -t
table
-F
iptables -t
table -L
chain -n
.
All tables may be applied to a packet, typically at different stages in the processing of the packet; usually in each table only one chain is applied to a packet, depending on the packet's attributes.
The default set of tables is:
filter
nat
mangle
Of these the filter
table is the most commonly
used, as it is that which is used to implement security
policies. For this reason it is also the default table for
the iptables
command, and most existing
documentation concentrates on setting it up.
The nat
table is commonly used in the case
where a set of local hosts should be either hidden
(masquerading) or renumbered with respect to some remote
hosts.
The mangle
table is more rarely used to
implement traffic control policies, for example altering the
quality of service options of packets.
filter
tableThis table contains three predefined chains, and exactly one of them is applied to a packet depending on its transit status:
INPUT
FORWARD
OUTPUT
It is very important to note that the chains in the
filter
table are applied to packets
after after routing, as the output interface, if
any, is known.
The targets that can be applied to packets that match the condition of a rule in one of these chains are:
-j
chain-j RETURN
-j ACCEPT
-j LOG
-j REJECT
-j QUEUE
-j DROP
Some of the targets above are built-in to the iptables subsystem, and some are implemented by extension modules; however this is usually unimportant.
Ideally they are like these:
Given that sessions are not immediately evident in the flow of packets, reliable guesses must be used, either by:
Session inference and validity is usually based on a combination of the flags in the packet header(s), the interface and addresses of the packet, including the ports for higher level protocols.
Logging must be done for only a few packets that are deemed interesting, to avoid generating lots of noise.
For the same reason, and to prevent log base denial of service attacks all logging must be subject to frequency limits.
Logging can be done before accepting or dropping a packet, matching it twice, or in the same chain that accepts or drops it.
Despite the latter looking more appealing, my current preference is to use logging by double matching for session opening packets, which are probably rare in any case.
A packet is not suitable for the current node if it tries to open a non acceptable session or does not otherwise belong into an existing session.
Note that this principle applies to packets wheter entering, leaving or pssing thru the current node.
A firewall configuration is essentially a predicate, a logical formula: if the packet's attributes satisfy the predicate, it is passed on, otherwise it gets deleted.
Unfortunately this logical formula can have a dozen of free variables, and hundreds of terms.
It is very hard to build logic formulas that complex that work correctly, and they can also be very slow to evaluate.
The way to reduce the impact of the complexity is to split the
big formula in a hierarchy of smaller formulas, and to short
circuit evaluation. In iptables
this is done by
splitting the firewall configuration into a number of chains.
Each chain contains a number of terms which are evaluated sequentially, and that can be simple terms or invocations of other chains. A good organization of these chains is essential both as maintainability (and thus correctness), and of the speed with which packets are checked.
There are three principles I like for organizing a set of chains:
Packets are hierarchies of attributes; first the envelope attributes, then the attributes of the header, then the attributes of any contained header, then the contents of packets carried in the body.
-i
and -o
).-m mac --mac-source
).-m mark --mark
).-m state --state
).iptables
.-s
).-d
).-m length --length
.--tos
).--ttl
).-p
).-p tcp --sp
).-p tcp --dp
).-p tcp --tcp-flags
).-p tcp --tcp-option
).-p tcp --mss
).-p udp --sp
).-p udp --dp
).-p icmp
--icmp-type
).TBD
TBD
It is not easy to get Netfilter actions logged in a nice sort of debugging way by default, so one must use static analysis, indirect means or explicit instrumentation:
iptables -L -v -n | less -SCiand remembering to scroll right to see the important details produced by
-v
.
This example is about a simple leaf machine that accesses the internet via a single PPP connection; all output sessions are allowed, but only a few types of input sessions.
PPPIP="$1" PPPDV="$2" : ${PPPDV:='ppp0'}
for C in INPUT FORWARD DROP do iptables -F "${C}" iptables -P "${C}" DROP done
for D in I O do for C in E IP TCP TCP_P UDP UDP_P ICMP ICMP_T do iptables -F "${C}_${D}" iptables -X "${C}_${D}" iptables -N "${C}_${D}" done done for D in S D do for C in IP_A do iptables -F "${C}_${D}" iptables -X "${C}_${D}" iptables -N "${C}_${D}" done doneThe names of the chains follow some simple patterns:
E
for “envelope”
field checking.IP
and those
embedded inside IP, like TCP
,
UDP
and ICMP
._P
or ICMP types with suffix
_T
._I
or
_O
for input out output packets.IP_A
and then suffix _S
or
_D
depending on whether the source or
destination address is checked.iptables -A IP_A_S -j DROP -s 127.0.0.0/8 iptables -A IP_A_S -j DROP -s 10.0.0.0/8 iptables -A IP_A_S -j DROP -s 172.16.0.0/14 iptables -A IP_A_S -j DROP -s 192.168.0.0/24 iptables -A IP_A_D -j DROP -d 127.0.0.0/8 iptables -A IP_A_D -j DROP -d 10.0.0.0/8 iptables -A IP_A_D -j DROP -d 172.16.0.0/14 iptables -A IP_A_D -j DROP -d 192.168.0.0/24
lo
unconditionally. This is the first rule because we expect
a lot of traffic on lo
, for a desktop. If
this were a loaded server we would probably put this rule
below the next one.
iptables -A INPUT -j E_I iptables -A E_I -j ACCEPT -i lo iptables -A E_I -j ACCEPT -m state --state ESTABLISHED,RELATED iptables -A E_I -j RETURN -i "$PPPDV" iptables -A E_I -j DROPThen we check the source address, that should be valid, and the destination address must be the PPP address.
iptables -A INPUT -j IP_I iptables -A IP_I -j DROP \! -d "$PPPIP" iptables -A IP_O -j RETURN -s "$PPPIP" iptables -A IP_I -j IP_A_S
iptables -A INPUT -j TCP_I -p tcp iptables -A TCP_I -j TCP_P_I -p tcp -m state --state NEW iptables -A TCP_P_I -j ACCEPT -p tcp -m multiport --dp ftp,ssh,http,ident
iptables -A INPUT -j UDP_I -p udp iptables -A UDP_I -j UDP_P_I -p udp -m state --state NEW iptables -A UDP_P_I -j DROP
iptables -A INPUT -j ICMP_I -p icmp iptables -A ICMP_I -j ICMP_T_I for T in 'destination-unreachable' 'source-quench' \ 'echo-reply' 'time-exceeded' 'parameter-problem' do iptables -A ICMP_T_I -j ACCEPT -p icmp --icmp-type "${T}" done
Output is basically the same as input, but allowing all session initiations, so there won't be repeated comments.
iptables -A OUTPUT -j E_O iptables -A E_O -j ACCEPT -i lo iptables -A E_O -j ACCEPT -m state --state ESTABLISHED,RELATED iptables -A E_O -j RETURN -o "$PPPDV" iptables -A E_O -j DROP
iptables -A OUTPUT -j IP_O iptables -A IP_O -j DROP \! -s "$PPPIP" iptables -A IP_O -j RETURN -d "$PPPIP" iptables -A IP_O -j IP_A_D
iptables -A OUTPUT -j TCP_O -p tcp iptables -A TCP_O -j TCP_P_O -p tcp -m state --state NEW iptables -A TCP_P_O -j ACCEPT
iptables -A OUTPUT -j UDP_O -p udp iptables -A UDP_O -j UDP_P_O -p udp -m state --state NEW iptables -A UDP_P_O -j ACCEPT
iptables -A OUTPUT -j ICMP_O -p icmp iptables -A ICMP_O -j ICMP_T_O iptables -A ICMP_T_O -j ACCEPT