Documentation


Viewing posts for the category Omarine User's Manual

Building a fault-tolerant firewall system with virtual machines: Writing a firewall ruleset

Rules are the building material of the firewall. A firewall without a rule set is an empty firewall, like an air wall. Meaning it allows all including unwanted packets.

It's called rule-writing because rule-setting is flexible. There are no hard rules and for the same purpose we can write in many ways. For example, for the conntrack state we can rely on the internal state, the status of conntrack; or rely on external state, ctinfo. ctinfo is conntrack's state information but is updated according to the context. They are not completely identical. ctinfo shows the direction of the flow, status does not. Conversely, status indicates that a conntrack is expected while ctinfo no longer holds this information once the connection has been established in the conntrack semantics. However, in some situations they mean the same thing. For example ct state established is equivalent to ct status seen-reply. ct state related is equivalent to ct status expected tcp flags syn. ct state established ct direction original is equivalent to ct status assured.
So how to write it depends on the person building the firewall. You can use goto or jump to break down the rule set to make it brighter.
Not a requirement, but you can create a flow table (flowtable) to accelerate packet forwarding and offload flow. Once conntrack is established, you can choose to place the flow entry into the table using flow add rule. Each entry is represented by a seven-element tuple: source address, destination address, source port, destination port, layer 3 protocol, layer 4 protocol, and input interface. In addition, it also caches the output interface. At the ingress hook, if the flow entry is found in the table, the packet will bypass the classic forwarding path i.e. not go through the netfilter hooks behind ingress but go directly to the output interface via the neigh_xmit() function in the hook nf_flow_offload_inet_hook().

In our example the firewall allows only the following services:

     1. DNS (UDP, port 53)
     2. www (TCP, port 80, 443)
     3. File Transfer (TCP, port 21 with helper)
     4. Secure remote login - ssh (TCP, port 22)
     5. ping (ICMP, echo-request and echo-reply types)


The simple ruleset is as follows:

Building a fault-tolerant firewall system with virtual machines: expectation: part 3: diagram

When the dust of time covers the long lines of code, this image is easy to remember

Building a fault-tolerant firewall system with virtual machines: expectation: part 2: expectation

Recovering helper

The feature of fault-tolerant firewall is the ability to recover connections. But if it loses the helper, the connection recovering fails. When conntrackd injects a conntrack whose inherent helper into the kernel conntrack table, the netlink subsystem creates helper for it. Unfortunately, the work of NAT later took the helper away (don't use automatic helper assignment, now the safe way is to explicitly define helper using the firewall rule ct helper set). To fix it, we add the following code to the __nf_ct_try_assign_helper() function:

Building a fault-tolerant firewall system with virtual machines: expectation: part 1: helper

Some application protocols such as FTP, H.323 and SIP divide  transaction into two flows with two separate connections. The first connection is the control connection followed by data connection. In the FTP case with passive mode, the first connection to port 21 of the file server is the control connection. After the user has logged in and runs an ftp command eg 'ls' is at the beginning of the data connection in a very large port range 1024 - 65535. The firewall has to allow these two connections to operate, but not more than that, it should not allow so many ports. First the server and client will negotiate destination port number that the server will open to listen. But this information belongs to the application protocol and is not obtained by normal connection tracking that only get the protocol header deepest to layer 4. The data analysis behind the TCP header claim a protocol help called helper. This helper read the destination port number and creates a so-called expectation. That's the expectation of data connection to become 'related' to the control connection that acts as a master connection. We then only need to write 'ct state related' rule without knowing what the specific destination port is.
The helper registration is made by the nf_conntrack_helper_register() function. The argument entering the function is a nf_conntrack_helper structure pointer. This structure is the representative of the helper, including the following fields, we explain to the FTP case:
    • Helper's name, "ftp".
    • A tuple is a nf_conntrack_tuple structure, information that it needs to hold is the layer 3 protocol number, layer 4 protocol number and working port number of master connection. The layer 3 protocol is IPv4 or IPv6. The layer 4 protocol is TCP, the working port is 21.
    • Function pointer help to handle the application protocol. This is the main job of helper.
    • hnode is a hlist_node structure to insert this helper at the head of a list in a bucket in the helper hash table.
    • Function pointer from_nlattr, the function is called when the master conntrack is injected from the user space, typically when the firewall recovers the connection. This function intends to handle netlink attributes but FTP only sets NF_CT_FTP_SEQ_PICKUP flag to ignore the sequence number checking of the data because the previous backup firewall does not know the sequence number (only the active firewall can update the sequence numbers).
    • Flags, expectation policy and some other things.

The registration function takes the index based on the helper's tuple to access the array of the hash table and insert the helper at the head of the list at that index. Once the node is in the hash table, the helper is its container so can be retrieved.

When the first packet of the master connection arrives, like any normal connection, the connection tracking system (hereafter called the conntrack system) will create an entry called conntrack. A conntrack is a nf_conn structure that holds connection information, including:
    • The status field holds the connection state, is the packet has been seen both ways, has left the box (confirmed and conntrack has been inserted into the official hash table in the last hook postrouting), is the expected connection, is the new connection or dying...
    • tuplehash is an array of nf_conntrack_tuple_hash structs, there are two structures for the original and reply directions. Each structure has two fields: a tuple holding connection information and a hnnode. Each structure is inserted into the hash table through its hnnode. When it is need to access a conntrack that has a node in the hash table, the system computes the index to get the bucket containing the node, and then finds the node in the bucket. Once a node is present, the pointer is moved back by data offset to the beginning of the nf_conn structure and conntrack is obtained. Two tuples are the most important members of a conntrack. Unlike helper tuple that need only three types of information to access its hash table, conntrack's tuples contain full information: source address, destination address, source port, destination port, layer 3 protocol, layer protocol 4 and direction. One tuple is for a packet's original direction and another for reply direction.
    • master conntrack, if it is a data connection, this element points to the conntrack of the control connection.
    • timeout defines the life time of a conntrack. When it expires, the conntrack is destroyed.
    • ct_general is a nf_conntrack structure. This structure has only one member, use, which is used to manage the reference count of the conntrack object. When the reference count decreases to 0, it is safe to release the object. In practice, the nf_ct_put() function is used to reduce the reference count by 1 and if the reference count is zero, the object is released. The function nf_ct_expect_put() has the same function as the nf_ct_put() function but applies to the expectation object.
    • The ext pointer points to the nf_ct_ext structure. This structure has a data field which holds some extension structures to be added as needed. The offset field is an array containing the offsets of each extended structure from the beginning of the container structure with the array index of their id. offset[NF_CT_EXT_HELPER] is the offset of the nf_conn_help structure. The nf_conn_help structure helps the master conntrack manage its expectations and communicate with the helper. The nf_conn_help structure has four members:
        ◦ The helper pointer points to the helper.
        ◦ expectations is a structure hlist_head. It is the head of the list of expectations of the master conntrack. A newly created expectation will be inserted at the head of this list. This is how the master conntrack manages its expectations independently of the general management of expectations in the expectation hash table.
        ◦ expecting is an array of integers, holding the number of current expectations by class. Currently FTP only uses one class, 0.
        ◦ data is a 32 byte field for helper-specific information. FTP uses this data for the nf_ct_ftp_master structure which holds the NF_CT_FTP_SEQ_PICKUP flag and sequence number information as described above.
      Another extension structure is nf_conntrack_ecache. This structure has a cache field that holds reporting events such as IPCT_NEW, IPCT_DESTROY, IPCT_HELPER.
    • Some other things.

During conntrack initialization, helper assignment is performed if automatic helper assignment is configured. This includes finding the helper, adding the nf_conn_help structure and assigning its helper pointer to the helper. At the stage of the last hook, postrouting, before the packet comes out of the box, conntrack is confirmed with the nf_conntrack_confirm() function. If the packet is accepted this function inserts conntrack into the hash table. Then it checks the helper with the nfct_help() function. The nfct_help() function returns a pointer to the nf_conn_help structure. Because the helper was assigned, this went smoothly. So it sets the event with the nf_conntrack_event_cache() function, setting bit 1 << IPCT_HELPER. Finally, the nf_conntrack_confirm() function delivers the event with the nf_ct_deliver_cached_events() function (in the nf_conntrack_core.h source file). This function first looks for cached events using the nf_ct_ecache_find() function. Since the event is set, it finds this one together the other events. Thus the cached events are delivered once, and they are deleted immediately after delivery (this is done by the statement: events = xchg(&e->cache, 0);).
In the nf_ct_deliver_cached_events() function there is a notify pointer of the nf_ct_event_notifier structure. This construct has a fcn field which is a function pointer to handle event messages. Meanwhile in the source file nf_conntrack_netlink.c there is a struct nf_ct_event_notifier that was initialized when declared with its fcn pointer assigned the ctnetlink_conntrack_event() function. Now let's briefly analyze network activity to understand the event handover process.

The net structure holds the network operations. This structure has a gen pointer field that points to a net_generic structure (generic.h source file) which has a ptr field to be an array whose indexes are the ids of the net's network operations. Each network operation registered with the register_pernet_subsys() function which takes an argument of a pernet_operations structure pointer (the net_namespace.c source file). The pernet_operations structure is a representation of network activity. It has a list field to insert into the list of operations, an id pointer, an init function pointer, and several others. The register_pernet_subsys() function calls the register_pernet_operations() function. This function calls the ida_alloc_min() function to generate an id for the activity, and then it calls the __register_pernet_operations() function for the specific registration. In turn, the __register_pernet_operations() function adds the operation to the list and calls the ops_init() function to initialize the operation with an initialized net structure named init_net. It is the default net and also the only net in the system if we do not create additional network namespaces.
The ops_init() function calls the net_assign_generic() function to assign the new net_generic structure (if necessary) to the net's gen pointer, and then calls the pernet_operations structure's init function to initialize the network operation. The registration of network activity here has been completed.

We return to the event delivery part of the ctnetlink activity. The ctnetlink subsystem registers activity with the register_pernet_subsys() function where the active structure is ctnetlink_net_ops. The ctnetlink_net_ops has an init function of ctnetlink_net_init(), so registration leads to a call to ctnetlink_net_init(). The function ctnetlink_net_init() again calls the function nf_conntrack_register_notifier() with the argument of the above nf_ct_event_notifier structure pointer.
Talking more about the net structure, it has a ct field which is a netns_ct structure that manages conntracks. The netns_ct structure has a field nf_conntrack_event_cb which is a pointer to the nf_ct_event_notifier structure with the goal of holding event notification callback function. So the nf_conntrack_register_notifier() function assigns address of the nf_ct_event_notifier structure above to the net's nf_conntrack_event_cb pointer.
Back to the nf_ct_deliver_cached_events() function, which uses the rcu_dereference() function to obtain net->ct.nf_conntrack_event_cb and assigns to the notify pointer. Pointer notify then run fcn ie call ctnetlink_conntrack_event() function with arguments to be the events and address of a nf_ct_event structure which contains the conntrack pointer.
nf_ct_deliver_cached_events() only delivers the events, while the ctnetlink_conntrack_event() function actually broadcasts the events.
What we are interested in here is IPCT_HELPER ie event creating helper for the master conntrack.

Building a fault-tolerant firewall system with virtual machines: Load balancing

Going hand in hand with high availability (HA) is the load balancing technique. Two servers srv-1 and srv-2 to be added to the network topology

The omarine server running keepalived acts as a virtual server that distributes connections equally to the real servers srv-1 and srv-2. All service access to the virtual server is routed to real servers. Real servers are health checked to monitor the health of the network. A quorum is set (required minimum total weight of all live servers in the pool). If a real server has problem and the quorum is below the minimum, then access goes to a sorry server. In this example we create a virtual Web service. The contents of the server's homepage are as follows:

    • Real server srv-1: Hello, I am server 1.
    • Real server srv-2: Hello, I am server 2.
    • The sorry server: Sorry, the quorum was not achieved!

The quorum is set to 2. At startup both real servers are healthy and the quorum is sufficient. From the client we go to http://omarine.omarine.co several times, the connections go to srv-1 and srv-2 in turn equally. Then we stop the service on the srv-1. At this point the quorum is not reached and we are redirected to the home page of the sorry server at omarine


The configuration file is as follows: