 |

CHAOS Technical Information

Current CHAOS feature summary:
CHAOS 1.6 has been released with the following high-level feature set;
|
Live CD or PXE; runs from RAM after loading from media - nil installation.
|
|
6Mbyte OS footprint; fits on a business card
|
|
Feature packed Linux Kernel (2.4.27)
|
|
Latest openMosix software (kernel 2.4.27-20040808, tools 0.3.6-2)
|
|
Automatic IP configuration; boot with DHCP/BOOTP/RARP or manually
|
|
3DES encrypted network communications; IPSEC - fully meshed!
|
|
Stateful packet filtering; 500/UDP and ESP network accessible in IPSEC mode
|
|
Custom INIT binary; fast, zombie-free, clear (color coded), and inflexible!
|
|
Autodiscovery daemon tyd with multicast (local) and unicast (multi-site) support
|
|
Supports most i586/PCI hardware (including recent Compaq/Dell desktops)
|
The proposed CHAOS roadmap
CHAOS version 1.6 was intended to be a package upgrade from CHAOS 1.5, and
an effort to restore compatibility between CHAOS, ClusterKnoppix and Qantian.
Unfortunately, due to IPSEC issues in ClusterKnoppix, this effort has not
been successful. As a new version of Knoppix has been released there
is, understandably, little interest in repairing/upgrading the older version
(clusterKNOPPIX_V3.6-2004-08-16-EN-cl1).
The plan going forward is, therefore;
|
CHAOS version 1.6 has been released as a package upgrade from CHAOS 1.5, only.
|
|
CHAOS version 1.7 will be released along with the latest version of
ClusterKnoppix, providing compatible/inter-operable versions of CHAOS,
ClusterKnoppix and Qantian. CHAOS 1.7 will also see the beginning of
the security and implementation improvements in the CHAOS-specific
code (i.e. tyd, init, etc).
|
Understanding CHAOS' networking
CHAOS has been designed with large ad-hoc roll-outs in mind. Everywhere
that a problem was considered, the resultant decision was almost certainly
made in favor of security, simplicity and automation. To achieve the
largest possible gains in time/labour saving through automation in
networking, the boot screen features a matrix of prefabricated boot
options. These options allow for combinations of DHCP, BOOTP or static
addressing to be quickly selected - the default being dynamic (DHCP/BOOTP).
In the case that no network configurator exists, pressing F5 at the boot
prompt will display instructions for manually entering static IP interface
data.
All CHAOS automagic IP configuration options are managed by init, to
provide the fastest flexible boot sequence possible. This implementation method
ensures that, where required, supporting drivers (such as PCMICIA support) can
be loaded before attempting to obtain network connectivity.
An ASCII-diagram of the services used by CHAOS can be seen here;
#
# CHAOS-1.0 /etc/services
#
#
# Network services, Internet style
#
ssh 22/tcp # Secure Shell Login
bootps 67/udp # BOOTP server
bootpc 68/udp # BOOTP client
tftp 69/udp # Trivial File Transfer Protocol
http 80/tcp # WorldWideWeb HTTP
ike 500/udp # IPSEC IKE
#
#
# Local openMosix
#
om-mfs 723/tcp # openMosix FileSystem port
om-disc 1334/udp # openMosix autodiscovery protocol
om-mig 4660/tcp # openMosix Migration Daemon port
om-info 5428/udp # openMosix Info Daemon port
#
#
# Local CHAOS
#
mgetm 2727/udp # mulicast get-m protocol
ugetm 2728/udp # unicast get-m protocol
tnp 3278/tcp # terrence-n-phillip protocol
Understanding CHAOS' tyd
The CHAOS auto discovery daemon is called "tyd" - pronounced
"tie-dee" (like "tidy"). It is an implementation of the
TNP protocol, through a brutally butchered omdiscd source framework; it is
a single process/single thread binary that operates as both client and server
in the TNP process. Tyd can be executed from /sbin/tyd on a running CHAOS
node.
Tyd has two components, Terrence and Phillip. Terrence acts as a client,
interrogating Phillip, the server. On the first node in a cluster, Tyd is
executed without any parameters, forcing it to run without Terrence; we'll
come back to Phillip later. On every other node, Tyd is executed with one
parameter; an IP address for another node. Tyd can now tell Terrence where
one Phillip is. Terrence connects to Phillip and, once they have met and
exchanged their formal greeting, Terrence begins to interrogate Phillip.
Terrence is going to be busy for a little while. First, he retrieves the mosix
map from the Phillip he was told about; he updates the local CHAOS node with
this map. Then, he connects to every Phillip in the map (cluster), one by
one, starting with the first (lowest numbered node-id) and moving through
to the last. As he visits each Phillip, Terrence asks him to add the new
CHAOS node to his own map. Phillip does this, and returning the current
total number of nodes to Terrence. If the total number of nodes that a
given Phillip knows about, is more than the total number of nodes Terrence
is expecting, then Terrence has been passed by another Terrence; he aborts
his cluster addition interrogation, and starts again - asking the first
node for a new map.
Phillip, relatively speaking, has a much easier job. Once Terrence has been
successful in joining the cluster (map) he goes away, leaving a Phillip in
place on the new CHAOS node. This Phillip is like all or any other; contains
a map of the entire cluster, is ready to serve maps to a passing Terrence,
and to add new Terrences as they occur.
An ASCII-diagram of the TNP communications can be seen here;
/*
* tnp proto
*
* port 3278/tcp
*
* hello
*
* [t] -------> [p] "let's look for treasure!" (hello,)
* [t] <------- [p] "yes, let's look for treasure!" (hello.)
*
*
* interrogate - map
*
* [t] -------> [p] "aaaahahahaha .. spattered your face!" (map me)
* [t] <------- [p] [map_count][map data] - close
*
*
* interrogate - getpubkey
*
* [t] -------> [p] "get_ssh_pub_key" (add me)
* [t] <------- [p] [rsa key ent]
* [t] -------> [p] "get_ssh_pub_key_ack"
* [t] <------- [p] [dsa key ent]
* [t] -------> [p] "get_ssh_pub_key_ack"
*
* interrogate - setpubkey
*
* [t] -------> [p] "set_ssh_pub_key" (add me)
* [t] <------- [p] "set_ssh_pub_key_ack"
* [t] -------> [p] [rsa key ent]
* [t] <------- [p] "set_ssh_pub_key_ack"
* [t] -------> [p] [dsa key ent]
* [t] <------- [p] "set_ssh_pub_key_ack"
*
* interrogate - add
*
* [t] -------> [p] "aaaahahahaha .. just kidding!" (add me)
* [t] <------- [p] [node_count] - close
*
*
* interrogate - del
*
* [t] -------> [p] "*phht* daaaahahahaha!" (del me)
* [t] <------- [p] close
*
*
*
* tyd help can be found with "tyd --help"
*
*/
The TNP protocol was created to solve the shortcomings of the omdiscd.
The omdiscd does an awesome job of dynamically configuring the kernel
with a common cluster-wide openMosix map; providing your cluster is
within a single VLAN or LAN segment. This segment boundary limitation
is not intentional - the multicast traffic used to maintain the cluster
configuration is only isolated due to the routing infrastructure
surrounding a given segment. However, having spent three days trying to
get mrouted to bridge VLANs on a proxy-arp'd class-b network, it became
evident that writing a unicast autodiscovery daemon would be far more
productive.
At this stage tyd, like omdiscd, does not support node removal. Hooks
have been coded, but the openMosix kernel was not accepting the node
removal request -- more investigation is required. There is also a
limitation in node addition, based on cluster integrity. Should a node
go missing mid-process, the map chain will be broken and new nodes fail
to add correctly. This will be corrected in a comming version of tyd.
Understanding CHAOS' get-m
The CHAOS autodiscovery daemon, tyd, needed a little help. As mentioned
above, tyd needs to know the IP address of just one node in the cluster
(any node), so that the Terrence component of tyd can download the
cluster map, and find all of the other nodes in the cluster. As you will
see below, to specify this "master" node, you pass tyd the "-m"
parameter.
This new daemon get-m, pronounced "get em" (like a Brittish
"get them"), removes the need to manually key in an IP address
for Terrence to retrieve a map from. Instead, get-m uses a multicast sonar
to try to find any nodes that may already exist on this LAN segment. If it
finds any, it records them in /var/run/get-m.info for tyd to find.
Understanding CHAOS' init
The initd (or just plain "init") is the unix initialisation process. It is
the first user space software to be executed after the kernel has loaded.
Init's job is to literally initialise the operating environment. It does
this by setting start-up kernel values, mounting file systems, loading
server software, etc.
The CHAOS init daemon works very much like the traditional init (though
far less gracefully).
At startup, each of the desired processes are initialised and launched. At
shutdown (ctrl-alt-del or "kill 1") the init process gracefully brings
the system down. Actually, it is the shutdown process that is the more
important of the two; during shutdown, init asks openMosix to expel all of
the locally retained processes, so that no parcels are lost on reboot.
The real power of the CHAOS init won't be of interest to you unless you try
to embed CHAOS, or try to customise it in some way that requires special
deployment. For those who want to serve CHAOS to a network, or make their
own CD's, etc, the CHAOS init supports a number of environment options that
will act to parameterise the startup characteristics of the operating
environment. On the boot prompt/boot command line, after the init= specifier,
you can add PARAM=value options, each separated by spaces. Unfortunately
there is a very limited number of these values (32 chars worth?) that the
kernel will accept - even though /proc/cmdline may reflect all of the options
entered. A future version of init will probably rectify this issue by
reading /proc/cmdline, downloading an options file from a tftp server, or
doing both.
An ASCII-diagram of the init options can be seen here;
/*
* ACPID ACPID Start/Allow
* = 0 Don't
* = 1 Do
*
* BOOT Boot Type
* = 0 Unknown/ISO
* = 1 Network
* = 2 Local/Fixed Disk
*
* DHCP DHCP client Start/Allow
* = 0 Don't
* = 1 Do
*
* EJECT CDROM Find/Eject
* = 0 Don't
* = 1 Do
*
* HTTPD HTTPD Start/Allow
* = 0 Don't
* = 1 Do
* = 2 Do + Admin
*
* IPFILT Ipfilter Start/Allow
* = 0 Don't
* = 1 Do
*
* IPSEC IPSEC Start/Allow
* = 0 Don't
* = 1 Do
*
* MASTER Allocate discovery host (tyd)
* = ip.ad.dr.ess
* = host.na.me
*
* OMAUTO Autodiscovery Type
* = 0 Unknown/None
* = 1 Omdiscd
* = 2 Tyd
*
* SETI SETI Start/Allow
* = 0 Don't
* = 1 Do
*
* SHELL Boot to Interactive Shell
* = 0 Don't
* = 1 Do
*
* SSHD SSHD Start/Allow
* = 0 Don't
* = 1 Do
*
* TYD TYD behaviour
* = 0 None/Off
* = 1 Slave
* = 2 Master
* = 3 PXE-Master
*
*/
But why create a new init? In a dedicated task environment, the flexibility
of a complete unix OS is simply not required. Removing said flexibility allows
for focus in the distribution; CHAOS' init does only what is required to start
the cluster environment and without the use of a myriad of shell scripts. This
technique also makes it easier to integrate features comprehensively (such as
PXE boot options).
CHAOS has stateful packet filtering
Yes - The Mosix/openMosix architecture is incredibly insecure;
from vulnerabilities in the respective implementations, all the way back
to their insecure design. Since CHAOS 0.5, tyd has supported packet
filtering using the linux netfilter kernel-level stateful packet filtering
code.
In CHAOS version post CHAOS 0.7, if cryptography is disabled, then the
filters employed allow only the node's http daemon and ty-daemon to be
visible to the network, beyond those nodes registered in the cluster. If
cryptography is enabled, then the filters employed allow only the node's
http daemon, IPSEC's ESP protocol, and IKE's service port visible to the
network, beyond those nodes registered in the cluster. With cryptography
enabled, all openMosix and tyd communications are required to route via
the ipsec0 interface, only.
CHAOS has cryptography - 3DES under IPSEC
From CHAOS 0.7, tyd has supported the layer-3 VPN standard - IPSEC - with
a default encryption transform of 3DES (Triple DES). CHAOS uses
pre-shared-keys (PSKs) to authenticate the tunnel.
The default value for the PSK is stored in /etc/ipsec.secrets. It is long
and complicated, but it would still be most prudent avoid using the
default. Note that using the default will ensure that differing tyd
versions will not successfully communicate within the same cluster. See
also the following sections on compatibility for details on CHAOS nodes
in non-CHAOS-homogeneous clusters.
Nb; CHAOS applies the default Openswan encrypt, tunnel, compress and pfs
transforms in a fully meshed topology ("n-1" tunnels per node). All tyd and
openMosix communications are encrypted and encapsulated in the IPSEC
tunnels.
If you operate within the finance,
communications technology,
insurance,
business services,
educational, or
legal
sectors, then see how Pure Hacking can help you manage the risk for your organisation.


Business Services
"Pure Hacking distinguish themselves by successfully communicating difficult
technical concepts to all levels. They also followed through with their promise
to continue support in explaining these concepts long after our contract finished."
IT Manager - Production Systems, International Business Services Corporation
|

|
 |