==Phrack Inc.== Volume 0x0b, Issue 0x39, Phile #0x07 of 0x12 |=---=[ ICMP based remote OS TCP/IP stack fingerprinting techniques ]=---=| |=-----------------------------------------------------------------------=| |=---------------=[ Ofir Arkin & Fyodor Yarochkin ]=---------------------=| --[ICMP based fingerprinting approach]-- TCP based remote OS fingerprinting is quite old(*1) and well-known these days, here we would like to introduce an alternative method to determine an OS remotely based on ICMP responses which are received from the host. Certain accuracy level has been achieved with different platforms, which, with some systems or or classes of platforms (i.g. Win*), is significally more precise than demonstrated with TCP based fingerprinting methods. As mentioned above TCP based method, ICMP fingerprinting utilizes several tests to perform remote OS TCP/IP stack probe, but unlike TCP fingerprinting, a number of tests required to identify an OS could vary from 1 to 4 (as of current development stage). ICMP fingerprinting method is based on certain discoveries on differencies of ICMP replies from various operating systems (mostly due to incorrect, or inconsistant implementation), which were found by Ofir Arkin during his "ICMP Usage in Scanning" research project. Later these discoveries were summarised into a logical desicions tree which Ofir entitled "X project" and practically implemented in 'Xprobe' tool. --[Information/Noise ratio with ICMP fingerprints]-- As it's been noted, the number of datagrams we need to send and receive in order to remotely fingerprint a targeted machine with ICMP based probes is small. Very small. In fact we can send one datagram and receive one reply and this will help us identify up to eight different operating systems (or classes of operating systems). The maximum datagrams which our tool will use at the current stage of development, is four. This is the same number of replies we will need to analyse. This makes ICMP based fingerprinting very time-efficient. ICMP based probes could be crafted to be very stealthy. As on the moment, no maliformed/broken/corrupted datagrams are used to identify remote OS type, unlike the common fingerprinting methods. Current core analysis targets validation of received ICMP responses on valid packets, rather than crafting invalid packets themselves. Heaps of such packets appear in an average network on daily basis and very few IDS systems are tuned to detect such traffic (and those which are, presumably are very noisy and badly configured). --[Why it still works?]-- Inheritable mess among various TCP/IP stack implementations with ICMP handling implementations which implement different RFC standards (original RFC 792, additional RFC 1122, etc), partial or incomplete ICMP support (various ICMP requests are not supported everywhere), low significance of ICMP Error messages data (who verifies all the fields of the original datagram?!), mistakes and misunderstanding in ICMP protocol implementation made our method viable. --[What do we fingerprint:]-- Several OS-specific differencies are being utilized in ICMP based fingerprinting to identify remote operating system type: IP fields of an 'offending' datagram to be examined: * IP total length field Some operating systems (i.g. BSD family) will add 20 bytes (sizeof(ipheader)) to the original IP total length field (which occures due to internal processing mistakes of the datagram, please note when the same packet is read from SOCK_RAW the same behaviour is seen: returned packet ip_len fiend is off by 20 bytes). Some other operating systems will decrease 20 bytes from the original IP total lenth field value of the offending packet. Third group of systems will echo this field correctly. * IP ID some systems are seen not to echo this field correctly. (bit order of the field is changed). * 3 bits flags and offset some systems are seen not to echo this field correctly. (bit order of the field is changed). * IP header checksum Some operating systems will miscalculate this field, others just zero it out. Third group of the systems echoes this field correctly. * UDP header checksum (in case of UDP datagram) The same thing could happen with UDP checksum header. IP headers of responded ICMP packet: * Precedence bits Each IP Datagram has an 8-bit field called the 'TOS Byte', which represents the IP support for prioritization and Type-of-Service handling. The 'TOS Byte' consists of three fields. The 'Precedence field'\cite{rfc791}, which is 3-bit long, is intended to prioritize the IP Datagram. It has eight levels of prioritization. Higher priority traffic should be sent before lower priority traffic. The second field, 4 bits long, is the 'Type-of-Service' field. It is intended to describe how the network should make tradeoffs between throughput, delay, reliability, and cost in routing an IP Datagram. The last field, the 'MBZ' (must be zero), is unused and must be zero. Routers and hosts ignore this last field. This field is 1 bit long. The TOS Bits and MBZ fields are being replaced by the DiffServ mechanism for QoS. RFC 1812 Requires following for IP Version 4 Routers: "4.3.2.5 TOS and Precedence ICMP Source Quench error messages, if sent at all, MUST have their IP Precedence field set to the same value as the IP Precedence field in the packet that provoked the sending of the ICMP Source Quench message. All other ICMP error messages (Destination Unreachable, Redirect, Time Exceeded, and Parameter Problem) SHOULD have their precedence value set to 6 (INTERNETWORK CONTROL) or 7 (NETWORK CONTROL). The IP Precedence value for these error messages MAY be settable". Linux Kernel 2.0.x, 2.2.x, 2.4.x will act as routers and will set their Precedence bits field value to 0xc0 with ICMP error messages. Networking devices that will act the same will be Cisco routers based on IOS 11.x-12.x and Foundry Networks switches. * DF bits echoing Some TCP/IP stacks will echo DF bit with ICMP Error datagrams, others (like linux) will copy the whole octet completely, zeroing certain bits, others will ignore this field and set their own. * IP ID filend (linux 2.4.0 - 2.4.4 kernels) Linux machines based on Kernel 2.4.0-2.4.4 will set the IP Identification field value with their ICMP query request and reply messages to a value of zero. This was later fixed with Linux Kernels 2.4.5 and up. * IP ttl field (ttl distance to the target has to be precalculated to guarantee accuracy). "The sender sets the time to live field to a value that represents the maximum time the datagram is allowed to travel on the Internet". The field value is decreased at each point that the IP header is being processed. RFC 791 states that this field decreasement reflects the time spent processing the datagram. The field value is measured in units of seconds. The RFC also states that the maximum time to live value can be set to 255 seconds, which equals to 4.25 minutes. The datagram must be discarded if this field value equals zero - before reaching its destination. Relating to this field as a measure to assess time is a bit misleading. Some routers may process the datagram faster than a second, and some may process the datagram longer than a second. The real intention is to have an upper bound to the datagram lifetime, so infinite loops of undelivered datagrams will not jam the Internet. Having a bound to the datagram lifetime help us to prevent old duplicates to arrive after a certain time elapsed. So when we retransmit a piece of information which was not previously delivered we can be assured that the older duplicate is already discarded and will not interfere with the process. The IP TTL field value with ICMP has two separate values, one for ICMP query messages and one for ICMP query replies. The IP TTL field value helps us identify certain operating systems and groups of operating systems. It also provides us with the simplest means to add another check criterion when we are querying other host(s) or listening to traffic (sniffing). TTL-based fingeprinting requires a TTL distance to the done to be precalculated in advance (unless a fingerprinting of a local network based system is performed system). The ICMP Error messages will use values used by ICMP query request messages. A good statistics of ttl dependancy on OS type has been gathered at: http://www.switch.ch/docs/ttl_default.html (Research paper on default ttl values) * TOS field RFC 1349 defines the usage of the Type-of-Service field with the ICMP messages. It distinguishes between ICMP error messages (Destination Unreachable, Source Quench, Redirect, Time Exceeded, and Parameter Problem), ICMP query messages (Echo, Router Solicitation, Timestamp, Information request, Address Mask request) and ICMP reply messages (Echo reply, Router Advertisement, Timestamp reply, Information reply, Address Mask reply). Simple rules are defined: * An ICMP error message is always sent with the default TOS (0x0000) * An ICMP request message may be sent with any value in the TOS field. "A mechanism to allow the user to specify the TOS value to be used would be a useful feature in many applications that generate ICMP request messages". The RFC further specify that although ICMP request messages are normally sent with the default TOS, there are sometimes good reasons why they would be sent with some other TOS value. * An ICMP reply message is sent with the same value in the TOS field as was used in the corresponding ICMP request message. Some operating systems will ignore RFC 1349 when sending ICMP echo reply messages, and will not send the same value in the TOS field as was used in the corresponding ICMP request message. ICMP headers of responded ICMP packet: * ICMP Error Message Quoting Size: All ICMP error messages consist of an IP header, an ICMP header and certain amount of data of the original datagram, which triggered the error (aka offending datagram). According to RFC 792 only 64 bits (8 octets) of original datagram are supposed to be included in the ICMP error message. However RFC 1122 (issued later) recommends up to 576 octets to be quoted. Most of "older" TCP stack implementations will include 8 octets into ICMP Errror message. Linux/HPUX 11.x, Solaris, MacOS and others will include more. Noticiably interesting is the fact that Solaris engineers probably couldn't not read RFC properly (since instead of 64 bits Solaris 2.x includes 64 octets (512 bits) of the original datagram. * ICMP error Message echoing integrity Another artifact which has been noticed is that some stack implementations, when sending back an ICMP error message, may alter the offending packet's IP header and the underlying protocol data, which is echoed back with the ICMP error message. Since mistakes, made by TCP/IP stack programmers are different and specific to an operating system, an analysis of these mistakes could give a potential attacker a a possibilty to make assumptions about the target operating system type. Additional tweaks and twists: * Using difererent from zero code fields in ICMP echo requests When an ICMP code field value different than zero (0) is sent with an ICMP Echo request message (type 8), operating systems that will answer our query with an ICMP Echo reply message that are based on one of the Microsoft based operating systems will send back an ICMP code field value of zero with their ICMP Echo Reply. Other operating systems (and networking devices) will echo back the ICMP code field value we were using with the ICMP Echo Request. The Microsoft based operating systems acts in contrast to RFC 792 guidelines which instruct the answering operating systems to only change the ICMP type to Echo reply (type 0), recalculate the checksums and send the ICMP Echo reply away. * Using DF bit echoing with ICMP query messages As in case of ICMP Error messages, some tcp stacks will respond these queries, while the others: will not. * Other ICMP messages: * ICMP timestamp request * ICMP Information request * ICMP Address mask request Some TCP/IP stacks support these messages and respond to some of these requests. --[Xprobe implementation]-- Currently Xprobe deploys hardcoded logic tree, developed by Ofir Arkin in 'Project X'. Initially a UDP datagram is being sent to a closed port in order to trigger ICMP Error message: ICMP unreachable/port unreach. (this sets up a limitation of having at least one port not filtered on target system with no service running, generically speaking other methods of triggering ICMP unreach packet could be used, this will be discussed further). Moreover, a few tests (icmp unreach content, DF bits, TOS ...) could be combined within a single query, since they do not affect results of each other. Upon the receipt of ICMP unreachable datagram, contents of the received datagram is examined and a diagnostics decision is made, if any further tests are required, according to the logic tree, further queries are sent. --[ Logic tree]--- Quickly recapping the logic tree organization: Initially all TCP/IP stack implementations are split into 2 groups, those which echo precedence bits back, and those which do not. Those which do echo precendence bits (linux 2.0.x, 2.2.x, 2.4.x, cisco IOS 11.x-12.x, Extreme Network Switches etc), being differentiated further based on ICMP error quoting size. (Linux sticks with RFC 1122 here and echoes up to 576 octets, while others in this subgroup echo only 64 bits (8 octets)). Further echo integrity checks are used to differentiate cisco routers from Extreme Network switches. Time-to-live and IP ID fields of ICMP echo reply are being used to recognize version of linux kernel. The same approach is being used to recognize other TCP/IP stacks. Data echoing validation (amounts of octets of original datagram echoed, checksum validation, etc). If additional information is needed to differ two 'similar' IP stacks, additional query is being sent. (please refer to the diagram at http://www.sys-security.com/html/projects/X.html for more detailed explanation/graphical representation of the logic tree). One of the serious problems with the logic tree, is that adding new operating system types to it becomes extremely painful. At times part of the whole logic tree has to be reworked to 'fit' a single description. Therefore a singature based fingerprinting method took our closer attention. --[Sinature based approach]-- Singature based approach is what we are currently focusing on and which we believe will be further, more stable, reliable and flexible method of remote ICMP based fingerprints. Signature-based method is currently based on five different tests, which optionally could be included in each operating system fingerprint. Initally the systems with lesser amount of tests are being examined (normally starting with ICMP unreach test). If no single OS stack found matching received signature, those stacks which match a part, being grouped again, and another test (based on lesser amounts of tests issued principle) is choosen and executed. This verification is repeated until an OS stack, completely matching the signature is found, or we run out of tests. Currently following tests are being deployed: * ICMP unreachable test (udp closed port based, host unreachable, network unreachable (for systems which are believed to be gateways) * ICMP echo request/reply test * ICMP timestamp request * ICMP information request * ICMP address mask request --[future implementations/development]-- Following issues are planned to be deployed (we always welcome discussions/suggestions though): * Fingerprints database (currently being tested) * Dynamic, AI based logic (long-term project :)) * Tests would heavily dependent on network topology (pre-test network mapping will take place). * Path-to-target test (to calculate hops distance to the target) filtering devices probes. * Future implementations will be using packets with actual application data to dismiss chances of being detected. * other network mapping capabilities shall be included ( network role identification, search for closed UDP port, reachability tests, etc). --[code for kids]-- Currently implemented code and further documentation is available at following locations: http://www.sys-security.com/html/projects/X.html http://xprobe.sourceforge.net http://www.notlsd.net/xprobe/ Ofir Arkin Fyodor Yarochkin |=[ EOF ]=---------------------------------------------------------------=|