Page History

Turn Off History

What should be changed?

All the deprecated BSD socket functions. These are complete list. (from Userlevel IPv6 Programming Introduction)


numbers in right denotes lines of code that has the keyword.

struct sockaddr_in : 237
gethostbyname      : 165
gethostbyname2     : 0
getservbyname      : 5
gethostbyaddr      : 23
getservbyport      : 0
inet_addr          : 13
inet_aton          : 7
inet_nsap_addr     : 0
inet_ntoa          : 114
inet_nsap_ntoa     : 0
inet_makeaddr      : 0
inet_netof         : 0
inet_network       : 0
inet_neta          : 0
inet_net_netop     : 0
inet_net_pton      : 0
rcmd               : 10
rexec              : 2
rrsevport          : 0

total 576 lines of code *must* be changed.

IP address is no longer fixed 4 bytes nor fixed 16 bytes. It can be both of them. So, every storage class should be changed. For some of source codes that use 'int' as storage for IP address, this is most troublesome because it could be hidden to simple text search.

Reading and parsing IP address from config file should be changed as well. If an existing code base entirely relied on BSD socket interface such as inet_aton or inet_addr, it would be easier. However, if the code has proprietary parsing/converting functions, every incident should be found and changed.

Zach mentioned that security and authentication code mangled with IPv4.(IpVerifier)

IPv6 address string is far longer than IPv4 address. IPv6 address [0001:0002:0003:0004:0005:0006:0007:0008], length is 41. IPv4 address 101.202.103.203, length is 15. About 3 times longer.

Printing IP address to log file should be considered since existing layout may not work well.

Current Condor Code Base

As noted earlier, Condor source has 576 lines that uses obsolete BSD functions.

What makes problem complicated is that unsigned long is used to hold IP address. There are many places to change. For example, in sock.h,

unsigned int Sock::get_ip_int()
unsigned int Sock::peer_ip_int()
It uses 'unsigned int' to pass IP address.

class Sock already has abstraction of BSD socket interface. But, IP address escapes from this class by returning it as an 4 byte int and sockaddr_in. It is required to check whether there are codes that directly calls BSD socket interface.

The method of Attack

Having single IP address class that deals both IPv4 and IPv6. So, nobody uses sockaddr, sockaddr_in, or plain int to represent IP address.

Extend existing Sock class to be able to deal with both IPv4 and IPv6.

Extend sinful string handler to be able to deal with both IPv4 and IPv6 address.

Here is Boost IP address class. In that class, they hold both IPv4 address and IPv6 address. But I think we only need one 16 byte array instead of having both. Also, we may not need to have ipv4/v6 flag because there is IPv4 address mapping in IPv6. [Zach commented that although we do not need separate storage for each IPv4 and IPv6 address, we still need a type because when we tries to connect to other machines, we do not know whether it should be done through IPv4 network or IPv6 network]

Change every IP address storage to that IP address class.

It is not decided whether to have host byte order or network byte order. sockaddr_in always have network byte order in it. However, the class itself should be convertible to sockaddr_in or sockaddr_in6. However, byte-level compatibility may not be required. So, we need something like IPaddress::to_sockaddr_in6() function.

Unified Address Class

I juggled many design possibilities. I came to the conclusion that unified address class should contain sockaddr_in6. Here is how sockaddr_in6 look like

// code from <linux/in6.h>

struct sockaddr_in6 {
    unsigned short int  sin6_family;    /* AF_INET6 */
    __u16           sin6_port;      /* Transport layer port # */
    __u32           sin6_flowinfo;  /* IPv6 flow information */
    struct in6_addr     sin6_addr;      /* IPv6 address */
    __u32           sin6_scope_id;  /* scope id (new in RFC2553) */
};

Notable difference from sockaddr_in(IPv4 address/port storage) is that it has scope id. Scope id denotes a ethernet interface in a system. Scope id binds IP address shown in sockaddr_in6 to specific ethernet interface. That binding is necessary because some IPv6 addresses are limited to some ethernet interface. Google about 'link local address' would be helpful.

Tentative unified address class is as follows

class IpAddr
{
    sockaddr_in6 storage;

    unsigned char* get_addr() {
        // field name of in6_addr is different from platform to platform
        return (unsigned char*)&storage.sin6_addr;
    }

    unsigned int& get_addr_header() { return *(unsigned int*)get_addr(); }
    unsigned int& get_v4_addr() { return *(unsigned int*)(get_addr()+12); }

public:
    IpAddr()
    {
        memset(&storage, 0, sizeof(storage));
    }

    IpAddr(int ip, unsigned port = 0) : IpAddr()
    {
        storage.sin6_family = AF_INET;
        storage.sin6_port = htons(port);
        // IPv4-mapped region
        get_addr_header() = 0xffff0000;
        get_v4_addr() = (unsigned int)ip;
    }

    IpAddr(in_addr ip, unsigned port = 0) : IpAddr(ip.S_un.S_addr, port)
    {
    }

    IpAddr(sockaddr_in* sin) : IpAddr(sin->sin_addr, sin->sin_port)
    {
    }

    IpAddr(sockaddr_in6* sin6)
    {
        storage = *sin6;
    }

    sockaddr_in to_sin()
    {
        sockaddr_in ret;
        memset(&ret, 0, sizeof(ret));
        if (storage.sin_family != AF_INET) {
            // error..
            return ret;
        }

        ret.sin_family = AF_INET;
        ret.sin_port = storage.sin6_port;
        ret.sin_addr.S_un.S_addr = get_v4_addr();
    }

};

As you see here, sockaddr_in6 is super-set of sockaddr_in. So, this implementation considers IPv6 as basis and providing compatibility to IPv4 usage.

There should be issue when a compiler does not have IPv6 related header files. In that case, we may supply definition of sockaddr_in6. Here only dependency is to AF_INET6 constant and struct sockaddr_in6.

In IPv6, you cannot simply have 16byte IP address portion. You should have scope id as well. Separating { IP address, scope id } and port number could be possible but it will only make source codes complex. Some might argue that in this way, we could save some memory foot-print but I might say that considering size of sockaddr_in6, it is no big deal.

In bottom line, I recommend you to have class IpAddr (which is basically sockaddr_in6 with helper functions) even you only want IP address.

Converting obsolete BSD interface

Since we have few hundreds lines of code, I think I can change it by manually. But scary thing is how to test all of them. Do you have any idea? I am desperate.

Changing every lines which dealing with IP address

My plan is to use compiler generated error. This plan has some analogy to 'escape analysis'. Consider "old" IP address structure is escaping from socket interfaces to every leaf of source codes. If we disallow from the top, there will be no IPv4 address flowing to the leaf. Here is flow.

  1. Find every BSD socket functions that returns IP address or accepts IP address argument
  2. Make proxy function for all of them.
  3. Change every lines where calls old functions to proxy functions. Definitely, proxy functions should allow only new class "IpAddr" instead of sockaddr, and sockaddr_in.
  4. You will have tons of error message.
  5. Resolving this all error message with class IpAddr.
  6. You have the code base that compatible with IPv6 sockets.

How proxy socket interface look like?

My initial plan is to place prefix "condor_". inet_ntop becomes condor_inet_ntop. connect becomes condor_connect.

I do not think we need any class that encapsulating those BSD socket interfaces. The reason is that I do not change any semantics of BSD socket interface but only syntax. Or, if you suggest just a c++ namespace, I would certainly accept that.

Also, I am skeptical to another socket interface layer since we already have socks, reli_sock, and so on. BSD socket interface is kinda universal since it even ported to Playstation Portable programming interface.

Todo

Manageable daily work-list.

  1. discuss with staffs about new sinful string format. <1.2.3.4:80> will not work! encapsulate ipv6 address by [] ? like <[1234:abcd:1234::5678]:5000> ??
  2. remove every obsolete interface.
  3. change every incidents where use sockaddr,sockaddr_in, and int.
  4. change IpVerify not to depend on IPv4 address

Done

Note

This is Condor specific term. We need to extend definition to include IPv6 address. For example, <[a:b:c:d:e:f:g:h]:pppp> would work.

Current answer is no. But, we may need a 'type' flag.

*** currently, this is just temporary page