Page History
- 2012-Nov-13 16:18 adesmet
- 2011-Apr-12 14:23 m
- 2010-Sep-08 11:25 adesmet
- 2010-Aug-17 14:53 adesmet
- 2010-Apr-27 21:43 m
- 2010-Apr-07 06:02 m
- 2010-Mar-29 10:27 m
- 2010-Jan-28 18:46 m
- 2010-Jan-28 18:46 m
- 2010-Jan-26 18:35 m
- 2010-Jan-25 17:20 m
- 2010-Jan-25 17:19 m
- 2010-Jan-25 17:09 m
- 2010-Jan-25 17:06 m
- 2010-Jan-24 21:07 m
- 2010-Jan-24 21:04 m
- 2010-Jan-24 21:03 m
- 2010-Jan-24 20:52 m
- 2010-Jan-24 19:36 m
- 2010-Jan-24 19:07 m
- 2010-Jan-24 18:58 m
- 2010-Jan-22 17:07 m
- 2010-Jan-22 17:06 m
- 2010-Jan-19 10:40 m
- 2010-Jan-19 10:36 m
- 2010-Jan-18 11:05 m
- 2010-Jan-18 11:05 m
- 2010-Jan-12 16:15 m
- 2010-Jan-12 15:52 m
- 2010-Jan-12 15:43 m
- 2010-Jan-12 15:42 m
- 2010-Jan-12 15:34 m
- 2010-Jan-12 15:18 m
- 2010-Jan-12 15:11 m
- 2010-Jan-12 15:10 m
- 2010-Jan-12 10:36 m
- 2010-Jan-12 10:36 m
- 2009-Dec-23 16:34 m
- 2009-Dec-23 16:32 m
- 2009-Dec-23 16:25 m
- 2009-Dec-23 16:23 m
- 2009-Dec-23 16:22 m
- 2009-Dec-23 14:47 m
- 2009-Dec-23 14:47 m
- 2009-Dec-23 14:36 m
- 2009-Dec-23 11:33 matt
- 2009-Dec-21 18:21 m
- 2009-Dec-21 18:21 m
- 2009-Dec-21 18:17 m
- 2009-Dec-21 18:14 m
- 2009-Dec-21 18:13 m
- 2009-Dec-21 04:29 m
- 2009-Dec-21 04:27 m
- 2009-Dec-21 04:17 m
- 2009-Dec-21 04:14 m
- 2009-Dec-21 04:08 m
- 2009-Dec-21 04:05 m
- 2009-Dec-21 04:03 m
- 2009-Dec-21 04:01 m
- 2009-Dec-21 03:54 m
- 2009-Dec-21 03:42 m
- 2009-Dec-21 03:35 m
- 2009-Dec-18 17:14 m
- 2009-Dec-11 16:09 m
- 2009-Dec-11 16:07 m
- 2009-Dec-11 16:06 m
- 2009-Dec-11 16:05 m
- 2009-Dec-11 16:04 m
- 2009-Dec-11 16:01 m
- 2009-Dec-11 15:34 m
- 2009-Dec-08 04:54 m
- 2009-Dec-08 04:53 m
- 2009-Dec-08 04:50 m
- 2009-Dec-08 04:48 m
- 2009-Dec-08 04:36 m
- 2009-Dec-08 04:36 m
- 2009-Dec-08 04:35 m
- 2009-Dec-08 04:34 m
- 2009-Dec-08 04:33 m
- 2009-Dec-08 04:26 m
- 2009-Dec-08 04:24 m
- 2009-Dec-08 04:23 m
- 2009-Dec-08 04:23 m
- 2009-Dec-08 04:17 m
- 2009-Dec-08 04:11 m
- 2009-Dec-08 04:08 m
- 2009-Dec-08 04:03 m
- 2009-Dec-04 17:24 m
- 2009-Dec-04 17:17 m
- 2009-Dec-04 17:15 m
- 2009-Dec-04 17:04 m
What should be changed?
All the deprecated BSD socket functions. These are complete list. (from Userlevel IPv6 Programming Introduction)
numbers in right denotes lines of code that has the keyword. struct sockaddr_in : 237 gethostbyname : 165 gethostbyname2 : 0 getservbyname : 5 gethostbyaddr : 23 getservbyport : 0 inet_addr : 13 inet_aton : 7 inet_nsap_addr : 0 inet_ntoa : 114 inet_nsap_ntoa : 0 inet_makeaddr : 0 inet_netof : 0 inet_network : 0 inet_neta : 0 inet_net_netop : 0 inet_net_pton : 0 rcmd : 10 rexec : 2 rrsevport : 0 total 576 lines of code *must* be changed.
IP address is no longer fixed 4 bytes nor fixed 16 bytes. It can be both of them. So, every storage class should be changed. For some of source codes that use 'int' as storage for IP address, this is most troublesome because it could be hidden to simple text search.
Reading and parsing IP address from config file should be changed as well. If an existing code base entirely relied on BSD socket interface such as inet_aton or inet_addr, it would be easier. However, if the code has proprietary parsing/converting functions, every incident should be found and changed.
Zach mentioned that security and authentication code mangled with IPv4.(IpVerifier)
IPv6 address string is far longer than IPv4 address. IPv6 address [0001:0002:0003:0004:0005:0006:0007:0008], length is 41. IPv4 address 101.202.103.203, length is 15. About 3 times longer.
Printing IP address to log file should be considered since existing layout may not work well.
Current Condor Code Base
As noted earlier, Condor source has 576 lines that uses obsolete BSD functions.
What makes problem complicated is that unsigned long is used to hold IP address. There are many places to change. For example, in sock.h,
unsigned int Sock::get_ip_int() unsigned int Sock::peer_ip_int()
class Sock already has abstraction of BSD socket interface. But, IP address escapes from this class by returning it as an 4 byte int and sockaddr_in. It is required to check whether there are codes that directly calls BSD socket interface.
The method of Attack
Having single IP address class that deals both IPv4 and IPv6. So, nobody uses sockaddr, sockaddr_in, or plain int to represent IP address.
Extend existing Sock class to be able to deal with both IPv4 and IPv6.
Extend sinful string handler to be able to deal with both IPv4 and IPv6 address.
Here is Boost IP address class. In that class, they hold both IPv4 address and IPv6 address. But I think we only need one 16 byte array instead of having both. Also, we may not need to have ipv4/v6 flag because there is IPv4 address mapping in IPv6. [Zach commented that although we do not need separate storage for each IPv4 and IPv6 address, we still need a type because when we tries to connect to other machines, we do not know whether it should be done through IPv4 network or IPv6 network]
Change every IP address storage to that IP address class.
It is not decided whether to have host byte order or network byte order. sockaddr_in always have network byte order in it. However, the class itself should be convertible to sockaddr_in or sockaddr_in6. However, byte-level compatibility may not be required. So, we need something like IPaddress::to_sockaddr_in6() function.
Unified Address Class
I juggled many design possibilities. I came to the conclusion that unified address class should contain sockaddr_in6. Here is how sockaddr_in6 look like
// code from <linux/in6.h> struct sockaddr_in6 { unsigned short int sin6_family; /* AF_INET6 */ __u16 sin6_port; /* Transport layer port # */ __u32 sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ __u32 sin6_scope_id; /* scope id (new in RFC2553) */ };
Notable difference from sockaddr_in(IPv4 address/port storage) is that it has scope id. Scope id denotes a ethernet interface in a system. Scope id binds IP address shown in sockaddr_in6 to specific ethernet interface. That binding is necessary because some IPv6 addresses are limited to some ethernet interface. Google about 'link local address' would be helpful.
Tentative unified address class is as follows
class IpAddr { sockaddr_in6 storage; unsigned char* get_addr() { // field name of in6_addr is different from platform to platform return (unsigned char*)&storage.sin6_addr; } unsigned int& get_addr_header() { return *(unsigned int*)get_addr(); } unsigned int& get_v4_addr() { return *(unsigned int*)(get_addr()+12); } public: IpAddr() { memset(&storage, 0, sizeof(storage)); } IpAddr(int ip, unsigned port = 0) : IpAddr() { storage.sin6_family = AF_INET; storage.sin6_port = htons(port); // IPv4-mapped region get_addr_header() = 0xffff0000; get_v4_addr() = (unsigned int)ip; } IpAddr(in_addr ip, unsigned port = 0) : IpAddr(ip.S_un.S_addr, port) { } IpAddr(sockaddr_in* sin) : IpAddr(sin->sin_addr, sin->sin_port) { } IpAddr(sockaddr_in6* sin6) { storage = *sin6; } sockaddr_in to_sin() { sockaddr_in ret; memset(&ret, 0, sizeof(ret)); if (storage.sin_family != AF_INET) { // error.. return ret; } ret.sin_family = AF_INET; ret.sin_port = storage.sin6_port; ret.sin_addr.S_un.S_addr = get_v4_addr(); } };
As you see here, sockaddr_in6 is super-set of sockaddr_in. So, this implementation considers IPv6 as basis and providing compatibility to IPv4 usage.
There should be issue when a compiler does not have IPv6 related header files. In that case, we may supply definition of sockaddr_in6. Here only dependency is to AF_INET6 constant and struct sockaddr_in6.
In IPv6, you cannot simply have 16byte IP address portion. You should have scope id as well. Separating { IP address, scope id } and port number could be possible but it will only make source codes complex. Some might argue that in this way, we could save some memory foot-print but I might say that considering size of sockaddr_in6, it is no big deal.
In bottom line, I recommend you to have class IpAddr (which is basically sockaddr_in6 with helper functions) even you only want IP address.
Converting obsolete BSD interface
Since we have few hundreds lines of code, I think I can change it by manually. But scary thing is how to test all of them. Do you have any idea? I am desperate.
Changing every lines which dealing with IP address
My plan is to use compiler generated error. This plan has some analogy to 'escape analysis'. Consider "old" IP address structure is escaping from socket interfaces to every leaf of source codes. If we disallow from the top, there will be no IPv4 address flowing to the leaf. Here is flow.
- Find every BSD socket functions that returns IP address or accepts IP address argument
- Make proxy function for all of them.
- Change every lines where calls old functions to proxy functions. Definitely, proxy functions should allow only new class "IpAddr" instead of sockaddr, and sockaddr_in.
- You will have tons of error message.
- Resolving this all error message with class IpAddr.
- You have the code base that compatible with IPv6 sockets.
How proxy socket interface look like?
My initial plan is to place prefix "condor_". inet_ntop becomes condor_inet_ntop. connect becomes condor_connect.
I do not think we need any class that encapsulating those BSD socket interfaces. The reason is that I do not change any semantics of BSD socket interface but only syntax.
Another reason is that BSD socket interface is kinda universal and well formed. For application socket classes, we already have Sock, ReliSock, and so on. Any abstraction other than proxy between application socket classes and BSD socket interface? I did not find any appealing reason. If you have any idea, please let me know.
Todo
Manageable daily work-list.
- discuss with staffs about new sinful string format. <1.2.3.4:80> will not work! encapsulate ipv6 address by [] ? like <[1234:abcd:1234::5678]:5000> ??
- remove every obsolete interface.
- change every incidents where use sockaddr,sockaddr_in, and int.
- change IpVerify not to depend on IPv4 address
Done
- investigate ipv6 address handling in C++ Boost library
- investigate security and authentication code that deals IPv4 address directly. IP verifier?
- investigate every source code where calls obsolete BSD socket interface.
- make a git branch
- IpAddr class, unified address class
Note
- In a server, we want to accept both IPv4 and IPv6 connections in single program(process). There are two choices with dealing IPv4 and IPv6 co-existence.
- We may create only one IPv6 socket. Then, OS will fake a server program with IPv6-IPv4 mapped region. ffff::1.2.3.4 is really IPv4 host and came through IPv4 router and everything.
- We may create two sockets, IPv4 and IPv6. Call non-blocking accept to both of them. And poll() or select() or kqueue() on both sockets. In current Linux implementation, I think we do not need this kind of complexity. Linux kernels deal smoothly with faked IPv6 address. I expect it will be true for commercial Unix like AIX and Solaris.
- internet.h seems to have a lot of IPv6 incompatible functions.
- why do you call <a.b.c.d:pppp> as sinful string? what does it mean 'sinful'? - Now that I know.
This is Condor specific term. We need to extend definition to include IPv6 address. For example, <[a:b:c:d:e:f:g:h]:pppp> would work.
- do we really need unified ip address class? IPv6 address space contains IPv4 and there is 1-to-1 mapping.
Current answer is no. But, we may need a 'type' flag.
- Do you have any good arguments for having network-byte-order IP address instead of having host-byte-order IP address?
- check point server and syscall library. will they eventually replaced by DMTCP? they contains IPv4 functions but maybe I don't need to care about them.
*** currently, this is just temporary page