See also:

Todo

More test on IPv6 environments. Test IPv6 unicast address.

Clean up DNS functions at ipv6_hostname.h/cpp

Phase 3

What have been done:

condor_netaddr which represents a network (128.105.0.0/24). NetStringList now utilizes condor_netaddr to match.

condor_gethostname() reimplemented. it is better to call get_local_hostname() instead of condor_gethostname(). let's see what should we do when we are using standard universe.

convert_ip_to_hostname() at condor_netdb.cpp implicitly accepts in_addr as const char*. it was removed.

Phase 3 gonna have protocol divergence. Safe_sock has a message that contains IP address as 4 byte integer.

ipv6_hostname becomes bigger and bigger. I feel it is getting duplications but don't know what to say. it has 4-5 variants of get_hostname features whilst it is merely a wrapper of getaddrinfo().

ckpt_server has its own stack of network implementation. do i need to overhaul it??

yes, I'm working on it.

Still remaining

NetworkDeviceInfo

same_host()

IPv6 Mode

Phase 3 is gonna have IPv6 mode. IPv6 mode basically makes HTCondor to prefer IPv6 address.

With IPv6 mode turned on, HTCondor will create every socket as IPv6. (AF_INET6)

However, it is still possible to interact with IPv4 HTCondor and also is possible to bind to IPv4 address!

If the config allows to use any network interface, it uses the interface with IPv4 address to connect IPv4 host. It uses the interface with IPv6 address to connect IPv6 host, too!

If the config requires to use IPv4 address, it will bind to the IPv4 socket using IPv6 socket. (IPv4 address should be converted to IPv4-mapped IPv6 address, though) In this case, no IPv6 connection is allowed.

Thus, IPv6 mode is harmless in any case. (in theory)

IPv6 link-local address

IPv6 link-local address is currently desperately needed as CS dept does not have unicast IPv6 address.

The link-local address is automatically assigned IP address when there is no valid address on the network interface.

One of the interesting thing about the link-local address is that the address contains the interface name. For example, fe80::862b:2bff:fe98:65f2%eth0. The interface name is required because given address is only valid in that interface.

In HTCondor IPv6 testing, I'd like to use link-local IPv6 address. However, one caveat is that the interface name is different from system to system. Thus, we cannot use the link-local address with the interface name as identifying machines.

Instead, what I am gonna do is that each HTCondor system who wants to use the link-local address has to specify the network interface of use. In condor_config, NETWORK_INTERFACE is required.

Each link-local address (which can be identifiable by looking at the address) will be bound to pre-configured NETWORK INTERFACE.

NODNS and IPv6

In NODNS mode, hostname is set by machine's IP address and DEFAULT_DOMAIN_NAME from condor_config. For example, if the IP address of the machine is 127.0.0.1 and DEFAULT_DOMAIN_NAME is nodns.com, the hostname is 127-0-0-1.nodns.com

For IPv6 address, it follows same scheme. ':' character in IPv6 address will be changed to '-'. For example, if the IP address is fe80::862b:2bff:fe98:65f2 and DEFAULT_DOMAIN_NAME is nodns.com, the hostname is fe80--862b-2bff-fe98-65f2.nodns.com. (notice that there is one '::' for zero compaction)

When converting NODNS hostname to IP address, we follow those criteria.

1) if there are 7 dashes in the hostname, it is IPv6 address 2) if there is '--' in the hostname, it is IPv6 address.

Valid IPv6 address will always contain 7 colons or one '::' (which compacts multiple zeroes). Thus, we can determine the type of address without confusion.

network_adapter.* and IPv6

Current method of getting network adapter is totally incompatible with IPv6 address. Current HTCondor uses ioctl() with SIOCGIFCONF that only returns IPv4 addresses.

getifaddrs() is introduced to solve these problems. One caveat of getifaddrs() is that it does not tell you about the network interface id.

We need to cross-match the result of getifaddrs() and ioctl(,SIOCGIFCONF,).

ckpt server

cpkt server has C functions that make it hard to convert. I'm relying on class condor_sockaddr which is c++ class. Investigate C function is really necessary.

ckpt_server sends packets that contain IP address. thinking about how to deal with it.

getSockAddr(): call getsockname() for given fd. if it is INADDR_ANY(0.0.0.0), it replaces the ip address to local ip address. only ckpt server code calls it.

why ckpt server requires this strange behavior?

On-going *all finished*

Working on Phase 2.

*remember to fix command.cpp, sin_to_hostname() is converted to to_sinful(), not get_hostname() file_transfer_db.cpp, condor_auth_x509.cpp, and schedd.cpp, same --> command.cpp is not. --> done. need double-check. confusing to_sinful() and get_hostname() produces an error that is hard to debug. be sure to check by comparing with git history.

Qmgmt, StdUnivSock and soap all have IPv4 data structure. have a look!

Modify hashing function at condor_ipverify.cpp. Find a better hashing function for in6_addr.

NO_DNS re-implementation

In case if you want to know the detail,...

1) NO_DNS are transparently implemented over existing DNS functions. (gethostbyaddr -> condor_gethostbyaddr, ...) 2) These functions and structures are currently obsolete. The interfaces and the structures are limited to IPv4 or is not protocol independent. (Current IPv6 socket functions are all protocol independent - they can work on any layer 2 protocol) 3) If caller uses obsolete structure, it usually intertwined with caller's loop as well. (DNS structure usually contains linked list) 4) HTCondor offer same function in many different forms. (condor_gethostname+get_full_hostname, my_hostname, my_full_hostname are essentially same) Even with abundant of implementation, caller still needs to do housekeeping work by themselves.

I carefully looked at each case and extracted common use cases. These are

1) resolving single/all IP address for given hostname. 2) resolve DNS name for IP address with/without DNS aliases.

It looks very simple and obvious. But, there were 5-6 different implementations on each of them.

unmodified sockaddr_in

finished sockaddr_in

conversion of sockaddr_in were all finished.

job log of collector_engine/hashkey

hashkey.h/cpp has sockaddr_in in function definition. But, the implementation does not use it all.

collector_engine.h/cpp has sockaddr_in at collect() and other functions. They just pass sockaddr_in to functions at hashkey.h. Thus, practically nobody uses it.

Design choice: just convert sockaddr_in to ipaddr. no other conversion is required.

Socket connections that do not use class Sock

-- StdUnivSock was handled. StdUnivSock calls do_connect() (condor_util_lib/do_connect.c) to establish connection with the scheduler. Entire code path uses IPv4 specific constant and functions. Should I just convert them or introduce another socket class?

-- don't mind it Current Stream and Sock are somewhat bloated. It is not functionally coherent. Diverse functions are included in single class. The concept "Stream" is a feature of socket. It should not be parent of Sock. This is where multiple interface inheritance kicks in. Consider "IStream" for the functionality So, class Sock could remain just socket wrapper.

Sinful string construction

There are some places that construct sinful strings without using wrapper functions.

  // shadow_common.cpp
  char sinfulstring[SINFUL_STRING_BUF_SIZE];

  snprintf(sinfulstring, SINFUL_STRING_BUF_SIZE, "<%s:%d>", sock->peer_ip_str(), ports.port1);

The caveat is that sinful grammar is changing due to IPv6 address and these are simply incompatible. If the address is IPv6, it should be form of <[%s]:%s>. We need another abstraction layer.

-- I made a wrapper function. generate_sinful()

Where to put NODNS?

NO_DNS were completely re-implemented. Below are garbages.

Previously, NODNS is implemented over existing DNS query functions such as condor_gethostbyname, condor_gethostname, and condor_gethostbyaddr.

In IPv6, this problem is little bit complicated because when you call getaddrinfo(), it will return a pointer to addrinfo structure. It should be released by the user code. Previously, gethostbyname() returned a pointer to static buffer, thus, the user is not liable to release.

addrinfo structure is a linked list and its allocation method is unknown. Thus, I cannot allocate my own addrinfo structure that is filled with NODNS information.

One possible way is to provide condor_freeaddrinfo() that overrides freeaddrinfo().

But what I really think is that is it really required? I prefer better abstractions.

-- NODNS issue was resolved by introducing newer DNS functions.

Test required

Don't forget below lists.

-- All done

Converting utility functions

struct hostent

I Initially only looked at struct sockaddr_in. I wrote class ipaddr for replacement. However, it is only half of the world. I should devise IPv6 compliant structure for struct hostent and make it obsolete. I think it would make more sense if we have class ipaddr and class MyString instead of sockaddr* and char*. Write here some design possibilities!

-- Ok, all done.

// from MSDN
typedef struct hostent {
  char FAR      *h_name;
  char FAR  FAR **h_aliases;
  short         h_addrtype;
  short         h_length;
  char FAR  FAR **h_addr_list;
}

typedef struct addrinfo {
  int             ai_flags;
  int             ai_family;
  int             ai_socktype;
  int             ai_protocol;
  size_t          ai_addrlen;
  char            *ai_canonname;
  struct sockaddr *ai_addr;
  struct addrinfo *ai_next;
}

Converting Sock

Sock is not only used by TCP/UDP connection but also used by Unix domain socket (AF_UNIX). Basically, Sock can only create TCP or UDP socket internally but SharedEndPoint and SharedPortClient creates Unix domain socket and calls Sock::assign() to pass the descriptor.

I guess this is broken abstraction since Sock has IP related functions such as peer_addr() or peer_port() which are not available to Unix domain socket.

It seems assign() is only used by these Shared* thing. So maybe we could end up with refactoring it.

-- do not need to think about this

List of HTCondor IP addr functions

-- All done here.

These are almost done. At some point, should deprecate internet.c and my_hostname.cpp.

// from condor_c++_util/get_full_hostname.cpp

extern char* get_full_hostname( const char*,
                                struct in_addr *sin_addrp = NULL );

extern char* get_full_hostname_from_hostent( struct hostent* host_ptr,
                                             const char* host );

// from condor_util_lib/condor_netdb.c

struct hostent *
condor_gethostbyname(const char *name);

struct hostent *
condor_gethostbyaddr(const char *addr, SOCKET_LENGTH_TYPE len, int type);

int
condor_gethostname(char *name, size_t namelen);

// from condor_util_lib/internet.c
// internet.c is a total failure...... tons of old-style codings
int is_ipaddr(const char *inbuf, struct in_addr *sin_addr);
int is_ipaddr_no_wildcard(const char *inbuf, struct in_addr *sin_addr);
int is_ipaddr_wildcard(const char *inbuf, struct in_addr *sin_addr, struct in_addr *mask_addr);
int is_valid_network( const char *network, struct in_addr *ip, struct in_addr *mask);
int is_valid_sinful( const char *sinful );

// there are many more from internet.c. omitted at this time.

HTCondor NetDB

HTCondor defines its own netdb-related functions (proxy functions) for its own DNS system called NO_DNS.

-- all done. reimplemented HTCondor netdb

What is it?

TCP_FORWARDING_HOST : it seems it used by startd. when the job (startd) tries to connect shadow and TCP_FORWARDING_HOST is defined, it connects to TCP_FORWARDING_HOST. It seems like some kind of NAT or proxy.

solved

KeyCache

KeyCache, uses sockaddr_in for storing ip addr. before modify, be sure to look at the overall picture. only used in condor_secman and daemon core.

KeyCacheEntry stores sockaddr_in. KeyCache converts sockaddr_in into sinful string and use the string as a key.

Sock::my_addr() can fail? *RESOLVED

One of asymmetry in Sock class is that peer_addr() does not fail and always return a value but my_addr() can fail. ( peer_addr tells you the address of peer, my_addr() tells you the address of local socket)

The possible case for a failure of my_addr() is that when Sock class did not assign a socket descriptor.

In this case, peer_addr() can also fail.

Ulrich Mystery

Ulrich's Userlevel IPv6 Programming Introductionn - http://people.redhat.com/drepper/userapi-ipv6.html

He denotes gethostbyaddr() as an obsolete. However, gethostbyaddr() accepts a socket type as a parameter. (AF_INET, AF_INET6) That means it should work well with IPv6. Why it is obsolete??

We should continue to use gethostbyaddr() because there is no other way to get DNS aliases using newer socket functions.

Try getaddrinfo() on www.yahoo.com. You only get 'any-fp.wa1.b.yahoo.com'. But if you call gethostbyname() on www.yahoo.com, you will get additional hostnames (aliases), which are www.yahoo.com and fp.wg1.b.yahoo.com.

Returning a pointer of a static buffer inside a function

It is quite common practice in HTCondor source code that returning a pointer of a static buffer in a function. It was perfectly fine in a world that pthread is rare and multi-core is unknown. It is obviously non-re-entrant and non-thread-safe. Well, I could do as it was. But, I am slightly uncomfortable of writing such a function. So, I decided to use MyString instead of returning a pointer of a static buffer.

get_full_hostname *RESOLVED

my_ip_addr, my_hostname, my_full_hostname() and get_full_hostname() are now obsolete. Now, implementing ipv6_hostname.{h/cpp}.

It uses gethostbyname() which has been obsoleted. It should call getaddrinfo() instead. The function has about 150-200 lines of code. At the end of function, it tail-calls get_full_hostname_from_hostent(). struct hostent is also deprecated structure.

It requires overhaul from interface to implementation.

** I reimplemented get_full_hostname* and my_hostname*. Newer implementation exists at ipv6_hostname.*

IpVerify

[zach] implemented by mostly Todd, and little bit by Zach and Dan.

IpVerify verifies the remote process's ip from the configuration file. IpVerify heavily uses in_addr structure internally. I would simply change it to in6_addr and map IPv4 address into IPv6. (so called V4MAPPED). It would make HTCondor slightly slower and use more memory. However, having two different code path for IPv4 and IPv6 does not seem nice. 4byte->16byte would not be much overhead and will make everything cleaner and easier. Also, I guess future IP address system will surely be super-set of current IPv6 as IPv6 is super-set of IPv4.

V4MAPPED address works fine. One caveat is that the log will have IPv6-formatted IP address, e.g. ::ffff:127.0.0.1. If somebody do mind, it is possible to remove "::ffff".

Interface Changes - should I do?

Sock::get_port() -> Sock::my_addr().get_port() *for consistency. <- Well, I guess it should be in the future but not now.

-- I won't do this.