{section: Document Moved to Google Drive} As part of an experiment with Google Drive, this document has been moved to https://docs.google.com/document/d/1BgKK9aHWPJGiePl4Kcv-cl1bbm8MPShG5d8gUf38N_s/edit?usp=sharing If you think you need edit access, contact Alan (adesmet@cs) See GoogleDriveWisdom for more on this experiment. . . . . . . . . . . . . ====================================== Design Doc: Mixed Mode IPv4/IPv6 #3538: Support mixed IPv4/IPv6 mode In progress. Author: Alan De Smet (adesmet@cs). {section: Goal} Be able to run an HTCondor with some nodes only speaking IPv4, some only speaking IPv6, and some speaking both. Matchmaking must avoid matching an IPv4-only computer to an IPv6-only computer. {section: Assumptions} Many of these are things we would hope to change in the future, but are not required for the initial implementation of mixed-mode. *: Any computer running the condor_collector in such an environment must have both IPv4 and IPv6 connectivity. *: Computers running daemons other than the condor_collector may speak IPv4, IPv6, or both. *: Only the vanilla universe is required to work. *: It is not necessary to support listening on more than one IPv4 and one IPv6 address for a given host. This mirrors today's limitation that only one IPv6 address can be used by a given host. *: It is acceptable to require computers speaking IPv6 to specify which address to use, including for IPv4 if the computer also speaks that. This mirrors today's limitation that IPv6 hosts must specify the address to use. *: We can ignore the complexities of private and link local addresses. *: It is acceptable for a computer that only speaks IPv4/IPv6 to be unable to communicate in any way with a computer that only speaks IPv6/IPv4. For example, from an IPv6-only computer, "condor_q -name 123.45.67.89" is allowed to fail. {section: Open design questions} *: Do any of these things work in IPv6? If not, identify them as assumptions to not work above. *:: BIND_ALL_INTERFACES=TRUE *:: ENABLE_ADDRESS_REWRITING=TRUE *:: Asterisks in NETWORK_INTERFACE *:: Host and IP based security *: How to specify COLLECTOR_HOST? Do the various querying mechanisms actually try a list, or can we only broadcast to multiple (via CONDOR_VIEW_HOST). A very cursory look suggests lots of places are trying to parse COLLECTOR_HOST on their own; we'll probably need to unify this. *: What are we passing in on connect attempts? sinful string or object equivalent? {section: Problem: Sinful versus condor_sockaddr} We currently have 2 objects that represent an address, Sinful and condor_sockaddr. Both can serial/deserialize a Sinful string. Sinful is a relatively simple set of key/value pairs (plus an unlabelled basic address). It doesn't know how to convert things to system API structures (sockaddr_in/sockaddr_in6), it's really just storing a pile of strings. On the up side, it knows how to store CCB addresses, private network addresses, and shared port socket information. It is also easily extensible. It knows nothing about IPv6 addresses, although an IPv6 addresses can probably be passed through the object safely since they are just another string. condor_sockaddr essentially wraps sockaddr_in/sockaddr_in6. It only knows how to hold a single address; CCB, shared port, private networks, and the like don't work. The two were created in parallel around the same time. Code tends to use with one or the other. I'm slightly mystified that the system is working as well as it does. These need to be unified. Mixed mode is going to require advertising at least 2 different addresses simultaneously, and connection attempts may want to try both. We want to be future proof against multiple addresses per protocol in the future. This points to using Sinful as our exclusive or nearly exclusive object for passing around contact information, relegating condor_sockaddr to low level, last minute work (to take advantage of the sockaddr_in/sockaddr_in6 support. Probably keep condor_sockaddr to holding a single address. I believe condor_sockaddr has permeated higher levels than it should have given. {section: Plan} 1: Dual mode collector 1:: 1-2 weeks 2013-07-08 Teach collector to respond to commands on both IPv4 and IPv6. (Likely side effect: all daemons do.) 2:: 0-1 weeks 2013-07-15 Teach collector to hold both IPv4 and IPv6 ads. (Probably no work at all.) *:: Not required: advertising both addresses (see next step) 1: ?? ????-??-?? Resolve Sinful/condor_sockaddr *:: 2: 1-3 weeks ????-??-?? Dual address sinful strings *:: Allow encoding both IPv4 and IPv6 addresses in a sinful string. Needs to simultaneously work with CCB and shared port. If we bother with backward compatbility, make the IPv4 address the backward compatible one. *:: May need to support different port numbers for different addresses; trying to synchronize port numbers across TCP and UDP for both IPv4 and IPv6 seems like a pain in the ass. *:: Design such that if in the future we wanted to have arbitrarily long lists of addresses, we could do so while maintaining backward compatbility (older versions would just use the head of a list). This seems likely to be functionality we'll want in the future, and while we're messing with the sinful strings in the time to ensure the future-proofing is done. *:: Rejected: MyAddress (implicitly IPv4) and new attribute MyAddressIPv6. It means multiple places need to look in multiple attributes to find the appropriate address. We want to be able to hand a sinful string around and assume it's everything you need to contact someone. This is the same decision we made for CCB and shared port. 3: 1-3 weeks ????-??-?? Teach collector to advertise both IPv4 and IPv6 addresses. (Likely side effect: all daemons do.) *:: If not implemented in previous step, the code that identifies a daemon's own MyAddress should now identify both addresses if appropriate. This may have been done as a side effect of the "Dual address sinful strings" step. 4: 1-3 weeks ????-??-?? Teach startd to listen on and advertise both IPv4 and IPv6 addresses. (Likely side effect: all daemons do.) *:: Likely just done at the DaemonCore level where the attributes are automatically added. Indeed, likely even lower, at the "Gimme a sinful string for myself" level. *:: There will be breakage at this point, as the negotiator begins trying to match IPv4-only hosts to IPv6-only hosts. 5: 0-2 weeks ????-??-?? Teach schedd to listen on and advertise both IPv4 and IPv6 addresses. (Possibly done by "teach startd" step.) *:: There will be breakage at this point, as the negotiator begins trying to match IPv4-only hosts to IPv6-only hosts. 6: Teach negotiator to make matchmaking decisions based on IPv4 and IPv6 capabilities of the involved parties. 1:: 1 week ????-??-?? New function: CanCommunicate, takes two sinful strings, returns true or false. 2:: 2 weeks ????-??-?? Negotiator implicitly adds CanCommunicate to Requirements for the Machine (or the Job; whichever). *::: Rejected: Explicitly add. Risks trashing autoclusters, as every machine and job has a unique sinful string. *:: Rejected: add SpeaksIPv6=TRUE; SpeaksIPv4=TRUE then adding to Job/Machine adds (Requirements=SpeaksIPv6 && TARGET.SpeaksIPv6)||(Requirements=SpeaksIPv4 && TARGET.SpeaksIPv4). Not backward compatible. Could assume anyone lacking both only speaks IPv4, but breaks existance IPv6 pools. Not future proof: no hope for subtle things like "are these hosts on the same subnet and thus able to use link local IPv6?" 7: 0-2 weeks ????-??-?? Teach other daemons to listen on and advertise both IPv4 and IPv6 addresses. (Possibly done by "teach startds" step). 8: 1-3 weeks ????-??-?? Gracefully handle impossible connection requests. 1:: Daemons that only handle one protocol shouldn't ever have a reason to connect to the other. But just in case, ensure the failure is graceful: complain to the log and carry on. EXCEPTing is absolutely not acceptable. 2:: Command line tools could easily be asked to do the impossible. "condor_q -name ipv6-only-host.example.com" on an IPv4 host, for example. We need to give a reasonable message to the end user.