HTCondorWiki: How To Mix Firewalls And Ht Condor

Page History

How to get HTCondor and a NAT Firewall to Cooperate

There is an HTCondor machine which is a submit node or an execute node, and you would like to share your new computer resources with others outside your department or division. But, you do not want crooks using your systems to wreak havoc. So, there is a firewall between your HTCondor resource and the Internet. Assume that the firewall is a separate host from the HTCondor resource. Indeed, will assume explicitly that it is a bastion host running Linux, and the firewall is iptables, with Network Address Translation (NAT); this is to be explicit about which commands to run. Either you, or your firewall admin, should be able to translate these instructions to your firewall installation. We are also assuming that you are not going to be using CCB. CCB allows communication between HTCondor daemons in a private network (outgoing connections only) with daemons in a public network (bidirectional connections allowed). It therefore is not a complete solution for a case where you need daemons in two separate private networks to communicate---one or the other network must allow bidirectional connections for CCB to help. In the case described below, we want a submit node, which is in a private network, to communicate with execute nodes in other private networks, such as Open Science Grid. This is the private-to-private case that cannot be solved with CCB alone. The solution below uses port-forwarding to make the submit node effectively public. This allows the execute nodes in the remote private network to use CCB to have bidirectional connectivity with your submit node. Bidirectional connectivity could also be achieved without CCB by also applying the port-forwarding solution below to the execute nodes of the remote private network, which may not be possible, either because of your own security concerns or because you do not administer machines on the remote network.

Let us first assume that you have HTCondor installed and running. For this, you should follow the instructions in section 3.2 of the HTCondor manual.

Let us assume you have the following setup. The HTCondor schedd is installed on a machine (named S) with IP address 192.168.0.1; this is your submit machine. An HTCondor startd is installed on a machine (named E) at 192.168.0.2; this is your execute node. The firewall has an external---that is, facing the Internet---IP address of 10.0.0.1, and an internal---that is, facing toward your local network---IP address of 192.168.0.250; we will call this machine F. We know 10.0.0.1 is actually not a routable address, but pretend that it is for the duration of this document. S and E are in the domain mydomain.net.

Then we will make the following changes to condor_config.local on S (the schedd). To find your HTCondor configuration files, the command condor_config_val -dump will be a big help, as the files are listed in the header of the output

USE_SHARED_PORT = True
SHARED_PORT_ARGS = -p 9617
PRIVATE_NETWORK_NAME = mydomain.net
PRIVATE_NETWORK_INTERFACE = eth0
TCP_FORWARDING_HOST = 10.0.0.1

In the configuration settings above, the port 9617 was chosen out of a hat; there is no reason it cannot be any port on the system. 9618 is often chosen; it is the well-known port of the HTCondor collector. In our setup, we are not assuming that there is a collector in the 192.168.0.0/24 network that will be contacted from outside 192.168.0.0/24, so 9618 is also a valid port number; but you may well want to avoid 9618 if you have an internal collector. Note that the TCP_FORWARDING_HOST must match the external address of the collector.

On the execute node E, we have similar configuration changes, except for the shared port:

USE_SHARED_PORT = True
SHARED_PORT_ARGS = -p 9616
PRIVATE_NETWORK_NAME = mydomain.net
PRIVATE_NETWORK_INTERFACE = eth0
TCP_FORWARDING_HOST = 10.0.0.1

The use of PRIVATE_NETWORK_NAME on S and E allow them to communicate directly without going through the firewall F. The port choice of 9616 is arbitrary.

Now, on the firewall F, we run the following commands to redirect connections from the Internet to ports 9617 and 9616 on F to the corresponding ports on S and E:

iptables -t nat -A PREROUTING -p tcp -d 10.0.0.1 --dport 9617 -j DNAT --to-destination 192.168.0.1
iptables -t nat -A PREROUTING -p tcp -d 10.0.0.1 --dport 9616 -j DNAT --to-destination 192.168.0.2
iptables -A POSTROUTING -o eth1 -j SNAT --to-source 10.0.0.1

The first rule above says that inbound connections to 10.0.0.1 on port 9617 will be rewritten to be connections to 192.168.0.1, port 9617. The second line states a similar rule for 192.168.0.2, port 9616. The third rule is probably superfluous, as it is likely already in the firewall rules: all outbound connections from 10.0.0.1 will look like they emanate from 10.0.0.1.

Finally, run condor_reconfig on S and E to incorporate the configuration changes.

One may well ask how this works for HTCondor. Simply put, HTCondor wraps all the information in an string with all the address information. condor_status -schedd -l will return a Schedd ClassAd, which will contain a line that looks something like:

MyAddress = "<10.0.0.1:9617?PrivAddr=%3c192.168.0.1:9617%3fsock%3d936_480b_8%3e&PrivNet=mydomain.net&noUDP&sock=936_480b_8>"

This string contains the needed information for another HTCondor client or daemon to contact the schedd and begin using high throughput computing. Examining this string may be helpful in debugging if you are unable to connect to remote HTCondor services.