Page History

Turn Off History

How to get HTCondor and a NAT Firewall to Cooperate

There is an HTCondor machine which is a submit node or an execute node, and you would like to share your new computer resources with others outside your department or division. But, you do not want crooks using your systems to wreak havoc. So, there is a firewall between your HTCondor resource and the Internet. Assume that the firewall is a separate host from the HTCondor resource. It is a bastion host running Linux, and the firewall is iptables, with Network Address Translation (NAT). This assumption allows the description to include explicit commands to run. You or your firewall administrator should translate these instructions to your particular firewall installation.

We are also assuming that you are not going to be using CCB. CCB allows communication between HTCondor daemons in a private network (outgoing connections only) with daemons in a public network (bidirectional connections allowed). It therefore is not a complete solution for a case where you need daemons in two separate private networks to communicate---one or the other network must allow bidirectional connections for CCB to help. In the case described below, we want a submit node, which is in a private network, to communicate with execute nodes in other private networks, such as Open Science Grid. This is the private-to-private case that cannot be solved with CCB alone. The solution below uses port-forwarding to make the submit node effectively public. This allows the execute nodes in the remote private network to use CCB to have bidirectional connectivity with your submit node. Bidirectional connectivity could also be achieved without CCB by also applying the port-forwarding solution below to the execute nodes of the remote private network, which may not be possible, either because of your own security concerns or because you do not administer machines on the remote network.

Assume that HTCondor is installed and running with the following set up:

Make changes to file condor_config.local on machine S. To find this configuration file, see the very beginning of the output generated by the command condor_config_val -dump.

USE_SHARED_PORT = True
SHARED_PORT_ARGS = -p 9617
PRIVATE_NETWORK_NAME = mydomain.net
PRIVATE_NETWORK_INTERFACE = eth0
TCP_FORWARDING_HOST = 10.0.0.1
In these configuration settings, the port 9617 was chosen out of a hat; there is no reason it cannot be any port on the system. 9618 is often chosen; it is the well-known port of the HTCondor collector. In our setup, we are not assuming that there is a collector in the 192.168.0.0/24 network that will be contacted from outside 192.168.0.0/24, so 9618 is also a valid port number; but you may well want to avoid 9618 if you have an internal collector. Note that the TCP_FORWARDING_HOST must match the external address of the collector.

On the execute node E, we have similar configuration changes, except for the shared port:

USE_SHARED_PORT = True
SHARED_PORT_ARGS = -p 9616
PRIVATE_NETWORK_NAME = mydomain.net
PRIVATE_NETWORK_INTERFACE = eth0
TCP_FORWARDING_HOST = 10.0.0.1
The use of PRIVATE_NETWORK_NAME on S and E allow them to communicate directly without going through the firewall F. The port choice of 9616 is arbitrary.

Now, on the firewall F, we run the following commands to redirect connections from the Internet to ports 9617 and 9616 on F to the corresponding ports on S and E:

iptables -t nat -A PREROUTING -p tcp -d 10.0.0.1 --dport 9617 -j DNAT --to-destination 192.168.0.1
iptables -t nat -A PREROUTING -p tcp -d 10.0.0.1 --dport 9616 -j DNAT --to-destination 192.168.0.2
iptables -A POSTROUTING -o eth1 -j SNAT --to-source 10.0.0.1
The first rule above says that inbound connections to 10.0.0.1 on port 9617 will be rewritten to be connections to 192.168.0.1, port 9617. The second line states a similar rule for 192.168.0.2, port 9616. The third rule is probably superfluous, as it is likely already in the firewall rules: all outbound connections from 10.0.0.1 will look like they emanate from 10.0.0.1.

Finally, run condor_reconfig on S and E to incorporate the configuration changes.

One may well ask how this works for HTCondor. Simply put, HTCondor wraps all the information in an string with all the address information. condor_status -schedd -l will return a Schedd ClassAd, which will contain a line that looks something like:

MyAddress = "<10.0.0.1:9617?PrivAddr=%3c192.168.0.1:9617%3fsock%3d936_480b_8%3e&PrivNet=mydomain.net&noUDP&sock=936_480b_8>"
This string contains the needed information for another HTCondor client or daemon to contact the schedd and begin using high throughput computing. Examining this string may be helpful in debugging if you are unable to connect to remote HTCondor services.