restart firewalld after installing tftp *client*

edit: the title of the post is wrong, and I’ll repost it when I’ve got around to crossing all the tees. 

  • There’s a firewalld service called tftp-client;  the fix for machines with a tftp client is thus probably to enable that.
  • This post covers off a number of other topics as well – turned out to be a trip down the rabbit hole.  So I’ll leave it here for the moment.
  • In the very unlikely event that anyone wants to link to this blog post in the mean time, the link will go dead when I fix it as I’ll name it more appropriately.

 

I was trying to test out all the components of my PXE build server; ie: http and tftp.

  • xinetd and tftpd installed on Centos 7 (on a pi.)
  • tftp client installed on Centos 7.
  • Both with firewalld and selinux enforcing.

I’d already tried to test it with a PXE boot, which failed. The cause of that would have been ..

Running tftpd as non-root. Not.

First things first, the ‘user’ field in the xinetd configuration for tftp must be set to root. It doesn’t do what you want it to do. Or rather it does, but at the expense of tftp not working.

  • create an account, switch out user=root in /etc/xinetd.d/tftp, and you’ll get:
in.tftpd[2120]: cannot set groups for user nobody

Which is confusing, because you thought you told xinetd to run it as a specific account .. and probably not ‘nobody.’

  • The xinetd config user field determines what account xinetd spawns the daemon as.
  • The parameters on the daemon determine what account it will do the tftp as.
  • Don’t specify parameters for the daemon and it’ll try and use nobody.
  • If it isn’t running as root, it cannot switch to the nobody account, or any other account for that matter.
  • The daemon doesn’t handle this, it just fails to work.    I wasn’t first, and I won’t be last.

The ‘correct’ way to do it is:

	user			= root
	server_args		= -u user -s /some/path

By default the process persists for 15 minutes afterwards (change this with –timeout in server_args) and so you can see it runs as root.   But when an actual tftp session kicks off, it forks, and then does this (from strace output)

[pid 3503] setgroups32(1, [10069]) = 0
[pid 3503] chroot(".") = 0
[pid 3503] setregid32(10069, 10069) = 0
[pid 3503] setreuid32(10069, 10069) = 0

Which, without reading up on syscalls, I take to mean that it switches uid and gid.

I’m running selinux enforcing. Which you should be doing as well. That provides an alternative way to assure containment.

I spent a long time googling and trying to find a way to make that work nicely, so having given up ..

now it should work .. but the files transfer empty?

Once I’d got it running cleanly, unhelpful stuff was happening.  I tried it a lot, tried various things, in summary:

client:

$ tftp -4 -v 192.168.1.223 -c get foo
Connected to 192.168.1.223 (192.168.1.223), port 69
getting from 192.168.1.223:foo to foo [netascii]
@����
$ ls -al foo
-rw-rw-r--. 1 ben ben 0 Mar 12 17:06 foo

server:

in.tftpd[2235]: RRQ from 192.168.1.245 filename foo

Yeah, complete with random noise from the client.  I tried this a lot, and when passing the get command on the CLI, it pukes a little. Nice.

The file’s empty.  There’s no error anywhere, and there’s nothing else logged.  (There should be.)

  • I turned off the server’s firewalld.
  • I set selinux permissive on the server, and the following is clean.
ausearch -m AVC,USER_AVC -ts recent

Remember that the tftp daemon stays running?  That makes it much easier to trace, which at this point became necessary.  Ran out of ideas.  I’m not a syscall guru by any means, but it all started with truss on Solaris, and I still fall back to it sometimes.

I thought: it must be a problem reading the file, in which case, the syscalls will show this.

I’ve annotated the end of it here; the PID is that root process.

# strace -fp <PID>

   # server IP
[pid  2249] bind(0, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.1.223")}, 16) = 0
   # client IP
[pid  2249] connect(0, {sa_family=AF_INET, sin_port=htons(53450), sin_addr=inet_addr("192.168.1.245")}, 16) = 0
[pid  2249] setsockopt(0, SOL_IP, IP_MTU_DISCOVER, [0], 4) = 0
   # open requested fgile
[pid  2249] open("foo", O_RDONLY|O_LARGEFILE) = 1
[pid  2249] fstat64(1, {st_mode=S_IFREG|0444, st_size=29, ...}) = 0
[pid  2249] fcntl64(1, F_SETLK64, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0
[pid  2249] fcntl64(1, F_GETFL)         = 0x20000 (flags O_RDONLY|O_LARGEFILE)
[pid  2249] gettimeofday({1552408345, 873122}, NULL) = 0
   # syslog 'Mar 12 16:32:25 in.tftpd[2249]: RRQ from 192.168.1.245 filename foo'
[pid  2249] send(3, "<29>Mar 12 16:32:25 in.tftpd[224"..., 72, MSG_NOSIGNAL) = 72
[pid  2249] fstat64(1, {st_mode=S_IFREG|0444, st_size=29, ...}) = 0
[pid  2249] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x76f88000
   # the file foo contains the output of 'date'
[pid  2249] read(1, "Tue Mar 12 16:26:48 UTC 2019\n", 4096) = 29
[pid  2249] read(1, "", 4096)           = 0
   # send the contents of the file ..
[pid  2249] send(0, "\0\3\0\1Tue Mar 12 16:26:48 UTC 2019"..., 33, 0) = 33
[pid  2249] poll([{fd=0, events=POLLIN}], 1, 1000) = 1 ([{fd=0, revents=POLLERR}])
   #
   # here's the key error - E(rror) HOST UNREACH(able)
   #
[pid  2249] recv(0, 0x9690b0, 516, 0)   = -1 EHOSTUNREACH (No route to host)
   # also a bit weird; I think this is error logging
   # perhaps file handles are used to control the volume on debugging.
[pid  2249] write(2, "fatal: ", 7)      = -1 EBADF (Bad file descriptor)
[pid  2249] write(2, "recvfrom_flags_with_timeout: rec"..., 54) = -1 EBADF (Bad file descriptor)
[pid  2249] fstat64(1, {st_mode=S_IFREG|0444, st_size=29, ...}) = 0
[pid  2249] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x76f87000
[pid  2249] write(1, "\n", 1)           = -1 EBADF (Bad file descriptor)
[pid  2249] exit_group(1)               = ?
[pid  2249] +++ exited with 1 +++

Not a problem reading the file.

It reads the file, tries to pass the data to the client, loses the client.

A network issue.

firewalld on the client

Process of elimination.

I don’t know tftp that well, but I do know that FTP traffic is a bit odd and involves two sessions, and firewalls have to cope with that.

Here we have a hand off between daemons on the server side, perhaps there’s some oddness with ports, and the *client* firewall has to keep track of it. Otherwise, you’ve got incoming data from an unexpected source.

Of course, you don’t usually run the TFTP client – that’s a Cisco device or appliance doing a backup, or it’s your PXE client.  It just works, right?

So client side:

$ sudo systemctl stop firewalld 
[sudo] password for ben: 
$ sudo systemctl status firewalld 
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Tue 2019-03-12 20:19:32 GMT; 6s ago
     Docs: man:firewalld(1)
$ tftp -4 -v 192.168.1.223 -c get foo
Connected to 192.168.1.223 (192.168.1.223), port 69
getting from 192.168.1.223:foo to foo [netascii]
Received 29 bytes in 0.1 seconds [1854 bit/s]

#
# server side
#

in.tftpd[2724]: RRQ from 192.168.1.245 filename foo
in.tftpd[2724]: Client 192.168.1.245 finished foo

#
# client side
#

$ sudo systemctl start firewalld 
$ tftp -4 -v 192.168.1.223 -c get foo
Connected to 192.168.1.223 (192.168.1.223), port 69
getting from 192.168.1.223:foo to foo [netascii]
Received 29 bytes in 0.1 seconds [2203 bit/s]

Some caviats

  • YMMV.  When I restarted firewalld, a bunch of noise was generated by the service along the following lines. Not sure whether TFTP should still have been working.
firewalld[28663]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w2 -w --table mangle 
--delete POSTROUTING --out-interface virbr0 --protocol udp --destination-port 68 
--jump CHECKSUM --checksum-fill' failed: iptables: No chain/target/match by that name.
  • I rebooted and the problem came back.  I had assumed the ‘fix’ was in the tftp package, but the firewalld reload was not.  Not so.
  • Post reboot, stopping firewalld fixed it again, starting it broke it again, unlike before.

So, in other words, the fix may not be a restart.  The fix maybe to stop firewalld.   Assuming anyone else has this issue, and it’s not just a fault in the machine I’m sat in front of.

Oooh, nice one systemd.

In the process, I tried to eliminate xinetd.

I started up the tftp socket:

● tftp.socket - Tftp Server Activation Socket
   Loaded: loaded (/usr/lib/systemd/system/tftp.socket; disabled; vendor preset: disabled)
   Active: active (listening) since Tue 2019-03-12 17:03:11 UTC; 3h 16min ago
   Listen: [::]:69 (Datagram)

And the really cool thing about that is that it works just like xinetd. tftpd.service is still disabled, but this acts as a listener, and starts that service when a connection arrives.  You then have to supply modifications to the tftpd.service definition to change its parameters.  It does NOT use the xinetd configuration file.

Do NOT edit /usr/lib/systemd/system/xinetd.service. There’s a number of ways to effect changes to systemd unit files, at some point I’ll blog about them, maybe.

Ideally I’d now go and switch out xinetd for this, but I think I’ll cut my losses. The build server was an enabler for other stuff, not an end in itself!!

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s