Summary
The goal was to create a syslog server on a Raspberry Pi, so I can ship all my logs to a low power machine that’ll be on all the time.
tl;dr – a bit optimistic. The limited memory would probably require Elasticsearch (ES) and Logstash (LS) on different machines. I think Logstash really wants 0.5-1.0gb of stack, which is going to be hard on a Pi (given that it also takes 0.5gb of RSS in addition to the stack)
It’s possible to get ES and LS crammed into a single raspberry pi 3 b+ (~1gb RAM). With little or no configuration, and no data yet, memory use is as follows (base being 949416):
- elastic search RSS: 150680
- logstash RSS: 636272
- swap: 389236 used
- free memory: 10244
- cache: 75036
Additionally, LS is consuming cpu all the time, the PI has a 15 min load average of 1.27. Probably because:
Often times CPU utilization can go through the roof if the heap size is too low, resulting in the JVM constantly garbage collecting.
Startup time is extended; on their own ES takes 50 minutes, and LS takes 1hr 35m. I’ve only got LS started once so far, and it restarted of its own accord at some point over night, so that points to other problems.
Kibana: I couldn’t track down node.js for 32bit ARM Centos7. Elastic no longer support 32bit. And anyway, there’s no memory left. I’d already adjusted my plan towards running it elsewhere just to do the visualisation, with ES and LS running all the time.
An interesting learning exercise. I understand Graylog replaces LS and Kibana in the stack (but needs ES.) It also needs mongodb, which yum didn’t offer up. Looks like I might waste a bunch of time going there.
So, I’ll fall back to basic rsyslog on the Pi; maybe a Beats agent shipping the logs to VM that pops up every so often. Follow the logserver tag to find out.
Packages
This is getting puppetised, so I’m not downloading things and copying them around.
Repo is the same for elasticsearch and logstash, documented here and here.
[elasticsearch-6.x] name=Elasticsearch repository for 6.x packages baseurl=https://artifacts.elastic.co/packages/6.x/yum gpgcheck=1 gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch enabled=1 autorefresh=1 type=rpm-md
Logstash specifies java 1.8, and NOT java 9. So, I’ll go with that rather than OpenJDK 11 which was also available.
rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch yum install java-1.8.0-openjdk-headless elasticsearch logstash
versions for reference.
# rpm -qa | egrep 'logstash|elast|openjdk'
elasticsearch-6.5.4-1.noarch
java-1.8.0-openjdk-headless-1.8.0.191.b12-1.el7_6.armv7hl
logstash-6.5.4-1.noarch
elasticsearch summary
Took guidance from blog entries by MW Preston.
- Cluster/node name
- Out of the box, 6.x doesn’t like being bound to an interface, so have just made it explicitly loopback.
- xpack change is needed on arm cpus.
More detail below.
I’ve read that systemd gets the java heap size from somewhere else, but it worked! Initially set to 350mb for initial and max (documentation states that should be the same) the resulting RSS immediately after it finished starting was 460756, so I cut it back to 256mb. Still got to cram logstash and probably beats in there as well.
# diff /etc/elasticsearch/elasticsearch.yml_orig /etc/elasticsearch/elasticsearch.yml 17c17 < #cluster.name: my-application --- > cluster.name: es_cluster_0 23c23 < #node.name: node-1 --- > node.name: es_cluster_0_node_0 55c55 < #network.host: 192.168.0.1 --- > network.host: 127.0.0.1 59c59 < #http.port: 9200 --- > http.port: 9200 88a89,90 > > xpack.ml.enabled: false # diff /etc/elasticsearch/jvm.options /etc/elasticsearch/jvm.options_orig 22,23c22,23 < -Xms256m < -Xmx256m --- > -Xms1g > -Xmx1g # systemctl start elasticsearch
.
logstash summary
I’ve left node name and http host unchanged. I’ll need to expose the port in due course, but let’s deal with the other issues first, whatever they turn out to be.
With 1gb RAM, I think it’s best to use disk for cache, and for the moment, I’ve reduced this, as I only created a 1gb filesystem to start with.
# diff /etc/logstash/logstash.yml /etc/logstash/logstash.yml_orig 41c41 < pipeline.workers: 2 --- > # pipeline.workers: 2 130c130 < queue.type: persisted --- > # queue.type: memory 154c154 < queue.max_bytes: 512mb --- > # queue.max_bytes: 1024mb # diff /etc/logstash/jvm.options /etc/logstash/jvm.options_orig 6,7c6,7 < -Xms128m < -Xmx128m --- > -Xms1g > -Xmx1g # cat /etc/logstash/conf.d/syslog-5514.conf input { tcp { port => 5514 type => syslog } udp { port => 5514 type => syslog } } filter { } output { elasticsearch { hosts => ["127.0.0.1:9200"] } } # systemctl start logstash Failed to start logstash.service: Unit not found.
A reinstall of the package fixed that, but there might be a nicer way to do it; detail below.
issues getting elasticsearch working
Elastic no longer support 32 bit, and I doubt they expect it running on a 1gb machine. So, some issues are hardly a surprise.
The package took 12 minutes to install.
jna/jni (for completeness)
tl;dr – ignore this ‘error’ in the logs.
ES failed to start, after a few minutes. From /var/log/elasticsearch/es_cluster_0.log ..
java.lang.UnsatisfiedLinkError: Native library (com/sun/jna/linux-arm/libjnidispatch.so) not found in resource path
I spent a while looking into this; comments online indicate it’ll run without JNA.
But, instead I went here .. don’t.
# yum provides */libjnidispatch.so jna-3.5.2-8.el7.armv7hl : Pure Java access to native libraries Repo : base Matched from: Filename : /usr/lib/jna/libjnidispatch.so
There’s a good explanation of behaviour here and sure enough, jvm.options specifies that the native library shouldn’t be used.
# use our provided JNA always versus the system one -Djna.nosys=true
Elastic expect ES to use the jna library provided. That’s not going to work on 32bit raspberry pi, so fall back to ‘it’ll run without it.’
Are you on a 32 bit system? We don’t support 32 bit
# unzip -t /usr/share/elasticsearch/lib/jna-4.5.1.jar | grep aarch testing: com/sun/jna/linux-aarch64/ OK testing: com/sun/jna/linux-aarch64/libjnidispatch.so OK
write access to /var/lib/elasticsearch
On the other hand, it won’t cope with this:
java.lang.IllegalStateException: Failed to create node environment Caused by: java.nio.file.AccessDeniedException: /var/lib/elasticsearch/nodes
Which was caused by me mounting a filesystem on /var/lib/elasticsearch and forgetting that puppet wasn’t fixing the permissions. I’d told it to, but hadn’t got round to working out why the dependencies I set up prevented puppet applying the changes.
x-pack machine learning
org.elasticsearch.ElasticsearchException: X-Pack is not supported and Machine Learning is not available for [linux-arm]; you can use the other X-Pack features (unsupported) by setting xpack.ml.enabled: false in elasticsearch.yml
- Added ‘xpack.ml.enabled: false’ to elasticsearch.yml
non loopback address
Binding to the ethernet interface, which I tried initially, caused issues.
[o.e.n.Node ] [es_cluster_0_node_0] starting ... [o.e.t.TransportService ] [es_cluster_0_node_0] publish_address {192.168.1.246:9300}, bound_addresses {192.168.1.246:9300} [o.e.b.BootstrapChecks ] [es_cluster_0_node_0] bound or publishing to a non-loopback address, enforcing bootstrap checks [o.e.b.Bootstrap ] [es_cluster_0_node_0] node validation exception [1] bootstrap checks failed [1]: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk [o.e.n.Node ] [es_cluster_0_node_0] stopping ...
Documentation suggests the fix for the moment is to bind to loopback. I’d look into using a reverse proxy before binding ES to an interface, if it’s that much of a big deal.
startup time
On a raspberry pi B3+ the service takes just under 50 minutes from starting via systemctl to opening port 9300. Start up seems to be single threaded, so it’s slow.
timeouts
lots of warnings about timeouts being exceeded. not bothered as long as it doesn’t cause it to abend.
failed to process cluster event (put-pipeline-xpack_monitoring_6) within 30s cluster state update task [maybe generate license for cluster] took [4.4m] above the warn threshold of 30s committed version [12] source [maybe generate license for cluster]])] took [2.2m] above the warn threshold of 30s
issues getting logstash working
no boot script
Seems like this issue: https://github.com/elastic/logstash/issues/9403
However, puppet installed logstash for me, with a dependency on openjdk .. so what went wrong …? The logs show everything was there in the right order.
It could be the issue at the end of the thread; installer kicks off a java process with -Xmx1g -Xms1g which is exciting on a 1gb machine; I monitored the reinstall, and in practise it didn’t get near this in RSS, only virtual. Puppet swallowed the output, so who knows.
A reinstall fixed it.
Installing : 1:logstash-6.5.4-1.noarch 1/1 Using provided startup.options file: /etc/logstash/startup.options OpenJDK Zero VM warning: TieredCompilation is disabled in this release. Successfully created system startup script for Logstash Verifying : 1:logstash-6.5.4-1.noarch # systemctl status logstash ● logstash.service - logstash Loaded: loaded (/etc/systemd/system/logstash.service; disabled; vendor preset: disabled) Active: inactive (dead)
There might be another way to do it:
head -9 /etc/logstash/startup.options | grep '^# '
# These settings are ONLY used by $LS_HOME/bin/system-install to create a custom # startup script for Logstash and is not used by Logstash itself. It should # automagically use the init system (systemd, upstart, sysv, etc.) that your # Linux distribution uses. # After changing anything here, you need to re-run $LS_HOME/bin/system-install # as root to push the changes to the init script.
memory and start time
Unlike elasticsearch, logstash doesn’t start logging immediately.
Starting both in parallel extended ES start time to an hour. At this point, logstash hadn’t started logging or started listening on any ports.
After 1hr 10 mins, logstash had an RSS of 485000 (rather more than the 256mb allocated) and in total, there was only 100,000 bytes left for linux, with no prospect of it backing off yet.
I restarted it with the stack configured as 128m. This didn’t make much difference; once logstash was up and running it had an RSS of 637000. There’s a bit more in the summary about this.