ELK 6.5 stack on Raspberry Pi / Centos7

Summary

The goal was to create a syslog server on a Raspberry Pi, so I can ship all my logs to a low power machine that’ll be on all the time.

tl;dr – a bit optimistic. The limited memory would probably require Elasticsearch (ES) and Logstash (LS) on different machines. I think Logstash really wants 0.5-1.0gb of stack, which is going to be hard on a Pi (given that it also takes 0.5gb of RSS in addition to the stack)

It’s possible to get ES and LS crammed into a single raspberry pi 3 b+ (~1gb RAM). With little or no configuration, and no data yet, memory use is as follows (base being 949416):

  • elastic search RSS: 150680
  • logstash RSS: 636272
  • swap: 389236 used
  • free memory: 10244
  • cache: 75036

Additionally, LS is consuming cpu all the time, the PI has a 15 min load average of 1.27. Probably because:

Often times CPU utilization can go through the roof if the heap size is too low, resulting in the JVM constantly garbage collecting.

Startup time is extended; on their own ES takes 50 minutes, and LS takes 1hr 35m. I’ve only got LS started once so far, and it restarted of its own accord at some point over night, so that points to other problems.

Kibana: I couldn’t track down node.js for 32bit ARM Centos7.   Elastic no longer support 32bit.  And anyway, there’s no memory left. I’d already adjusted my plan towards running it elsewhere just to do the visualisation, with ES and LS running all the time.

An interesting learning exercise.  I understand Graylog replaces LS and Kibana in the stack (but needs ES.) It also needs mongodb, which yum didn’t offer up. Looks like I might waste a bunch of time going there.

So, I’ll fall back to basic rsyslog on the Pi;  maybe a Beats agent shipping the logs to VM that pops up every so often. Follow the logserver tag to find out.

Packages

This is getting puppetised, so I’m not downloading things and copying them around.

Repo is the same for elasticsearch and logstash, documented here and here.

[elasticsearch-6.x]
name=Elasticsearch repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Logstash specifies java 1.8, and NOT java 9. So, I’ll go with that rather than OpenJDK 11 which was also available.

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
yum install java-1.8.0-openjdk-headless elasticsearch logstash

versions for reference.

# rpm -qa | egrep 'logstash|elast|openjdk'
elasticsearch-6.5.4-1.noarch
java-1.8.0-openjdk-headless-1.8.0.191.b12-1.el7_6.armv7hl
logstash-6.5.4-1.noarch

 

elasticsearch summary

Took guidance from blog entries by MW Preston.

  • Cluster/node name
  • Out of the box, 6.x doesn’t like being bound to an interface, so have just made it explicitly loopback.
  • xpack change is needed on arm cpus.

More detail below.

I’ve read that systemd gets the java heap size from somewhere else, but it worked!  Initially set to 350mb for initial and max (documentation states that should be the same) the resulting RSS immediately after it finished starting was 460756, so I cut it back to 256mb. Still got to cram logstash and probably beats in there as well.

# diff /etc/elasticsearch/elasticsearch.yml_orig /etc/elasticsearch/elasticsearch.yml
17c17
< #cluster.name: my-application
---
> cluster.name: es_cluster_0
23c23
< #node.name: node-1
---
> node.name: es_cluster_0_node_0
55c55
< #network.host: 192.168.0.1
---
> network.host: 127.0.0.1
59c59
< #http.port: 9200
---
> http.port: 9200
88a89,90
> 
> xpack.ml.enabled: false

# diff /etc/elasticsearch/jvm.options /etc/elasticsearch/jvm.options_orig
22,23c22,23
< -Xms256m
< -Xmx256m
---
> -Xms1g
> -Xmx1g
# systemctl start elasticsearch

.

logstash summary

I’ve left node name and http host unchanged. I’ll need to expose the port in due course, but let’s deal with the other issues first, whatever they turn out to be.

With 1gb RAM, I think it’s best to use disk for cache, and for the moment, I’ve reduced this, as I only created a 1gb filesystem to start with.

# diff /etc/logstash/logstash.yml /etc/logstash/logstash.yml_orig 
41c41
< pipeline.workers: 2
---
> # pipeline.workers: 2
130c130
< queue.type: persisted
---
> # queue.type: memory
154c154
< queue.max_bytes: 512mb
---
> # queue.max_bytes: 1024mb
#   diff /etc/logstash/jvm.options /etc/logstash/jvm.options_orig 
6,7c6,7
< -Xms128m
< -Xmx128m
---
> -Xms1g
> -Xmx1g

# cat /etc/logstash/conf.d/syslog-5514.conf 
input {
  tcp {
    port => 5514
    type => syslog
  }
  udp {
    port => 5514
    type => syslog
  }
}
filter {
}
output {
  elasticsearch { hosts => ["127.0.0.1:9200"] }
}

# systemctl start logstash
Failed to start logstash.service: Unit not found.

A reinstall of the package fixed that, but there might be a nicer way to do it; detail below.

issues getting elasticsearch working

Elastic no longer support 32 bit, and I doubt they expect it running on a 1gb machine. So, some issues are hardly a surprise.

The package took 12 minutes to install.

jna/jni (for completeness)

tl;dr – ignore this ‘error’ in the logs.

ES failed to start, after a few minutes.  From /var/log/elasticsearch/es_cluster_0.log ..

java.lang.UnsatisfiedLinkError: Native library (com/sun/jna/linux-arm/libjnidispatch.so) not found in resource path

I spent a while looking into this;  comments online indicate it’ll run without JNA.

But, instead I went here .. don’t.

# yum provides */libjnidispatch.so 
jna-3.5.2-8.el7.armv7hl : Pure Java access to native libraries
Repo : base
Matched from:
Filename : /usr/lib/jna/libjnidispatch.so

There’s a good explanation of behaviour here and sure enough, jvm.options specifies that the native library shouldn’t be used.

# use our provided JNA always versus the system one
-Djna.nosys=true

Elastic expect ES to use the jna library provided.  That’s not going to work on 32bit raspberry pi, so fall back to ‘it’ll run without it.’

Are you on a 32 bit system? We don’t support 32 bit

# unzip -t /usr/share/elasticsearch/lib/jna-4.5.1.jar | grep aarch
testing: com/sun/jna/linux-aarch64/ OK
testing: com/sun/jna/linux-aarch64/libjnidispatch.so OK

write access to /var/lib/elasticsearch

On the other hand, it won’t cope with this:

java.lang.IllegalStateException: Failed to create node environment
Caused by: java.nio.file.AccessDeniedException: /var/lib/elasticsearch/nodes

Which was caused by me mounting a filesystem on /var/lib/elasticsearch and forgetting that puppet wasn’t fixing the permissions.  I’d told it to, but hadn’t got round to working out why the dependencies I set up prevented puppet applying the changes.

x-pack machine learning

org.elasticsearch.ElasticsearchException: X-Pack is not supported and Machine 
Learning is not available for [linux-arm]; you can use the other X-Pack 
features (unsupported) by setting xpack.ml.enabled: false in elasticsearch.yml
  • Added ‘xpack.ml.enabled: false’ to elasticsearch.yml

non loopback address

Binding to the ethernet interface, which I tried initially, caused issues.

[o.e.n.Node ] [es_cluster_0_node_0] starting ...
[o.e.t.TransportService ] [es_cluster_0_node_0] publish_address {192.168.1.246:9300}, 
bound_addresses {192.168.1.246:9300}
[o.e.b.BootstrapChecks ] [es_cluster_0_node_0] bound or publishing to a non-loopback 
address, enforcing bootstrap checks
[o.e.b.Bootstrap ] [es_cluster_0_node_0] node validation exception
[1] bootstrap checks failed
[1]: system call filters failed to install; check the logs and fix your 
configuration or disable system call filters at your own risk
[o.e.n.Node ] [es_cluster_0_node_0] stopping ...

Documentation suggests the fix for the moment is to bind to loopback.  I’d look into using a reverse proxy before binding ES to an interface, if it’s that much of a big deal.

startup time

On a raspberry pi B3+ the service takes just under 50 minutes from starting via systemctl to opening port 9300. Start up seems to be single threaded, so it’s slow.

timeouts

lots of warnings about timeouts being exceeded.  not bothered as long as it doesn’t cause it to abend.

failed to process cluster event (put-pipeline-xpack_monitoring_6) within 30s
cluster state update task [maybe generate license for cluster] took [4.4m] 
above the warn threshold of 30s
committed version [12] source [maybe generate license for cluster]])] took [2.2m] 
above the warn threshold of 30s

issues getting logstash working

no boot script

Seems like this issue: https://github.com/elastic/logstash/issues/9403

However, puppet installed logstash for me, with a dependency on openjdk .. so what went wrong …?  The logs show everything was there in the right order.

It could be the issue at the end of the thread;  installer kicks off a java process with -Xmx1g -Xms1g  which is exciting on a 1gb machine; I monitored the reinstall, and in practise it didn’t get near this in RSS, only virtual. Puppet swallowed the output, so who knows.

A reinstall fixed it.

Installing : 1:logstash-6.5.4-1.noarch 1/1 
Using provided startup.options file: /etc/logstash/startup.options
OpenJDK Zero VM warning: TieredCompilation is disabled in this release.
Successfully created system startup script for Logstash
Verifying : 1:logstash-6.5.4-1.noarch 

# systemctl status logstash
● logstash.service - logstash
Loaded: loaded (/etc/systemd/system/logstash.service; disabled; vendor preset: disabled)
Active: inactive (dead)

There might be another way to do it:

head -9 /etc/logstash/startup.options | grep '^# '
# These settings are ONLY used by $LS_HOME/bin/system-install to create a custom
# startup script for Logstash and is not used by Logstash itself. It should
# automagically use the init system (systemd, upstart, sysv, etc.) that your
# Linux distribution uses.
# After changing anything here, you need to re-run $LS_HOME/bin/system-install
# as root to push the changes to the init script.

memory and start time

Unlike elasticsearch, logstash doesn’t start logging immediately.

Starting both in parallel extended ES start time to an hour. At this point, logstash hadn’t started logging or started listening on any ports.

After 1hr 10 mins, logstash had an RSS of 485000 (rather more than the 256mb allocated) and in total, there was only 100,000 bytes left for linux, with no prospect of it backing off yet.

I restarted it with the stack configured as 128m.  This didn’t make much difference; once logstash was up and running it had an RSS of 637000.  There’s a bit more in the summary about this.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s