ctrl-alt-del.cc: 2018

Saturday, 17 November 2018

Slimming down 1-node Elastic cluster

If you ever ran Elastic Search especially quick and dirty - single node and default config, you will notice the health is always showing yellow and that it's a proper hog for the system. Well, yes, it will be, especially in default config, as my good friend Justin Borland pointed out.

I'm a complete newbie when it comes to Elastic, deployed few in Docker containers to quickly ingest data and dig in with Kibana, but that was it. Luckily for me Justin is absolute beast when it comes to all things Elastic - he just looked at my node and right on the spot explained what's wrong with it and how to fix/improve.

Basically my default setup was running 5 shards for each of the indices stored in the system, and I had quite a few daily indices already there - we're talking months of DNS research data and web spider runs across thousands of websites... all repeated daily. This means the optimisation to be really effective needs to also deal with what's in there, not just new data I will be adding.

Plan:

Change default template to run only 1 shard and 0 replicas - it's a single node deployment, so anything more complex doesn't make much sense.
Use reindex API to rewrite all of existing indices as single shard versions, the deleting the old ones using 5 shards - there's no other way to do it than through reindexing.
My indices are treated append-only on the day, then become read-only, so we can merge the segments - leaving technical details behind, this will mean no random access later, just linear file reads, but that's perfectly acceptable in my particular use scenario.

Let's do it!

Friday, 2 November 2018

Solution - Rancher 2 (k8s), private registry, self-signed certificates

Since Rancher switched to Kubernetes in version 2.x, I'm exposed to a lot of stupidity and limitations k8s introduced, but I can live with that, at least for a moment... What I couldn't accept was that I could no longer use my private registry (with self-signed certificate) that works perfectly fine with older Rancher (1.6 - before move to k8s).

That is now resolved!

My cluster setup

Rancher 2 cluster (based on Kubernetes), all running on latest RancherOS
Private registry available only within the LAB network - hence self-signed certificate
Registry has an internal host name, resolvable via internal DNS server
Registry does not require user accounts, so no need for credentials, but self-signed certificate prevents it from working, resulting with following error when image is pulled

x509: certificate signed by unknown authority

Dead ends

First of all, please ignore RancherOS documentation - last one I found was for version 1.2, current RancherOS is 1.4.2... anyway, it no longer works (it did for older RancherOS and Rancher 1.6 though, but new Rancher is more Kubernetes than anything else). In my research I also read a bunch of bug reports, feature requests, stack exchange articles, etc... mostly waste of time, but they gave me a good idea on rabbit holes to avoid. Some of the more useful reads are here and here, I also have a feeling this will be useful for me quite soon.
Another trick I noticed was that if I followed RancherOS docs above, the registry CA key was overwritten with something else on node reboot.

Solution (a.k.a "works for me")

Go old school Linux admin style:

SSH to the RancherOS node (user is rancher@<node>), having your private CA certificate at hand
As user rancheros try docker pull <registry:port>/<my image> - you should get a CA error
Check your /etc/resolv.conf - mine was regularly overwritten by dhcp but it was not writing name servers correctly - this should be easily fixed by writing what you want to /etc/resolv.conf.tail (in hopes dhcp will append it when it regenerates resolv.conf).
Now the key element - edit the OS wide trusted CA list (hint hint - may disappear after sudo ros os upgrade, but this can be fixed with sudo chattr +i /etc/resolv.conf) and add your CA certificate there. Running vi /etc/ssl/certs/ca-certificates.crt and copy'n'paste does the trick!
Try docker pull again, now it worked for me.