I am very fortunate in that many of my hobbies and interests at home are in close alignment with my career in industrial automation. One of the foundational components of many of these interests, specifically around IoT and Data Science, is a home server for edge computing. I have maintained a Linux based home server since finishing university and getting a dedicated home internet connection. This server was typically salvaged from my last upgraded desktop computer but as of a few years ago, that pool of hardware had run dry. Between work laptops and a personal Chromebook, I haven't brought a new desktop computer into the house in well over 5 years.
With the end of hand-me-down hardware, I decided to invest in dedicated home server hardware and take advantage of virtualization to split my servers by function rather than maintaining one instance to do everything. This includes the edge applications in my IoT system as well as web-hosting and software development. This setup has proven to be useful in my recent data science and analytics courses and interests. Many of the tools for data science, like Jupyter Notebook and RStudio, can be implemented as web services which means I can consolidate my development to one place and I'm no longer tied to an operating system or a specific desktop instance. This helps to make working from the Chromebook a feasible option.
Why not go full cloud for the non-edge applications? To implement just the three core Linux instances on GCP or AWS with some degree of provisioned SSD space, the cost is estimated to run at around $60 - $100 CAD per month. This implies that the payout on this home server is only 6 to 10 months. There is no doubt that the cloud instances would be orders of magnitude more reliable but it would be paying for features I don't need; none of this is critical infrastructure. While I hope to get > 3 years of use out of this hardware, I could replace it annually and still come out ahead. Recently I have been working through a few projects on GCP and AWS but I have been trying to limit my use to the boundaries of their free tiers. If anything, I would consider a hybrid-cloud implementation and keep as much computing on the edge as practicable.
The following hardware makes up the components of this home server and the costs are in $CAD from January 2019.
In addition to the core components, there are two identical 1TB USB drives connected to the system for file archiving in a soft RAID 1 (mirror) configuration. The system is plugged into a small consumer-grade UPS for surge protection and to handle small power bumps. If you have an old platter HDD you would like to get some use out of, either part it out for some cool magnets or start a History of Computing shelf display. Buy an SSD for your home server!
The hypervisor is the system layer that allows for the implementation of virtual machines. At a high level, virtualization can either be hardware-based, where the virtual machine is largely ignorant that it's not running directly on dedicated hardware, or software-based, where the operating system, desktop or a specific application is tailored to work under supervisory software. VMware ESXi, Microsoft Hyper-V, and Oracle VirtualBox are common examples of hardware virtualization whereas implementations of a Virtual Desktop Infrastructure (VDI), Docker and Kubernetes are examples of software-based virtualization. Industrial Control Systems (ICS) still largely implement hardware virtualization and the majority of these that I've been exposed to are VMware based. At work, we are a VMware partner to support these efforts. Due to this familiarity and an available free version, my home server hypervisor is VMware ESXi.
The product offerings from VMware have exploded over the last few years as they have gone heavily into cloud and hybrid-cloud deployments among other things. What we're looking for is the VMware VSphere Hypervisor (a.k.a. ESXi). It's installed as an evaluation and you then register for a free license and apply it to your system. This write-up is more of an as-built than a how-to guide but fortunately, there is a ton of information available on setting up ESXi on a NUC if you search out "VMware NUC homelab". The Intel NUC hardware is widely supported by VMware and aligns nicely with the limitations of the free version of the ESXi hypervisor. As of this writing, it appears that ESXi version 7.0 still supports the 5th generation Intel NUC platform.
Once installed, all interaction to the hypervisor is through the embedded HTML5 ESXi web interface or through ssh which is enabled and disabled as required. The use of VSphere Client or VMware Workstation is not required.
As the intent of this server was not specifically to serve as a full VMware homelab beyond the features of the free version of ESXi, there was no consideration for a vCenter server, vSphere High Availability, vMotion, vStorage or vSAN. These are all components that should be reviewed and considered for an enterprise VMware deployment where some degree of hardware redundancy is included.
Data that doesn't exist in two places doesn't really exist! For a home server deployment, every hardware component in the NUC is a single point of failure for every server deployed on it. With no other mitigation, an unrecoverable failure of the SSD would be a complete system rebuild. That said, I honestly don't know if there is a recoverable failure of an SSD. RAID on the SSD would reduce this risk considerably but there is still the environmental and physical risk that comes with a single location susceptible to flooding, fire, theft, power surges, etc. For these reasons, it's best to accept the fragility of your server and make sure your backup regime aligns with how much work you're willing to put in when (not if) you lose everything. The other major consideration is how long you can afford your system to be down. In my case, other than a very frustrating experience, the sun will still rise if my server completely dies and if it dies from an environmental issue, I've got bigger problems to worry about.
The design and distribution of the virtual machines were based on focusing functionality from my previous dedicated server into distinct fit-for-purpose instances. Sometimes things go wrong or directions need to be changed and it's nice to limit this impact to a specific server. The cost associated with this configuration is the CPU and RAM dedicated to redundant operating system functions across the servers. This is mitigated by primarily using headless Linux distributions which tend to sip hardware resources rather than gulping them down.
Drive space can be expanded and most of my applications spike but never overwhelm the available CPU; the limitation on this system is RAM. I have not had an issue with assigning the recommended 2 GB of RAM to each Linux instance and my Windows 10 machine is 'comfortable' with 6 GB of RAM. My Kali Linux instance runs with a GUI and is assigned 4 GB of RAM but is powered down when not in use. I try to stagger any CPU intense scheduled tasks across all servers.
My recommendation for designing your system is to plan out your VM's based on the RAM they will consume and then plan your drive space to comfortably fit this configuration. Then, before you hit Add to cart, double it or at least take the next size up. Keep the RAM as the bottleneck on your system; you'll beat yourself up if you run out of drive space before you run out of RAM.
The web server hosts the public-facing web pages and is a headless Ubuntu system configured as a LAMP stack (Apache, MySQL & PHP). It includes the following components:
With the use of the Apache reverse proxy, the web server can manage the https encryption and domain certificates for any content behind the proxy. Sites behind the proxy can run with Http and will be encrypted through a valid certificate once they pass through the web server. This isn't ideal for a defence-in-depth network security strategy but it simplifies the management and configuration of the servers behind the proxy.
The IoT server facilitates the majority of the edge computing applications. It manages my IoT projects and home automation applications. This is a headless Ubuntu server with the following main components:
The development server is to host the programming platforms and is another headless Ubuntu system with the following packages installed:
With Jupyterhub and RStudio, the development is web-based and can be done from any machine on the network with a browser. Also, using the Visual Studio Code Remote-SSH extension, this machine can serve as the central repository for all development with VSC. This allows me to jump between my Windows laptop and my Chromebook seamlessly. The development server runs apache so data visualizations can be published to the public-facing website without having to move files around. It also hosts an internal replica of the website for testing upgrades, patches and plugins on a sandbox version before touching the production site.
The following two workstations live on the server as a matter of convenience and serve as virtual desktops.
The Kali Linux workstation is my only Linux distribution with a GUI. This was created to support a couple of courses I took on cybersecurity and penetration testing. I normally keep this powered down but still use it to do security spot checks on work servers. It was also fun to mess around with while watching "Mr. Robot" and poking back at some of the IP addresses that were logging attacks to my router (in the name of education of course).
This was my only software cost on my server and pulls more than its fair share of resources. I'm trying to migrate to more network-based software and rely less on applications loaded to a specific desktop. This windows instance is my fallback for where this simply doesn't work with some of the software I need to use. With growing Linux support on ChromeOS and improved Android application performance, I'm able to do most of my tasks on a Chromebook. When I need a Windows O/S or application, I simply RDP to this workstation. With the VPN software on the web server, I can do this from anywhere.
Why not just a Raspberry Pi or two? A Raspberry Pi is a sufficient edge appliance to handle a few specific edge functions for a few specific applications. Expanding this to a server with more horsepower gives the flexibility of spinning up a Raspberry Pi’s worth of computing resources as you need it. In my projects where I need I/O or a dedicated screen and GUI, I have used Raspberry Pi’s as nested IoT appliances where this server acts as more of an Edge Gateway. Following this scheme further, the Raspberry Pi is often overkill and these IoT applications can be implemented on cheaper IoT hardware like the ESP8266 and ESP32 based platforms. A robust edge server can be used to manage the security holes and offload some of the CPU heavy lifting that comes with using the cheaper and less capable IoT devices.