[Fixed?] Tiny, But Heroic Oracle Free Tier VPS Runs Out Of Memory During DNF Upgrade!
Oracle Free Tier Powers The MetalVPS Website
The MetalVPS website is served by a tiny, but heroic Oracle Cloud Free Tier VPS The specs are:
VM.Standard.E2.1.Micro.
OCPU count: 1
Memory (GB): 1
Boot Volume Size: 47 GB
Network bandwidth (Gbps): 0.48
Launched: Sun, Jul 17, 2022, 23:25:22 UTC
The instance has been up the best part of a year:
[opc@instance-20220717-1620 ~]$ uptime
15:27:35 up 301 days, 16:01, 1 user, load average: 0.06, 0.08, 0.02
[opc@instance-20220717-1620 ~]$
Recent Downtime Alerts From Hetrix Tools
Recently I started receiving alerts from Hetrix Tools about the MetalVPS website being down briefly, just a few minutes at a time, but repeatedly. I confirmed that I couldn't access the website from my Chromebook during the periods when Hetrix said the site was down. Additionally, my ssh connection to the instance stopped working when Hetrix said the site was down. Luckily, I had redundant backups, so all the files were safe.
Checking the logs suggested that periodic DNF updates were causing the tiny, but valiant VPS to run out of memory. When there was no memory, the web server and the ssh server couldn't work. After the OOM killer decided it was time to kill the DNF update, then the web server and the ssh server could serve again.
Just for fun, I repeatedly ran free -h
during a manually initiated dnf upgrade
to see what happened. Watch the swap go all the way down to zero:
Swap Drops To Zero During DNF Upgrade
chronos@penguin:~/servers/oracle/metalvps$ ssh m
Activate the web console with: systemctl enable --now cockpit.socket
Last login: Fri May 5 01:56:22 2023 from 187.189.238.1
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 189Mi 364Mi 1.0Mi 126Mi 384Mi
Swap: 1.3Gi 463Mi 896Mi
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 244Mi 236Mi 1.0Mi 199Mi 329Mi
Swap: 1.3Gi 460Mi 899Mi
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 258Mi 50Mi 1.0Mi 371Mi 316Mi
Swap: 1.3Gi 460Mi 899Mi
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 397Mi 50Mi 1.0Mi 231Mi 176Mi
Swap: 1.3Gi 460Mi 899Mi
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 583Mi 47Mi 0.0Ki 49Mi 21Mi
Swap: 1.3Gi 546Mi 813Mi
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 576Mi 46Mi 0.0Ki 56Mi 24Mi
Swap: 1.3Gi 719Mi 640Mi
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 552Mi 57Mi 0.0Ki 70Mi 42Mi
Swap: 1.3Gi 870Mi 489Mi
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 563Mi 42Mi 0.0Ki 74Mi 29Mi
Swap: 1.3Gi 1.0Gi 289Mi
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 559Mi 58Mi 0.0Ki 62Mi 39Mi
Swap: 1.3Gi 1.2Gi 123Mi
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 604Mi 31Mi 0.0Ki 43Mi 3.0Mi
Swap: 1.3Gi 1.3Gi 0.0Ki
After OOM
[opc@instance-20220717-1620 ~]$ free -h
total used free shared buff/cache available
Mem: 680Mi 166Mi 433Mi 0.0Ki 80Mi 422Mi
Swap: 1.3Gi 484Mi 875Mi
[opc@instance-20220717-1620 ~]$
Memory Issues Fixed By Adding Swap
The memory issues seem to have been fixed by adding additional swap. Here's a link to a tutorial which I found helpful: https://www.cyberciti.biz/faq/linux-add-a-swap-file-howto/
The Oracle install already had a swap file, but I added a second, larger swap file. Note that these swap files are within the partition on which the OS is located, not on separate partitions as usually is the case for swap.
Back in the old days, I always could set up machines with plenty of swap. Nowadays, on the clouds, often the VMs come without swap. On HN, I read that swap might be less popular these days since swap could present a security issue. Swap could contain private data which needs protection.
But, by adding a swap file, some tiny VPSes can do big jobs! I seem to remember compiling a ton of software out of pkgsrc on a different, but also tiny Oracle VPS running CentOS. Was it all just a matter of adding some swap and then eating dinner and sleeping while pkgsrc reliably finished its enormous workload on the tiny Oracle VPS?
Here's what the terminal output looked like during the addition of extra swap space followed by a successful dnf upgrade
.
[opc@instance-20220717-1620 ~]$ swapon -s
Filename Type Size Used Priority
/.swapfile file 1392636 461880 -2
[opc@instance-20220717-1620 ~]$ ls -lh /.swapfile
-rw-------. 1 root root 1.4G Jul 17 2022 /.swapfile
[opc@instance-20220717-1620 ~]$ sudo su -
Last login: Sat May 13 04:36:56 GMT 2023 on pts/0
[root@instance-20220717-1620 ~]# cd /
[root@instance-20220717-1620 /]# df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/ocivolume-root 36G 11G 25G 31% /
[root@instance-20220717-1620 /]# dd if=/dev/zero of=/.swapfile1 bs=1024 count=2093152
2093152+0 records in
2093152+0 records out
2143387648 bytes (2.1 GB, 2.0 GiB) copied, 39.7653 s, 53.9 MB/s
[root@instance-20220717-1620 /]# ls -l .swapfile*
-rw-------. 1 root root 1426063360 Jul 17 2022 .swapfile
-rw-r--r--. 1 root root 2143387648 May 13 21:22 .swapfile1
[root@instance-20220717-1620 /]# chmod 0600 .swapfile1
[root@instance-20220717-1620 /]# ls -l .swapfile*
-rw-------. 1 root root 1426063360 Jul 17 2022 .swapfile
-rw-------. 1 root root 2143387648 May 13 21:22 .swapfile1
[root@instance-20220717-1620 /]# mkswap /.swapfile1
Setting up swapspace version 1, size = 2 GiB (2143383552 bytes)
[ . . . ]
[root@instance-20220717-1620 /]# swapon /.swapfile1
[root@instance-20220717-1620 /]# swapon -s
Filename Type Size Used Priority
/.swapfile file 1392636 461528 -2
/.swapfile1 file 2093148 0 -3
[root@instance-20220717-1620 /]# date -u
Sat May 13 21:30:53 UTC 2023
[root@instance-20220717-1620 /]# dnf upgrade
Oracle Linux 8 BaseOS Latest (x86_64) 4.9 kB/s | 3.6 kB 00:00
Oracle Linux 8 Application Stream (x86_64) 149 kB/s | 3.9 kB 00:00
Oracle Linux 8 Addons (x86_64) 57 kB/s | 3.0 kB 00:00
Latest Unbreakable Enterprise Kernel Release 6 for Oracle Linu 83 kB/s | 3.0 kB 00:00
Last metadata expiration check: 0:00:01 ago on Sat 13 May 2023 08:17:46 PM GMT.
Dependencies resolved.
===============================================================================================
Package Architecture Version Repository Size
===============================================================================================
Installing:
kernel-uek x86_64 5.4.17-2136.318.7.2.el8uek ol8_UEKR6 112 M
kernel-uek-devel x86_64 5.4.17-2136.318.7.2.el8uek ol8_UEKR6 19 M
Removing:
kernel-uek x86_64 5.4.17-2136.317.5.5.el8uek @ol8_UEKR6 136 M
kernel-uek-devel x86_64 5.4.17-2136.317.5.5.el8uek @ol8_UEKR6 75 M
Transaction Summary
===============================================================================================
Install 2 Packages
Remove 2 Packages
Total download size: 131 M
Is this ok [y/N]: y
Downloading Packages:
(1/2): kernel-uek-devel-5.4.17-2136.318.7.2.el8uek.x86_64.rpm 19 MB/s | 19 MB 00:01
(2/2): kernel-uek-5.4.17-2136.318.7.2.el8uek.x86_64.rpm 30 MB/s | 112 MB 00:03
-----------------------------------------------------------------------------------------------
Total 35 MB/s | 131 MB 00:03
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : kernel-uek-devel-5.4.17-2136.318.7.2.el8uek.x86_64 1/4
Running scriptlet: kernel-uek-devel-5.4.17-2136.318.7.2.el8uek.x86_64 1/4
Running scriptlet: kernel-uek-5.4.17-2136.318.7.2.el8uek.x86_64 2/4
Installing : kernel-uek-5.4.17-2136.318.7.2.el8uek.x86_64 2/4
Running scriptlet: kernel-uek-5.4.17-2136.318.7.2.el8uek.x86_64 2/4
Erasing : kernel-uek-devel-5.4.17-2136.317.5.5.el8uek.x86_64 3/4
Running scriptlet: kernel-uek-5.4.17-2136.317.5.5.el8uek.x86_64 4/4
Erasing : kernel-uek-5.4.17-2136.317.5.5.el8uek.x86_64 4/4
Running scriptlet: kernel-uek-5.4.17-2136.317.5.5.el8uek.x86_64 4/4
Running scriptlet: kernel-uek-5.4.17-2136.318.7.2.el8uek.x86_64 4/4
Running scriptlet: kernel-uek-5.4.17-2136.317.5.5.el8uek.x86_64 4/4
Verifying : kernel-uek-5.4.17-2136.318.7.2.el8uek.x86_64 1/4
Verifying : kernel-uek-devel-5.4.17-2136.318.7.2.el8uek.x86_64 2/4
Verifying : kernel-uek-5.4.17-2136.317.5.5.el8uek.x86_64 3/4
Verifying : kernel-uek-devel-5.4.17-2136.317.5.5.el8uek.x86_64 4/4
Installed:
kernel-uek-5.4.17-2136.318.7.2.el8uek.x86_64
kernel-uek-devel-5.4.17-2136.318.7.2.el8uek.x86_64
Removed:
kernel-uek-5.4.17-2136.317.5.5.el8uek.x86_64
kernel-uek-devel-5.4.17-2136.317.5.5.el8uek.x86_64
Complete!
[root@instance-20220717-1620 /]# date
Sat May 13 21:44:31 GMT 2023
[root@instance-20220717-1620 /]# uname -r
5.4.17-2136.308.9.el8uek.x86_64
[root@instance-20220717-1620 /]# cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="8.7"
ID="ol"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="8.7"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Oracle Linux Server 8.7"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:8:7:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL="https://bugzilla.oracle.com/"
ORACLE_BUGZILLA_PRODUCT="Oracle Linux 8"
ORACLE_BUGZILLA_PRODUCT_VERSION=8.7
ORACLE_SUPPORT_PRODUCT="Oracle Linux"
ORACLE_SUPPORT_PRODUCT_VERSION=8.7
[root@instance-20220717-1620 /]#
There have been no more downtime alerts from Hetrix since the extra swap was added.
Additional Steps
At least a couple more things might need to be done.
First, to persist past reboots, the extra swap needs to be added to /etc/fstab and/or synced with whatever systemd might contribute.
Second, the running kernel (5.4.17-2136.308.9.el8uek.x86_64) doesn't seem to be quite the same as the most recently installed kernel (5.4.17-2136.318.7.2.el8uek.x86_64). So maybe a reboot is needed?
The Bigger Picture
I don't understand why this problem happened. Also, I don't understand why this problem happened at the time it happened, rather than sooner or later.
Could it be that the tiny but heroic VM specs are too small to expect
dnf
to work?Did I misconfigure something or make some other mistake which caused
dnf
to run out of space?Was the current
dnf
upgrade unusually big, such that it required more space than expected.Was the problem caused by something else other than
dnf
?Was it correct to add a second swap file? Should there be only one? I could have replaced the existing swap file with another, larger swap file, thus keeping only one swap file.
Maybe /etc/fstab still is meaningful in systemd distros?
Maybe the VPS will survive a reboot?
How applicable is all this to other guys running tiny, but heroic Oracle Free Tier VPSes? Are they all fated to run out of memory after a while?
What about other tiny VPSes on providers other than Oracle and running OSes other than Oracle Linux?
Over the next days, maybe I will get to understand more. Thanks again to Oracle for the nifty, heroic free VPS!
Best wishes! I hope everyone gets the server he wants! 🤩
Comments
I have one machine which does that. I just use microdnf instead.
https://github.com/rpm-software-management/microdnf
https://unix.stackexchange.com/questions/649598/what-cant-microdnf-do-compared-to-dnf
MetalVPS
Yup, it's ... minimal ... but it works in tiny amounts of ram. YUM/DNF aren't designed for small amounts of ram in my experience. It's just not something they're concerned with.
Depending on how quickly you want to run a command, take a look at: watch from the procps rpm.
watch -n1 free -h
It will, once per second, clear the screen and re-run the command. Not useful if you're trying to log. Useful if you wanna watch something while doing something else.
Definitely one option. Another is stopping OS services you may not be using. Swap FirewallD for writing your own iptables. Disable that tuned. If you've got cockpit running disable that ( assuming you're not using it ).
Adding it to /etc/fstab will be enough. Systemd uses the fstab file to generate the actual systemd units is actually uses to mount stuff iirc.
Yeah, the kernel is the only thing you can't swap dynamically. (*1)
Kernel updates tend to be large plus if you've added other services this might just be the time all the rest of your daemons have grown enough you ran out of space.
It Depends (tm).
Like I said above, kernel updates tend to be large and they spawn a bunch of other processes to rebuild stuff.
In the sense that other daemons were eating up enough memory that it ran out of memory, yes.
Having more than one is fine. I generally only have one but have used more than one on many occasions like this one.
It is as I mentioned above.
Hard to know. What else have you done to it?
Rats, forgot my footnote.
1 - You can *patch a kernel if you've got Ksplice/Kpatch/whatever Oracle calls it. In certain circumstances heavy wizards can do funky stuff including swap live kernels according to the ancient texts - I've never done it.
Swap out the Oracle Linux OS for Ubuntu 20.04 if possible. It runs better on system with small amount of RAMs. Ubuntu 22.04 also has a minimum RAM of 1 GB, but I did manage to run it along with nginx webserver on 512 MB of ram.
Somik.org - Server admins cheat codes
If swapping OS is an option, go Debian. Those are my two. If it's greater than ~512M then it's Alma/Rocky. If it's 512M or less then it's Debian. Ubuntu's getting too into that whole snaps thing.
I've still got a couple 256M NAT boxes ( and maybe 128M ) running C7. It's painful but doable. ;-)
The server minimal version of ubuntu does not come with any preinstalled snaps that i know of, so it is almost like debian. Honestly I cannot remember why I did not use debian, but I do remember something was different about debian and I could not configure some app/plugin, which is why I switched to ubuntu permanently.
I used to use Centos but stopped using it due to reliability issues when upgrading OS. I had multiple failed updates and server crashes when upgrading Centos 5 to 6 or from 6 to 7. So switched to using ubuntu.
Somik.org - Server admins cheat codes
Why does this only show 680Mi? It's supposed to be a lot closer to 1Gi than that. Do you, by any chance, have a chunk of memory reserved for kdump?
BTW, I am using a couple of 1 GB VPSes with Ubuntu 22.04 and 1 GB is plenty of memory for those.
And any reason why you don't use the Ampere A1 instances (where you can have up to 24 GB of RAM for free)?
@cmeerw Thanks for your always helpful comments!
I didn't do anything (of which I am consciously aware) that would reserve a bunch of memory.
Nothing against Ubuntu. I just used Oracle Linux because I was on Oracle's platform, and it seemed a good idea to enjoy a little exercise with the non-Debian dialect.
I do have a couple of Ampere instances. They are great! One isn't running anything special but has the Oracle Linux developer flavor so I can play with whatever. The other is running Ubuntu with Bootstrap for an article I wrote last year, Updating A Free Udemy Bootstrap Course On Oracle Cloud Free Tier
So I gotta look into why total memory is only 680 MB. Thanks again for the tip! Much appreciated!
MetalVPS
Big chunk of memory missing. Look for something strange on /var/log/dmesg. For comparison, this is my 1G ubuntu 22.04 free tier:
This machine is only running NFS server -- using the free disk space of the two AMD instances on the Ampere server
Cheers,
Eliphas
TL;DR It survived reboot! No big increase in total memory. Dunno why not.
Specs
Shape configuration
Shape: VM.Standard.E2.1.Micro
OCPU count: 1
Network bandwidth (Gbps): 0.48
Memory (GB): 1
Local disk: Block storage only
Before reboot
After reboot
Post-reboot OS Release
Thanks again to Oracle for the nice VPS on their nice cloud platform!
MetalVPS
True, though the Ubuntu images on Oracle cloud, in particular, come with a oracle-cloud-agent snap pre-installed.
Oh, right, all OS comes with oracle-cloud-agent as it makes it easier to manage the OS from their OCI console.
Somik.org - Server admins cheat codes
I wasn't sure if it was a snap on el variants too, but it might be.
I don't think so, but I cant recall exactly either...
Somik.org - Server admins cheat codes
People were mentioning that Oracle Cloud VMs running Ubuntu seemed to show more total memory than appeared to be present in my awesome, tiny VM.Standard.E2.1.Micro instance which is serving metalvps.com..
I only have the single VM.Standard.E2.1.Micro instance, but I do also have two identically configured, larger VM.Standard.A1.Flex instances, one of which happens to be running Oracle Linux and the other happens to be running Ubuntu.
So here is a comparison of the total memory reported by
free
in two identically configured 12 GB RAM Oracle Arm instances, one running Oracle Linux and the other running Ubuntu. As you can see, the results here do seem to show that Ubuntu'sfree
reports more total memory than Oracle'sfree
in identically configured instances.Any ideas what causes the possible difference?
Thanks to Oracle for the great, free VMs!
Instance-20220614-1523 -- Oracle Linux
From Oracle Cloud Web GUI
Shape configuration
Shape: VM.Standard.A1.Flex
OCPU count: 2
Network bandwidth (Gbps): 2
Memory (GB): 12
Local disk: Block storage only
Instance-20220616-1631 -- Ubuntu
From Oracle Cloud Web GUI
Shape configuration
Shape: VM.Standard.A1.Flex
OCPU count: 2
Network bandwidth (Gbps): 2
Memory (GB): 12
Local disk: Block storage only
MetalVPS
My bet would still be on kdump being enabled by Oracle Linux. Do you see something like "crashkernel" in the Linux command line (
/proc/cmdline
)?@cmeerw Looks like you might be right! Congrats plus double bandwidth!
MetalVPS