@vpsgeek said:
I keep reading that it is not production ready yet but what does it lack that it is considered unfit to be offered as replacement for OpenVZ?
1) There's no easy way to handle storage. Users have to use either ZFS (which is fine but doesn't work for user quotas inside of the VM), or LVM (which doesn't easily shrink). There was talks about making XFS work for storage but who knows.
2) You have to use a bridged interface because there's no easy networking system like vznet. While a bridge is nice, it makes it an issue on hosts that require some sort of MAC whitelisting. You also have the issue where you need to use ebtables to stop a user from spoofing packets
3) LXC just isn't built with mutli-tenant occupancy in mind. LXC does have unprivileged containers, but that isn't the #1 thing they're always keeping in mind.
4) A simple fuckup in configuration can end with containers being able to read/write to devices (read: your RAID drives itself) they shouldn't be allowed to (this expands on #3).
LXC is nice, but on your own proxmox. It seems to me that the general consensus is that it's not a valid alternative to OpenVZ since it's an unprivileged container running mostly on generic kernels and designed mostly to run untrusted apps or otherwise deletable ad nutum.
Fran nailed it. LXC is awesome, I use it as much as possible within my own Proxmox nodes, but never for multi-tenant situations. There just isn't the isolation that VZ has.
The biggest example is running htop inside an LXC container reveals the host specs, not those of the container.
@dahartigan said:
The biggest example is running htop inside an LXC container reveals the host specs, not those of the container.
Is there a difference between containers in a LXD installation and LXC on Proxmox? On my LXD installation htop is showing the container specs.
Francisco said: 1) There's no easy way to handle storage. Users have to use either ZFS (which is fine but doesn't work for user quotas inside of the VM), or LVM (which doesn't easily shrink). There was talks about making XFS work for storage but who knows.
I'm using a ZFS based storage pool and once I set a disk size on the container profile that's the disk size that shows inside the container. Is that what you mean by "user quotas inside of the VM"?
I've just started playing with LXD and I'm trying to understand its limitations.
LXC is still somewhat inferior to both other options. OVZ is easier to overload (for those $1/mo warriors), and KVM gives you a complete virtual machine to put whatever you want on it. LXC is a "BSD Jail" for Linux. It works if you set it up correctly, but otherwise it really isn't an answer most people know exist, little alone design for use.
A related question I swear -- are there any file-based storage backends for KVM? I.E. give each VM a folder instead of a LV? Something that works well for that would let me move from OpenVZ to KVM for my mass storage nodes.
Simply, those nodes offer too much storage to just give people 1:1 mapped block devices, and lvm thinpool isn't quite as efficient for disk space and feels more than a little risky.
If there were a viable way to oversubscribe disk space, by giving people file system based VM storage instead of block device VM storage, I could just do KVM for those users.
OpenVZ is mostly abandonware at this point, and LXC isn't ready for multi-tenancy, it doesn't leave you in a great position.
@funkywizard said:
OpenVZ is mostly abandonware at this point, and LXC isn't ready for multi-tenancy, it doesn't leave you in a great position.
The symfs exposes all of your stuff to any service administrative data, fwiw. If they're running a scanner and it sees you mounted, and scans it, your shit gets deleted even if nobody has ever looked at it.
KVM with dedicated space is a lot more safe than any compressing/anti-dupe systems which exist, and I would not trust it to work as a filesystem mount, since it has to emulate the entire drive in whole. It's plausible, but not really, without taking over some OpenVZ ideals like the shell-hell-script-init, and other goodness along those lines. By abstracting the entire filesystem to a drive (or LVM - which is a lot better, but takes a bit more admin attendance), it makes it a lot easier to both ensure it's yours, and only yours.
@funkywizard said:
A related question I swear -- are there any file-based storage backends for KVM?
ZFS Zvol is sorta file-based.
Simply, those nodes offer too much storage to just give people 1:1 mapped block devices, and lvm thinpool isn't quite as efficient for disk space and feels more than a little risky.
If there were a viable way to oversubscribe disk space
Build out giant raidz2 ZFS pools on Dual E5s with tons of ECC RAM (1Gb per TB of usable pool) . Throw in nvme caching(ZIL) for fast writes , and hand out big ZVoLs per KVM.
ZFS is way more robust for bitrot than lvm thin pool.
Zeroes and other data easily compressible by lz4 don't incur (full) writes to the pool.
Caveat: have processes in place to keep total pool utilization under 85% at ALL times.
The key is scale. Big ass nodes to make it work
ULTRAVPS does something like this I imagine. (But based on SmartOS by Illumos.)
@funkywizard said:
A related question I swear -- are there any file-based storage backends for KVM?
ZFS Zvol is sorta file-based.
Simply, those nodes offer too much storage to just give people 1:1 mapped block devices, and lvm thinpool isn't quite as efficient for disk space and feels more than a little risky.
If there were a viable way to oversubscribe disk space
Build out giant raidz2 ZFS pools on Dual E5s with tons of ECC RAM (1Gb per TB of usable pool) . Throw in nvme caching(ZIL) for fast writes , and hand out big ZVoLs per KVM.
ZFS is way more robust for bitrot than lvm thin pool.
Zeroes and other data easily compressible by lz4 don't incur (full) writes to the pool.
Caveat: have processes in place to keep total pool utilization under 85% at ALL times.
The key is scale. Big ass nodes to make it work
ULTRAVPS does something like this I imagine. (But based on SmartOS by Illumos.)
Lets assume I build storage on threadripper.
10TB×24=240TB 240GB of ram assigned to maintain ZFS, so it ate all the ram it supports letf no ram to sell to customers
@funkywizard said:
A related question I swear -- are there any file-based storage backends for KVM?
ZFS Zvol is sorta file-based.
Simply, those nodes offer too much storage to just give people 1:1 mapped block devices, and lvm thinpool isn't quite as efficient for disk space and feels more than a little risky.
If there were a viable way to oversubscribe disk space
Build out giant raidz2 ZFS pools on Dual E5s with tons of ECC RAM (1Gb per TB of usable pool) . Throw in nvme caching(ZIL) for fast writes , and hand out big ZVoLs per KVM.
ZFS is way more robust for bitrot than lvm thin pool.
Zeroes and other data easily compressible by lz4 don't incur (full) writes to the pool.
Caveat: have processes in place to keep total pool utilization under 85% at ALL times.
The key is scale. Big ass nodes to make it work
ULTRAVPS does something like this I imagine. (But based on SmartOS by Illumos.)
Lets assume I build storage on threadripper.
10TB×24=240TB 240GB of ram assigned to maintain ZFS, so it ate all the ram it supports letf no ram to sell to customers
We use 4tb drives in raid 10 currently, as we find we run out of disk i/o before running out of disk space, with that amount of space per drive. So I'm not sure that 24x 10tb drives, with no raid (or, no overhead from ZFS parity) is going to be a good ratio of disk i/o to disk space. Especially as ZFS is not well known for high performance.
So on that basis, 12x4tb but in raid 10 is more like 24tb usable space per node, or 1/10th what you've budgeted. We're putting 256gb ram in those nodes just for the hell of it as ram is so cheap. Easily could get by on half that or less.
So I think in terms of ram usage, ZFS would be ok. In my example here, I'd need maybe 24gb ram on a server that's got at least 128gb more ram than it needs.
Another way to look at it, a 10tb drive is at least $200 -- new is more like $300 for a 7200rpm 10tb drive of "datacenter" quality. It sounds like you nee 10gb ram for every 10tb of usable disk space. Let's ignore for the moment using disk space for parity / mirroring / or even just leaving some space empty. 10gb ram is worth something on the order of $20 (really between $15 and $30 depending on the ram you want). So the ram use adds about 10% to the cost of the disk space in this scenario. A notable cost, but not a deal breaker by any means.
Those SandyBridge and ivyBridge E5s are probably the sweet spot if you have lots of ECC DDR3 lying about.
As I understand it, ZFS performance is as snappy as its slowest component (ZIL usually).
Now that even FreeBSD has decided to adopt ZfsonLinux as it's upstream , it's a good bet for longevity.
vimalware said: Build out giant raidz2 ZFS pools on Dual E5s with tons of ECC RAM (1Gb per TB of usable pool) . Throw in nvme caching(ZIL) for fast writes , and hand out big ZVoLs per KVM.
Won't do anything in his case.
The ZVOL's will grow but never really shrink. You could try using VIRTIO-SCSI and see if ZFS will handle the unmap/TRIM requests, but you might need to use QCOW2 images instead.
Still, it doesn't give him the ability to easily fix things (search for massive log files to clear out if he runs out of space, etc). Once he's out of space, he's out and his VM's are paused.
Comments
1) There's no easy way to handle storage. Users have to use either ZFS (which is fine but doesn't work for user quotas inside of the VM), or LVM (which doesn't easily shrink). There was talks about making XFS work for storage but who knows.
2) You have to use a bridged interface because there's no easy networking system like vznet. While a bridge is nice, it makes it an issue on hosts that require some sort of MAC whitelisting. You also have the issue where you need to use
ebtables
to stop a user from spoofing packets3) LXC just isn't built with mutli-tenant occupancy in mind. LXC does have unprivileged containers, but that isn't the #1 thing they're always keeping in mind.
4) A simple fuckup in configuration can end with containers being able to read/write to devices (read: your RAID drives itself) they shouldn't be allowed to (this expands on #3).
Francisco
LXC is nice, but on your own proxmox. It seems to me that the general consensus is that it's not a valid alternative to OpenVZ since it's an unprivileged container running mostly on generic kernels and designed mostly to run untrusted apps or otherwise deletable ad nutum.
EDIT: I've been stallion'd
Simple summary, not ready for production.
Nexus Bytes Ryzen Powered NVMe VPS | NYC|Miami|LA|London|Netherlands| Singapore|Tokyo
Storage VPS | LiteSpeed Powered Web Hosting + SSH access | Switcher Special |
Fran nailed it. LXC is awesome, I use it as much as possible within my own Proxmox nodes, but never for multi-tenant situations. There just isn't the isolation that VZ has.
The biggest example is running htop inside an LXC container reveals the host specs, not those of the container.
Get the best deal on your next VPS or Shared/Reseller hosting from RacknerdTracker.com - The original aff garden.
Was fixed earlier this year, Don't remember if it requires a specific option to be enabled or not but it's there.
Is there a difference between containers in a LXD installation and LXC on Proxmox? On my LXD installation htop is showing the container specs.
I'm using a ZFS based storage pool and once I set a disk size on the container profile that's the disk size that shows inside the container. Is that what you mean by "user quotas inside of the VM"?
I've just started playing with LXD and I'm trying to understand its limitations.
You're right, my bad. I was going off an old experience apparently.
Get the best deal on your next VPS or Shared/Reseller hosting from RacknerdTracker.com - The original aff garden.
LXC is still somewhat inferior to both other options. OVZ is easier to overload (for those $1/mo warriors), and KVM gives you a complete virtual machine to put whatever you want on it. LXC is a "BSD Jail" for Linux. It works if you set it up correctly, but otherwise it really isn't an answer most people know exist, little alone design for use.
My pronouns are asshole/asshole/asshole. I will give you the same courtesy.
A related question I swear -- are there any file-based storage backends for KVM? I.E. give each VM a folder instead of a LV? Something that works well for that would let me move from OpenVZ to KVM for my mass storage nodes.
Simply, those nodes offer too much storage to just give people 1:1 mapped block devices, and lvm thinpool isn't quite as efficient for disk space and feels more than a little risky.
If there were a viable way to oversubscribe disk space, by giving people file system based VM storage instead of block device VM storage, I could just do KVM for those users.
OpenVZ is mostly abandonware at this point, and LXC isn't ready for multi-tenancy, it doesn't leave you in a great position.
The symfs exposes all of your stuff to any service administrative data, fwiw. If they're running a scanner and it sees you mounted, and scans it, your shit gets deleted even if nobody has ever looked at it.
KVM with dedicated space is a lot more safe than any compressing/anti-dupe systems which exist, and I would not trust it to work as a filesystem mount, since it has to emulate the entire drive in whole. It's plausible, but not really, without taking over some OpenVZ ideals like the shell-hell-script-init, and other goodness along those lines. By abstracting the entire filesystem to a drive (or LVM - which is a lot better, but takes a bit more admin attendance), it makes it a lot easier to both ensure it's yours, and only yours.
My pronouns are asshole/asshole/asshole. I will give you the same courtesy.
ZFS Zvol is sorta file-based.
Build out giant raidz2 ZFS pools on Dual E5s with tons of ECC RAM (1Gb per TB of usable pool) . Throw in nvme caching(ZIL) for fast writes , and hand out big ZVoLs per KVM.
ZFS is way more robust for bitrot than lvm thin pool.
Zeroes and other data easily compressible by lz4 don't incur (full) writes to the pool.
Caveat: have processes in place to keep total pool utilization under 85% at ALL times.
The key is scale. Big ass nodes to make it work
ULTRAVPS does something like this I imagine. (But based on SmartOS by Illumos.)
Lets assume I build storage on threadripper.
10TB×24=240TB 240GB of ram assigned to maintain ZFS, so it ate all the ram it supports letf no ram to sell to customers
We use 4tb drives in raid 10 currently, as we find we run out of disk i/o before running out of disk space, with that amount of space per drive. So I'm not sure that 24x 10tb drives, with no raid (or, no overhead from ZFS parity) is going to be a good ratio of disk i/o to disk space. Especially as ZFS is not well known for high performance.
So on that basis, 12x4tb but in raid 10 is more like 24tb usable space per node, or 1/10th what you've budgeted. We're putting 256gb ram in those nodes just for the hell of it as ram is so cheap. Easily could get by on half that or less.
So I think in terms of ram usage, ZFS would be ok. In my example here, I'd need maybe 24gb ram on a server that's got at least 128gb more ram than it needs.
Another way to look at it, a 10tb drive is at least $200 -- new is more like $300 for a 7200rpm 10tb drive of "datacenter" quality. It sounds like you nee 10gb ram for every 10tb of usable disk space. Let's ignore for the moment using disk space for parity / mirroring / or even just leaving some space empty. 10gb ram is worth something on the order of $20 (really between $15 and $30 depending on the ram you want). So the ram use adds about 10% to the cost of the disk space in this scenario. A notable cost, but not a deal breaker by any means.
Those SandyBridge and ivyBridge E5s are probably the sweet spot if you have lots of ECC DDR3 lying about.
As I understand it, ZFS performance is as snappy as its slowest component (ZIL usually).
Now that even FreeBSD has decided to adopt ZfsonLinux as it's upstream , it's a good bet for longevity.
Won't do anything in his case.
The ZVOL's will grow but never really shrink. You could try using VIRTIO-SCSI and see if ZFS will handle the unmap/TRIM requests, but you might need to use QCOW2 images instead.
Still, it doesn't give him the ability to easily fix things (search for massive log files to clear out if he runs out of space, etc). Once he's out of space, he's out and his VM's are paused.
Francisco
In my experience ZFS 0.7 (Ubuntu 18.04) does support that.