@willie said:
How does an IP address get stolen from a VPS, except through someone getting to the control panel somehow?
SolusVM bug in this case. There's a lot of different bugs. This instance happened to have more than one. Basically at any given time it's possible for it to skip one of the rules or assignments for whatever reason and it creates a chain reaction over time. There's another similar bug with disks that they never fixed, they just gave us a workaround which we haven't used as it's very I/O intensive and could kill the disk more quickly.
Some IP stealer also simultaneously took advantage of the situation, whether on purpose or not.
So there were essentially three people competing over one IP. How IP stealing usually can happen with the protection turned on is if one of the bugs above exists. Sometimes it also fully fails but that's more rare. It definitely didn't help that we also had to turn it off for some time when the NICs were burning up from the other issue we faced before splitting VLANs. And the short version of that goes: integrated NIC isn't amazing, so it struggles keeping up with so many ARP packets flying around, and once we turned on the protection it added another layer of things it had to keep track of which made it want to melt.
Fun stuff.
(edit) I suppose it's not necessarily a bug, it was just never coded to handle all the weird scenarios that may occur and doesn't clean up automatically. After all, SolusVM in its current state is someone used PHP where PHP probably shouldn't have been used and a bunch of low end hosts such as myself decided to use it anyway. Back in the 1800's when it was coded though maybe it was the only option.
Mentally strong people put each KVM in a separate VLAN.
Each IPv4 /32 or IPv6 /56 is individually routed to the VLAN.
Gateway address is a link-local address, for both IPv4 and IPv6.
IP stealing is impossible and no IP is wasted for gateway.
@packer255 said:
Has a round of maintenance for TYOC040 ended? 72 hours have passed, but my vps is still offline.
TYOC040 Still updating the announcement, very good. TYOC035 have been offline for more than a month.
Status Offline
Node Name TYOC035
TYOC035 (Outage)Critical
Affecting Server - SolusVM
09/13/2022 16:39 Last Updated 09/13/2022 16:39
This issue affects a server that may impact your services
TYOC035 is facing a partial outage. We are evaluating a potential maintenance window.
@packer255 said:
Has a round of maintenance for TYOC040 ended? 72 hours have passed, but my vps is still offline.
TYOC040 Still updating the announcement, very good. TYOC035 have been offline for more than a month.
Status Offline
Node Name TYOC035
TYOC035 (Outage)Critical
Affecting Server - SolusVM
09/13/2022 16:39 Last Updated 09/13/2022 16:39
This issue affects a server that may impact your services
TYOC035 is facing a partial outage. We are evaluating a potential maintenance window.
You know I've thought long and hard about it and decided you were right, I was deceiving myself. I want you to know I appreciate you pointing this out.
i think virmach has to fix the MJJ Tokyo unorthodox migration influx in parallel with getting the nodes back up to have a better control of the situation.
You know I've thought long and hard about it and decided you were right, I was deceiving myself. I want you to know I appreciate you pointing this out.
@cybertech said:
i think virmach has to fix the MJJ Tokyo unorthodox migration influx in parallel with getting the nodes back up to have a better control of the situation.
You know I've thought long and hard about it and decided you were right, I was deceiving myself. I want you to know I appreciate you pointing this out.
Do u have any service in TYOC035?
No, and for that you have my sympathy. Truly unlucky for you.
You know I've thought long and hard about it and decided you were right, I was deceiving myself. I want you to know I appreciate you pointing this out.
Do u have any service in TYOC035?
No, and for that you have my sympathy. Truly unlucky for you.
No, and for that you have my sympathy. Truly unlucky for you.
So.This is why u can not understand.
I understand fine. I understand that you're unhappy/angry/upset. I understand you've paid for a service you aren't receiving. I understand this causes you great stress. I understand that you want to make all of these things known. The unresolved question is whether you understand that you're using words incorrectly. It's not abandoned. It's not ignored. It just hasn't gotten fixed yet because other things are being done first. Is that fair? (shrug)
Currently 22.5% of TYOC035 is potentially impacted.
Currently 40% of TYOC040 is potentially impacted.
That means 77.5% of TYOC035 isn't and 60% of TYOC040 isn't. Even with this, we've still been getting so many tickets about them when there's already a network status page posted and it can also be seen here on this thread with how often it's getting mentioned when in reality it's not impacting as many people as you'd assume when compared to other topics that have been discussed.
Right now SJCZ004 outage is impacting more people than both of those nodes combined. Tell me how many times that has been mentioned versus TYOC035 and TYOC040.
SEAZ009 also went down and it's impacting more people as well. How many mentions for that? We received infinite times more tickets about TYOC035 and TYOC040 (because it's 0, since people looked at the network status page for it.)
If we're already having a high support load as a result of just 35% of the people being affected, then by taking it offline to do further maintenance that's essentially guaranteed we'll receive 3x as many reports from 100% of the people being affected. Last time for TYOC040 we sent an email, made a network status, and even asked people to consider not contacting us so we can focus on the issue and we still got many requests. So while I understand it's clearly a big problem and I apologize for the continued outage for those people, understand that when we're being simultaneously bombarded with requests it's counterproductive and that we also have other issues we're tending to so it makes it a better choice to prioritize those at the moment when the only thing that's essentially guaranteed to come from another maintenance or migration attempt is a lot of tickets. Even if we migrate people off, we're going to end up getting a flood of tickets for that so we definitely do not want to continue having ticket backlogs and have to keep that in mind.
I've been in a bit of a predicament with VirMach for a while... For a start, I understand that VirMach has been having a lot of trouble lately with ColoCrossing - speaking of which, what happened to them? I have (or had) a dedicated server, which I haven't had access to for nearly 6 months because of a networking issue. Then last month, overnight, I got an email saying it was suddenly being "migrated" elsewhere, and that I should back up my data. Well this is the problem, I couldn't back it up, since I hadn't had access to it for months on end. Now I'm being told that the data is lost and there's little chance of it being recovered. So what happens now?
@MemphisX said:
I've been in a bit of a predicament with VirMach for a while... For a start, I understand that VirMach has been having a lot of trouble lately with ColoCrossing - speaking of which, what happened to them? I have (or had) a dedicated server, which I haven't had access to for nearly 6 months because of a networking issue. Then last month, overnight, I got an email saying it was suddenly being "migrated" elsewhere, and that I should back up my data. Well this is the problem, I couldn't back it up, since I hadn't had access to it for months on end. Now I'm being told that the data is lost and there's little chance of it being recovered. So what happens now?
I was affected too. Follow the instructions in the email (also copied in this comment: https://lowendspirit.com/discussion/comment/94512/#Comment_94512, additional info here: https://lowendspirit.com/discussion/comment/94592/#Comment_94592) to create a ticket to (hopefully) get a new service deployed quickly. A ticket should have been generated for you automatically, but maybe it got missed/buried and put on the back burner with everything else going on, so best to open that priority support ticket. Best of luck!
Ohh and yeah -- getting the data back isn't going to be an option at this point.... ColonCrossing made sure of that.
Disk is full on my Seafile instance.
The only machine with more disk is VirMach.
I spent all night moving the files.
Hopefully MIAZ011 doesn't lose data going forward.
Hourly backup is in place.
@Mason said: Ohh and yeah -- getting the data back isn't going to be an option at this point.... ColonCrossing made sure of that.
For dedicated servers too? Yikes.
Can someone tell me what MJJ is? Malaysian Jiu-Jitsu? I only know about the Brazilian kind.
If I understand the Tokyo stowaway thing, it sounds crazy that they haven't been booted, but it is also a serious software bug that it could happen at all.
I gotta add, both of my SJC vps are up now, have been up for the past week or two, and performing extremely well, as if the host nodes are nearly unloaded. Both have had IP changes and needed reinstalls but with that out of the way things are running smoothly. There may have been network issues that I haven't noticed since I've been idling both, after migrating a service away from one during the outage. I plan to bring that service back and see what happens.
@Virmach I know that you've said in the past that Transfers can be time consuming, but..
If you fancy a change from the "mundane" usual Tickets, could you please process the (merged) open Ticket for one.
Poor @vyas already paid me at the beginning of the month and it doesn't feel right holding onto his bucks, for nothing.
Go on, you know you want to.
It wisnae me! A big boy done it and ran away.
NVMe2G for life! until death (the end is nigh)
@VirMach said:
Currently 22.5% of TYOC035 is potentially impacted.
Currently 40% of TYOC040 is potentially impacted.
That means 77.5% of TYOC035 isn't and 60% of TYOC040 isn't. Even with this, we've still been getting so many tickets about them when there's already a network status page posted and it can also be seen here on this thread with how often it's getting mentioned when in reality it's not impacting as many people as you'd assume when compared to other topics that have been discussed.
Right now SJCZ004 outage is impacting more people than both of those nodes combined. Tell me how many times that has been mentioned versus TYOC035 and TYOC040.
SEAZ009 also went down and it's impacting more people as well. How many mentions for that? We received infinite times more tickets about TYOC035 and TYOC040 (because it's 0, since people looked at the network status page for it.)
If we're already having a high support load as a result of just 35% of the people being affected, then by taking it offline to do further maintenance that's essentially guaranteed we'll receive 3x as many reports from 100% of the people being affected. Last time for TYOC040 we sent an email, made a network status, and even asked people to consider not contacting us so we can focus on the issue and we still got many requests. So while I understand it's clearly a big problem and I apologize for the continued outage for those people, understand that when we're being simultaneously bombarded with requests it's counterproductive and that we also have other issues we're tending to so it makes it a better choice to prioritize those at the moment when the only thing that's essentially guaranteed to come from another maintenance or migration attempt is a lot of tickets. Even if we migrate people off, we're going to end up getting a flood of tickets for that so we definitely do not want to continue having ticket backlogs and have to keep that in mind.
its trademark MJJ , even more so for unorthodox migrants.
whats the issue with node 40? will you be extending service for those affected instead of credits? credits cant help me get back 6TB of premium bandwidth watching educational videos in Japan.
Comments
Plot twist! IP Protection is turned on and @AlwaysSkint actually typoed his dns during the re-ip and he's actually hitting someone else's server. :-D
SolusVM bug in this case. There's a lot of different bugs. This instance happened to have more than one. Basically at any given time it's possible for it to skip one of the rules or assignments for whatever reason and it creates a chain reaction over time. There's another similar bug with disks that they never fixed, they just gave us a workaround which we haven't used as it's very I/O intensive and could kill the disk more quickly.
Some IP stealer also simultaneously took advantage of the situation, whether on purpose or not.
So there were essentially three people competing over one IP. How IP stealing usually can happen with the protection turned on is if one of the bugs above exists. Sometimes it also fully fails but that's more rare. It definitely didn't help that we also had to turn it off for some time when the NICs were burning up from the other issue we faced before splitting VLANs. And the short version of that goes: integrated NIC isn't amazing, so it struggles keeping up with so many ARP packets flying around, and once we turned on the protection it added another layer of things it had to keep track of which made it want to melt.
Fun stuff.
(edit) I suppose it's not necessarily a bug, it was just never coded to handle all the weird scenarios that may occur and doesn't clean up automatically. After all, SolusVM in its current state is someone used PHP where PHP probably shouldn't have been used and a bunch of low end hosts such as myself decided to use it anyway. Back in the 1800's when it was coded though maybe it was the only option.
Mentally strong people put each KVM in a separate VLAN.
Each IPv4 /32 or IPv6 /56 is individually routed to the VLAN.
Gateway address is a link-local address, for both IPv4 and IPv6.
IP stealing is impossible and no IP is wasted for gateway.
Webhosting24 aff best VPS; ServerFactory aff best VDS; Cloudie best ASN; Huel aff best brotein.
LOL. You know me too well!
Thanks for sorting it out Virmach. Now where's that big hammer?
It wisnae me! A big boy done it and ran away.
NVMe2G for life! until death (the end is nigh)
TYOC040 Still updating the announcement, very good. TYOC035 have been offline for more than a month.
Status Offline
Node Name TYOC035
TYOC035 (Outage)Critical
Affecting Server - SolusVM
09/13/2022 16:39 Last Updated 09/13/2022 16:39
This issue affects a server that may impact your services
TYOC035 is facing a partial outage. We are evaluating a potential maintenance window.
Yes.TYOC035 has been abandoned
You know I've thought long and hard about it and decided you were right, I was deceiving myself. I want you to know I appreciate you pointing this out.
i think virmach has to fix the MJJ Tokyo unorthodox migration influx in parallel with getting the nodes back up to have a better control of the situation.
I bench YABS 24/7/365 unless it's a leap year.
Do u have any service in TYOC035?
The migration killed all maintenance progress.
No, and for that you have my sympathy. Truly unlucky for you.
So.This is why u can not understand.
I understand fine. I understand that you're unhappy/angry/upset. I understand you've paid for a service you aren't receiving. I understand this causes you great stress. I understand that you want to make all of these things known. The unresolved question is whether you understand that you're using words incorrectly. It's not abandoned. It's not ignored. It just hasn't gotten fixed yet because other things are being done first. Is that fair? (shrug)
Currently 22.5% of TYOC035 is potentially impacted.
Currently 40% of TYOC040 is potentially impacted.
That means 77.5% of TYOC035 isn't and 60% of TYOC040 isn't. Even with this, we've still been getting so many tickets about them when there's already a network status page posted and it can also be seen here on this thread with how often it's getting mentioned when in reality it's not impacting as many people as you'd assume when compared to other topics that have been discussed.
Right now SJCZ004 outage is impacting more people than both of those nodes combined. Tell me how many times that has been mentioned versus TYOC035 and TYOC040.
SEAZ009 also went down and it's impacting more people as well. How many mentions for that? We received infinite times more tickets about TYOC035 and TYOC040 (because it's 0, since people looked at the network status page for it.)
If we're already having a high support load as a result of just 35% of the people being affected, then by taking it offline to do further maintenance that's essentially guaranteed we'll receive 3x as many reports from 100% of the people being affected. Last time for TYOC040 we sent an email, made a network status, and even asked people to consider not contacting us so we can focus on the issue and we still got many requests. So while I understand it's clearly a big problem and I apologize for the continued outage for those people, understand that when we're being simultaneously bombarded with requests it's counterproductive and that we also have other issues we're tending to so it makes it a better choice to prioritize those at the moment when the only thing that's essentially guaranteed to come from another maintenance or migration attempt is a lot of tickets. Even if we migrate people off, we're going to end up getting a flood of tickets for that so we definitely do not want to continue having ticket backlogs and have to keep that in mind.
Sounds fair to me but then again I'm not an impacted party.
E-mails sent? I've received nothing, did I win a jackpot and non of my VPSes will change IP?
Haven't bought a single service in VirMach Great Ryzen 2022 - 2023 Flash Sale.
https://lowendspirit.com/uploads/editor/gi/ippw0lcmqowk.png
I've been in a bit of a predicament with VirMach for a while... For a start, I understand that VirMach has been having a lot of trouble lately with ColoCrossing - speaking of which, what happened to them? I have (or had) a dedicated server, which I haven't had access to for nearly 6 months because of a networking issue. Then last month, overnight, I got an email saying it was suddenly being "migrated" elsewhere, and that I should back up my data. Well this is the problem, I couldn't back it up, since I hadn't had access to it for months on end. Now I'm being told that the data is lost and there's little chance of it being recovered. So what happens now?
Webhosting24 aff best VPS; ServerFactory aff best VDS; Cloudie best ASN; Huel aff best brotein.
I was affected too. Follow the instructions in the email (also copied in this comment: https://lowendspirit.com/discussion/comment/94512/#Comment_94512, additional info here: https://lowendspirit.com/discussion/comment/94592/#Comment_94592) to create a ticket to (hopefully) get a new service deployed quickly. A ticket should have been generated for you automatically, but maybe it got missed/buried and put on the back burner with everything else going on, so best to open that priority support ticket. Best of luck!
Ohh and yeah -- getting the data back isn't going to be an option at this point.... ColonCrossing made sure of that.
Head Janitor @ LES • About • Rules • Support • Donate
Disk is full on my Seafile instance.
The only machine with more disk is VirMach.
I spent all night moving the files.
Hopefully MIAZ011 doesn't lose data going forward.
Hourly backup is in place.
Webhosting24 aff best VPS; ServerFactory aff best VDS; Cloudie best ASN; Huel aff best brotein.
When will the sentencing of Tokyo's stowaways take place?
For dedicated servers too? Yikes.
Can someone tell me what MJJ is? Malaysian Jiu-Jitsu? I only know about the Brazilian kind.
If I understand the Tokyo stowaway thing, it sounds crazy that they haven't been booted, but it is also a serious software bug that it could happen at all.
I gotta add, both of my SJC vps are up now, have been up for the past week or two, and performing extremely well, as if the host nodes are nearly unloaded. Both have had IP changes and needed reinstalls but with that out of the way things are running smoothly. There may have been network issues that I haven't noticed since I've been idling both, after migrating a service away from one during the outage. I plan to bring that service back and see what happens.
@Virmach I know that you've said in the past that Transfers can be time consuming, but..
If you fancy a change from the "mundane" usual Tickets, could you please process the (merged) open Ticket for one.
Poor @vyas already paid me at the beginning of the month and it doesn't feel right holding onto his bucks, for nothing.
Go on, you know you want to.
It wisnae me! A big boy done it and ran away.
NVMe2G for life! until death (the end is nigh)
@Virmach I was glad to see ipv6 working at Germany, thanks.
In other news, a Ryzen Migration from ATL to SEA went flawlessly.
It wisnae me! A big boy done it and ran away.
NVMe2G for life! until death (the end is nigh)
@AlwaysSkint said:
West coast!?! Have you gone mad? ;-)
Aye! A wee experiment, as I have a client in that neck of the woods.
It wisnae me! A big boy done it and ran away.
NVMe2G for life! until death (the end is nigh)
did the same from LAX to AMS. now happy with xTom.
will Virmach plan to supply IPv6 in that location?
I bench YABS 24/7/365 unless it's a leap year.
I'm thinking that I could do with a second one there.
It wisnae me! A big boy done it and ran away.
NVMe2G for life! until death (the end is nigh)
its trademark MJJ , even more so for unorthodox migrants.
whats the issue with node 40? will you be extending service for those affected instead of credits? credits cant help me get back 6TB of premium bandwidth watching educational videos in Japan.
I bench YABS 24/7/365 unless it's a leap year.