BuyVM downtime

heyhey OG
edited April 2022 in Outages

Did anyone notice buyvm Las Vegas was down for about 4 hours today?

I don't have any important thing there, just curious about what happens

edit location

Comments

  • When did BuyVM add LA?

  • @Xenos said:
    When did BuyVM add LA?

    Opps, It's Las Vegas.

  • armandorgarmandorg Services Provider

    That is completely true, Las Vegas

    Web Design Agency - Custom Web Designs
    WHMCS.design - WHMCS Themes | Blesta.shop - Blesta Themes

  • its up, likely just your node.

  • was it ever possible to buy something from buyvm? every time i went to their website, it said: sold out!

  • @lapua said:
    was it ever possible to buy something from buyvm? every time i went to their website, it said: sold out!

    Yeah it is possible . you must receive an email about availability of their services. It is on their website just put your email there

    Dentistry is my passion

  • Gotta get in a line and be quick about it to buy anything from them.

    ♻ Amitz day is October 21.
    ♻ Join Nigh sect by adopting my avatar. Let us spread the joys of the end.

  • FranciscoFrancisco Hosting ProviderOG
    edited April 2022

    Sorry about that.

    We had a single storage node in Vegas burp causing the whole block storage cluster to hang. When it hangs, users VM's will usually "pause" and then resume once things are moving again.

    Vegas runs a fairly old version of things, with NY/LU/MIA being newer builds. Our current setup doesnt' give us any sort of node level redundancy, though, so if a node locks up/reboots/whatever, it's going to crash whatever VM's are feeding from it.

    In the next few days we'll begin live trials on our Ceph cluster. Our own tests look pretty solid and will give us the option to offer Object Storage (S3) if we wanted to.

    It'll take quite some time to migrate users into Ceph, but I'm fairly sure I can do the entire thing while users are still running and without a single disruption or byte lost.

    Francisco

  • edited April 2022

    Ah that explain what I can't connect to ssh, but my box still retained its uptime when I came back later.

  • williewillie OG
    edited April 2022

    Mine doesn't seem to have been down at all. I had an ssh session open last night and it is still connected. Storage is still mounted too. I'm used to having to remount it when anything happens.

  • @Francisco said: I can do the entire thing while users are still running and without a single disruption or byte lost.

    Just copying for the comp claim later.

  • Ohh, Fran's gonna get sued.

    ♻ Amitz day is October 21.
    ♻ Join Nigh sect by adopting my avatar. Let us spread the joys of the end.

  • FranciscoFrancisco Hosting ProviderOG

    @willie said:
    Mine doesn't seem to have been down at all. I had an ssh session open last night and it is still connected. Storage is still mounted too. I'm used to having to remount it when anything happens.

    That's if a storage node you're attached to reboots, that actually kills active connections so you'll go read-only.

    I like the setup we have, it's pretty easy to maintain, but the lack of wider redundancy is annoying.

    @Lee said:

    @Francisco said: I can do the entire thing while users are still running and without a single disruption or byte lost.

    Just copying for the comp claim later.

    :D Thankfully it just uses Libvirts live migrations. We literally rebuilt all of LUX slabs...twice... last year due to XFS chewing its face off. Users were unaware it was happening minus the lack of stock.

    Francisco

    Thanked by (1)yoursunny
  • @Francisco said:
    In the next few days we'll begin live trials on our Ceph cluster. Our own tests look pretty solid and will give us the option to offer Object Storage (S3) if we wanted to.

    It'll take quite some time to migrate users into Ceph, but I'm fairly sure I can do the entire thing while users are still running and without a single disruption or byte lost.

    Francisco

    That's exciting! I assume you have enough nodes that you can lose a few without going HEALTH_WARN; the stress of heavy scrubbing on a live cluster can quickly cause cascading issues. Some of us still remember ZXHost....

  • FranciscoFrancisco Hosting ProviderOG

    @seanho said:

    @Francisco said:
    In the next few days we'll begin live trials on our Ceph cluster. Our own tests look pretty solid and will give us the option to offer Object Storage (S3) if we wanted to.

    It'll take quite some time to migrate users into Ceph, but I'm fairly sure I can do the entire thing while users are still running and without a single disruption or byte lost.

    Francisco

    That's exciting! I assume you have enough nodes that you can lose a few without going HEALTH_WARN; the stress of heavy scrubbing on a live cluster can quickly cause cascading issues. Some of us still remember ZXHost....

    Shouldn't be a problem :) I suspect ZX was flying by the seat of his pants and barely had enough capacity to cover what he was offering, nevermind spare. There's a real chance he had 'min_size' == 1, basically R0.

    Francisco

    Thanked by (1)seanho
  • @Francisco said:

    @seanho said:

    @Francisco said:
    In the next few days we'll begin live trials on our Ceph cluster. Our own tests look pretty solid and will give us the option to offer Object Storage (S3) if we wanted to.

    It'll take quite some time to migrate users into Ceph, but I'm fairly sure I can do the entire thing while users are still running and without a single disruption or byte lost.

    Francisco

    That's exciting! I assume you have enough nodes that you can lose a few without going HEALTH_WARN; the stress of heavy scrubbing on a live cluster can quickly cause cascading issues. Some of us still remember ZXHost....

    Shouldn't be a problem :) I suspect ZX was flying by the seat of his pants and barely had enough capacity to cover what he was offering, nevermind spare. There's a real chance he had 'min_size' == 1, basically R0.

    Francisco

    For what it’s worth I/ZX was using erasure coding. 6-2 if I remember correctly. Was in the process of adding a new node (storage) hit a nasty bug that caused some extents in the OSD journal to be miss set during the rebalance.

    Was fine till an OSD needed to restart and playback the journal, worked with the CEPH dev’s to fix the issue at the time.

    However by that point enough OSD/PG shards where corrupt and pretty much every RBD was impacted hence toasted FS.

    Thanked by (3)caracal mikho fluttershy
  • Hey you're here, Ash! I did enjoy ZX while it lasted, and I knew you tried your best to recover.

    Thanked by (1)AshUk
  • edited April 2022

    @AshUk said:

    I/ZX

    oh, hi there! good to see you alive and standing... miss my storage boxes still, however it all could have ended better I guess ;-)

    How's everything going? any plans to come back to hosting? probably people quickly get their forks out, so be careful ... all the best!

  • Hosting ain't worth it. Stay away.

    ♻ Amitz day is October 21.
    ♻ Join Nigh sect by adopting my avatar. Let us spread the joys of the end.

  • edited April 2022

    @Falzo said:

    @AshUk said:

    I/ZX

    oh, hi there! good to see you alive and standing... miss my storage boxes still, however it all could have ended better I guess ;-)

    How's everything going? any plans to come back to hosting? probably people quickly get their forks out, so be careful ... all the best!

    Hey!

    Couple months after I actually had a few people reaching out to me asking if I was going to restart / offer something as they had a need for XXTB and couldn’t find anywhere else.

    So for past year or so have run a small operation providing to just word of mouth.

    0 advertisement or anything, after that I realised having 100’s of clients paying $.$$ when something goes wrong it’s a huggeee headache and have not much more hair to loose…

    Feel free to drop me a message if you ever need any advise / anything, don’t want to prop up this post anymore.

    But yeah I don’t think I’ll be doing anything similar again anytime soon. Maybe launch something on the mid/higher end of $ and not aiming for the bottom.

    Thanked by (1)usr123
This discussion has been closed.