An interesting tech test, looking for sponsors or pledges from interested people. [Industry Secrets]

2»

Comments

  • Not_OlesNot_Oles Hosting ProviderContent Writer
    edited June 2020

    @AnthonySmith said: Any questions, thoughts, words of support?

    Please wait a moment. This is a really super fun idea! But maybe there are as yet insufficiently considered questions of experimental data and statistical analysis?

    First, experimental data: how do the control and experimental groups report their perceptions of server performance? In other words, what data will be collected and analyzed? For example, do the users fill out a questionnaire several times during the experiment and again at the end? Or something else?

    Second, statistical analysis: what size does the study need to be in order for the results to be statistically significant? I remember that the famous lady tasting tea experiment is subject to criticism that there weren't enough tests.

    I know very little about all this, but I am guessing it might be a great idea, before starting, to think carefully about experimental data and statistical analysis.

    Hope this helps!

    Thanked by (3)imok dahartigan uptime
  • InceptionHostingInceptionHosting Hosting ProviderOG

    feel free to propose some questions, I was thinking of a questionnaire at 50% capacity and then again at the end.

    There are 3 bit of knowledge I want to gain and share from this.

    1. Is the method viable, what is the breaking point.
    2. is the end-user experience compatible to an under-provisioned node.
    3. Even if the overselling is noticeable as a comparison is the service still viable and worth paying for if it is cheap.

    https://inceptionhosting.com
    Please do not use the PM system here for Inception Hosting support issues.

  • bikegremlinbikegremlin ModeratorOG

    Overselling is not the same as overloading.

    It would be interesting to have 4 test servers.
    One not oversold.
    Second "conservatively" oversold.
    Third "normally" oversold.
    Fourth "optimistically" oversold.

    Detailed info about providers whose services I've used:
    BikeGremlin web-hosting reviews

  • InceptionHostingInceptionHosting Hosting ProviderOG

    @bikegremlin said:
    Overselling is not the same as overloading.

    It would be interesting to have 4 test servers.
    One not oversold.
    Second "conservatively" oversold.
    Third "normally" oversold.
    Fourth "optimistically" oversold.

    Well that's an expensive experiment :)

    also if you just do a vanilla install and throw 256GB of VPS at a server with 64GB Ram.... you are going to have a bad day.

    If you tweak the bejesus out of it and manage the ram compression, swap, create a KSM profile to fit the use then you might have a hard day but you might get through it :)

    Thanked by (1)wdmg

    https://inceptionhosting.com
    Please do not use the PM system here for Inception Hosting support issues.

  • @AnthonySmith said:
    looks like the IP space is covered, really appreciated, now we just need to secure the hardware for ideally 3 months.

    There are some 12 core opterons available for pretty cheap right now if you maybe want to buy two servers outright and just do whatever you want with them later on.

    I'm the 85%. Also Elon likes memes hence he's an idiot.

  • This is so cool and @AntGoldFish is best animator 2020 confirmed, but my nerdy side agrees with @poisson and @Not_Oles - this cannot claim to prove anything, as any result you get is just happened to be so.

    Do you wish to show some ppl it's ok to oversell a sane ratio? Will they learn though? Do you want to find out how much is ok? The answer is known: "it depends"

    That being said I would purchase a 3 month participation for anything less than $7, if only for lulz.

  • InceptionHostingInceptionHosting Hosting ProviderOG
    edited June 2020

    @comi I understand what you are saying, I am not trying to prove an ironclad point, this is a technical exercise with observers that shows the art of the possible and will perhaps surprise some people who believe that 10% over commit = turd service and to push the boundaries to find a level.

    edit: Also this is to illustrate that it is entirely possible and likely that these services that have been selling 8GB Ram servers for stupidly low prices e.g. 3.99 p/m for the past 5 years or so have probably been simply selling ssd swap space as ram and to see if properly tuned that is possible without hurting the user experience much (if cheap enough).

    With NVMe hitting insane speeds now that is likely even more possible.

    Thanked by (1)comi

    https://inceptionhosting.com
    Please do not use the PM system here for Inception Hosting support issues.

  • edited June 2020

    I always thought a provider could only get away with this sort of setup with a very high installed RAM on host. 256G-1024G (let's say 128G as a minimum)

    Hetzner server bidding keeps having those 12T, 256GB boxes for under 70EUR/mo. Subnets would be extra.

    Also 2GB ram is sort of the new 1GB. I think a lot of better oversell planning could be done with 4 and 8GB RAM guests with lots of idle ram. OFC, Don't hand out vCPU like candy .

    App examples:
    I know that Gitlab-fresh idles close to 3GB (not including disk pagecache for repos ) .
    Zulip recommends 4GB.

  • bikegremlinbikegremlin ModeratorOG
    edited June 2020

    My 2c - not (claiming to be) an expert - have that in mind, but if it's of any use:

    Overselling is a rather reasonable way to sell resources. Because many websites (not sure about other on-line applications) are idle at times, more or less. As long as overselling is well balanced, it allows for a lot lower price (and better use of the resources), with very little performance "penalties". In my opinion, it makes little sense to not oversell.

    Overloading is a different thing. Overdo overselling, and there are problems. Suppose it boils down to experience and good monitoring by the hosting provider (and calculating some margin of error, or resource use spikes) to make sure that overselling doesn't turn into overloading. Good (especially low end) providers get this right, most of the time. Allowing them to offer affordable prices, with no performance problems (cutting down on overselling means prices must go up, while overdoing it means the performance suffers).

    For testing and comparison, I use a free, primitive method (my testing method). TL/DR - two identical WordPress websites, one on each server, then using Octoperf to simulate 50 simultaneous visitors browsing them, and using manual GTmetrix testing. Method is primitive, precision varies up to 14%, but it's the best I can do (on my budget and knowledge levels). Running tests 3 times improves precision to 5%, or better.

    This could give a more-less objective performance gauge and comparison.
    A control could be done by first running both websites (and tests) on one server - to confirm the test precision.
    The downside is I have no idea how much RAM and CPU load do these test really put on a server. Probably ridiculously low, so the (oversold) server must be put to work by the rest of the stuff that's running on it.

    Even more drivel, again - not an expert:
    Running out of RAM should create problems, while (temporary) lack of CPU resources shouldn't hurt nearly as much. Think that's also something to be considered. At least with HDD and SSD used as swap RAM - not sure how NVME performs and if it can help change that, so that overloading CPU becomes the thing to worry about primarily. Sure experienced providers know the answers, while testing this takes time.

    Detailed info about providers whose services I've used:
    BikeGremlin web-hosting reviews

  • @AnthonySmith said:

    @dahartigan said: You'd pretty much have to use alphawootracks as the benchmark for horrible performance imo lol

    I think that was caused by them simply overselling rather than applying some intelligent design to it. :)

    I agree with that 100% - One thing I just considered was that they also used HDD which was the cause of their main issues that I can recall (horrible IO) plus a lot of abusive clients that they attracted/appealed to/allowed who thrashed the CPU (admittedly, I'm guilty of that) and those who thrashed the network and disk (torrents etc).

    I'm starting to get curious now about how far one could REALLY push a decent server, cram absolutely everyone onto that, oversell the absolute shit out of the CPU and RAM by just disregarding them basically lol. The storage would be interesting if oversold by no more than 2x (200% of physical storage) to see how much cramming would occur and how long that would take, and if monitored, could even be scaled by adding another drive or two.. I think I'm running off on a tangent from your idea, apologies for that, just brain farting because I have a spare 10 mins :)

    Get the best deal on your next VPS or Shared/Reseller hosting from RacknerdTracker.com - The original aff garden.

  • Ok, let me bring some research and statistical expertise to the table.

    First up, there are too many confounders (think of them as alternative explanations) that can influence perceptions of performance (which seems to be what we are measuring). I don't think it is possible to make any causal claims. The most we can say is that there is/isn't a statistically significant difference (meaning the difference is not due to chance) between both groups in terms of perception of server performance. That in itself is probably sufficiently valuable a conclusion.

    To answer the question of sample size, well, that is a little complicated, but I would say at least 30 responses in each group would be a minimum.

    Deals and Reviews: LowEndBoxes Review | Avoid dodgy providers with The LEBRE Whitelist | Free hosting (with conditions): Evolution-Host, NanoKVM, FreeMach, ServedEZ | Get expert copyediting and copywriting help at The Write Flow

  • We can sit here and discuss the scientific faults or shortfalls of the setup all day long. Frankly this is also possible in even the best of scientific research. But this looks like a fun technical test and something uniquely possible to a forum like this. I'm sure enough members can find actual real world applications to run vs just testing to make this a worthwhile test (blogs, server monitoring, dns, discord, etc)

    Let's do it and see what the results are. I have a feeling that a clean nvme raid 1 or raid 10 will actually have fairly acceptable use with very little performance impact. I'm personally curious as to what the typical number of users is for a low end provider. 100 users? 1000? We should maybe discuss that but the rest let's just see what happens! :D

    Thanked by (2)Abdullah PHP_Backend
  • InceptionHostingInceptionHosting Hosting ProviderOG

    Yep, the thing is with an artificial load and no way to tell how it will be used in advance and people also knowing it is a test there is no way to get concrete results.

    It will however have a control group that will not know they are a control group. So some insight can be gained.

    Additionally it will test an extreme resource density model that I have had in my head as an idea for a while that I am certain some hosts use but I have never had the balls or lack of morals to try in practice.

    Also, it’s fun for the whole community.

    https://inceptionhosting.com
    Please do not use the PM system here for Inception Hosting support issues.

  • @AnthonySmith said:
    Yep, the thing is with an artificial load and no way to tell how it will be used in advance and people also knowing it is a test there is no way to get concrete results.

    It will however have a control group that will not know they are a control group. So some insight can be gained.

    Additionally it will test an extreme resource density model that I have had in my head as an idea for a while that I am certain some hosts use but I have never had the balls or lack of morals to try in practice.

    Also, it’s fun for the whole community.

    Frankly, there is no way to get generalizable results because in the first place, you will not get a probability sample for your participants. So, yes, this is a fun test and the results will be indicative but not predictive.

    Deals and Reviews: LowEndBoxes Review | Avoid dodgy providers with The LEBRE Whitelist | Free hosting (with conditions): Evolution-Host, NanoKVM, FreeMach, ServedEZ | Get expert copyediting and copywriting help at The Write Flow

  • bikegremlinbikegremlin ModeratorOG

    @poisson said:

    @AnthonySmith said:
    Yep, the thing is with an artificial load and no way to tell how it will be used in advance and people also knowing it is a test there is no way to get concrete results.

    It will however have a control group that will not know they are a control group. So some insight can be gained.

    Additionally it will test an extreme resource density model that I have had in my head as an idea for a while that I am certain some hosts use but I have never had the balls or lack of morals to try in practice.

    Also, it’s fun for the whole community.

    Frankly, there is no way to get generalizable results because in the first place, you will not get a probability sample for your participants. So, yes, this is a fun test and the results will be indicative but not predictive.

    Would it make sense to run a speed/stress test - was that intended?
    For comparison of how performance changes depending on whether a server is oversold, or not.

    Detailed info about providers whose services I've used:
    BikeGremlin web-hosting reviews

  • InceptionHostingInceptionHosting Hosting ProviderOG
    edited June 2020

    @bikegremlin said:

    @poisson said:

    @AnthonySmith said:
    Yep, the thing is with an artificial load and no way to tell how it will be used in advance and people also knowing it is a test there is no way to get concrete results.

    It will however have a control group that will not know they are a control group. So some insight can be gained.

    Additionally it will test an extreme resource density model that I have had in my head as an idea for a while that I am certain some hosts use but I have never had the balls or lack of morals to try in practice.

    Also, it’s fun for the whole community.

    Frankly, there is no way to get generalizable results because in the first place, you will not get a probability sample for your participants. So, yes, this is a fun test and the results will be indicative but not predictive.

    Would it make sense to run a speed/stress test - was that intended?
    For comparison of how performance changes depending on whether a server is oversold, or not.

    Not really as that is fully synthetic.

    The real point here is, can 250 experienced users coming to this without a destructive mindset:

    A ) Tell that they are on a node which is running compressed ram as swap and secondary swap on NVME.
    B ) Consider it usable regardless of if they can tell the difference.
    C ) Somewhat outside of the end-user experience, it shows an interesting technical method of gaming significant density over standard and shows how well the model/method works.

    That is all, no one is trying to get a paper peer-reviewed or published in a recognised journal, no one is saying there will not be areas of weakness in the method due to the scale, it is what it is and it is very interesting and maybe it poses more questions than is answers but that is not a bad thing either.

    Thanked by (2)Abdullah dahartigan

    https://inceptionhosting.com
    Please do not use the PM system here for Inception Hosting support issues.

  • bikegremlinbikegremlin ModeratorOG

    I still don't see how a comparative performance test can harm anything. It would only give more info, that could be useful in judging performance reducing effect of overselling more objectively.

    Without it, it's like avoiding to use a thermometer in comparing whether cellar is cooler than an air conditioned room. Sure - it's great to know the subjective impression. But why deliberately disregard any measurement?

    Thanked by (1)PHP_Backend

    Detailed info about providers whose services I've used:
    BikeGremlin web-hosting reviews

  • InceptionHostingInceptionHosting Hosting ProviderOG

    @bikegremlin said: I still don't see how a comparative performance test can harm anything. It would only give more info, that could be useful in judging performance reducing effect of overselling more objectively.

    Without it, it's like avoiding to use a thermometer in comparing whether cellar is cooler than an air conditioned room. Sure - it's great to know the subjective impression. But why deliberately disregard any measurement?

    I have no issue with it in general but there is no doubt that the sold to spec server will perform better.

    The test is a perception test at its core.

    For example, we all know a v6 ford engine with a turbo is going to better than one without a turbo but if you want to find out if people think that the one without the turbo has an acceptable performance based on their past car experience as regular drivers of multiple vehicles then you don't put it on a rolling road to show performance metrics first as that will skew perception and perception is reality.

    This is why I said from the getgo if this is done, and you buy it, if your only reason for buying is to find out which server is which then you missed the point.

    Again though, I would expect to do the reveal at about 60 days in of a 90-day test so at that point comparative benchmarks could be a great addition to the test.

    This reminds me of the tickets I get during black Friday like "I know its a black Friday special but I expect better performance". Upon checking the host node it becomes obvious that 30 customers are benchmarking at the same time with zero awareness of each other.

    If 30 users benchmark on the undersold node it will get worse results than if no one is benchmarking on the oversold one :)

    As such personally I put zero store by synthetic generic benchmarks on a VPS because you are doing it blind, so yes it does no harm if done at the appropriate time which is not at the start of the experiment and they would be better off being done by me at a time of known average load after 60 days has established that baseline.

    If you want and are really interested I would be happy to have you keep that part honest by overseeing the method and execution.

    https://inceptionhosting.com
    Please do not use the PM system here for Inception Hosting support issues.

  • InceptionHostingInceptionHosting Hosting ProviderOG

    @ReadyDedis said:
    I'll be cool with sponsoring 2 units of 2 x Dual E5-2620v2 - 32GB RAM - 4x240GB SSD ( or 2x500GB SSD )

    I am sorry, as the last comment on the page I missed it.

    much appreciated, I would like to wait a few days more to see if we can get any NVMe based servers though as those pcie 3/4 lanes make a huge difference.

    https://inceptionhosting.com
    Please do not use the PM system here for Inception Hosting support issues.

  • @bikegremlin said:

    @poisson said:

    @AnthonySmith said:
    Yep, the thing is with an artificial load and no way to tell how it will be used in advance and people also knowing it is a test there is no way to get concrete results.

    It will however have a control group that will not know they are a control group. So some insight can be gained.

    Additionally it will test an extreme resource density model that I have had in my head as an idea for a while that I am certain some hosts use but I have never had the balls or lack of morals to try in practice.

    Also, it’s fun for the whole community.

    Frankly, there is no way to get generalizable results because in the first place, you will not get a probability sample for your participants. So, yes, this is a fun test and the results will be indicative but not predictive.

    Would it make sense to run a speed/stress test - was that intended?
    For comparison of how performance changes depending on whether a server is oversold, or not.

    The point is not to stress test. That's easy. The goal is to simulate a "natural" experiment (in a true natural experiment, the participants don't know they are part of the experiment so this proposal lies midway between a proper natural experiment and a controlled experiment) to see if participants notice a difference that leads them to suspect a server has been oversold.

    When you stress test, you are basically forcing maximum performance out of the machine for the entire duration, but for oversold servers, you probably won't get to stress test levels all the time (well, you can reach that state if horribly oversold enough, but most providers seek to oversell a bit (understandable for more profit). The key question is what is the distribution of different users' usage? Does it follow a normal distribution, or is it skewed to the left or to the right?

    If it is a left skew, you can oversell more and not worry. If it is a right skew, better not. We don't know, and that's what the experiment is about.

    Deals and Reviews: LowEndBoxes Review | Avoid dodgy providers with The LEBRE Whitelist | Free hosting (with conditions): Evolution-Host, NanoKVM, FreeMach, ServedEZ | Get expert copyediting and copywriting help at The Write Flow

  • @AnthonySmith said:
    That is all, no one is trying to get a paper peer-reviewed or published in a recognised journal, no one is saying there will not be areas of weakness in the method due to the scale, it is what it is and it is very interesting and maybe it poses more questions than is answers but that is not a bad thing either.

    Not trying to get published in a peer-reviewed journal doesn't mean not treating the study design seriously. You don't want to waste time or money. Just saying.

    Deals and Reviews: LowEndBoxes Review | Avoid dodgy providers with The LEBRE Whitelist | Free hosting (with conditions): Evolution-Host, NanoKVM, FreeMach, ServedEZ | Get expert copyediting and copywriting help at The Write Flow

  • bikegremlinbikegremlin ModeratorOG
    edited June 2020

    @AnthonySmith said:

    @bikegremlin said: I still don't see how a comparative performance test can harm anything. It would only give more info, that could be useful in judging performance reducing effect of overselling more objectively.

    Without it, it's like avoiding to use a thermometer in comparing whether cellar is cooler than an air conditioned room. Sure - it's great to know the subjective impression. But why deliberately disregard any measurement?

    I have no issue with it in general but there is no doubt that the sold to spec server will perform better.

    The test is a perception test at its core.

    For example, we all know a v6 ford engine with a turbo is going to better than one without a turbo but if you want to find out if people think that the one without the turbo has an acceptable performance based on their past car experience as regular drivers of multiple vehicles then you don't put it on a rolling road to show performance metrics first as that will skew perception and perception is reality.

    This is why I said from the getgo if this is done, and you buy it, if your only reason for buying is to find out which server is which then you missed the point.

    Again though, I would expect to do the reveal at about 60 days in of a 90-day test so at that point comparative benchmarks could be a great addition to the test.

    This reminds me of the tickets I get during black Friday like "I know its a black Friday special but I expect better performance". Upon checking the host node it becomes obvious that 30 customers are benchmarking at the same time with zero awareness of each other.

    If 30 users benchmark on the undersold node it will get worse results than if no one is benchmarking on the oversold one :)

    As such personally I put zero store by synthetic generic benchmarks on a VPS because you are doing it blind, so yes it does no harm if done at the appropriate time which is not at the start of the experiment and they would be better off being done by me at a time of known average load after 60 days has established that baseline.

    If you want and are really interested I would be happy to have you keep that part honest by overseeing the method and execution.

    I understand your point - but. :)

    Not arguing - just trying to explain my point of view - if it's of any practical use for your test.

    Tests I had in mind measure sort of a non-critical load (for a somewhat decent hosting with caching set properly). If there are hick-ups with those - then I conclude that server is not acceptably good. If there are large performance differences - same conclusion on which server is worse (not good enough).

    Because, from my experience, when relying on "subjective testing", even Bluehost shared hosting is good - until it isn't - which sometimes takes months to become apparent (just discussing the performance, not tech. support, or security). I've been a happy EIG user for years, until I noticed problems - been measuring ever since, because it's the fastest way to compare and see if a hosting is (any) good.

    Of course, I have almost no experience with VPS-s, so this could all be wrong. Maybe with VPS-s differences are more noticeable and more gradual.

    EDIT: the idea is for you (or one, designated user) to do the tests - not for everyone to be doing stress tests - then it would not be goood.

    Detailed info about providers whose services I've used:
    BikeGremlin web-hosting reviews

  • Soooooo... where are we?

  • InceptionHostingInceptionHosting Hosting ProviderOG

    @MrPsycho said:
    Soooooo... where are we?

    Never got any suitable hardware offers so it was not possible. It’s one that may happen in the future.

    https://inceptionhosting.com
    Please do not use the PM system here for Inception Hosting support issues.

  • @AnthonySmith said:

    @MrPsycho said:
    Soooooo... where are we?

    Never got any suitable hardware offers so it was not possible. It’s one that may happen in the future.

    We’re still happy to provide hardware to whatever specs necessary

Sign In or Register to comment.