MongoDB vs MariaDB vs PostgreSQL for Strapi
I am trying to deploy strapi for my api server, and I'm currently looking for the best database for it. Tried googling it, but couldn't find any relevant info, and people just seemed to use whatever they normally prefer. Have anybody tried strapi with multiple database backends? Which DB should perform the best?
Any suggestion would be awesome, as well as suggestions for other headless CMSes (preferably with an admin panel).
Thanks 
 
                             
                            
Comments
I think that's your answer.
In IT, everything has trade-offs. So people tends to use the tool that they prefer (or know).
https://phpbackend.com/
I'm use mongoDb for my production and local. Never get headcache about relation or whatever to database
Avoid MongoDB. It's a mediocre database at best, and you're quite likely to end up with some form of data corruption over time, given that it doesn't validate data integrity like an RDBMS would.
I've not kept track of MySQL/MariaDB lately (it's much less pleasant to work with than PostgreSQL), but where performance is concerned, PostgreSQL will easily win out over MongoDB, no matter how many misleading benchmarks MongoDB (the company) tries to publish...
Everything has tradeoffs, sure. But sometimes there are technologies that are just bad, that provide literally no redeeming features over already-existing options. MongoDB is one of those cases; it basically only exists because the company behind it needed a product to sell, it doesn't have any actual redeeming features.
(For other, more serious databases, there are generally pros and cons to each option. MongoDB is a special case here.)
Completely agree, also an absolute memory hog.
constant breakdowns, good as a small DB but when you need anything more than a few TBs, just stay away from it.
A nicely sharded MySQL or Postgres works much better.
if you have the cash, stick with Oracle or MSSQL (I have seen database sizes that make most big DBs look puny)
Cassandra could be a the goto DB, it works if you have the know-how and a team to maintain it.
My Personal Blog 🌟🌟 Backups for $1 🔥🚀
Is strapi something that uses tb's of data? Mongo is easy to get started with (document db instead of relational) and has an easy replication setup (as does cassandra), plus it's very fast in unsafe mode, but I'd agree to stay away from it for serious purposes. Cassandra isn't especially hard to use either (I've played with it). ScyllaDB is an interesting alternative to Cassandra though I haven't tried it yet.
@PHP_Backend @lightblade @joepie91 @evnix @willie thankyou all with the replys, and sorry for responding too late. TBH I was really busy with other personal stuffs and didn't really have time to look into it.
It seems like for testing purposes, MongoDB should work fine and fast (especially with unsafe mode), but I should stick on with MariaDB on production since I am used to it and has more stability.
Once again, thankyou everyone!
Here's the problem, though: "getting started with" something is something you only do once, whereas "keeping it going" is something you will be doing effectively forever. MongoDB makes it "easy to get started", in exchange for making everything after that significantly harder and less reliable, forever.
That's great for their ability to market a subpar database product (and in fact, this seems to be quite literally their marketing strategy), but for the end user it means that the rare case is being optimized at the cost of the common case - ie. a very bad deal.
As for "easy replication setup" - it may be easy to get it running in something it claims is a replicated setup... but pretty much every single person I've spoken to who has actually maintained a serious production MongoDB cluster has called it a nightmare to operate and maintain, with constant inexplicable failures.
There's a reason there's a billion "this is how easy it is to get started with MongoDB" tutorials around the web, and virtually none that tell you how to maintain a MongoDB cluster in the long run. Most anyone running a serious deployment has migrated away to a serious database by that point.
(This is actually a great indicator for whether a new-ish technology is just hype, or a serious improvement; is the web full of "getting started" posts, or are there also in-depth articles about long-term use? If it's just the former and almost none of the latter, it's probably just hype.)
Could you please elaborate a bit? I'm new to Mongo, but I have some experience with SQL. Mongo seems to look better than MySQL, but then good marketing makes shit look gold. I would love to hear your experiences/opinions
I've done it, and there are definitely serious deployments around, though I agree with you that Mongo was always overhyped and people have caught onto it more by now. I still use it for a few things. SQL users now tend to do things like put JSON text into columns (MySQL and PostgreSQL both have acquired features to support this) to implement Mongo-like soft-schema approaches. I agree that for big permament production applications you are better off jumping through all the SQL hoops.
That's a difficult question to answer, unless I know what impressions you've gotten of MongoDB, given how much different nonsense they've marketed over the years
So yeah, in what way(s) does it seem to look better to you than MySQL? Then I can address those points specifically.
It's "Alternative".
My pronouns are asshole/asshole/asshole. I will give you the same courtesy.
This!
Not much to be honest, I have been skeptical towards Mongo (probably because their marketing made it sound too good), but I've heard that they apparently scale better than MySQL.
I haven't tested it myself, but Mongo claims to be easier to configure HA on. Their marketing seems to indicate that it's much better than MySQL.
Okay, so what that is really referring to, is that MongoDB uses sharding (basically, distributing records across multiple servers and using a deterministic algorithm to determine what server to ask for what record), which makes it "easy" to scale up in the sense that it doesn't require you to architect your data storage around a particular distribution model across servers, it just throws all the records into a big content-addressable bucket.
This is not a technique that's unique to MongoDB, and in fact there are quite a few databases that can be sharded (or are sharded by default).
What their marketing copy doesn't mention, however, is that sharding comes with severe tradeoffs; you can't have relational integrity, because a sharded system cannot assume that other servers are available to check the validity of certain references against, and there can be significant overhead associated with lookups when different servers have a different idea of which servers are currently online and "healthy", as well as a lot of opportunities for servers to serve up outdated versions of records.
All the while this functionality is not necessary for the vast majority of projects (you can scale very far on a single server, enough for 99.9% of usecases), and so sharding is a really bad default as a replication strategy, because you will be trading in data integrity guarantees that you do need, for scalability features that you don't need. There's a reason why RDBMSes don't shard by default.
If you still really want to do sharding for some reason, then there's an implementation of that for PostgreSQL and, from a quick search, it seems for MySQL as well (though I don't bother with MySQL personally, PostgreSQL is better and nicer to work with in almost every way).
So yeah, not quite the unique selling point that MongoDB are presenting it to be.
how about Replication?
https://docs.mongodb.com/manual/replication/
it's like master<=>slave cluster. i'm not use it, only install MongoDB in docker for dev at this time.
another source i found on internet
https://dba.stackexchange.com/questions/52632/difference-between-sharding-and-replication-on-mongodb
In a sharded model, you can have redundancy by simply having >1 nodes responsible for the same record, and modifying your algorithm to produce >1 results. That seems to be more or less what MongoDB does, with its "each shard can be a replica set" approach.
If you want just replication, then that's the standard mode of operation of most every RDBMS that supports a cluster of more than one instance. But HA replication is not trivial; and it's something that MongoDB definitely doesn't offer, despite its marketing (because for true HA, you need a guarantee that each node in the cluster will always produce non-stale data, which MongoDB doesn't provide).
Basically, MongoDB still doesn't do anything special here, and if you just run a replica set, you still don't get the relational integrity that an RDBMS would provide.
Edit: Also, you usually want none of the above things. The complexity of highly-redundant setups tends to cause more problems than it solves.
Sharding (splitting a single big dataset across multiple servers) and replication (maintaining multiple copies of a dataset for HA) are completely separate things. MySQL and Mongo both had better replication features than Postgresql (PG) did, as of a few years ago. Many people considered PG to be generally better than MySQL, but still chose to run MySQL because they needed replication. It's possible that PG has better replication by now: anyone know?
There was a time when Mongo had sharding and PG didn't. Later there was a proprietary PG fork called CitusDB that had it, and PG itself has managed to catch up (at least partially) with Citus since then. So the most we can say is that PG has (possibly) caught up to Mongo for sharding by now. At one point Mongo had an advantage in this area and that was an attraction for some users. We can't blame the users for that. I don't know if MySQL has any sharding to this day.
Maintaining consistency with replication active is not complicated if you can tolerate slow operations: just wait for acks from multiple (or all) slave servers before declaring any operation fully committed. Mongo has modes that do that, they are slow, but they are there if you need them. Consistency across sharding is more complicated and there are various "eventual consistency" schemes etc. I think Cassandra was more sophisticated than Mongo about that. Mongo despite appearances was not really that concerned with enormous sharded datasets. It was more about easy setup and prototyping. I remember hearing that Riak was the most solid of the sharded db's though I never used it.
Mongo in unsafe mode was really faster than the SQL databases, if you could accept the un-safety. It took a bad rap over that because maybe people expected unsafe mode to be safe and got upset when they found out that it wasn't. That seems silly to me. Redis is unsafe and fast and people use it exactly when that combination suits their purposes. There is no confusion, so Redis is popular.
These days I'm trying to get better at SQL, so I prototype with SqLite, but Mongo did make some things easy.
Hm, not intending to derail this thread, but if there's any availble and easy understandable guidelines as to when (up to what size/other measurement) SQLite is preferrable to PostgreSQL/MariaDB, that would be nice.
(And also, is MySQL/MariaDB better than PostgreSQL for low memory VPS'es?)
Just suck it up and learn Pg. Then Spend 10years mastering it.
Your career will thank you.
Pg: A real Swiss army knife for 95% of your data manipulation needs if you count the plugin ecosystem.
AFAIK it is still the case that MySQL has better replication for a particular replication model than PostgreSQL; I forgot which model that was, though.
I strongly doubt that anyone who initially picked MongoDB, actually picked it because they needed sharding. There were a bunch of other databases that implemented some variant of sharding before Mongo came around, and people who actually had a sharding requirement were very likely using those. The hype around MongoDB has always been focused around nebulous unsubstantiated "performance" claims, and hype around the "MEAN stack" (which MongoDB coined themselves), neither of which really had anything to do with sharding.
(Even today, almost noone is actually using the sharding features in PostgreSQL, because it just isn't a common requirement.)
That still doesn't give you relational integrity, though.
I've seen that claim thrown around, but mysteriously people fell silent every time I asked for a source. I've never seen a single benchmark that backed up this claim and was also verified by a third party as being a representative benchmark. Every plausible benchmark I've seen showed MongoDB to be slower than PostgreSQL, even when running MongoDB in unsafe mode.
Far as I can tell, the performance claim has never been more than that; a claim. MongoDB's recent benchmarking bullshit just further reinforces that conclusion for me.
It's certainly less of an issue than with MongoDB, because the Redis documentation is fairly clear that it's not a reliable data store, and not designed to be. But even then, I have to convince someone every once in a while that no, Redis is not meant to be used as a primary database...
If you need an embedded database, ie. not a separate server process but part of your application: use SQLite, I guess. In every other case: use PostgreSQL (or MySQL/MariaDB if you already have that running).
SQLite doesn't perform very well, and it's pretty awful to work with (it's missing fairly basic features such as "renaming columns" and it has a very questionable storage/typing model), but it's pretty much your only option for a vaguely-relational database without a separate process.
It's not really worth using SQLite "for prototyping" or "for testing", IMO -- after a few months you'll discover that it's just costing you time and effort, because you constantly have to replicate database features that SQLite doesn't have natively, just to be able to use them in production where you're running PostgreSQL/MySQL/MariaDB. Just installing a database server on your development system will be much simpler in the long run.
I did a quick test at idle a while ago, and PostgreSQL idled below MySQL in memory usage. I don't have good data on memory usage under load, but I've seen no reason to believe that PostgreSQL is somehow unsuitable for low-memory systems; or at least, not any more unsuitable than MySQL would be.
I'd even say that it feels like PostgreSQL is more efficient, though that's of course not a very reliable statement without the data to back it up
Thanks, great answers. I haven't played with PostgreSQL for a long time, but remember I used to like it better than MySQL. (Then I ended up working with a lot of system that were built with MySQL.) So I'd really like to give PG another go for my next project.