yammdb - just another .mmdb
Hey,
Today I wanted to present you, yammdb.
Which is another, different, geodatabase, based on real measurements.
You can find it here: https://github.com/Ne00n/yammdb
The Database is build weekly, on fridays.
As long the buildserver doesn't blow up.
The primary use case for me is to compare existing geo data, maybe someone of you will find it useful.
If you have any ideas and feedback, please lemme know.
Enjoy.
Comments
Latency is now included from the closest location.
Free NAT KVM | Free NAT LXC
Umm I'm stupid and new to all this. What is this???
Put in a IP, it'll tell you where the physical location of the server/host of the IP is.
Somik.org - Server admins cheat codes
Oh cool!
I'm trying this out with my gDNSd servers. The db seems much smaller than the ip-to-city-lite db I have been using, 16MB compared to 99MB. If this works as well it is going to save me a bunch of RAM.
Thank you for this !!
LES • About • Donate • Rules • Support
It has way less "useless" data on it, hence its so smol.
However, It should work with auto_dc maps using geo cords but I have no idea how accurate it is.
Free NAT KVM | Free NAT LXC
Thank you! I hope there is an acl version for bind9
I wrote a smol, 30 lines benchmark script, using a dump of the global routing table.
65%, 638k from 975k in the routing table, which is good, I expected less.
Github page said 600k.
Meanwhile the other .mmdb's have a 99% hit rate.
TLDR: Yes use it, but not as primary database, build your own, as my primary purpose was.
Free NAT KVM | Free NAT LXC
Fair enough.
LES • About • Donate • Rules • Support
if you build your own, you prob, can drop the memory usage even further.
Throw everything out and turn it into a flying gas can.
Free NAT KVM | Free NAT LXC
Speaking of hit rate, the benchmark I ran yesterday, gave me a 65% hit rate, against a routing table dump.
Which is more than I expected, roughly 648k from 975k, however, the second benchmark I ran, hit only 35% with 8.5 Million IP addresses.
This was due, that bigger subnets are splitted into smaller ones for more accurate data, however the ones which didn't respond where not filled, which has been fixed.
After fixing these bugs, the hit rate is at 74%.
The next step would be to see, where I can further improve the data I use for all of this, so I end up with higher hit rates.
Free NAT KVM | Free NAT LXC
Thanks to https://virtury.com/ we got a new Probe in Pakistan!
Free NAT KVM | Free NAT LXC
I took a bit longer than expected, however the software is now mostly optimized for more probes.
Expect more probes in the next weeks.
Daily test builds, not guaranteed, will be available under https://yammdb.serv.app/test.mmdb
Weekly build will happen as usual.
Also, thanks to https://ginernet.com for a new probe in Madrid.
Free NAT KVM | Free NAT LXC
It was supposedly to be already done, however I did a fuck up.
One function had the build hang for hours over hours.
This is fixed, thanks to GPT4 once again.
It should finish in a bit, once this is done, I will make a second test build, including a bunch of new locations.
Free NAT KVM | Free NAT LXC
i noticed its now smaller than the previous version i tested. did you reduce the coverage or change the format?
however, cool and useful idea to crosscheck other geolocators.
No, but I did noticed it too.
The only way I can explain it, is how the writer builds the database.
Basically I tried to aggregate the prefixes, to make the database even smaller, however it seems like the writer already does this.
So the size did not change after all, a while ago, the database had a lot of gaps, because the way it does ping bigger subnets.
These gaps have been closed, hence I do assume, that the writer now can optimize / aggregate the database even further, hence its smaller. The code definitely does not or has remove data.
Free NAT KVM | Free NAT LXC
I added a few more Locations for this test run.
Free NAT KVM | Free NAT LXC
Gonna be the biggest Friday run, yet.
Free NAT KVM | Free NAT LXC
Thanks to some people that followed my github repo, they actually gave me the idea, to make an mtr only geo database.
I did code it in less than 24 hours, however, the hit rates where to low and my brain did not manage to figure out yet where the fuck up was.
However, today I found the mapping error.
db/mtr.mmdb {'fail': 126502, 'success': 849072, 'percentage': 87.03306976200678}
From 64% to 74% now 87% hitrate, not bad.
I put the .mmdb as usual on https://yammdb.serv.app/mtr.mmdb
This database is only 4.2MB in size, only contains geo coordinates, right now.
I will add the usual info in a later build, such as country, continent etc.
Plus I will add a combined build later, with geo.mmdb and mtr.mmdb which first uses latency, then mtr for better accuracy.
Free NAT KVM | Free NAT LXC
Any plans to release a CSV version of the mmdb?
Somik.org - Server admins cheat codes
I updated the mtr.mmdb, it does now include continent, country and latency same as the geo.mmdb.
@somik Sure, I added the csv file: https://yammdb.serv.app/mtr.csv
Currently they are smaller than the geo.mmdb due to less measurements per subnet, this will change once I run them again.
Free NAT KVM | Free NAT LXC
I also added geo.mmdb as csv: https://yammdb.serv.app/geo.csv
There is no compression or anything, hence the file is so big.
Usually the .mmdb writer does the compression.
Free NAT KVM | Free NAT LXC
Best to have it without compression for maximum compatibility. I'm visiting our neighbouring country for some good foods now, so I'll test it out once I go back to Singapore.
On that note, seems like a lot of shops closed down over the last pandemic... Sad days.
Somik.org - Server admins cheat codes
Well, I guess a .mmtr only database with more tests per subnet, won't be happening.
It takes to long, roughly 1-2 days to finish a build with roughly 8+ million targets.
Even with 20 probes, running, at the same time.
Instead I am going to run another test build next week, which does .mtr on subnet's that doesn't ping and combines them with the latency results as mentioned before.
Free NAT KVM | Free NAT LXC
@Neoon I think your CSV headers (table titles) are missing for both geo.csv and mtr.csv
Somik.org - Server admins cheat codes
I will change that before the next build tomorraw.
Free NAT KVM | Free NAT LXC
It seems that a recent masscan is mandatory, I still used a 2 months old one.
The build just finished 3 hours earlier and with +7% higher hitrate, so 80% without mtr.
Lesson learned, masscan will be updated at least once per week, gg.
As soon I get the mtr integration working It should easily get a 90%+ hitrate.
Free NAT KVM | Free NAT LXC
I thought port scanning was frowned upon by most data-centers/hosts?
Btw, what's MRT?
90% hit rate as in for IPs or returning correct geo-loc/country?
Somik.org - Server admins cheat codes
I never said I was port scanning.
hit rate means, you get a result.
Accuracy depends on the amount of locations.
Free NAT KVM | Free NAT LXC
Isn't masscan used for port scanning? Are you using it to scan for something else?
Somik.org - Server admins cheat codes