2023 and Ruby is still ridiculously slow for some things
I am doing some machine learning stuff for recommendations, and needed to preprocess some train data before feeding it into the recommender. The processing of a dataset done in Ruby was taking up to 20 seconds to process half a million items despite a ton of optimizations; I ported the same processing to a Postgres function (it's not just queries, I need to build a temporary dataset and then another derived from it with some processing that requires looping; I could in theory do it with CTEs but it would be ridiculously complex in comparison). This way it happens in the database directly, and it now takes just 50 milliseconds for the exact same processing. 400_000 times faster than the exact same thing done in Ruby. I still cannot believe the difference. I was planning to port the Ruby code to Crystal due to the difference in speed but I am gonna leave it in Postgres. It's 2023 and for some things Ruby is still ridiculously slow compared to alternatives 😦
Comments
what is Ruby used for? maybe its not the right tool for your use case!
Isn't everyone and their dog moving to polars for this?
Polars or Polaris? Not familiar with it and Google doesn’t help
Rust implementation of dataframes that seems to be taking over from Pandas slowly but surely
https://github.com/pola-rs/polars
Gotcha, thanks
BTW I am super excited because Google is organizing a workshop on machine learning exclusively for our company in their Helsinki office. I can't wait!