most production ml pipelines have the same architecture train a model in python, pickle it, spin up a serving layer, then shuttle data back and forth between your database and the inference service
this works but it’s annoying you add latency with every network hop, you need to maintain separate infrastructure, and you’re solving a problem that maybe doesn’t need to exist
how gradient boosting works
gradient boosting builds an ensemble of decision trees sequentially each tree tries to correct the errors of all previous trees
you start with a simple prediction, then add a tree that learns where you were wrong, then add another tree that fixes those mistakes, and repeat the final prediction is the sum of all trees
mathematically it’s sophisticated but conceptually it’s just iterative error correction 100 trees might sound complex but each tree is just a series of if-then rules
the insight
tree based models are just nested if statements a gradient boosting model with 100 trees is really 100 decision paths that get summed together there’s nothing inherently complex about that logic
sql has case statements databases are already optimized for executing queries over tabular data so why not just generate the sql equivalent of your model and run predictions where the data lives
what sqlgbm does
sqlgbm takes a trained lightgbm or xgboost model and spits out a sql query that reproduces its predictions
you train your model normally
clf = lgb.LGBMClassifier(n_estimators=100, max_depth=5)
clf.fit(X, y)
then convert it to sql
from sqlgbm import SQLGBM
sqlgbm = SQLGBM(clf)
sql = sqlgbm.generate_query('my_table', 'probability')
now you have a query you can run directly in postgres or mysql or duckdb no python runtime, no model server, no serialization overhead
why this is useful
if your features live in a database and you need to score them this eliminates a ton of complexity
batch scoring thousands of rows becomes a single query instead of exporting data, calling an api, waiting for results, and writing back
for real time predictions you can materialize the query as a view or use it in a stored procedure the database handles caching and query optimization automatically
latency drops because you’re not serializing data or making network calls deployment becomes simpler because there’s no separate service to manage
tradeoffs
this only makes sense for certain use cases
tree models translate cleanly to sql but neural networks don’t very deep models generate huge queries that some databases struggle with the sql isn’t optimized yet so there’s room for improvement
multiclass classification isn’t supported yet and the api is rough around the edges it’s early stage software
but for binary classification with gradient boosting on data that lives in sql it works well i use it for simple scoring tasks where spinning up model serving infrastructure feels like overkill
summary
the core idea is that ml inference doesn’t always need special infrastructure sometimes you can just generate sql and let your database do the work
sqlgbm converts lightgbm and xgboost models to queries that run anywhere sql does no runtime dependencies, no separate services, no data movement
useful when you want simple deployment and your data already lives in a database
check it out at github.com/mattismegevand/sqlgbm