-
-
Save Nikolay-Lysenko/06769d701c1d9c9acb9a66f2f9d7a6c7 to your computer and use it in GitHub Desktop.
| import numpy as np | |
| def xgb_quantile_eval(preds, dmatrix, quantile=0.2): | |
| """ | |
| Customized evaluational metric that equals | |
| to quantile regression loss (also known as | |
| pinball loss). | |
| Quantile regression is regression that | |
| estimates a specified quantile of target's | |
| distribution conditional on given features. | |
| @type preds: numpy.ndarray | |
| @type dmatrix: xgboost.DMatrix | |
| @type quantile: float | |
| @rtype: float | |
| """ | |
| labels = dmatrix.get_label() | |
| return ('q{}_loss'.format(quantile), | |
| np.nanmean((preds >= labels) * (1 - quantile) * (preds - labels) + | |
| (preds < labels) * quantile * (labels - preds))) | |
| def xgb_quantile_obj(preds, dmatrix, quantile=0.2): | |
| """ | |
| Computes first-order derivative of quantile | |
| regression loss and a non-degenerate | |
| substitute for second-order derivative. | |
| Substitute is returned instead of zeros, | |
| because XGBoost requires non-zero | |
| second-order derivatives. See this page: | |
| https://github.com/dmlc/xgboost/issues/1825 | |
| to see why it is possible to use this trick. | |
| However, be sure that hyperparameter named | |
| `max_delta_step` is small enough to satisfy: | |
| ```0.5 * max_delta_step <= | |
| min(quantile, 1 - quantile)```. | |
| @type preds: numpy.ndarray | |
| @type dmatrix: xgboost.DMatrix | |
| @type quantile: float | |
| @rtype: tuple(numpy.ndarray) | |
| """ | |
| try: | |
| assert 0 <= quantile <= 1 | |
| except AssertionError: | |
| raise ValueError("Quantile value must be float between 0 and 1.") | |
| labels = dmatrix.get_label() | |
| errors = preds - labels | |
| left_mask = errors < 0 | |
| right_mask = errors > 0 | |
| grad = -quantile * left_mask + (1 - quantile) * right_mask | |
| hess = np.ones_like(preds) | |
| return grad, hess | |
| # Example of usage: | |
| # bst = xgb.train(hyperparams, train, num_rounds, | |
| # obj=xgb_quantile_obj, feval=xgb_quantile_eval) |
Thanks for the prompt response!. I have checked with both LightGBM and CatBoost. There is no doubt that their interval level is very stable. However, I could not get an improved forecast. In fact, I have a much better forecast XGBoost of H2o. Yet, H2o does not provide support for the Quantile regression. I tried to use prediction intervals using functions from this link (https://towardsdatascience.com/regression-prediction-intervals-with-xgboost-428e0a018b). However, the interval range gets very narrow and when the interval is increased upper limits get flat and there is no impact on the lower interval. I am thinking if I can get a better interval from using your function and then wrapped it up with the prediction of XGboost H2o. I hope this can be done.
There are some questions about license. This gist is released under MIT License, so you can use it in your projects.
@Shafi2016, this can be done like this:
However, this gist is quite old. Now, there are better solutions. I recommend you to look at CatBoost or LightGBM, because these tools have native support of quantile regression as well as performance comparable to that of XGBoost.