Note
Click here to download the full example code or to run this example in your browser via Binder
regression
from ai4water.datasets import busan_beach
from skopt.plots import plot_objective
from autotab import OptimizePipeline
data = busan_beach()
pl = OptimizePipeline(
inputs_to_transform=data.columns.tolist()[0:-1],
outputs_to_transform=data.columns.tolist()[-1:],
parent_iterations=30,
child_iterations=0, # don't optimize hyperparamters only for demonstration
parent_algorithm='bayes',
child_algorithm='random',
eval_metric='mse',
monitor=['r2', 'r2_score'],
models=[ "LinearRegression",
"LassoLars",
"Lasso",
"RandomForestRegressor",
"HistGradientBoostingRegressor",
"CatBoostRegressor",
"XGBRegressor",
"LGBMRegressor",
"GradientBoostingRegressor",
"ExtraTreeRegressor",
"ExtraTreesRegressor"
],
input_features=data.columns.tolist()[0:-1],
output_features=data.columns.tolist()[-1:],
split_random=True,
)
results = pl.fit(data=data, process_results=False)
Out:
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/experimental/enable_hist_gradient_boosting.py:17: UserWarning: Since version 1.0, it is not needed to import enable_hist_gradient_boosting anymore. HistGradientBoostingClassifier and HistGradientBoostingRegressor are now stable and can be normally imported from sklearn.ensemble.
"Since version 1.0, "
Iter mse r2 r2_score mse
WARNING:tensorflow:From /home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/ai4water/utils/utils.py:1685: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
0 2.73e+15 0.2340123 0.02512779
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
1
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
2
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
3 0.4546905
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
4
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_base.py:138: FutureWarning: The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:
from sklearn.pipeline import make_pipeline
model = make_pipeline(StandardScaler(with_mean=False), LassoLars())
If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:
kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)
Set parameter alpha to: original_alpha * np.sqrt(n_samples).
FutureWarning,
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
5
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
6
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_coordinate_descent.py:648: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.223e+14, tolerance: 2.710e+11
coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
7
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
8
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
9 2.71e+15 0.03148921
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
10 2.69e+15 0.03712889
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
11
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_base.py:138: FutureWarning: The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:
from sklearn.pipeline import make_pipeline
model = make_pipeline(StandardScaler(with_mean=False), LassoLars())
If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:
kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)
Set parameter alpha to: original_alpha * np.sqrt(n_samples).
FutureWarning,
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
12
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
13
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
14 2.53e+15 0.09695015
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
15
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
16
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
17 0.5653205
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
18
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/preprocessing/_data.py:3218: RuntimeWarning: overflow encountered in power
out[pos] = (np.power(x[pos] + 1, lmbda) - 1) / lmbda
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
19
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/base.py:444: UserWarning: X has feature names, but PowerTransformer was fitted without feature names
f"X has feature names, but {self.__class__.__name__} was fitted without"
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/base.py:444: UserWarning: X has feature names, but PowerTransformer was fitted without feature names
f"X has feature names, but {self.__class__.__name__} was fitted without"
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
20
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/base.py:444: UserWarning: X has feature names, but QuantileTransformer was fitted without feature names
f"X has feature names, but {self.__class__.__name__} was fitted without"
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/base.py:444: UserWarning: X has feature names, but QuantileTransformer was fitted without feature names
f"X has feature names, but {self.__class__.__name__} was fitted without"
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
21 5.4e+13 0.6369561 0.4878222 5.40484e+13
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
22
23
24 0.7440143
25
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/preprocessing/_data.py:3218: RuntimeWarning: overflow encountered in power
out[pos] = (np.power(x[pos] + 1, lmbda) - 1) / lmbda
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
26 0.8295779
27
28
29
pl.optimizer_._plot_convergence(save=False)

pl.optimizer_._plot_parallel_coords(figsize=(16, 8), save=False)

pl.optimizer_._plot_distributions(save=False)

Out:
<Figure size 2100x2100 with 16 Axes>
pl.optimizer_.plot_importance(save=False)

_ = plot_objective(results)

pl.optimizer_._plot_evaluations(save=False)

pl.optimizer_._plot_edf(save=False)

pl.bfe_all_best_models(data=data)

Out:
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_base.py:138: FutureWarning: The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:
from sklearn.pipeline import make_pipeline
model = make_pipeline(StandardScaler(with_mean=False), LassoLars())
If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:
kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)
Set parameter alpha to: original_alpha * np.sqrt(n_samples).
FutureWarning,
pl.dumbbell_plot(data=data, save=False)

Out:
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_base.py:138: FutureWarning: The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:
from sklearn.pipeline import make_pipeline
model = make_pipeline(StandardScaler(with_mean=False), LassoLars())
If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:
kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)
Set parameter alpha to: original_alpha * np.sqrt(n_samples).
FutureWarning,
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_coordinate_descent.py:648: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.075e+16, tolerance: 9.458e+12
coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive
<AxesSubplot:xlabel='mse', ylabel='Models'>
pl.dumbbell_plot(data=data, metric_name='r2', save=False)

Out:
<AxesSubplot:xlabel='r2', ylabel='Models'>
pl.taylor_plot(data=data, save=False)

Out:
<Figure size 640x480 with 1 Axes>
pl.compare_models()

Out:
<PolarAxesSubplot:>
pl.compare_models(plot_type="bar_chart")

Out:
<AxesSubplot:xlabel='mse'>
pl.compare_models("r2", plot_type="bar_chart")

Out:
<AxesSubplot:xlabel='r2'>
print(f"all results are save in {pl.path} folder")
Out:
all results are save in /home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/examples/results/pipeline_opt_20220504_162449 folder
pl.cleanup()
Total running time of the script: ( 3 minutes 7.167 seconds)