Note

Click here to download the full example code or to run this example in your browser via Binder

regression

from ai4water.datasets import busan_beach
from skopt.plots import plot_objective
from autotab import OptimizePipeline

data = busan_beach()

pl = OptimizePipeline(
    inputs_to_transform=data.columns.tolist()[0:-1],
    outputs_to_transform=data.columns.tolist()[-1:],
    parent_iterations=30,
    child_iterations=0,  # don't optimize hyperparamters only for demonstration
    parent_algorithm='bayes',
    child_algorithm='random',
    eval_metric='mse',
    monitor=['r2', 'r2_score'],
    models=[ "LinearRegression",
            "LassoLars",
            "Lasso",
            "RandomForestRegressor",
            "HistGradientBoostingRegressor",
             "CatBoostRegressor",
             "XGBRegressor",
             "LGBMRegressor",
             "GradientBoostingRegressor",
             "ExtraTreeRegressor",
             "ExtraTreesRegressor"
             ],

    input_features=data.columns.tolist()[0:-1],
    output_features=data.columns.tolist()[-1:],
    split_random=True,
)

results = pl.fit(data=data, process_results=False)

Out:

/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/experimental/enable_hist_gradient_boosting.py:17: UserWarning: Since version 1.0, it is not needed to import enable_hist_gradient_boosting anymore. HistGradientBoostingClassifier and HistGradientBoostingRegressor are now stable and can be normally imported from sklearn.ensemble.
  "Since version 1.0, "
Iter  mse                r2              r2_score        mse
WARNING:tensorflow:From /home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/ai4water/utils/utils.py:1685: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
0     2.73e+15           0.2340123       0.02512779
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
1
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
2
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
3                        0.4546905
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
4
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_base.py:138: FutureWarning: The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:

from sklearn.pipeline import make_pipeline

model = make_pipeline(StandardScaler(with_mean=False), LassoLars())

If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:

kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)

Set parameter alpha to: original_alpha * np.sqrt(n_samples).
  FutureWarning,
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
5
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
6
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_coordinate_descent.py:648: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.223e+14, tolerance: 2.710e+11
  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
7
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
  ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
8
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
9     2.71e+15                           0.03148921
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
10    2.69e+15                           0.03712889
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
11
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_base.py:138: FutureWarning: The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:

from sklearn.pipeline import make_pipeline

model = make_pipeline(StandardScaler(with_mean=False), LassoLars())

If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:

kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)

Set parameter alpha to: original_alpha * np.sqrt(n_samples).
  FutureWarning,
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
12
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
  ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
13
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
14    2.53e+15                           0.09695015
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
  ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
15
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
16
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
17                       0.5653205
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
18
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/preprocessing/_data.py:3218: RuntimeWarning: overflow encountered in power
  out[pos] = (np.power(x[pos] + 1, lmbda) - 1) / lmbda
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
  ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
19
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/base.py:444: UserWarning: X has feature names, but PowerTransformer was fitted without feature names
  f"X has feature names, but {self.__class__.__name__} was fitted without"
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/base.py:444: UserWarning: X has feature names, but PowerTransformer was fitted without feature names
  f"X has feature names, but {self.__class__.__name__} was fitted without"
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
20
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/base.py:444: UserWarning: X has feature names, but QuantileTransformer was fitted without feature names
  f"X has feature names, but {self.__class__.__name__} was fitted without"
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/base.py:444: UserWarning: X has feature names, but QuantileTransformer was fitted without feature names
  f"X has feature names, but {self.__class__.__name__} was fitted without"
/home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/autotab/_main.py:2472: RuntimeWarning: All-NaN axis encountered
  best_so_far = func(self.metrics_best_.loc[:self.parent_iter_, _metric])
21    5.4e+13            0.6369561       0.4878222       5.40484e+13
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
22
23
24                       0.7440143
25
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/preprocessing/_data.py:3218: RuntimeWarning: overflow encountered in power
  out[pos] = (np.power(x[pos] + 1, lmbda) - 1) / lmbda
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
  ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:244: RuntimeWarning: overflow encountered in reduce
  ret = umr_sum(x, axis, dtype, out, keepdims=keepdims, where=where)
26                       0.8295779
27
28
29

pl.optimizer_._plot_convergence(save=False)

pl.optimizer_._plot_parallel_coords(figsize=(16, 8), save=False)

pl.optimizer_._plot_distributions(save=False)

Out:

<Figure size 2100x2100 with 16 Axes>

pl.optimizer_.plot_importance(save=False)

_ = plot_objective(results)

pl.optimizer_._plot_evaluations(save=False)

pl.optimizer_._plot_edf(save=False)

pl.bfe_all_best_models(data=data)

Out:

/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/numpy/core/_methods.py:233: RuntimeWarning: overflow encountered in multiply
  x = um.multiply(x, x, out=x)
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_base.py:138: FutureWarning: The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:

from sklearn.pipeline import make_pipeline

model = make_pipeline(StandardScaler(with_mean=False), LassoLars())

If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:

kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)

Set parameter alpha to: original_alpha * np.sqrt(n_samples).
  FutureWarning,

pl.dumbbell_plot(data=data, save=False)

Out:

/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_base.py:138: FutureWarning: The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:

from sklearn.pipeline import make_pipeline

model = make_pipeline(StandardScaler(with_mean=False), LassoLars())

If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:

kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)

Set parameter alpha to: original_alpha * np.sqrt(n_samples).
  FutureWarning,
/home/docs/checkouts/readthedocs.org/user_builds/autotab/envs/latest/lib/python3.7/site-packages/sklearn/linear_model/_coordinate_descent.py:648: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.075e+16, tolerance: 9.458e+12
  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive

<AxesSubplot:xlabel='mse', ylabel='Models'>

pl.dumbbell_plot(data=data, metric_name='r2', save=False)

Out:

<AxesSubplot:xlabel='r2', ylabel='Models'>

pl.taylor_plot(data=data, save=False)

Out:

<Figure size 640x480 with 1 Axes>

pl.compare_models()

Out:

<PolarAxesSubplot:>

pl.compare_models(plot_type="bar_chart")

Out:

<AxesSubplot:xlabel='mse'>

pl.compare_models("r2", plot_type="bar_chart")

Out:

<AxesSubplot:xlabel='r2'>

print(f"all results are save in {pl.path} folder")

Out:

all results are save in /home/docs/checkouts/readthedocs.org/user_builds/autotab/checkouts/latest/examples/results/pipeline_opt_20220504_162449 folder

pl.cleanup()

Total running time of the script: ( 3 minutes 7.167 seconds)

Gallery generated by Sphinx-Gallery