{ "cells": [ { "cell_type": "markdown", "id": "ec713391", "metadata": {}, "source": [ "# Comparing models visually\n", "\n", "_Martin Vonk and DavĂ­d Brakenhoff, Artesia 2022_\n", "\n", "---\n", "\n", "In this notebook we introduce the `CompareModels` class in Pastas that can be used to compare models (visually), and construct custom model comparison plots." ] }, { "cell_type": "code", "execution_count": null, "id": "8d8db2ae", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "import pastas as ps\n", "\n", "ps.set_log_level(\"ERROR\")\n", "ps.show_versions()" ] }, { "cell_type": "markdown", "id": "dd6ccda8", "metadata": {}, "source": [ "## Load Time Series\n", "\n", "First load some data to construct models that we can compare with one another." ] }, { "cell_type": "code", "execution_count": null, "id": "75bb2bd6", "metadata": {}, "outputs": [], "source": [ "rain = pd.read_csv(\"./data/rain_nb1.csv\", index_col=0, parse_dates=True).squeeze()\n", "evap = pd.read_csv(\"./data/evap_nb1.csv\", index_col=0, parse_dates=True).squeeze()\n", "obs1 = pd.read_csv(\"./data/head_nb1.csv\", index_col=0, parse_dates=True).squeeze()\n", "obs2 = pd.read_csv(\"./data/nb18_head.csv\", index_col=0, parse_dates=True).squeeze()" ] }, { "cell_type": "markdown", "id": "e912466f", "metadata": {}, "source": [ "# Create models\n", "\n", "* Model1a: observations series 1 with linear RechargeModel and Exponential response function\n", "* Model1b: observations series 1 with linear RechargeModel and Gamma response function\n", "* Model1c: observation series 1 with precipitation and evaporation as separate stresses\n", "* Model2: has observation series 2 with linear RechargeModel and Exponential response function" ] }, { "cell_type": "code", "execution_count": null, "id": "a6d43a52", "metadata": { "tags": [] }, "outputs": [], "source": [ "ml1a = ps.Model(obs1, name=\"1a_exp\")\n", "ml1a.add_noisemodel(ps.ArNoiseModel())\n", "sm1a = ps.RechargeModel(rain, evap, rfunc=ps.Exponential(), name=\"recharge\")\n", "ml1a.add_stressmodel(sm1a)\n", "ml1a.solve(report=False)\n", "\n", "ml1b = ps.Model(obs1, name=\"1b_gamma\")\n", "ml1b.add_noisemodel(ps.ArNoiseModel())\n", "sm1b = ps.RechargeModel(rain, evap, rfunc=ps.Gamma(), name=\"recharge\")\n", "ml1b.add_stressmodel(sm1b)\n", "ml1b.solve(report=False)\n", "\n", "ml1c = ps.Model(obs1, name=\"1c_separate\")\n", "ml1c.add_noisemodel(ps.ArNoiseModel())\n", "sm2_1 = ps.StressModel(rain, rfunc=ps.Gamma(), name=\"Prec\", settings=\"prec\")\n", "sm2_2 = ps.StressModel(evap, rfunc=ps.Gamma(), name=\"Evap\", settings=\"evap\", up=False)\n", "ml1c.add_stressmodel([sm2_1, sm2_2])\n", "ml1c.solve(report=False)\n", "\n", "ml2 = ps.Model(obs2, name=\"model_2\")\n", "ml2.add_noisemodel(ps.ArNoiseModel())\n", "ml2.add_stressmodel(sm1a)\n", "ml2.solve(report=False)" ] }, { "cell_type": "markdown", "id": "5c59bf63", "metadata": {}, "source": [ "## CompareModels\n", "\n", "To compare models, just pass a list of models to `ps.CompareModels`. To plot\n", "the default comparison plot use the `plot()` method. \n", "\n", "The class itself is linked to a figure and a set of axes, so for each\n", "comparison a new CompareModels class should be created." ] }, { "cell_type": "code", "execution_count": null, "id": "4fdb24b8", "metadata": {}, "outputs": [], "source": [ "mc = ps.CompareModels(models=[ml1b, ml1a])\n", "mc.plot()" ] }, { "cell_type": "markdown", "id": "c35034bf", "metadata": {}, "source": [ "The layout of the plot is controlled by a so-called mosaic, which is\n", "essentially a 2D array with labels that define the positions of the axes. The\n", "mosaic for the plot above can be accessed through the `mc.mosaic` attribute.\n", "The oseries and model simulations are plotted in the \"sim\" axes which covers a\n", "2x2 region at the top left of the figure." ] }, { "cell_type": "code", "execution_count": null, "id": "841e5490", "metadata": {}, "outputs": [], "source": [ "mc.mosaic" ] }, { "cell_type": "markdown", "id": "3e9af07a", "metadata": {}, "source": [ "Access to the axes or the figure is available through `mc.axes` dictionary (e.g. for modifying axes labels, limits, or ticks) or `mc.figure`\n", "(e.g. for saving the figure)." ] }, { "cell_type": "code", "execution_count": null, "id": "0af6d42b", "metadata": {}, "outputs": [], "source": [ "# access the axes dictionary\n", "mc.axes" ] }, { "cell_type": "markdown", "id": "cb657ac7", "metadata": {}, "source": [ "## Customizing the comparison\n", "\n", "Perhaps you want to view all contributions on the same subplot (and the step\n", "responses as well). For this we need to customize the default plot layout and\n", "tell the plotting method we want several stresses to be plotted on the same\n", "axis.\n", "\n", "Customizing the layout (mosaic) can either be done manually, by providing a\n", "list of lists with axes labels, or we can modify the default mosaic slightly\n", "with `mc.get_default_mosaic`. By setting the number of stressmodels to 1 in\n", "this method there will be only one row for the contributions and response\n", "functions.\n", "\n", "We are now comparing models 1a and 1c (which had \"prec\" and \"evap\" as separate\n", "stresses)." ] }, { "cell_type": "code", "execution_count": null, "id": "975df6d4", "metadata": {}, "outputs": [], "source": [ "# initialize the comparison\n", "mc = ps.CompareModels(models=[ml1a, ml1c])" ] }, { "cell_type": "code", "execution_count": null, "id": "8501e845", "metadata": {}, "outputs": [], "source": [ "# get a custom mosaic by modifying the default mosaic slightly\n", "mosaic = mc.get_default_mosaic(n_stressmodels=1)\n", "mosaic" ] }, { "cell_type": "markdown", "id": "c6ddfdc1", "metadata": {}, "source": [ "The default behavior (when no custom mosaic is provided) is shown below. Note\n", "the difference, with 3 rows showing up for plotting stress models." ] }, { "cell_type": "code", "execution_count": null, "id": "645dade3", "metadata": {}, "outputs": [], "source": [ "# default mosaic when no customization is applied\n", "mc.get_default_mosaic()" ] }, { "cell_type": "markdown", "id": "385b3ef5", "metadata": {}, "source": [ "In order to force the `plot()` method to plot all stressmodels on the same axes\n", "we have to pass it some extra information. This extra information is given as\n", "the `smdict` and is a dictionary that contains an integer index as a key (i.e,\n", "0, 1, ...) and a list of stress model names as its value. The following\n", "dictionary tells `CompareModels` to combine any stress models with names\n", "\"recharge\", \"Prec\" or \"Evap\" from any model in the comparison list on the first\n", "row (with index 0)." ] }, { "cell_type": "code", "execution_count": null, "id": "014f6bdf", "metadata": {}, "outputs": [], "source": [ "smdict = {0: [\"recharge\", \"Prec\", \"Evap\"]}" ] }, { "cell_type": "code", "execution_count": null, "id": "9c610408", "metadata": {}, "outputs": [], "source": [ "# initialize the figure with our custom mosaic\n", "mc.initialize_figure(mosaic=mosaic)\n", "\n", "# now plot the model comparison\n", "mc.plot(smdict=smdict)" ] }, { "cell_type": "markdown", "id": "d4e1eb88", "metadata": {}, "source": [ "## Using individual plotting methods\n", "\n", "Each component (i.e. time series or table) in the plots above is controlled by a separate method, making it easy to plot certain components separately. Check out all the methods starting with `plot_*` to see which options are available. When one of these methods is called separately after creating a `CompareModels` object, a single axis object is created on which the time series for each model are shown. " ] }, { "cell_type": "code", "execution_count": null, "id": "be7e4cc3", "metadata": {}, "outputs": [], "source": [ "# compare model simulations\n", "mc = ps.CompareModels(models=[ml1b, ml1a])\n", "ax = mc.plot_simulation()\n", "_ = ax.legend(loc=(0, 1), frameon=False, ncol=2)" ] }, { "cell_type": "code", "execution_count": null, "id": "a2220992", "metadata": {}, "outputs": [], "source": [ "# compare model optimal parameters\n", "mc = ps.CompareModels(models=[ml1a, ml1b, ml1c])\n", "ax = mc.plot_table_params()" ] }, { "cell_type": "code", "execution_count": null, "id": "6e304300", "metadata": {}, "outputs": [], "source": [ "# compare ACF plots\n", "mc = ps.CompareModels(models=[ml1a, ml1c])\n", "ax = mc.plot_acf()\n", "ax.grid(True)" ] }, { "cell_type": "markdown", "id": "95b7bfa4", "metadata": {}, "source": [ "## Some helper functions\n", "\n", "The `ps.CompareModels` class contains some helper methods to obtain\n", "information from the models passed to the class. Using these can be especially\n", "useful to customize the tables you wish to show on your comparison figure." ] }, { "cell_type": "code", "execution_count": null, "id": "98d26526", "metadata": {}, "outputs": [], "source": [ "# get minimum tmin and maximum tmax\n", "mc.get_tmin_tmax()" ] }, { "cell_type": "code", "execution_count": null, "id": "db6e8b74", "metadata": {}, "outputs": [], "source": [ "# get table with all parameters\n", "mc.get_parameters()" ] }, { "cell_type": "code", "execution_count": null, "id": "b2e9b56c", "metadata": {}, "outputs": [], "source": [ "# get table with parameters selected by substring\n", "mc.get_parameters(param_selection=[\"_A\"])" ] }, { "cell_type": "code", "execution_count": null, "id": "895f4967", "metadata": {}, "outputs": [], "source": [ "# get table with all p-values of statistical tests\n", "mc.get_diagnostics()" ] }, { "cell_type": "code", "execution_count": null, "id": "a63062af", "metadata": {}, "outputs": [], "source": [ "# get table with fit metrics\n", "mc.get_metrics()" ] }, { "cell_type": "markdown", "id": "b7e4983c", "metadata": {}, "source": [ "## Equal vertical scaling between subplots\n", "\n", "It is possible set the vertical scale equal for all the subplots. Just\n", "initialize the figure with `initialize_adjust_height_figure()` instead of\n", "`initialize_figure()`. Note that this does require the default naming\n", "convention for the mosaic to be used (i.e. axes labels must include `\"sim\"`,\n", "`\"res\"` and `\"con*\"`).\n", "\n", "*__Note:__ the scaling is not perfect, probably because space taken up by the\n", "xticklabels, the legend and perhaps some other unknown quantities are not taken\n", "into consideration in the calculations, causing some small differences in the\n", "y-scales per subplot.*" ] }, { "cell_type": "code", "execution_count": null, "id": "a5d64d9d", "metadata": {}, "outputs": [], "source": [ "mc = ps.CompareModels(models=[ml1a, ml1c])\n", "mc.plot(adjust_height=True)" ] }, { "cell_type": "markdown", "id": "2d255026", "metadata": {}, "source": [ "If you want to customize the figure yourself and use the adjusted height functionality, make sure that you provide the `smdict` to the `initialize_adjust_height_figure()` method. Keep in mind that only the first column of the mosaic is used for scaling." ] }, { "cell_type": "code", "execution_count": null, "id": "7e83462a", "metadata": {}, "outputs": [], "source": [ "mosaic = [\n", " [\"sim\", \"sim\", \"met\"],\n", " [\"sim\", \"sim\", \"tab\"],\n", " [\"res\", \"res\", \"tab\"],\n", " [\"con0\", \"con0\", \"rf0\"],\n", " [\"con1\", \"con1\", \"rf1\"],\n", "]\n", "\n", "smdict = {0: [\"Prec\"], 1: [\"recharge\", \"Evap\"]}\n", "\n", "mc = ps.CompareModels([ml1a, ml1c])\n", "mc.initialize_adjust_height_figure(mosaic=mosaic, smdict=smdict)\n", "mc.plot(legend=True)" ] }, { "cell_type": "markdown", "id": "9707f37a", "metadata": {}, "source": [ "## Going a bit overboard\n", "\n", "Just to show you what is possible, here is an extreme example in which we do\n", "the following:\n", "\n", "- compare 2 models that are related (ml1a and ml1c with the same oseries), and \n", " one that isn't (ml2)\n", "- create a custom mosaic by manually providing one\n", "- plot just about every comparison we can think of\n", "- combine all the contributions of the different stresses on the same subplot\n", "- manually share the x-axes between certain plots\n", "- choose a different qualitative colormap\n", "\n", "Note that this comparison doesn't make all that much sense, but it does show\n", "you how easy it is to create custom comparison plots." ] }, { "cell_type": "code", "execution_count": null, "id": "6b348641", "metadata": {}, "outputs": [], "source": [ "mosaic = [\n", " [\"ose\", \"ose\", \"met\"],\n", " [\"sim\", \"sim\", \"tab\"],\n", " [\"res\", \"res\", \"tab\"],\n", " [\"con0\", \"con0\", \"dia\"],\n", " [\"acf\", \"acf\", \"dia\"],\n", "]\n", "\n", "mc = ps.CompareModels(models=[ml1a, ml1c, ml2])\n", "mc.initialize_figure(mosaic, figsize=(16, 10), cmap=\"Dark2\")\n", "\n", "# plot oseries on \"ose\" axis\n", "mc.plot_oseries(axn=\"ose\")\n", "\n", "# plot simulation on \"sim\" axis\n", "mc.plot_simulation()\n", "\n", "# plot metrics\n", "mc.plot_table_metrics()\n", "\n", "# table of optimal parameters but only those containing the gain \"_A\"\n", "mc.plot_table_params(param_selection=[\"_A\"])\n", "\n", "# plot residuals\n", "mc.plot_residuals()\n", "\n", "# plot all contributions on the same axis\n", "mc.plot_contribution(smdict={0: [\"Prec\", \"Evap\", \"Rech\", \"recharge\"]}, axn=\"con{i}\")\n", "\n", "# plot p-value for diagnostic tests\n", "mc.plot_table_diagnostics(axn=\"dia\", diag_col=r\"Reject H0 ($\\alpha$=0.05)\")\n", "\n", "# plot ACF\n", "mc.plot_acf(axn=\"acf\")\n", "\n", "# turn grid on\n", "for axlbl in mc.axes:\n", " mc.axes[axlbl].grid(True)\n", "\n", "# share x-axes between plots\n", "mc.share_xaxes([mc.axes[\"ose\"], mc.axes[\"sim\"], mc.axes[\"res\"], mc.axes[\"con0\"]])\n", "\n", "# set tight layout\n", "mc.figure.tight_layout(pad=0.0)" ] }, { "cell_type": "code", "execution_count": null, "id": "969cb499", "metadata": {}, "outputs": [], "source": [ "mc = ps.CompareModels(models=[ml1a, ml1c, ml2])\n", "\n", "mosaic = [\n", " [\"ose\", \"ose\", \"met\"],\n", " [\"sim\", \"sim\", \"tab\"],\n", " [\"res\", \"res\", \"tab\"],\n", " [\"con0\", \"con0\", \"tab\"],\n", " [\"con1\", \"con1\", \"tab\"],\n", " [\"stress\", \"stress\", \"dia\"],\n", " [\"acf\", \"acf\", \"dia\"],\n", "]\n", "\n", "smdict = {0: [\"recharge\", \"Prec\"], 1: [\"Evap\"]}\n", "\n", "mc.initialize_adjust_height_figure(\n", " mosaic, figsize=(16, 10), cmap=\"Dark2\", smdict=smdict\n", ")\n", "mc.plot_oseries(axn=\"ose\")\n", "mc.plot_simulation()\n", "mc.plot_table_metrics(metric_selection=[\"evp\", \"rsq\"])\n", "mc.plot_table_params(param_selection=[\"_A\"], param_col=\"stderr\")\n", "mc.plot_residuals()\n", "mc.plot_stress()\n", "mc.plot_contribution(axn=\"con{i}\")\n", "mc.plot_table_diagnostics(axn=\"dia\", diag_col=\"Statistic\")\n", "mc.plot_acf(axn=\"acf\")\n", "for axlbl in mc.axes:\n", " mc.axes[axlbl].grid(True)\n", "ps.plots.share_xaxes(\n", " [\n", " mc.axes[\"ose\"],\n", " mc.axes[\"sim\"],\n", " mc.axes[\"res\"],\n", " mc.axes[\"con0\"],\n", " mc.axes[\"con1\"],\n", " mc.axes[\"stress\"],\n", " ]\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "a10bb361", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" }, "vscode": { "interpreter": { "hash": "dace5e1b41a98a8e52d2a8eebc3b981caf2c12e7a76736ebfb89a489e3b62e79" } } }, "nbformat": 4, "nbformat_minor": 5 }