error-monitoring.rst 3.2 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364
  1. .. _host_error_monitoring:
  2. Error monitoring
  3. =================
  4. When you run a FlexMeasures server, you want to stay on top of things going wrong. We added two ways of doing that:
  5. - You can connect to Sentry, so that all errors will be sent to your Sentry account. Add the token you got from Sentry in the config setting :ref:`sentry_access_token` and you're up and running!
  6. - Another source of crucial errors are things that did not even happen! For instance, a (bot) user who is supposed to send data regularly, fails to connect with FlexMeasures. Or, a task to import prices from a day-ahead market, which you depend on later for scheduling, fails silently.
  7. Let's look at how to monitor for things not happening in more detail:
  8. Monitoring the time users were last seen
  9. -----------------------------------------
  10. The CLI task ``flexmeasures monitor last-seen`` lets you be alerted if a user has contacted your FlexMeasures instance longer ago than you expect. This is most useful for bot users (a.k.a. scripts).
  11. Here is an example for illustration:
  12. .. code-block:: bash
  13. $ flexmeasures monitor last-seen --account-role SubscriberToServiceXYZ --user-role bot --maximum-minutes-since-last-seen 100
  14. As you see, users are filtered by roles. You might need to add roles before this works as you want.
  15. .. todo:: Adding roles and assigning them to users and/or accounts is not supported by the CLI or UI yet (besides ``flexmeasures add account-role``). This is `work in progress <https://github.com/FlexMeasures/flexmeasures/projects/18>`_. Right now, it requires you to add roles on the database level.
  16. Monitoring task runs
  17. ---------------------
  18. The CLI task ``flexmeasures monitor latest-run`` lets you be alerted when tasks have not successfully run at least so-and-so many minutes ago.
  19. The alerts will come in via Sentry, but you can also send them to email addresses with the config setting :ref:`monitoring_mail_recipients`.
  20. For illustration, here is one example of how we monitor the latest run times of tasks on a server ― the below is run in a cron script every hour and checks if every listed task ran 60, 6 or 1440 minutes ago, respectively:
  21. .. code-block:: bash
  22. $ flexmeasures monitor latest-run --task get_weather_forecasts 60 --task get_recent_meter_data 6  --task import_epex_prices 1440
  23. The first task (get_weather_forecasts) is actually supported within FlexMeasures, while the other two sit in plugins we wrote.
  24. This task status monitoring is enabled by decorating the functions behind these tasks with:
  25. .. code-block:: python
  26. @task_with_status_report
  27. def my_function():
  28. ...
  29. Then, FlexMeasures will log if this task ran, and if it succeeded or failed. The result is in the table ``latest_task_runs``, and that's where the ``flexmeasures monitor latest-run`` will look.
  30. .. note:: The decorator should be placed right before the function (after all other decorators).
  31. Per default the function name is used as task name. If the number of tasks accumulate (e.g. by using multiple plugins that each define a task or two), it is useful to come up with more dedicated names. You can add a custom name as argument to the decorator:
  32. .. code-block:: python
  33. @task_with_status_report("pluginA_myFunction")
  34. def my_function():
  35. ...