52 Commits

Author SHA1 Message Date
9f07e49cdd Alerting: Add extended definition to prometheus alert rules api (#103320)
* Alerting: Add extended definition to prometheus alert rules api

This adds `isPaused` and `notificationSettings` to the paginated rules api to enable the paginated view of GMA rules.

refactor: make alert rule status and state retrieval extensible

This lets us get status from other sources than the local ruler.

* update swagger spec

* add safety checks in test
2025-04-23 21:14:09 +01:00
a913c5426d Alerting: Fix flaky TestIntegrationPrometheusRules test (#103886) 2025-04-11 19:50:46 +02:00
695ac91290 Alerting: Add backend support for keep_firing_for (#100750)
What is this feature?

This PR introduces a new alert rule configuration option, keep_firing_for (Prometheus documentation).

keep_firing_for prevents alerts from resolving immediately after the alert condition returns to normal. Instead, they transition into a "Recovering" state and are not considered resolved by the Alertmanager. Once the recovery period ends (or after the next evaluation if it is bigger than keep_firing_for), the alert transitions to "Normal" if it doesn't start alerting again:

Before                                          

+----------+     +----------+                    
| Alerting |---->|  Normal  |                    
+----------+     +----------+                    

-----
After

+----------+      +------------+     +----------+
| Alerting |----->| Recovering |---->|  Normal  |
+----------+      +------------+     +----------+                                                 

Why do we need this feature?

This feature prevents flapping alerts by adding a recovery period. This helps avoid false resolutions caused by brief alert
2025-03-18 11:24:48 +01:00
5aeaccadff Alerting: Add read-only GMA rules to the new list view (#98116)
* Reuse prom groups generator between GMA, external DS and list view

* Improve generators, add initial support for GMA in grouped view components

* Improve handling of GMA rules

* Split componentes into files

* Improve error handling, simplify groups grouping

* Extract grafana rules component

* Reset yarn.lock

* Reset yarn.lock 2

* Update filters, adjust file names, add folder display name to GMA rules

* Re-enable filtering for cloud rules

* Rename AlertRuleLoader

* Add missing translations, fix lint errors

* Remove unused imports, update translations

* Fix responses in BE tests

* Update backend tests

* Update integration test

* Tidy up group page size constants

* Add error throwing to getGroups endpoint to prevent grafana usage

* Refactor FilterView to remove exhaustive check

* Refactor common props for grafana rule rendering

* Unify identifiers' discriminators, add comments, minor refactor

* Update translations

* Remove unnecessary prev page condition, add a few explanations

---------

Co-authored-by: fayzal-g <fayzal.ghantiwala@grafana.com>
Co-authored-by: Tom Ratcliffe <tom.ratcliffe@grafana.com>
2025-01-15 11:36:32 +01:00
64c93217ff Alerting: Fix incorrect 500 code on missing alert rule dashboardUID / panelID (#96491) 2024-11-14 21:24:48 +02:00
910553df20 Actionsets: Add cfg option for only writing actionsets (#88367)
* test

* test

* missed test

* fix review comments
2024-05-28 16:32:23 +01:00
105313f5c2 RBAC: Adding action set resolver for RBAC evaluation (#86801)
* add action set resolver

* rename variables

* some fixes and some tests

* more tests

* more tests, and put action set storing behind a feature toggle

* undo change from cfg to feature mgmt - will cover it in a separate PR due to the amount of test changes

* fix dependency cycle, update some tests

* add one more test

* fix for feature toggle check not being set on test configs

* linting fixes

* check that action set name can be split nicely

* clean up tests by turning GetActionSetNames into a function

* undo accidental change

* test fix

* more test fixes
2024-05-09 10:18:03 +01:00
6ea97e41fb Alerting: Consistently return Prometheus-style responses from rules APIs. (#86600)
* Alerting: Consistently return Prometheus-style responses from rules APIs.

This commit is part refactor and part fix. The /rules API occasionally returns
error responses which are inconsistent with other error responses. This fixes
that, and adds a function to map from Prometheus error type and HTTP code.

* Fix integration tests

* Linter happiness

* Make linter more happy

* Fix up one more place returning non-Prometheus responses
2024-04-19 21:03:20 +02:00
ddabef9895 RBAC: Add actionsets struct and write path (#86108)
* Add actionsets struct and failing test

* update from review

* review comments

* review comments update

* refactor: create interface

* actionset service

* fix tests

* move from wireoss to wire

* Apply suggestions from code review

remove unnecessary comments

Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>

* nil for the actionsetservice

* Revert "nil for the actionsetservice"

This reverts commit e3d3cc817180c1927dea430191f1e940f5c01272.

---------

Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>
2024-04-19 15:38:14 +01:00
faa1244518 Chore: Replace sqlstore with db interface (#85366)
* replace sqlstore with db interface in a few packages

* remove from stats

* remove sqlstore in admin test

* remove sqlstore from api plugin tests

* fix another createUser

* remove sqlstore in publicdashboards

* remove sqlstore from orgs

* clean up orguser test

* more clean up in sso

* clean up service accounts

* further cleanup

* more cleanup in accesscontrol

* last cleanup in accesscontrol

* clean up teams

* more removals

* split cfg from db in testenv

* few remaining fixes

* fix test with bus

* pass cfg for testing inside db as an option

* set query retries when no opts provided

* revert golden test data

* rebase and rollback
2024-04-04 15:04:47 +02:00
2497db4bd6 Alerting: Add UID of rules to response that were affected by update group request (#75985)
* update storage's method InstertRules to return ids of added rules as slice to keep the same order as rules in the argument
* schematize response of update rule group endpoint, add created, updated, deleted fields that contain UID of affected rules.
* update integration tests to use the new fields
2023-10-07 01:11:24 +03:00
025b2f3011 Chore: use any rather than interface{} (#74066) 2023-08-30 18:46:47 +03:00
cfa1a2c55f RBAC: Split non-empty scopes into kind, attribute and identifier fields for better search performance (#71933)
* add a feature toggle

* add the fields for attribute, kind and identifier to permission

Co-authored-by: Kalle Persson <kalle.persson@grafana.com>

* set the new fields when new permissions are stored

* add migrations

Co-authored-by: Kalle Persson <kalle.persson@grafana.com>

* remove comments

* Update pkg/services/accesscontrol/migrator/migrator.go

Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>

* feedback: put column migrations behind the feature toggle, added an index, changed how wildcard scopes are split

* PR feedback: add a comment and revert an accidentally changed file

* PR feedback: handle the case with : in resource identifier

* switch from checking feature toggle through cfg to checking it through featuremgmt

* don't put the column migrations behind a feature toggle after all - this breaks permission queries from db

---------

Co-authored-by: Kalle Persson <kalle.persson@grafana.com>
Co-authored-by: Gabriel MABILLE <gamab@users.noreply.github.com>
2023-07-21 15:23:01 +01:00
19ebb079ba Alerting: Add limits and filters to Prometheus Rules API (#66627)
This commit adds support for limits and filters to the Prometheus Rules
API.

Limits:

It adds a number of limits to the Grafana flavour of the Prometheus Rules
API:

- `limit` limits the maximum number of Rule Groups returned
- `limit_rules` limits the maximum number of rules per Rule Group
- `limit_alerts` limits the maximum number of alerts per rule

It sorts Rule Groups and rules within Rule Groups such that data in the
response is stable across requests. It also returns summaries (totals)
for all Rule Groups, individual Rule Groups and rules.

Filters:

Alerts can be filtered by state with the `state` query string. An example
of an HTTP request asking for just firing alerts might be
`/api/prometheus/grafana/api/v1/rules?state=alerting`.

A request can filter by two or more states by adding additional `state`
query strings to the URL. For example `?state=alerting&state=normal`.

Like the alert list panel, the `firing`, `pending` and `normal` state are
first compared against the state of each alert rule. All other states are
ignored. If the alert rule matches then its alert instances are filtered
against states once more.

Alerts can also be filtered by labels using the `matcher` query string.
Like `state`, multiple matchers can be provided by adding additional
`matcher` query strings to the URL.

The match expression should be parsed using existing regular expression
and sent to the API as URL-encoded JSON in the format:

{
    "name": "test",
    "value": "value1",
    "isRegex": false,
    "isEqual": true
}

The `isRegex` and `isEqual` options work as follows:

| IsEqual | IsRegex  | Operator |
| ------- | -------- | -------- |
| true    | false    |    =     |
| true    | true     |    =~    |
| false   | true     |    !~    |
| false   | false    |    !=    |
2023-04-17 17:45:06 +01:00
bd29071a0d Revert "Alerting: Add limits to the Prometheus Rules API" (#65842) 2023-04-03 15:20:37 +00:00
d96b0a71d3 Alerting: Add limits to the Prometheus Rules API (#65169)
This commit adds a number of limits to the Grafana flavor of the
Prometheus Rules API:

1. `limit` limits the maximum number of Rule Groups returned
2. `limit_rules` limits the maximum number of rules per Rule Group
3. `limit_alerts` limits the maximum number of alerts per rule

It sorts Rule Groups and rules within Rule Groups such that data in the
response is stable across requests. It also returns summaries (totals) for
all Rule Groups, individual Rule Groups and rules.
2023-04-03 10:17:02 +01:00
52a0f59706 Alerting: introduce AlertQuery in definitions package (#63825)
* copy AlertQuery from ngmodels to the definition package
* replaces usages of ngmodels.AlertQuery in API models
* create a converter between models of AlertQuery
---------

Co-authored-by: Alex Moreno <alexander.moreno@grafana.com>
2023-03-27 11:55:13 -04:00
91221bc436 Expressions: Fixes the issue showing expressions editor (#62510)
* Use suggested value for uid

* update the snapshot

* use __expr__

* replace all -100 with __expr__

* update snapshot

* more changes

* revert redundant change

* Use expr.DatasourceUID where it's possible

* generate files
2023-01-31 18:50:10 +01:00
2db8ed9441 Chore: All tests under pkg/tests should be integration tests (#59521)
* Chore: All tests under pkg/tests should be integrationtests

* run alerting integration tests only for sqlite
2022-12-09 08:11:56 +01:00
1b933ff3ed RBAC: Move resource permissions store to service package (#53815)
* Rename file to store

* Move resource permission specific database functions to
resourcepermissions package

* Wire: Remove interface bind

* RBAC: Remove injection of resourcepermission Store

* RBAC: Export store constructor

* Tests: Use resource permission package to initiate store used in tests

* RBAC: Remove internal types package and move to resourcepermissions
package

* RBAC: Run database tests as itegration tests
2022-08-18 09:43:45 +02:00
Jo
062d255124 Handle ioutil deprecations (#53526)
* replace ioutil.ReadFile -> os.ReadFile

* replace ioutil.ReadAll -> io.ReadAll

* replace ioutil.TempFile -> os.CreateTemp

* replace ioutil.NopCloser -> io.NopCloser

* replace ioutil.WriteFile -> os.WriteFile

* replace ioutil.TempDir -> os.MkdirTemp

* replace ioutil.Discard -> io.Discard
2022-08-10 15:37:51 +02:00
6afad51761 Move SignedInUser to user service and RoleType and Roles to org (#53445)
* Move SignedInUser to user service and RoleType and Roles to org

* Use go naming convention for roles

* Fix some imports and leftovers

* Fix ldap debug test

* Fix lint

* Fix lint 2

* Fix lint 3

* Fix type and not needed conversion

* Clean up messages in api tests

* Clean up api tests 2
2022-08-10 11:56:48 +02:00
8b3b667a47 Alerting: Fix rule API to accept 0 duration of field For (#50992)
* make 'for' pointer to distinguish between missing field and 0
* set 'for' to -1 if the value is missing but not allow negative in the request + path -1 with the value from original rule
* update store validation to not allow negative 'for'
* update usages to use pointer
2022-06-30 11:46:26 -04:00
6c43eb0b4d Split Create User (#50502)
* Split Create User

* Use new create user and User from package user

* Add service to wire

* Making create user work

* Replace user from user pkg

* One more

* Move Insert to orguser Service/Store

* Remove unnecessary conversion

* Cleaunp

* Fix Get User and add fakes

* Fixing get org id for user logic, adding fakes and other adjustments

* Add some tests for ourguser service and store

* Fix insert org logic

* Add comment about deprecation

* Fix after merge with main

* Move orguser service/store to org service/store

* Remove orguser from wire

* Unimplement new Create user and use User from pkg user

* Fix wire generation

* Fix lint

* Fix lint - use only User and CrateUserCommand from user pkg

* Remove User and CreateUserCommand from models

* Fix lint 2
2022-06-28 14:32:25 +02:00
4a9872d108 reload permissions after create folder (#51288) 2022-06-27 09:31:49 -04:00
53d03aec78 Alerting: Add api client to integration tests (#50970) 2022-06-21 11:39:22 -04:00
ae9491c3a7 Chore: Make test tracer noop and return no errors (#50797) 2022-06-15 12:40:41 +02:00
81e368dbb5 Fix flaky test. Sort records and only test the important fields (#49120) 2022-05-18 10:17:08 -05:00
f256f625d8 AccessControl: Enable RBAC by default (#48813)
* Add RBAC section to settings

* Default to RBAC enabled settings to true

* Update tests to respect RBAC

Co-authored-by: Karl Persson <kalle.persson@grafana.com>
2022-05-16 12:45:41 +02:00
41012af997 Tracing: Use common traceID context value for opentracing and opentelemetry (#46411)
* use common traceID context value for opentracing and opentelemetry

* support sampled trace IDs as well

* inject traceID into NormalResponse on errors

* Finally the test passed

* fix the test

* fix linter

* change the function parameter

Co-authored-by: Ying WANG <ying.wang@grafana.com>
2022-04-14 17:54:49 +02:00
e86b6662a1 Chore: Remove bus.Bus field (#47695)
* Chore: Remove bus.Bus field

* fix integration test
2022-04-13 15:24:13 +02:00
a338c78ca8 Alerting: Remove internal labels from prometheus compatible API responses (#46548)
* Alerting: Remove internal labels from prometheus compatible API responses

* Appease the linter

* Fix integration tests

* Fix API documentation & linter

* move removal of internal labels to the models
2022-03-16 16:04:19 +00:00
f75bea481d Alerting: validate rules and calculate changes in API controller (#45072)
* Update API controller
   - add validation of rules API model
   - add function to calculate changes between the submitted alerts and existing alerts
   - update RoutePostNameRulesConfig to validate input models, calculate changes and apply in a transaction

* Update DBStore
   - delete unused storage method. All the logic is moved upstream.
   - upsert to not modify fields of new by values from the existing alert
   - if rule has UID do not try to pull it from db. (it is done upstream)

* Add rule generator
2022-02-23 11:30:04 -05:00
d5b98772ed Dashboards: Refactor service to make it injectable by wire (#44588)
* Add providers to folder and dashboard services

* Refactor folder and dashboard services

* Move store implementation to its own file due wire cannot allow us to cast to SQLStore

* Add store in some places and more missing dependencies

* Bad merge fix

* Remove old functions from tests and few fixes

* Fix provisioning

* Remove store from http server and some test fixes

* Test fixes

* Fix dashboard and folder tests

* Fix library tests

* Fix provisioning tests

* Fix plugins manager tests

* Fix alert and org users tests

* Refactor service package and more test fixes

* Fix dashboard_test tets

* Fix api tests

* Some lint fixes

* Fix lint

* More lint :/

* Move dashboard integration tests to dashboards service and fix dependencies

* Lint + tests

* More integration tests fixes

* Lint

* Lint again

* Fix tests again and again anda again

* Update searchstore_test

* Fix goimports

* More go imports

* More imports fixes

* Fix lint

* Move UnprovisionDashboard function into dashboard service and remove bus

* Use search service instead of bus

* Fix test

* Fix go imports

* Use nil in tests
2022-02-16 14:15:44 +01:00
4fef791c7c Alerting: enable e2e tests to run in production mode (#45073)
* Alerting: run e2e tests in production mode

* adapt expected messages

* switch expected and actual to have the right order
2022-02-09 10:26:06 +01:00
8ee3f59cd4 Alerting: recognize Cortex datasources correctly in the frontend (#44316)
* Alerting: always use msg field for user facing errors

* fix: revert front-end Cortex detection

Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
2022-01-21 15:44:11 +01:00
30aa24a183 Chore: Implement OpenTelemtry in Grafana (#42674)
* Separate Tracer interface to TracerService and Tracer

* Fix lint

* Fix:Make it possible to start spans for both opentracing and opentelemetry in ds proxy

* Add span methods, use span interface for rest of tracing

* Fix logs in tracing

* Fix tests that are related to tracing

* Fix resourcepermissions test

* Fix some tests

* Fix more tests

* Add TracingService to wire cli runner

* Remove GlobalTracer from bus

* Renaming test function

* Remove GlobalTracer from TSDB

* Replace GlobalTracer in api

* Adjust tests to the InitializeForTests func

* Remove GlobalTracer from services

* Remove GlobalTracer

* Remove bus.NewTest

* Remove Tracer interface

* Add InitializeForBus

* Simplify tests

* Clean up tests

* Rename TracerService to Tracer

* Update pkg/middleware/request_tracing.go

Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>

* Initialize tracer before passing it to SQLStore initialization in commands

* Remove tests for opentracing

* Set span attributes correctly, remove unnecessary trace initiliazation form test

* Add tracer instance to newSQLStore

* Fix changes due to rebase

* Add modified tracing middleware test

* Fix opentracing implementation tags

Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
2022-01-20 11:10:12 +01:00
b8852ef6a3 Chore: Remove context.TODO() (#43409)
* Remove context.TODO() from services

* Fix live test

* Remove context.TODO
2021-12-22 11:02:42 +01:00
b605340668 Alerting: log errors happening in the API on server side (#43192)
* Alerting: log errors happening in the API on server side

* adapt tests to reflect changed payload
2021-12-16 13:33:10 +01:00
6220872633 Alerting: fix bug where user is able to access rules from namespaces user is not part of (#41403)
* Add fix
* Add tests
Co-authored-by: Yuriy Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: Armand Grillet <2117580+armandgrillet@users.noreply.github.com>
Co-authored-by: Jean-Philippe Quéméner <JohnnyQQQQ@users.noreply.github.com>
Co-authored-by: George Robinson <george.robinson@grafana.com>
2021-11-08 14:26:08 +01:00
935bd34a30 Panel ID annotation cannot be set without Dashboard UID (#40019) 2021-10-06 11:34:11 +01:00
2a4c1b1aa6 You can now get alert rules for a dashboard or a panel using /api/v1/rules endpoints. (#39476)
Get alert rules for a dashboard and panel in /api/v1/rules
2021-10-04 16:33:55 +01:00
012d4f0905 Alerting: Remove ngalert feature toggle and introduce two new settings for enabling Grafana 8 alerts and disabling them for specific organisations (#38746)
* Remove `ngalert` feature toggle

* Update frontend

Remove all references of ngalert feature toggle

* Update docs

* Disable unified alerting for specific orgs

* Add backend tests

* Apply suggestions from code review

Co-authored-by: achatterjee-grafana <70489351+achatterjee-grafana@users.noreply.github.com>

* Disabled unified alerting by default

* Ensure backward compatibility with old ngalert feature toggle

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>
2021-09-29 16:16:40 +02:00
78596a6756 Migrate to Wire for dependency injection (#32289)
Fixes #30144

Co-authored-by: dsotirakis <sotirakis.dim@gmail.com>
Co-authored-by: Marcus Efraimsson <marcus.efraimsson@gmail.com>
Co-authored-by: Ida Furjesova <ida.furjesova@grafana.com>
Co-authored-by: Jack Westbrook <jack.westbrook@gmail.com>
Co-authored-by: Will Browne <wbrowne@users.noreply.github.com>
Co-authored-by: Leon Sorokin <leeoniya@gmail.com>
Co-authored-by: Andrej Ocenas <mr.ocenas@gmail.com>
Co-authored-by: spinillos <selenepinillos@gmail.com>
Co-authored-by: Karl Persson <kalle.persson@grafana.com>
Co-authored-by: Leonard Gram <leo@xlson.com>
2021-08-25 15:11:22 +02:00
04d5dcb7c8 Alerting: modify DB table, accessors and migration to restrict org access (#37414)
* Alerting: modify table and accessors to limit org access appropriately

* Update migration to create multiple Alertmanager configs

* Apply suggestions from code review

Co-authored-by: gotjosh <josue@grafana.com>

* replace mg.ClearMigrationEntry()

mg.ClearMigrationEntry() would create a new session.
This commit introduces a new migration for clearing an entry from migration log for replacing  mg.ClearMigrationEntry() so that all dashboard alert migration operations will run inside the same transaction.
It adds also `SkipMigrationLog()` in Migrator interface for skipping adding an entry in the migration_log.

Co-authored-by: gotjosh <josue@grafana.com>
2021-08-12 16:04:09 +03:00
8a3edf280e Alerting: Fix prometheus API to check folder permissions (#36301) 2021-07-05 10:49:14 +03:00
23939eab10 [Alerting]: namespace fixes (#34470)
* [Alerting]: forbid viewers for updating rules if viewers can edit

check for CanSave instead of CanEdit

* Clear ngalert tables when deleting the folder

* Apply suggestions from code review

* Log failure to check save permission

Co-authored-by: gotjosh <josue@grafana.com>
2021-05-20 15:49:33 +03:00
bbb7bbf891 Alerting: Remove back end logic for supporting KeepLastState (#34242)
* Removed back end logic for supporting KeepLastState

* Map keep_state correctly in migrations
2021-05-18 10:55:43 -07:00
331991ca10 UAlerting: Increase default max datapoints (#34223)
Change const value from 100 to 43200 (12 hours at 1sec interval)
2021-05-17 18:46:52 +02:00
540f110220 [Alerting]: Extend quota service to optionally set limits on alerts (#33283)
* Quota: Extend service to set limit on alerts

* Add test for applying quota to alert rules

* Apply suggestions from code review

Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>

* Get used alert quota only if naglert is enabled

* Set alert limit to zero if nglalert is not enabled
Co-authored-by: Diana Payton <52059945+oddlittlebird@users.noreply.github.com>
2021-05-04 19:16:28 +03:00