65 Commits

Author SHA1 Message Date
3b45f7daa5 Alerting: Fix flaky test (#107090) 2025-06-24 11:25:51 +02:00
c58ac15031 Alerting: Remove grafanaManagedRecordingRules feature flag (#105569) 2025-05-19 12:15:49 +02:00
9fe523b9e6 Alerting: API to pause all alert rules in a folder (#104674) 2025-05-13 17:04:01 +02:00
163a8d5f0c NGAlert: Fix flaky test (#104070)
Fix flaky test

Co-authored-by: Marco de Abreu <18629099+marcoabreu@users.noreply.github.com>
2025-04-15 20:56:42 +01:00
54d4b7842b Alerting: Fix flaky tests (#104017)
Alerting: fix flaky tests

Some test conditions introduced in #103403 are flaky because they rely on random behavior of the generator.

Sometimes rules are generated with an updated by (which warrants the lookup of the users). This makes it so those tests which are checking the user lookup always have rules with updated by.
2025-04-15 08:15:29 +03:00
032299011a Alerting: Relax permissions for access a rule (#103664)
This makes it so that it is:
- No longer required to have datasource permissions to delete a rule.
- No longer required to have datasource permissions to update non-query related fields of a rule.
2025-04-11 00:58:37 +01:00
9bc4fbe900 alerting: batch user service calls (#103403)
* update alerting endpoints to batch requests to the user service via ListByIdOrUID
2025-04-07 07:33:15 -04:00
d8c5c2d3b8 K8s: Folders: Modify GetChildren to return only Folder References (#103072)
* Return FolderReference instead of Folder on GetChildren

Signed-off-by: Maicon Costa <maiconscosta@gmail.com>

---------

Signed-off-by: Maicon Costa <maiconscosta@gmail.com>
2025-04-02 01:30:17 -03:00
695ac91290 Alerting: Add backend support for keep_firing_for (#100750)
What is this feature?

This PR introduces a new alert rule configuration option, keep_firing_for (Prometheus documentation).

keep_firing_for prevents alerts from resolving immediately after the alert condition returns to normal. Instead, they transition into a "Recovering" state and are not considered resolved by the Alertmanager. Once the recovery period ends (or after the next evaluation if it is bigger than keep_firing_for), the alert transitions to "Normal" if it doesn't start alerting again:

Before                                          

+----------+     +----------+                    
| Alerting |---->|  Normal  |                    
+----------+     +----------+                    

-----
After

+----------+      +------------+     +----------+
| Alerting |----->| Recovering |---->|  Normal  |
+----------+      +------------+     +----------+                                                 

Why do we need this feature?

This feature prevents flapping alerts by adding a recovery period. This helps avoid false resolutions caused by brief alert
2025-03-18 11:24:48 +01:00
309a2eb4e9 Alerting: Allow administrators delete rules permanently via UI (#101974)
* add query parameter to existing APIs to control the permanent deletion of rules
* add GUID to gettable rule
* add new endpoint /ruler/grafana/api/v1/trash/rule/guid/{RuleGUID} to delete rules from trash permanently

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-03-14 22:14:06 +02:00
374380d1f6 Alerting: Keep the latest version of deleted rule in version table (#101481)
* add feature toggle alertRuleRestore
* Update delete rule to require UserUID, remove all versions and create "delete" version that holds information about who and when deleted the rule
2025-03-05 09:15:26 -05:00
1d54850a68 Alerting: Get alert rule versions by GUID (#101469)
* get alert rule versions by GUID

* protect guid field from accidental update
2025-02-28 11:27:46 -05:00
7ae8058c8b Alerting: Return 404 when /api/ruler/grafana/api/v1/rules/{Namespace}/{Groupname} does not exist (#100264)
* Return a 404 when rule group doesn't exist

* Update tests

* Update swagger doc and tests
2025-02-07 16:24:28 +00:00
68f1730461 Alerting: set updated_by for system owned operations (#100068) 2025-02-04 14:23:15 -05:00
ac41c19350 Alerting: Rule version history API (#99041)
* implement store method to read rule versions

* implement request handler

* declare a new endpoint

* fix fake to return correct response

* add tests

* add integration tests

* rename history to versions

* apply diff from swagger CI step

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-02-03 13:26:18 -05:00
d71904cb27 Alerting: Expose updated_by in rules GET APIs (#99525)
---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-01-27 14:31:40 -05:00
cbb688e910 Zanzana: Remove usage from legacy access control (#98883)
* Zanzana: Remove usage from legacy access control

* remove unused

* remove zanzana client from services where it's not used

* remove unused metrics

* fix linter
2025-01-14 10:26:15 +01:00
324503ee8b Alerting: Add simplified_notifications_section field to the alert rule metadata (#95988) 2024-11-14 12:55:54 +01:00
9f5b05f936 Alerting: Add metadata field with editor_settings to alert rule (#93245) 2024-09-19 16:43:41 +02:00
dfbddd8262 Alerting: Fix recording rule export (#91405)
* Fix HCL export

* Update rule export struct to support new optional fields

* Omit `for` field in export API if empty
2024-08-22 09:04:21 -04:00
87d86e81ce Zanzana: Evaluate permissions alongside with RBAC engine (#90064)
* Zanzana: Evaluate permissions if feature flag enabled

* Fix tests

* adjust logs

* fix spelling

* remove unused

* only evaluate implemented resources

* refactor
2024-07-05 11:31:23 +02:00
006d0021e3 Alerting: Remove requirement for datasource query on rule read (#87349)
* Remove requirement for datasource query for rule read

* Address PR comments
2024-05-23 12:44:30 -04:00
167151b211 Chore: Remove use of deprecated method in AC code (#87541)
* switch from using cfg to using featuremgmt for checking a feature toggle in AC code

* merge test fixes
2024-05-10 11:56:52 +01:00
df25e9197e Alerting: Get grafana-managed alert rule by UID (#86845)
* Add auth checks and test

* Check user is authorized to view rule and add tests

* Change naming

* Update Swagger params

* Update auth test and swagger gen

* Update swagger gen

* Change response to GettableExtendedRuleNode

* openapi3-gen

* Update tests with refactors models pkg
2024-05-02 15:24:59 +01:00
052082a927 Alerting: Refactor Alert Rule Generators (#86813) 2024-04-29 21:52:15 -04:00
5687243d0b Feature Flags: use FeatureToggles interface where possible (#85131)
* Feature Flags: use FeatureToggles interface where possible

Signed-off-by: Dave Henderson <dave.henderson@grafana.com>

* Replace TestFeatureToggles with existing WithFeatures

Signed-off-by: Dave Henderson <dave.henderson@grafana.com>

---------

Signed-off-by: Dave Henderson <dave.henderson@grafana.com>
2024-04-04 12:22:31 -04:00
e593d36ed8 Alerting: Update rule access control to explicitly check for permissions "alert.rules:read" and "folders:read" (#78289)
* require "folders:read" and "alert.rules:read"  in all rules API requests (write and read). 

* add check for permissions "folders:read" and "alert.rules:read" to AuthorizeAccessToRuleGroup and HasAccessToRuleGroup

* check only access to datasource in rule testing API

---------

Co-authored-by: William Wernert <william.wernert@grafana.com>
2024-03-19 22:20:30 -04:00
b7bbc5058f Alerting: Don't validate rules on group update if they've only been reordered (#81841)
---------

Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2024-02-15 12:03:28 -05:00
1eebd2a4de Alerting: Support for simplified notification settings in rule API (#81011)
* Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping
* Support validation of notification settings when rules are updated

* Implement route generator for Alertmanager configuration. That fetches all notification settings.
* Update multi-tenant Alertmanager to run the generator before applying the configuration.

* Add notification settings labels to state calculation
* update the Multi-tenant Alertmanager to provide validation for notification settings

* update GET API so only admins can see auto-gen
2024-02-15 09:45:10 -05:00
47546a4c72 Alerting: Update API to use folders' full paths (#81214)
* update GetUserVisibleNamespaces to use FolderSeriver
* update GetNamespaceByUID to use FolderService.GetFolders
* update GetAlertRulesForScheduling to use FolderService.GetFolders 

* Update API and GetAlertRulesForScheduling to use the folder's full path
* get full path of folder in RouteTestGrafanaRuleConfig

* fix escaping of titles for MySQL
2024-02-06 17:12:13 -05:00
2203bc2a3d Alerting: Refactor provisioning tests/fakes (#81205)
* Fix up test Alertmanager config JSON

* Move fake AM config and provisioning stores to fakes package
2024-01-24 17:15:55 -05:00
d1dab5828d Alerting: Update rule API to address folders by UID (#74600)
* Change ruler API to expect the folder UID as namespace

* Update example requests

* Fix tests

* Update swagger

* Modify FIle field in /api/prometheus/grafana/api/v1/rules

* Fix ruler export

* Modify folder in responses to be formatted as <parent UID>/<title>

* Add alerting test with nested folders

* Apply suggestion from code review

* Alerting: use folder UID instead of title in rule API (#77166)

Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>

* Drop a few more latent uses of namespace_id

* move getNamespaceKey to models package

* switch GetAlertRulesForScheduling to use folder table

* update GetAlertRulesForScheduling to return folder titles in format `parent_uid/title`.

* fi tests

* add tests for GetAlertRulesForScheduling when parent uid

* fix integration tests after merge

* fix test after merge

* change format of the namespace to JSON array

this is needed for forward compatibility, when we migrate to full paths

* update EF code to decode nested folder

---------

Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: Virginia Cepeda <virginia.cepeda@grafana.com>
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
Co-authored-by: Alex Weaver <weaver.alex.d@gmail.com>
Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
2024-01-17 11:07:39 +02:00
2be7605794 Alerting: Fix fine-grained rule access control to use 403 for authorization error (#79239)
* use 403 for authorization error
* update silences API
* add ForbiddenError to rule API responses
2023-12-07 13:43:58 -05:00
7cec741bae Alerting: Extract alerting rules authorization logic to a service (#77006)
* extract alerting authorization logic to separate package
* convert authorization logic to service
2023-11-15 18:54:54 +02:00
027bd9356f Alerting: Rule Modify Export APIs (#75322)
* extend RuleStore interface to get namespace by UID
* add new export API endpoints
* implement request handlers
* update authorization and wire handlers to paths
* add folder error matchers to errorToResponse
* add tests for export methods
2023-10-02 11:47:59 -04:00
025b2f3011 Chore: use any rather than interface{} (#74066) 2023-08-30 18:46:47 +03:00
b963defa44 Alerting: update rules POST API to validate query and condition only for rules that changed. (#68667)
* replace condition validation with just structural validation
* validate conditions of only new and updated rules
* add integration tests for rule update\delete API

Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-06-15 13:33:42 -04:00
d98813796c RBAC: Remove legacy AC from HasAccess permission check (#68995)
* remove unused HasAdmin and HasEdit permission methods

* remove legacy AC from HasAccess method

* remove unused function

* update alerting tests to work with RBAC
2023-05-30 14:39:09 +01:00
85a954cd81 Alerting: Update scheduler to get updates only from database (#64635)
* stop using the scheduler's Update and Delete methods all communication must be via the database
* update scheduler's registry to calculate diff before re-setting the cache
* update fetcher to return the diff generated by registry
* update processTick to update rule eval routine if the rule was updated and it is not going to be evaluated at this tick.
* remove references to the scheduler from api package
* remove unused methods in the scheduler
2023-03-14 18:02:51 -04:00
f561e71de8 Alerting: decouple api models from domain\dto models: separate Provenance status + converters (#63594)
* move conversions of domain models to api models and reverse from definition package to api package
2023-02-27 17:57:15 -05:00
6c5a573772 Chore: Move ReqContext to contexthandler service (#62102)
* Chore: Move ReqContext to contexthandler service

* Rename package to contextmodel

* Generate ngalert files

* Remove unused imports
2023-01-27 08:50:36 +01:00
080ea88af7 Nested Folders: Support getting of nested folder in folder service wh… (#58597)
* Nested Folders: Support getting of nested folder in folder service when feature flag is set

* Fix lint

* Fix some tests

* Fix ngalert test

* ngalert fix

* Fix API tests

* Fix some tests and lint

* Fix lint 2

* Fix library elements and panels

* Add access control to get folder

* Cleanup and minor test change
2022-11-11 14:28:24 +01:00
c16317e5b8 Alerting: Move fake rule store to the test utilities package (#56062)
* Move fakeRuleStore to tests/fakes package

* Break stub dependencies on store

* Update existing tests to point to new location

* Remove unused stub of TimeNow

* Rename fake to take advantage of package name
2022-09-30 14:36:51 -05:00
2d38664fe6 Alerting: Improve validation of query and expressions on rule submit (#53258)
* Improve error messages of server-side expression 
* move validation of alert queries and a condition to eval package
2022-09-21 15:14:11 -04:00
41bd36eb97 Alerting: Update rules delete endpoint to handle rules in group (#53790)
* update RouteDeleteAlertRules rules to update as a group
* remove expecter from scheduler mock to support variadic function
* create function to check for provisioning status + tests

Co-authored-by: Alexander Weaver <weaver.alex.d@gmail.com>
2022-08-24 15:33:33 -04:00
a14621fff6 Chore: Add user service method SetUsingOrg and GetSignedInUserWithCacheCtx (#53343)
* Chore: Add user service method SetUsingOrg

* Chore: Add user service method GetSignedInUserWithCacheCtx

* Use method GetSignedInUserWithCacheCtx from user service

* Fix lint after rebase

* Fix lint

* Fix lint error

* roll back some changes

* Roll back changes in api and middleware

* Add xorm tags to SignedInUser ID fields
2022-08-11 13:28:55 +02:00
6afad51761 Move SignedInUser to user service and RoleType and Roles to org (#53445)
* Move SignedInUser to user service and RoleType and Roles to org

* Use go naming convention for roles

* Fix some imports and leftovers

* Fix ldap debug test

* Fix lint

* Fix lint 2

* Fix lint 3

* Fix type and not needed conversion

* Clean up messages in api tests

* Clean up api tests 2
2022-08-10 11:56:48 +02:00
c50cbea0bb Alerting: Extract alert rule diff logic into separate file with exported API (#53083)
* Refactor diff logic into separate file with exported API

* Fix linter complaint
2022-08-01 23:41:23 -05:00
0d9389e1f4 Alerting: Code-gen parsing of URL parameters and fix related bugs (#50731)
* Extend template and generate

* Generate and fix up alertmanager endpoints

* Prometheus routes

* fix up Testing endpoints

* touch up ruler API

* Update provisioning and fix 500

* Drop dead code

* Remove more dead code

* Resolve merge conflicts
2022-06-23 15:13:39 -05:00
4d02f73e5f Alerting: Persist rule position in the group (#50051)
Migrations:
* add a new column alert_group_idx to alert_rule table
* add a new column alert_group_idx to alert_rule_version table
* re-index existing rules during migration

API:
* set group index on update. Use the natural order of items in  the array as group index
* sort rules in the group on GET
* update the version of all rules of all affected groups. This will make optimistic lock work in the case of multiple concurrent request touching the same groups.

UI:
* update UI to keep the order of alerts in a group
2022-06-22 10:52:46 -04:00