132 Commits

Author SHA1 Message Date
9fe523b9e6 Alerting: API to pause all alert rules in a folder (#104674) 2025-05-13 17:04:01 +02:00
0ceea29787 Alerting: Remove alertingSimplifiedRouting feature toggle (#104980)
Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
2025-05-09 16:30:56 +03:00
032299011a Alerting: Relax permissions for access a rule (#103664)
This makes it so that it is:
- No longer required to have datasource permissions to delete a rule.
- No longer required to have datasource permissions to update non-query related fields of a rule.
2025-04-11 00:58:37 +01:00
757be6365a CI: Bump golangci-lint to 2.0.2 (#103572) 2025-04-10 14:42:23 +02:00
9bc4fbe900 alerting: batch user service calls (#103403)
* update alerting endpoints to batch requests to the user service via ListByIdOrUID
2025-04-07 07:33:15 -04:00
f49a88ab72 Alerting: Add MissingSeriesEvalsToResolve to the APIs (#102150)
What is this feature?

A follow-up for #101184, adds AlertRule.MissingSeriesEvalsToResolve to the APIs.

missing_series_evals_to_resolve must be specified too and it must be > 0.

POST /api/ruler/grafana/api/v1/rules/{folderUID} works in the following way:

    If missing_series_evals_to_resolve is not sent or null, the rule keeps its existing value
    If missing_series_evals_to_resolve > 0: updates to that value
    If missing_series_evals_to_resolve = 0: resets to default (nil).
    AlertRule.MissingSeriesEvalsToResolve can't be 0, so I used it to reset

In the Provisioning API, the value is just set if present and > 0. Otherwise it's reset:

PUT to /api/v1/provisioning/alert-rules/{UID}:

    If missing_series_evals_to_resolve is nil, it's reset to the default value
    If missing_series_evals_to_resolve > 0, it's updated
2025-03-26 13:34:53 +01:00
695ac91290 Alerting: Add backend support for keep_firing_for (#100750)
What is this feature?

This PR introduces a new alert rule configuration option, keep_firing_for (Prometheus documentation).

keep_firing_for prevents alerts from resolving immediately after the alert condition returns to normal. Instead, they transition into a "Recovering" state and are not considered resolved by the Alertmanager. Once the recovery period ends (or after the next evaluation if it is bigger than keep_firing_for), the alert transitions to "Normal" if it doesn't start alerting again:

Before                                          

+----------+     +----------+                    
| Alerting |---->|  Normal  |                    
+----------+     +----------+                    

-----
After

+----------+      +------------+     +----------+
| Alerting |----->| Recovering |---->|  Normal  |
+----------+      +------------+     +----------+                                                 

Why do we need this feature?

This feature prevents flapping alerts by adding a recovery period. This helps avoid false resolutions caused by brief alert
2025-03-18 11:24:48 +01:00
309a2eb4e9 Alerting: Allow administrators delete rules permanently via UI (#101974)
* add query parameter to existing APIs to control the permanent deletion of rules
* add GUID to gettable rule
* add new endpoint /ruler/grafana/api/v1/trash/rule/guid/{RuleGUID} to delete rules from trash permanently

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-03-14 22:14:06 +02:00
87638c0170 Alerting: Start splitting apart ngalert/api package. (#102075) 2025-03-13 09:28:35 +01:00
7e4beb2074 Alerting: API to return deleted rules (#101429) 2025-03-11 12:40:44 -04:00
374380d1f6 Alerting: Keep the latest version of deleted rule in version table (#101481)
* add feature toggle alertRuleRestore
* Update delete rule to require UserUID, remove all versions and create "delete" version that holds information about who and when deleted the rule
2025-03-05 09:15:26 -05:00
1d54850a68 Alerting: Get alert rule versions by GUID (#101469)
* get alert rule versions by GUID

* protect guid field from accidental update
2025-02-28 11:27:46 -05:00
7ae8058c8b Alerting: Return 404 when /api/ruler/grafana/api/v1/rules/{Namespace}/{Groupname} does not exist (#100264)
* Return a 404 when rule group doesn't exist

* Update tests

* Update swagger doc and tests
2025-02-07 16:24:28 +00:00
f7d476e408 Alerting: Remove id and org_id from grafana alert rule API model (#100139) 2025-02-05 23:13:22 +02:00
68f1730461 Alerting: set updated_by for system owned operations (#100068) 2025-02-04 14:23:15 -05:00
ac41c19350 Alerting: Rule version history API (#99041)
* implement store method to read rule versions

* implement request handler

* declare a new endpoint

* fix fake to return correct response

* add tests

* add integration tests

* rename history to versions

* apply diff from swagger CI step

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-02-03 13:26:18 -05:00
d71904cb27 Alerting: Expose updated_by in rules GET APIs (#99525)
---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-01-27 14:31:40 -05:00
92d6762a3a Alerting: Store information about user that created\updated alert rule (#99395)
* introduce new fields created_by in rule tables
* update domain model and compat layer to support UpdatedBy
* add alert rule generator mutators for UpdatedBy
* ignore UpdatedBy in diff and hash calculation
* Add user context to alert rule insert/update operations
  Updated InsertAlertRules and UpdateAlertRules methods to accept a user context parameter. This change ensures auditability and better tracking of user actions when creating or updating alert rules. Adjusted all relevant calls and interfaces to pass the user context accordingly.

* set UpdatedBy in PreSave because this is where Updated is set
* Use nil userID for system-initiated updates
This ensures differentiation between system and user-initiated changes for better traceability and clarity in update origins.

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-01-24 12:09:17 -05:00
9f5b05f936 Alerting: Add metadata field with editor_settings to alert rule (#93245) 2024-09-19 16:43:41 +02:00
8bcd9c2594 Identity: Remove typed id (#91801)
* Refactor identity struct to store type in separate field

* Update ResolveIdentity to take string representation of typedID

* Add IsIdentityType to requester interface

* Use IsIdentityType from interface

* Remove usage of TypedID

* Remote typedID struct

* fix GetInternalID
2024-08-13 10:18:28 +02:00
b67bcdb9b8 Alerting: Handle namespace and group query string params in Ruler API (#91533)
* Handle namespace and group query string params in Ruler API

* Use the new namespace and group query params when slashes in names

* Add validation, add group handling in GMA Api

* Move constants

* Use checkForPathSeparator function

* Fix linter issue
2024-08-13 08:31:07 +02:00
bcfb66b416 Identity: remove GetTypedID (#91745) 2024-08-09 18:20:24 +03:00
9db3bc926e Identity: Rename "namespace" to "type" in the requester interface (#90567) 2024-07-25 12:52:14 +03:00
99d8025829 Chore: Move identity and errutil to apimachinery module (#89116) 2024-06-13 07:11:35 +03:00
543f0ae37e Alerting: Update ListAlertRulesQuery to take a slice of RuleGroups (#88385)
* Change ListAlertRulesQuery to take RuleGroup slice instead

* Change func name

* Change func name

* Fix fakes

* Fix function arg
2024-05-29 11:50:33 +01:00
8418aca823 Alerting: Add single rule checks to alert rule access control (#88307)
* Alerting: Add single rule checks to alert rule access control

Modifies ruler api single rule read to no longer fetch entire groups and instead
 use the new single rule ac check.
Simplifies provisioning api getAlertRuleAuthorized logic to always load a single
 rule instead of conditionally loading the entire group when provisioning
 permissions are not present.

* Swap out Has/AuthorizeAccessToRule for Has/AuthorizeAccessInFolder
2024-05-28 10:49:24 -04:00
006d0021e3 Alerting: Remove requirement for datasource query on rule read (#87349)
* Remove requirement for datasource query for rule read

* Address PR comments
2024-05-23 12:44:30 -04:00
49c8deb1ea Alerting: Add recording rules to ruler API and validation (#87779)
* Read path, main API

* Define record field for incoming requests

* Refactor several alerting specific validators into two paths

* Refactor validateCondition actually contain all the condition validation logic

* Move condition validation inside rule path

* Validators for recording rules

* Wire feature flag through to validators

* Test for accepting a valid recording rule

* Tests for negative case, no UID

* Test for ignoring alerting fields

* Build conditions based on recording rules as well

* Regenerate swagger docs

* Fix CRUD test to cover the right thing

* Re-generate swagger docs with backdated v0.30.2 version

* Regenerate base spec

* Regenerate ngalert specs

* Regenerate top level specs

* Comment and rename

* Return struct instead of modifying ref
2024-05-21 14:39:28 -05:00
e39658097f Alerting: Wire recording rules feature toggle into limits struct (#87778)
Wire toggle into limits
2024-05-14 07:44:14 -05:00
df25e9197e Alerting: Get grafana-managed alert rule by UID (#86845)
* Add auth checks and test

* Check user is authorized to view rule and add tests

* Change naming

* Update Swagger params

* Update auth test and swagger gen

* Update swagger gen

* Change response to GettableExtendedRuleNode

* openapi3-gen

* Update tests with refactors models pkg
2024-05-02 15:24:59 +01:00
a6ad2380bf Alerting: Refactor api_prometheus.go request handlers. (#86639)
This splits the request handlers into two functions, one which is the actual
handler and one which is independent from the Grafana `ReqContext` object. This
is to make it easier to reuse the implementation in other code.

Part of the refactoring changes the functions which get query parameters from
the request to operate on a `url.Values` instead of the request object.

The change also makes the code consistently use `req.Form` instead of a
combination of `req.URL.Query()` and `req.Form`, though I have left
`api_ruler` as-is to avoid this PR growing too large.
2024-04-23 14:50:26 +02:00
a862a4264d Alerting: Export rule validation logic and make it portable (#83555)
* ValidateInterval doesn't need the entire config

* Validation no longer depends on entire folder now that we've dropped foldertitle from api

* Don't depend on entire config struct

* Export validate group
2024-02-28 14:40:13 -06:00
b7bbc5058f Alerting: Don't validate rules on group update if they've only been reordered (#81841)
---------

Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2024-02-15 12:03:28 -05:00
1eebd2a4de Alerting: Support for simplified notification settings in rule API (#81011)
* Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping
* Support validation of notification settings when rules are updated

* Implement route generator for Alertmanager configuration. That fetches all notification settings.
* Update multi-tenant Alertmanager to run the generator before applying the configuration.

* Add notification settings labels to state calculation
* update the Multi-tenant Alertmanager to provide validation for notification settings

* update GET API so only admins can see auto-gen
2024-02-15 09:45:10 -05:00
99fa064576 Alerting: Emit warning when creating or updating unusually large groups (#82279)
* Add config for limit of rules per rule group

* Warn when editing big groups through normal API

* Warn on prov api writes for groups

* Wire up comp root, tests

* Also add warning to state manager warm

* Drop unnecessary conversion
2024-02-13 08:29:03 -06:00
47546a4c72 Alerting: Update API to use folders' full paths (#81214)
* update GetUserVisibleNamespaces to use FolderSeriver
* update GetNamespaceByUID to use FolderService.GetFolders
* update GetAlertRulesForScheduling to use FolderService.GetFolders 

* Update API and GetAlertRulesForScheduling to use the folder's full path
* get full path of folder in RouteTestGrafanaRuleConfig

* fix escaping of titles for MySQL
2024-02-06 17:12:13 -05:00
d1dab5828d Alerting: Update rule API to address folders by UID (#74600)
* Change ruler API to expect the folder UID as namespace

* Update example requests

* Fix tests

* Update swagger

* Modify FIle field in /api/prometheus/grafana/api/v1/rules

* Fix ruler export

* Modify folder in responses to be formatted as <parent UID>/<title>

* Add alerting test with nested folders

* Apply suggestion from code review

* Alerting: use folder UID instead of title in rule API (#77166)

Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>

* Drop a few more latent uses of namespace_id

* move getNamespaceKey to models package

* switch GetAlertRulesForScheduling to use folder table

* update GetAlertRulesForScheduling to return folder titles in format `parent_uid/title`.

* fi tests

* add tests for GetAlertRulesForScheduling when parent uid

* fix integration tests after merge

* fix test after merge

* change format of the namespace to JSON array

this is needed for forward compatibility, when we migrate to full paths

* update EF code to decode nested folder

---------

Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: Virginia Cepeda <virginia.cepeda@grafana.com>
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
Co-authored-by: Alex Weaver <weaver.alex.d@gmail.com>
Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
2024-01-17 11:07:39 +02:00
cf8e8852c3 Alerting: Drop NamespaceID from responses on unstable ngalert API endpoints in favor of NamespaceUID (#79359)
* Drop from API response

* Drop from swagger docs

* Drop from integration tests

* regenerate public swagger docs

* Drop from frontend

* Drop asserts for namespaceID field
2023-12-15 11:06:53 -06:00
2be7605794 Alerting: Fix fine-grained rule access control to use 403 for authorization error (#79239)
* use 403 for authorization error
* update silences API
* add ForbiddenError to rule API responses
2023-12-07 13:43:58 -05:00
64feeddc23 Alerting: Update rule access control to return errutil errors (#78284)
* update rule access control to return errutil errors
* use alerting in msgID
2023-12-02 01:42:11 +02:00
2f2ce3edbb Chore: Deprecate ID from Folder (#78281)
* Chore: Deprecate ID from Folder

* chore: add more linter comments

* chore: add missing lint comment
2023-11-20 15:44:51 -05:00
7cec741bae Alerting: Extract alerting rules authorization logic to a service (#77006)
* extract alerting authorization logic to separate package
* convert authorization logic to service
2023-11-15 18:54:54 +02:00
Jo
580477bf8e NGAlerting: Use identity.Requester interface instead of SignedInUser (#76360)
* unfurl SignedInUserAttrs services

* replace signedInUser with Requester

replace signedInUser with requester

* fix tests

* linting

---------

Co-authored-by: Ieva <ieva.vasiljeva@grafana.com>
2023-11-14 14:47:34 +00:00
Jo
dcd0c6b11e Identity: Unfurl OrgID in pkg/services to allow using identity.Requester interface (#76113)
Unfurl OrgID in pkg/services to allow using identity.Requester interface
2023-10-09 10:40:19 +02:00
2497db4bd6 Alerting: Add UID of rules to response that were affected by update group request (#75985)
* update storage's method InstertRules to return ids of added rules as slice to keep the same order as rules in the argument
* schematize response of update rule group endpoint, add created, updated, deleted fields that contain UID of affected rules.
* update integration tests to use the new fields
2023-10-07 01:11:24 +03:00
027bd9356f Alerting: Rule Modify Export APIs (#75322)
* extend RuleStore interface to get namespace by UID
* add new export API endpoints
* implement request handlers
* update authorization and wire handlers to paths
* add folder error matchers to errorToResponse
* add tests for export methods
2023-10-02 11:47:59 -04:00
237ce5ea82 Alerting: Extract methods for fetching rule groups with authorization (#75375)
* extract methods for fetching rule groups with authorization and refactor the request handlers.
* add logging to delete handler
2023-09-26 12:45:22 -04:00
58f6648505 Chore: capitalise messages for alerting (#74335) 2023-09-04 18:46:34 +02:00
025b2f3011 Chore: use any rather than interface{} (#74066) 2023-08-30 18:46:47 +03:00
b963defa44 Alerting: update rules POST API to validate query and condition only for rules that changed. (#68667)
* replace condition validation with just structural validation
* validate conditions of only new and updated rules
* add integration tests for rule update\delete API

Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-06-15 13:33:42 -04:00