This makes it so that it is:
- No longer required to have datasource permissions to delete a rule.
- No longer required to have datasource permissions to update non-query related fields of a rule.
What is this feature?
A follow-up for #101184, adds AlertRule.MissingSeriesEvalsToResolve to the APIs.
missing_series_evals_to_resolve must be specified too and it must be > 0.
POST /api/ruler/grafana/api/v1/rules/{folderUID} works in the following way:
If missing_series_evals_to_resolve is not sent or null, the rule keeps its existing value
If missing_series_evals_to_resolve > 0: updates to that value
If missing_series_evals_to_resolve = 0: resets to default (nil).
AlertRule.MissingSeriesEvalsToResolve can't be 0, so I used it to reset
In the Provisioning API, the value is just set if present and > 0. Otherwise it's reset:
PUT to /api/v1/provisioning/alert-rules/{UID}:
If missing_series_evals_to_resolve is nil, it's reset to the default value
If missing_series_evals_to_resolve > 0, it's updated
What is this feature?
This PR introduces a new alert rule configuration option, keep_firing_for (Prometheus documentation).
keep_firing_for prevents alerts from resolving immediately after the alert condition returns to normal. Instead, they transition into a "Recovering" state and are not considered resolved by the Alertmanager. Once the recovery period ends (or after the next evaluation if it is bigger than keep_firing_for), the alert transitions to "Normal" if it doesn't start alerting again:
Before
+----------+ +----------+
| Alerting |---->| Normal |
+----------+ +----------+
-----
After
+----------+ +------------+ +----------+
| Alerting |----->| Recovering |---->| Normal |
+----------+ +------------+ +----------+
Why do we need this feature?
This feature prevents flapping alerts by adding a recovery period. This helps avoid false resolutions caused by brief alert
* add query parameter to existing APIs to control the permanent deletion of rules
* add GUID to gettable rule
* add new endpoint /ruler/grafana/api/v1/trash/rule/guid/{RuleGUID} to delete rules from trash permanently
---------
Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
* add feature toggle alertRuleRestore
* Update delete rule to require UserUID, remove all versions and create "delete" version that holds information about who and when deleted the rule
* introduce new fields created_by in rule tables
* update domain model and compat layer to support UpdatedBy
* add alert rule generator mutators for UpdatedBy
* ignore UpdatedBy in diff and hash calculation
* Add user context to alert rule insert/update operations
Updated InsertAlertRules and UpdateAlertRules methods to accept a user context parameter. This change ensures auditability and better tracking of user actions when creating or updating alert rules. Adjusted all relevant calls and interfaces to pass the user context accordingly.
* set UpdatedBy in PreSave because this is where Updated is set
* Use nil userID for system-initiated updates
This ensures differentiation between system and user-initiated changes for better traceability and clarity in update origins.
---------
Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
* Refactor identity struct to store type in separate field
* Update ResolveIdentity to take string representation of typedID
* Add IsIdentityType to requester interface
* Use IsIdentityType from interface
* Remove usage of TypedID
* Remote typedID struct
* fix GetInternalID
* Handle namespace and group query string params in Ruler API
* Use the new namespace and group query params when slashes in names
* Add validation, add group handling in GMA Api
* Move constants
* Use checkForPathSeparator function
* Fix linter issue
* Alerting: Add single rule checks to alert rule access control
Modifies ruler api single rule read to no longer fetch entire groups and instead
use the new single rule ac check.
Simplifies provisioning api getAlertRuleAuthorized logic to always load a single
rule instead of conditionally loading the entire group when provisioning
permissions are not present.
* Swap out Has/AuthorizeAccessToRule for Has/AuthorizeAccessInFolder
* Read path, main API
* Define record field for incoming requests
* Refactor several alerting specific validators into two paths
* Refactor validateCondition actually contain all the condition validation logic
* Move condition validation inside rule path
* Validators for recording rules
* Wire feature flag through to validators
* Test for accepting a valid recording rule
* Tests for negative case, no UID
* Test for ignoring alerting fields
* Build conditions based on recording rules as well
* Regenerate swagger docs
* Fix CRUD test to cover the right thing
* Re-generate swagger docs with backdated v0.30.2 version
* Regenerate base spec
* Regenerate ngalert specs
* Regenerate top level specs
* Comment and rename
* Return struct instead of modifying ref
* Add auth checks and test
* Check user is authorized to view rule and add tests
* Change naming
* Update Swagger params
* Update auth test and swagger gen
* Update swagger gen
* Change response to GettableExtendedRuleNode
* openapi3-gen
* Update tests with refactors models pkg
This splits the request handlers into two functions, one which is the actual
handler and one which is independent from the Grafana `ReqContext` object. This
is to make it easier to reuse the implementation in other code.
Part of the refactoring changes the functions which get query parameters from
the request to operate on a `url.Values` instead of the request object.
The change also makes the code consistently use `req.Form` instead of a
combination of `req.URL.Query()` and `req.Form`, though I have left
`api_ruler` as-is to avoid this PR growing too large.
* ValidateInterval doesn't need the entire config
* Validation no longer depends on entire folder now that we've dropped foldertitle from api
* Don't depend on entire config struct
* Export validate group
* Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping
* Support validation of notification settings when rules are updated
* Implement route generator for Alertmanager configuration. That fetches all notification settings.
* Update multi-tenant Alertmanager to run the generator before applying the configuration.
* Add notification settings labels to state calculation
* update the Multi-tenant Alertmanager to provide validation for notification settings
* update GET API so only admins can see auto-gen
* Add config for limit of rules per rule group
* Warn when editing big groups through normal API
* Warn on prov api writes for groups
* Wire up comp root, tests
* Also add warning to state manager warm
* Drop unnecessary conversion
* update GetUserVisibleNamespaces to use FolderSeriver
* update GetNamespaceByUID to use FolderService.GetFolders
* update GetAlertRulesForScheduling to use FolderService.GetFolders
* Update API and GetAlertRulesForScheduling to use the folder's full path
* get full path of folder in RouteTestGrafanaRuleConfig
* fix escaping of titles for MySQL
* Change ruler API to expect the folder UID as namespace
* Update example requests
* Fix tests
* Update swagger
* Modify FIle field in /api/prometheus/grafana/api/v1/rules
* Fix ruler export
* Modify folder in responses to be formatted as <parent UID>/<title>
* Add alerting test with nested folders
* Apply suggestion from code review
* Alerting: use folder UID instead of title in rule API (#77166)
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
* Drop a few more latent uses of namespace_id
* move getNamespaceKey to models package
* switch GetAlertRulesForScheduling to use folder table
* update GetAlertRulesForScheduling to return folder titles in format `parent_uid/title`.
* fi tests
* add tests for GetAlertRulesForScheduling when parent uid
* fix integration tests after merge
* fix test after merge
* change format of the namespace to JSON array
this is needed for forward compatibility, when we migrate to full paths
* update EF code to decode nested folder
---------
Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
Co-authored-by: Virginia Cepeda <virginia.cepeda@grafana.com>
Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
Co-authored-by: Alex Weaver <weaver.alex.d@gmail.com>
Co-authored-by: Gilles De Mey <gilles.de.mey@gmail.com>
* Drop from API response
* Drop from swagger docs
* Drop from integration tests
* regenerate public swagger docs
* Drop from frontend
* Drop asserts for namespaceID field
* update storage's method InstertRules to return ids of added rules as slice to keep the same order as rules in the argument
* schematize response of update rule group endpoint, add created, updated, deleted fields that contain UID of affected rules.
* update integration tests to use the new fields
* extend RuleStore interface to get namespace by UID
* add new export API endpoints
* implement request handlers
* update authorization and wire handlers to paths
* add folder error matchers to errorToResponse
* add tests for export methods
* replace condition validation with just structural validation
* validate conditions of only new and updated rules
* add integration tests for rule update\delete API
Co-authored-by: George Robinson <george.robinson@grafana.com>