26 Commits

Author SHA1 Message Date
36c31ce841 NGAlert: Revert change to use math/rand/v2 in tests (#105661) 2025-05-20 14:47:57 +02:00
c58ac15031 Alerting: Remove grafanaManagedRecordingRules feature flag (#105569) 2025-05-19 12:15:49 +02:00
4b426238bd Dependencies: Bump github.com/openfga/openfga from v1.8.6 to v1.8.12 (#105193)
* Dependencies: Bump github.com/openfga/openfga from v1.8.6 to v1.8.12

* Linter: Replace x/exp/rand with math/rand/v2

* NGAlert: Fix test after linter fixes
2025-05-14 11:09:00 +03:00
757be6365a CI: Bump golangci-lint to 2.0.2 (#103572) 2025-04-10 14:42:23 +02:00
695ac91290 Alerting: Add backend support for keep_firing_for (#100750)
What is this feature?

This PR introduces a new alert rule configuration option, keep_firing_for (Prometheus documentation).

keep_firing_for prevents alerts from resolving immediately after the alert condition returns to normal. Instead, they transition into a "Recovering" state and are not considered resolved by the Alertmanager. Once the recovery period ends (or after the next evaluation if it is bigger than keep_firing_for), the alert transitions to "Normal" if it doesn't start alerting again:

Before                                          

+----------+     +----------+                    
| Alerting |---->|  Normal  |                    
+----------+     +----------+                    

-----
After

+----------+      +------------+     +----------+
| Alerting |----->| Recovering |---->|  Normal  |
+----------+      +------------+     +----------+                                                 

Why do we need this feature?

This feature prevents flapping alerts by adding a recovery period. This helps avoid false resolutions caused by brief alert
2025-03-18 11:24:48 +01:00
87638c0170 Alerting: Start splitting apart ngalert/api package. (#102075) 2025-03-13 09:28:35 +01:00
64c93217ff Alerting: Fix incorrect 500 code on missing alert rule dashboardUID / panelID (#96491) 2024-11-14 21:24:48 +02:00
324503ee8b Alerting: Add simplified_notifications_section field to the alert rule metadata (#95988) 2024-11-14 12:55:54 +01:00
49c8deb1ea Alerting: Add recording rules to ruler API and validation (#87779)
* Read path, main API

* Define record field for incoming requests

* Refactor several alerting specific validators into two paths

* Refactor validateCondition actually contain all the condition validation logic

* Move condition validation inside rule path

* Validators for recording rules

* Wire feature flag through to validators

* Test for accepting a valid recording rule

* Tests for negative case, no UID

* Test for ignoring alerting fields

* Build conditions based on recording rules as well

* Regenerate swagger docs

* Fix CRUD test to cover the right thing

* Re-generate swagger docs with backdated v0.30.2 version

* Regenerate base spec

* Regenerate ngalert specs

* Regenerate top level specs

* Comment and rename

* Return struct instead of modifying ref
2024-05-21 14:39:28 -05:00
e39658097f Alerting: Wire recording rules feature toggle into limits struct (#87778)
Wire toggle into limits
2024-05-14 07:44:14 -05:00
36ef611cf4 Alerting: Add database migration for recording rule fields (#87012)
* Create recording rule fields in model

* Add migration

* Write to database, support in version table

* extend fingerprint

* Force fields to be empty on validate

* Another storage spot, tests for fingerprint

* Explicitly set defaults in provisioning API

* Tests for main API validation

* Add diff tests even though fields are unpopulated for now

* Use struct tag approach instead of FromDB/ToDB hooks as it better handles nulls when deserializing

* test for deser

* Backout RecordTo for now since it's not decided in the doc

* back out of migration too

* Drop datasourceref for now

* address linter complaints

* Try a single outer struct with all fields embedded
2024-05-09 12:12:44 -05:00
533bed6d94 Alerting: Fix simplified routes '...' groupBy creating invalid routes (#86006)
* Alerting: Fix simplified routes '...' groupBy creating invalid routes

There were a few ways to go about this fix:
1. Modifying our copy of upstream validation to allow this
2. Modify our notification settings validation to prevent this
3. Normalize group by on save
4. Normalized group by on generate

Option 4. was chosen as the others have a mix of the following cons:
- Generated routes risk being incompatible with upstream/remote AM
- Awkward FE UX when using '...'
- Rule definition changing after save and potential pitfalls with TF

With option 4. generated routes stay compatible with external/remote AMs, FE
doesn't need to change as we allow mixed '...' and custom label groupBys, and
settings we save to db are the same ones requested.

In addition, it has the slight benefit of allowing us to hide the internal
implementation details of `alertname, grafana_folder` from the user in the
future, since we don't need to send them with every FE or TF request.

* Safer use of DefaultNotificationSettingsGroupBy

* Fix missed API tests
2024-04-16 12:14:39 -04:00
f2c7023fe6 fix(alerting): use uid and not rand() in tests for title (#85001) 2024-03-22 16:26:09 +02:00
a862a4264d Alerting: Export rule validation logic and make it portable (#83555)
* ValidateInterval doesn't need the entire config

* Validation no longer depends on entire folder now that we've dropped foldertitle from api

* Don't depend on entire config struct

* Export validate group
2024-02-28 14:40:13 -06:00
1eebd2a4de Alerting: Support for simplified notification settings in rule API (#81011)
* Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping
* Support validation of notification settings when rules are updated

* Implement route generator for Alertmanager configuration. That fetches all notification settings.
* Update multi-tenant Alertmanager to run the generator before applying the configuration.

* Add notification settings labels to state calculation
* update the Multi-tenant Alertmanager to provide validation for notification settings

* update GET API so only admins can see auto-gen
2024-02-15 09:45:10 -05:00
47546a4c72 Alerting: Update API to use folders' full paths (#81214)
* update GetUserVisibleNamespaces to use FolderSeriver
* update GetNamespaceByUID to use FolderService.GetFolders
* update GetAlertRulesForScheduling to use FolderService.GetFolders 

* Update API and GetAlertRulesForScheduling to use the folder's full path
* get full path of folder in RouteTestGrafanaRuleConfig

* fix escaping of titles for MySQL
2024-02-06 17:12:13 -05:00
cb419e799b Remove folderid service test (#80433)
* Remove FolderID from service tests

* Add models

* Add folderID pack to publicdashboard tests

* Remove folderID from dashboard tests

* Remove folderID from folders

* Remove folderID from ngalert tests

* Remove nolint comment

* Add back some tests after rebase
2024-01-12 16:43:39 +01:00
2f2ce3edbb Chore: Deprecate ID from Folder (#78281)
* Chore: Deprecate ID from Folder

* chore: add more linter comments

* chore: add missing lint comment
2023-11-20 15:44:51 -05:00
b963defa44 Alerting: update rules POST API to validate query and condition only for rules that changed. (#68667)
* replace condition validation with just structural validation
* validate conditions of only new and updated rules
* add integration tests for rule update\delete API

Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-06-15 13:33:42 -04:00
52a0f59706 Alerting: introduce AlertQuery in definitions package (#63825)
* copy AlertQuery from ngmodels to the definition package
* replaces usages of ngmodels.AlertQuery in API models
* create a converter between models of AlertQuery
---------

Co-authored-by: Alex Moreno <alexander.moreno@grafana.com>
2023-03-27 11:55:13 -04:00
53945afedf Alerting: Allow alert rule pausing from API (#62326)
* Add is_paused attr to the POST alert rule group endpoint

* Add is_paused to alerting API POST alert rule group

* Fixed tests

* Add is_paused to alerting gettable endpoints

* Fix integration tests

* Alerting: allow to pause existing rules (#62401)

* Display Pause Rule switch in Editing Rule form

* add isPaused property to form interface and dto

* map isPaused prop with is_paused value from DTO

Also update test snapshots

* Append '(Paused)' text on alert list state column when appropriate

* Change Switch styles according to discussion with UX

Also adding a tooltip with info what this means

* Adjust styles

* Fix alignment and isPaused type definition

Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>

* Fix test

* Fix test

* Fix RuleList test

---------

Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>

* wip

* Fix tests and add comments to clarify AlertRuleWithOptionals

* Fix one more test

* Fix tests

* Fix typo in comment

* Fix alert rule(s) cannot be paused via API

* Add integration tests for alerting api pausing flow

* Remove duplicated integration test

---------

Co-authored-by: Virginia Cepeda <virginia.cepeda@grafana.com>
Co-authored-by: gillesdemey <gilles.de.mey@gmail.com>
Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-02-01 13:15:03 +01:00
e5920c211e Chore: Fix random indices for slices in test files (#61884)
* Fix random indices for slices in test files

* Empty commit
2023-01-24 15:07:37 -03:00
080ea88af7 Nested Folders: Support getting of nested folder in folder service wh… (#58597)
* Nested Folders: Support getting of nested folder in folder service when feature flag is set

* Fix lint

* Fix some tests

* Fix ngalert test

* ngalert fix

* Fix API tests

* Fix some tests and lint

* Fix lint 2

* Fix library elements and panels

* Add access control to get folder

* Cleanup and minor test change
2022-11-11 14:28:24 +01:00
f5cace8bbd Rename Acl to ACL (#52342)
* Rename Acl to ACL

* Fix yaml files

* Add xorm tags and fix test
2022-07-18 15:14:58 +02:00
8b3b667a47 Alerting: Fix rule API to accept 0 duration of field For (#50992)
* make 'for' pointer to distinguish between missing field and 0
* set 'for' to -1 if the value is missing but not allow negative in the request + path -1 with the value from original rule
* update store validation to not allow negative 'for'
* update usages to use pointer
2022-06-30 11:46:26 -04:00
f75bea481d Alerting: validate rules and calculate changes in API controller (#45072)
* Update API controller
   - add validation of rules API model
   - add function to calculate changes between the submitted alerts and existing alerts
   - update RoutePostNameRulesConfig to validate input models, calculate changes and apply in a transaction

* Update DBStore
   - delete unused storage method. All the logic is moved upstream.
   - upsert to not modify fields of new by values from the existing alert
   - if rule has UID do not try to pull it from db. (it is done upstream)

* Add rule generator
2022-02-23 11:30:04 -05:00