166 Commits

Author SHA1 Message Date
f66a693438 Chore: Rename integration tests to follow the common convention (#105987)
* automatically rename integration tests to follow the common convention

* name tests differently

* alter column type to bigint

* update another column to bigint

* add another alter

* fix subquery for mysql
2025-06-29 16:56:24 +02:00
589046bcdc Alerting: Persist alert instance FiredAt field (#105927)
* Persist alert instance fired at

* Update protos and tests
2025-05-27 10:04:26 +01:00
694b9dfe50 Chore: Replace xorm.io/xorm imports (#104458)
* replace xorm.io/xorm imports

* replace xorm from other go.mod files

* clean up workspace

* nolint does not make sense anymore as it is not a module

* try if nolint directive helps

* use nolint:all for xorm

* add more nolints

* try to skip xorm in linter config

* exclude xorm differently

* retrigger ci
2025-05-02 17:13:01 +02:00
e1e1d3fd9f Fix: Prints should always include new lines (#102795)
* CI: Allow Bench conversion to fail

We shouldn't mark PRs and commits as X if they fail to convert logs with Bench.

* Fix: Prints should always include new lines

* fix: remove unused import
2025-03-27 12:27:53 +01:00
24ebacb10b Alerting: Add migration to clean up rule versions table (#102484)
* add migration to clean up rule versions
* drop index right before creating a new one.
* fetch only rules which version greater than toKeep
2025-03-20 12:34:36 -04:00
e39b17d701 Alerting: Remove constraints for uniqueness of rule title (#102067)
* fix having duplicated names in same group in the UI

---------

Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
2025-03-18 13:27:44 -04:00
04f20127a2 Revert "Alerting: Add an index to alert_rule_version table on (rule_org_id, rule_uid) (#102347)" (#102368)
This reverts commit 9491fa18957cff5ae7ca986d82d567d66fc402cd.
2025-03-18 14:54:45 +01:00
695ac91290 Alerting: Add backend support for keep_firing_for (#100750)
What is this feature?

This PR introduces a new alert rule configuration option, keep_firing_for (Prometheus documentation).

keep_firing_for prevents alerts from resolving immediately after the alert condition returns to normal. Instead, they transition into a "Recovering" state and are not considered resolved by the Alertmanager. Once the recovery period ends (or after the next evaluation if it is bigger than keep_firing_for), the alert transitions to "Normal" if it doesn't start alerting again:

Before                                          

+----------+     +----------+                    
| Alerting |---->|  Normal  |                    
+----------+     +----------+                    

-----
After

+----------+      +------------+     +----------+
| Alerting |----->| Recovering |---->|  Normal  |
+----------+      +------------+     +----------+                                                 

Why do we need this feature?

This feature prevents flapping alerts by adding a recovery period. This helps avoid false resolutions caused by brief alert
2025-03-18 11:24:48 +01:00
9491fa1895 Alerting: Add an index to alert_rule_version table on (rule_org_id, rule_uid) (#102347) 2025-03-18 11:15:55 +01:00
7dd6f52630 Alerting: Add MissingSeriesEvalsToResolve option to the AlertRule (#101184) 2025-03-11 22:12:06 +01:00
dc75b454f5 Alerting: Improve performance of the setting GUID during migration (#101800) 2025-03-07 12:28:07 -05:00
879b121136 Alerting: Add GUID to alert rule tables (#101321)
* add column guid to alert rule table and rule_guid to rule version table
+ populate the new field with UUID
* update storage and domain models
* patch GUID
* ignore GUID in fingerprint tests
2025-02-28 09:47:25 -05:00
cb43f4b696 Alerting: Add compressed protobuf-based alert state storage (#99193) 2025-01-27 18:47:33 +01:00
92d6762a3a Alerting: Store information about user that created\updated alert rule (#99395)
* introduce new fields created_by in rule tables
* update domain model and compat layer to support UpdatedBy
* add alert rule generator mutators for UpdatedBy
* ignore UpdatedBy in diff and hash calculation
* Add user context to alert rule insert/update operations
  Updated InsertAlertRules and UpdateAlertRules methods to accept a user context parameter. This change ensures auditability and better tracking of user actions when creating or updating alert rules. Adjusted all relevant calls and interfaces to pass the user context accordingly.

* set UpdatedBy in PreSave because this is where Updated is set
* Use nil userID for system-initiated updates
This ensures differentiation between system and user-initiated changes for better traceability and clarity in update origins.

---------

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
2025-01-24 12:09:17 -05:00
9f5b05f936 Alerting: Add metadata field with editor_settings to alert rule (#93245) 2024-09-19 16:43:41 +02:00
32f06c6d9c Alerting: Receiver API complete core implementation (#91738)
* Replace global authz abstraction with one compatible with uid scope

* Replace GettableApiReceiver with models.Receiver in receiver_svc

* GrafanaIntegrationConfig -> models.Integration

* Implement Create/Update methods

* Add optimistic concurrency to receiver API

* Add scope to ReceiversRead & ReceiversReadSecrets

migrates existing permissions to include implicit global scope

* Add receiver create, update, delete actions

* Check if receiver is used by rules before delete

* On receiver name change update in routes and notification settings

* Improve errors

* Linting

* Include read permissions are requirements for create/update/delete

* Alias ngalert/models to ngmodels to differentiate from v0alpha1 model

* Ensure integration UIDs are valid, unique, and generated if empty

* Validate integration settings on create/update

* Leverage UidToName to GetReceiver instead of GetReceivers

* Remove some unnecessary uses of simplejson

* alerting.notifications.receiver -> alerting.notifications.receivers

* validator -> provenanceValidator

* Only validate the modified receiver

stops existing invalid receivers from preventing modification of a valid
receiver.

* Improve error in Integration.Encrypt

* Remove scope from alert.notifications.receivers:create

* Add todos for receiver renaming

* Use receiverAC precondition checks in k8s api

* Linting

* Optional optimistic concurrency for delete

* make update-workspace

* More specific auth checks in k8s authorize.go

* Add debug log when delete optimistic concurrency is skipped

* Improve error message on authorizer.DecisionDeny

* Keep error for non-forbidden errutil errors
2024-08-26 10:47:53 -04:00
ba800692c6 Alerting: Persist AlertInstance ResolvedAt & LastSentAt (#89135)
* Alerting: Persist AlertInstance ResolvedAt & LastSentAt

* Fix test

* Modify existing tests

* Fix merge conflicts from nullable LastSentAt & ResolvedAt
2024-07-12 12:26:58 -04:00
36ef611cf4 Alerting: Add database migration for recording rule fields (#87012)
* Create recording rule fields in model

* Add migration

* Write to database, support in version table

* extend fingerprint

* Force fields to be empty on validate

* Another storage spot, tests for fingerprint

* Explicitly set defaults in provisioning API

* Tests for main API validation

* Add diff tests even though fields are unpopulated for now

* Use struct tag approach instead of FromDB/ToDB hooks as it better handles nulls when deserializing

* test for deser

* Backout RecordTo for now since it's not decided in the doc

* back out of migration too

* Drop datasourceref for now

* address linter complaints

* Try a single outer struct with all fields embedded
2024-05-09 12:12:44 -05:00
1eebd2a4de Alerting: Support for simplified notification settings in rule API (#81011)
* Add notification settings to storage\domain and API models. Settings are a slice to workaround XORM mapping
* Support validation of notification settings when rules are updated

* Implement route generator for Alertmanager configuration. That fetches all notification settings.
* Update multi-tenant Alertmanager to run the generator before applying the configuration.

* Add notification settings labels to state calculation
* update the Multi-tenant Alertmanager to provide validation for notification settings

* update GET API so only admins can see auto-gen
2024-02-15 09:45:10 -05:00
f6a46744a6 Alerting: Support hysteresis command expression (#75189)
Backend: 

* Update the Grafana Alerting engine to provide feedback to HysteresisCommand. The feedback information is stored in state.Manager as a fingerprint of each state. The fingerprint is persisted to the database. Only fingerprints that belong to Pending and Alerting states are considered as "loaded" and provided back to the command.
   - add ResultFingerprint to state.State. It's different from other fingerprints we store in the state because it is calculated from the result labels.
  -  add rule_fingerprint column to alert_instance
   - update alerting evaluator to accept AlertingResultsReader via context, and update scheduler to provide it.
   - add AlertingResultsFromRuleState that implements the new interface in eval package
   - update getExprRequest to patch the hysteresis command.

* Only one "Recovery Threshold" query is allowed to be used in the alert rule and it must be the Condition.


Frontend:

* Add hysteresis option to Threshold in UI. It's called "Recovery Threshold"
* Add test for getUnloadEvaluatorTypeFromCondition
* Hide hysteresis in panel expressions

* Refactor isInvalid and add test for it
* Remove unnecesary React.memo
* Add tests for updateEvaluatorConditions

---------

Co-authored-by: Sonia Aguilar <soniaaguilarpeiron@gmail.com>
2024-01-04 11:47:13 -05:00
6644e5e676 Alerting: Fix migration that is brittle to version downgrades (#78976) 2023-12-01 15:18:41 -05:00
cdad712547 Alerting: Keep track of individual org migration status (#78369)
* Alerting: Keep track of individual org migration status

Save migration status per migrated org.

Change the meaning (and key/value) of the org_id=0 entry 
to store the current (previous) config value used by alerting. 
This is so we can know when to upgrade/downgrade by 
comparing with the new config value in 
UnifiedAlerting.IsEnabled.
2023-11-30 10:25:59 -05:00
82f3127e23 Alerting: Move legacy alert migration from sqlstore migration to service (#72702) 2023-10-12 13:43:10 +01:00
f6649d7a97 Revert "Alerting: Remove vendored models in migration service" (#76387)
Revert "Alerting: Remove vendored models in migration service (#74503)"

This reverts commit 6a8649d54420bb3e649d956350922a067564ec8b.
2023-10-11 14:21:21 -05:00
6a8649d544 Alerting: Remove vendored models in migration service (#74503)
This PR replaces the vendored models in the migration with their equivalent ngalert models. It also replaces the raw SQL selects and inserts with service calls.

It also fills in some gaps in the testing suite around:

    - Migration of alert rules: verifying that the actual data model (queries, conditions) are correct 9a7cfa9
    - Secure settings migration: verifying that secure fields remain encrypted for all available notifiers and certain fields migrate from plain text to encrypted secure settings correctly e7d3993

Replacing the checks for custom dashboard ACLs will be replaced in a separate targeted PR as it will be complex enough alone.
2023-10-11 17:22:09 +01:00
ed7d29f2b9 Alerting: Migrate old alerting templates to Go templates (#62911)
* Migrate old alerting templates to use $labels

* Fix imports

* Add test coverage and separate rewriting to Go templates

* Fix lint

* Check for additional closing braces

* Add logging of invalid message templates

* Fix tests

* Small fixes

* Update comments

* Panic on empty token

* Use logtest.Fake

* Fix lint

* Allow for spaces in variable names by not tokenizing spaces

* Add template function to deduplicate Labels in a Value map

* Fix behavior of mapLookupString

* Reference deduplicated labels in migrated message template

* Fix behavior of deduplicateLabelsFunc

* Don't create variable for parent logger

* Add more tests for deduplicateLabelsFunc

* Remove unused function

* Apply suggestions from code review

Co-authored by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

* Give label val merge function better name

* Extract template migration and escape literal tokens

* Consolidate + simplify template migration

---------

Co-authored-by: William Wernert <william.wernert@grafana.com>
2023-10-02 11:25:33 -04:00
58f6648505 Chore: capitalise messages for alerting (#74335) 2023-09-04 18:46:34 +02:00
025b2f3011 Chore: use any rather than interface{} (#74066) 2023-08-30 18:46:47 +03:00
143683bd05 Alerting: Add more clear error to migration when rule cannot be parsed (#72374) 2023-07-27 09:47:04 -04:00
c7eb7fb58a Alerting: Keep legacy alert rule maxDataPoints and intervalMs during migration (#71989)
Keep legacy alert rule max-data-points and interval during migration
2023-07-24 13:36:34 -04:00
8c6cdf51fc Alerting: No longer silence paused alerts during legacy migration (#71596)
* Alerting: No longer silence paused alerts during legacy migration

Now that we migrate paused legacy alerts to paused UA alert rules, we no longer need to silence them.
2023-07-17 09:37:53 -04:00
Jo
d6c468c1c2 Auth: Add empty role definition (#64694)
* Allow setting role as None

Co-authored-by: gamab <gabi.mabs@gmail.com>

Seeking for places where role.None would be used

Co-authored-by: Jguer <joao.guerreiro@grafana.com>

Adding None role to the frontend

Co-authored-by: Jguer <joao.guerreiro@grafana.com>

unify org role declaration and remove from add permission

fix backend test

fix backend lint

* remove role none from frontend

* Simplify checks

Co-authored-by: Kalle Persson <kalle.persson@grafana.com>

* nits

---------

Co-authored-by: Kalle Persson <kalle.persson@grafana.com>
2023-07-06 15:40:06 +02:00
00d5f7fed7 Alerting: Convert 'Both' type Prometheus queries to 'Range' in migration (#70781)
* Alerting: Convert 'Both' type Prometheus queries to 'Range' in migration
2023-06-28 14:02:57 -04:00
ff9eff49bd Alerting: Bump grafana/alerting and refactor the ImageStore/Provider to provide image URL/bytes (#70182)
* implement alerting.images.Provider interface in our ImageStore

* add URLExists() method to fakeConfigStore

* make linter happy

* update integration tests
2023-06-21 20:53:30 -03:00
a6279b2d62 Database: Make dialects independent of xorm Engine (#69955)
* make dialects independent of xorm Engine

* goimports
2023-06-14 16:13:36 -04:00
0ed5d3bdf2 Revert "Alerting: Refactor the ImageStore/Provider to provide image URL/bytes" (#69265)
Revert "Alerting: Refactor the ImageStore/Provider to provide image URL/bytes (#67693)"

This reverts commit 72a187b0be1222b8fecb63792fe3ea67d70d31ad.
2023-05-30 11:33:33 -05:00
72a187b0be Alerting: Refactor the ImageStore/Provider to provide image URL/bytes (#67693)
* (WIP) Refactor the ImageStore interface to work with our latest alerting repository

* update alerting package

* refactor, new URLExists method in ImageProvider

* tests for the new methods

* fix linter warnings

* use alertingImages as an alias for grafana/alerting/images

* logs about image uris and not found images

* nerf image not found logs

* extract duplicated code to getImageFromURI() method

* refactor getImageFromURI()

* add index on url

* add comment about migration log

* sync generated files
2023-05-30 11:25:55 -03:00
3af95bebe1 Alerting: Migrate unknown NoData\Error settings to the default (#68403)
* use default execution if legacy is not known

* update docs

* Update docs/sources/alerting/migrating-alerts/migrating-legacy-alerts.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

* Update docs/sources/alerting/migrating-alerts/migrating-legacy-alerts.md

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>

---------

Co-authored-by: brendamuir <100768211+brendamuir@users.noreply.github.com>
2023-05-24 13:09:17 -04:00
0ce7f7eaf4 Alerting: Migration to not fail if alert_configuration table is not empty (#67924) 2023-05-05 12:03:53 -04:00
61484fa826 Alerting: Mention title of alert rule that caused migration to fail (#67451)
* add debug log for migration of alert rules
* add alert rule name and some information to conversion error
2023-05-01 10:32:16 -04:00
a8b4a4bb45 Alerting: Update alerting module to 20230418161049-5f374e58cb32 + refactoring (#66622)
* update to alerting 20230418161049-5f374e58cb32
* rename renamed structs in https://github.com/grafana/alerting/pull/73
* update ValidateContactPoint to use BuildReceiverConfiguration
* update logger factory according to changes
* rewrite integration builder
Co-authored-by: Santiago <santiagohernandez.1997@gmail.com>
2023-04-25 13:39:46 -04:00
035bf29146 RBAC: Remove the option to disable RBAC and add automated permission migrations for instances that had RBAC disabled (#66652)
* RBAC: Stop reading enabeld from ini file and always set to true

* Migrations: Add a migration for rbac to reset data migrations if rbac
was disabled

* If rbac was disabled we reset the data and data migrations that rbac
  has to perform to get it to a correct state

* Migrator: Store migration logs on migrator and add function to clear it from the
in-memory stored logs

* update tests

---------

Co-authored-by: Karl Persson <kalle.persson@grafana.com>
2023-04-19 16:34:19 +01:00
afd52d0866 Alerting: use alerting GrafanaReceiver and BuildReceiverConfiguration in Grafana (#65224)
* replace receiver errors with one from alerting
* add the converter to alerting models
* update buildReceiverIntegration to accept GrafanaReceiver
---------

Co-authored-by: George Robinson <george.robinson@grafana.com>
2023-04-13 12:25:32 -04:00
7b2f44762e Alerting: Update migration to put alerts to the default folder if dashboard folder is missing (#65577)
* extract function

* use context logger

* put alert to general folder if folder is missing

* move folderHelper init

* add test

* Update pkg/services/sqlstore/migrations/ualert/ualert.go

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>

---------

Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
2023-03-31 18:40:21 +02:00
3cfb7ac0dd Chore: Remove expr imports (#64543)
* Remove expr imports

* Add comment
2023-03-23 21:55:54 +01:00
030f6c948f Alerting: Fix migration pauses all alert rules on PostgreSQL (#63951)
This commit fixes a serious bug in Grafana 9.4.1 where on upgrade
a migration would pause all existing alert rules and change the
default value of the column to true.
2023-03-01 17:32:29 +00:00
a05bf41ff9 Alerting: Fix boolean default in migration from false to 0 (#63952)
Fix boolean default in migration from false to 0
2023-03-01 16:58:30 +00:00
b1e58eb47e Chore: Replace short UID generation with more standard UUIDs (#62731) 2023-02-06 20:44:37 -05:00
f066e8cdcd Alerting: Update to alerting 20230203015918-0e4e2675d7aa (after refactoring) (#62823)
* add alerting prefix to some packages from alerting that have similar names in prometheus alertmanager
2023-02-03 11:36:49 -05:00
00d954320f Chore: rename Id to ID in alert notification models (#62868) 2023-02-03 15:46:55 +01:00