* placeholder commit
* Complete function in api_convert_prometheus.go
* MVP before extensive testing
* Cleanup
* Updated tests
* cleanup
* Fix random logs and lint
* Remove comment
* Fix errors after rebase
* Update test
* Update swagger
* swagger
* Refactor to accept groups in body
* Fix auth tests and some cleanup
* Some cleanup before refactoring
* Remove unnecessary fields
* Also refactor RouteConvertPrometheusPostRuleGroup
* Remove unused code
* Rebase + cleanup
* Update authorization_test
* address comments
* Regen swagger files
* Remove namespace and group filters
* Final comments
* add query parameter to existing APIs to control the permanent deletion of rules
* add GUID to gettable rule
* add new endpoint /ruler/grafana/api/v1/trash/rule/guid/{RuleGUID} to delete rules from trash permanently
---------
Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
* Add showPolicies prop
* Add manage permissions component for easier reuse within alerting
* Add method for checking whether to show access control within alerting
* Remove accidental console.log from main
* Tweak styling for contact point width and add manage permissions drawer
* Improve typing for access control type response
* Add basic test for manage permissions on contact points list
* Only show manage permissions if grafana AM and alertingApiServer is enabled
* Update i18n
* Add test utils for turning features on and back off
* Add access control handlers
* Update tests with new util
* Pass AM in and add tests
* Receiver OSS resource permissions
There is a complication that is not fully addressed: Viewer defaults to read:*
and Editor defaults to read+write+delete:*
This is different to other resource permissions where non-admin are not granted
any global permissions and instead access is handled solely by resource-specific
permissions that are populated on create and removed on delete.
This allows them to easily remove permission to view or edit a single resource
from basic roles.
The reason this is tricky here is that we have multiple APIs that can
create/delete receivers: config api, provisioning api, and k8s receivers api.
Config api in particular is not well-equipped to determine when creates/deletes
are happening and thus ensuring that the proper resource-specific permissions
are created/deleted is finicky.
We would also have to create a migration to populate resource-specific
permissions for all current receivers. This migration would need to be reset so
it can run again if the flag is disabled.
* Add access control permissions
* Pass in contact point ID to receivers form
* Temporarily remove access control check for contact points
* Include access control metadata in k8s receiver List & Get
GET: Always included.
LIST: Included by adding a label selector with value `grafana.com/accessControl`
* Include new permissions for contact points navbar
* Fix receiver creator fixed role to not give global read
* Include in-use metadata in k8s receiver List & Get
GET: Always included.
LIST: Included by adding a label selector with value `grafana.com/inUse`
* Add receiver creator permission to receiver writer
* Add receiver creator permission to navbar
* Always allow listing receivers, don't return 403
* Remove receiver read precondition from receiver create
Otherwise, Creator role will not be able to create their first receiver
* Update routes permissions
* Add further support for RBAC in contact points
* Update routes permissions
* Update contact points header logic
* Back out test feature toggle refactor
Not working atm, not sure why
* Tidy up imports
* Update mock permissions
* Revert more test changes
* Update i18n
* Sync inuse metadata pr
* Add back canAdmin permissions after main merge
* Split out check for policies navtree item
* Tidy up utils and imports and fix rules in use
* Fix contact point tests and act warnings
* Add missing ReceiverPermissionAdmin after merge conflict
* Move contact points permissions
* Only show contact points filter when permissions are correct
* Move to constants
* Fallback to empty array and remove labelSelectors (not needed)
* Allow `toAbility` to take multiple actions
* Show builtin alertmanager if contact points permission
* Add empty state and hide templates if missing permissions
* Translations
* Tidy up mock data
* Fix tests and templates permission
* Update message for unused contact points
* Don't return 403 when user lists receivers and has access to none
* Fix receiver create not adding empty uid permissions
* Move SetDefaultPermissions to ReceiverPermissionService
* Have SetDefaultPermissions use uid from string
Fixes circular dependency
* Add FakeReceiverPermissionsService and fix test wiring
* Implement resource permission handling in provisioning API and renames
Create: Sets to default permissions
Delete: Removes permissions
Update: If receiver name is modified and the new name doesn't exist, it copies
the permissions from the old receiver to the newly created one. If old receiver
is now empty, it removes the old permissions as well.
* Split contact point permissions checks for read/modify
* Generalise getting annotation values from k8s entities
* Proxy RouteDeleteAlertingConfig through MultiOrgAlertmanager
* Cleanup permissions on config api reset and restore
* Cleanup permissions on config api POST
note this is still not available with feature flag enabled
* Gate the permission manager behind FF until initial migration is added
* Sync changes from config api PR
* Switch to named export
* Revert unnecessary changes
* Revert Filter auth change and implement in k8s api only
* Don't allow new scoped permissions to give access without FF
Prevents complications around mixed support for the scoped permissions causing
oddities in the UI.
* Fix integration tests to account for list permission change
* Move to `permissions` file
* Add additional tests for contact points
* Fix redirect for viewer on edit page
* Combine alerting test utils and move to new file location
* Allow new permissions to access provisioning export paths with FF
* Always allow exporting if its grafana flavoured
* Fix logic for showing auto generated policies
* Fix delete logic for contact point only referenced by a rule
* Suppress warning message when renaming a contact point
* Clear team and role perm cache on receiver rename
Prevents temporarily broken UI permissions after rename when a user's source of
elevated permissions comes from a cached team or basic role permission.
* Debug log failed cache clear on CopyPermissions
---------
Co-authored-by: Matt Jacobson <matthew.jacobson@grafana.com>
* Add auth checks and test
* Check user is authorized to view rule and add tests
* Change naming
* Update Swagger params
* Update auth test and swagger gen
* Update swagger gen
* Change response to GettableExtendedRuleNode
* openapi3-gen
* Update tests with refactors models pkg
Removes legacy alerting, so long and thanks for all the fish! 🐟
---------
Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
Co-authored-by: Sonia Aguilar <soniaAguilarPeiron@users.noreply.github.com>
Co-authored-by: Armand Grillet <armandgrillet@users.noreply.github.com>
Co-authored-by: William Wernert <rwwiv@users.noreply.github.com>
Co-authored-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>
* Add single receiver method
* Add receiver permissions
* Add single/multi GET endpoints for receivers
* Remove stable tag from time intervals
See end of PR description here: https://github.com/grafana/grafana/pull/81672
* declare new API and models GettableTimeIntervals, PostableTimeIntervals
* add new actions alert.notifications.time-intervals:read and alert.notifications.time-intervals:write.
* update existing alerting roles with the read action. Add to all alerting roles.
* add integration tests
* ngalert openapi: Use same `basePath` as rest of Grafana
Currently, there are two issues that prevent easily merging `ngalert` and grafana openapi specs:
- The basePath is different. `grafana` has `/api` and `ngalert` has `/api/v1`. I changed `ngalert` to use `/api`
- The `ngalert` endpoints have their basePath in the each operation path. The basePath should actually be omitted
---------
Co-authored-by: Yuriy Tseretyan <yuriy.tseretyan@grafana.com>
This PR has two steps that together create a functional dry-run capability for the migration.
By enabling the feature flag alertingPreviewUpgrade when on legacy alerting it will:
a. Allow all Grafana Alerting background services except for the scheduler to start (multiorg alertmanager, state manager, routes, …).
b. Allow the UI to show Grafana Alerting pages alongside legacy ones (with appropriate in-app warnings that UA is not actually running).
c. Show a new “Alerting Upgrade” page and register associated /api/v1/upgrade endpoints that will allow the user to upgrade their organization live without restart and present a summary of the upgrade in a table.
* extend RuleStore interface to get namespace by UID
* add new export API endpoints
* implement request handlers
* update authorization and wire handlers to paths
* add folder error matchers to errorToResponse
* add tests for export methods
* Alerting: Add endpoint to revert to a previous alertmanager configuration
This endpoint is meant to be used in conjunction with /api/alertmanager/grafana/config/history to
revert to a previously applied alertmanager configuration. This is done by ID instead of raw config
string in order to avoid secure field complications.
* WIP
* skip invalid historic configurations instead of erroring
* add warning log when bad historic config is found
* remove unused custom marshaller for GettableHistoricUserConfig
* add id to historic user config, move limit check to store, fix typo
* swagger spec
* Define endpoint and generate
* Wire up and register endpoint
* Cleanup, define authorization
* Forgot the leading slash
* Wire up query and SignedInUser
* Wire up timerange query params
* Add todo for label queries
* Drop comment
* Update path to rules subtree
* Use suggested value for uid
* update the snapshot
* use __expr__
* replace all -100 with __expr__
* update snapshot
* more changes
* revert redundant change
* Use expr.DatasourceUID where it's possible
* generate files
This adds provisioning endpoints for downloading alert rules and alert rule groups in a
format that is compatible with file provisioning. Each endpoint supports both json and
yaml response types via Accept header as well as a query parameter
download=true/false that will set Content-Disposition to recommend initiating a download
or inline display.
This also makes some package changes to keep structs with potential to drift closer
together. Eventually, other alerting file structs should also move into this new file
package, but the rest require some refactoring that is out of scope for this PR.
* Implement backtesting engine that can process regular rule specification (with queries to datasource) as well as special kind of rules that have data frame instead of query.
* declare a new API endpoint and model
* add feature toggle `alertingBacktesting`
* (WIP) switch to fork AM, first implementation of the API, generate spec
* get receivers avoiding race conditions
* use latest version of our forked AM, tests
* make linter happy, delete TODO comment
* update number of expected paths to += 2
* delete unused endpoint code, code review comments, tests
* Update pkg/services/ngalert/notifier/alertmanager.go
Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
* remove call to fmt.Println
* clear naming for fields
* shorter variable names in GetReceivers
Co-authored-by: Matthew Jacobson <matthew.jacobson@grafana.com>
Migrations:
* add a new column alert_group_idx to alert_rule table
* add a new column alert_group_idx to alert_rule_version table
* re-index existing rules during migration
API:
* set group index on update. Use the natural order of items in the array as group index
* sort rules in the group on GET
* update the version of all rules of all affected groups. This will make optimistic lock work in the case of multiple concurrent request touching the same groups.
UI:
* update UI to keep the order of alerts in a group
* update authz to exclude entire group if user does not have access to rule
* change rule update authz to not return changes because if user does not have access to any rule in group, they do not have access to the rule
* a new query that returns alerts in group by UID of alert that belongs to that group
* collect all affected groups during calculate changes
* update authorize to check access to groups
* update tests for calculateChanges to assert new fields
* add authorization tests
* create AlertGroupKey structure
* update PrometheusSrv.
- extract creation of RuleGroup to a separate method. Use group key for grouping
* update RuleSrv
- update calculateChanges to use groupKey
- authorize to use groupkey
* Template service
* Add GET routes and implement them
* Generate mock for persist layer
* Unit tests for reading templates
* Set up composition root and get integration tests working
* Fix prealloc issue
* Extract setup boilerplate
* Update AuthorizationTest
* Rebase and resolve
* Fix linter error
* Base-line API for provisioning notification policies
* Wire API up, some simple tests
* Return provenance status through API
* Fix missing call
* Transactions
* Clarity in package dependencies
* Unify receivers in definitions
* Fix issue introduced by receiver change
* Drop unused internal test implementation
* FGAC hooks for provisioning routes
* Polish, swap names
* Asserting on number of exposed routes
* Don't bubble up updated object
* Integrate with new concurrency token feature in store
* Back out duplicated changes
* Remove redundant tests
* Regenerate and create unit tests for API layer
* Integration tests for auth
* Address linter errors
* Put route behind toggle
* Use alternative store API and fix feature toggle in tests
* Fixes, polish
* Fix whitespace
* Re-kick drone
* Rename services to provisioning
* verify that the user has access to all data sources used by the rule that needs to be deleted from the group
* if a user is not authorized to access the rule, the rule is removed from the list to delete
update method getEvaluatorForAlertRule to accept permissions evaluator and exit on the first negative result, which is more effective than returning an evaluator that in fact is a bunch of slices.