This PR introduces "admission lanes" for dispatching tasks to the runner (scheduler), that are controlled by token buckets, to avoid overflowing the runner with a large amount of tasks.
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
**What this PR does / why we need it**:
This PR introduces the sharding to the new query engine to be able to test the engine using the exsiting Loki architecture with query frontend, query scheduler, and queriers.
**Note, this is only an interim solution used for validating and testing.**
By design, the first phase of the query engine implementation does only local execution of queries without sharding or time-based splitting.
However, for testing the results of this first phase, we can utilise dual-ingestion (writing both chunks/tsdb and data objects in a way that sharding and splitting is performed in the query frontend using the existing middlewares.
On the queriers themselves, the sub-queries are parsed and planned using the new logical and physical planner. If the query can successfully be planned, it will be executed by the new engine, otherwise it falls back to the old engine.
**Shard size considerations**
While performing time-splitting in the frontend works for data objects as well, sharding by information from TSDB is not directly mappable to data objects. The default target shard size in TSDB is 600MB (decompressed), whereas target size of data objects is 1GB compressed or roughly 10-15GB uncompressed. However, individual logs sections of a data object have a target size of 128MB, which is roughly 0.9-1.2GB. That is 1.5-2x larger than the TSDB target shard size. So when using the sharding calculation from TSDB, it would over-shard for data object sections, which is likely acceptable for testing and good enough for proving that local execution with the new engine works.
**How does sharding with data objects in this PR work?**
The query frontend passes down the calculated shards as part of the query parameters of the serialised sub-request. The logical planner on the querier stores the Shard annotation on the `MAKETABLE` alongside the stream selector. This is then used by the physical planner to filter out only the relevant sections of the resolved data objects from the metastore lookup. During exeuction, only readers for the relevant sections are initialised when performing the `DataObjScan`.
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Co-authored-by: Ashwanth <iamashwanth@gmail.com>
This PR connects the already existing components of the new query engine and wires them up in the querier API.
The feature flag `-querier.engine.enable-v2-engine` controls whether supported queries should be executed using the new engine.
Note that this implementation currently does not execute any queries.
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Adds a new `--include-common-labels` flag to logcli query, which causes the output to include all common labels.
This is mostly useful with `--quiet` and `--output=jsonl` as it allows users to get structured information from Loki that has all the information that Loki can provide.
---
Signed-off-by: Jonathan Lange <jml@mumak.net>
Co-authored-by: Christian Haudum <christian.haudum@gmail.com>
Passing `--compress` to logcli will enable (or more accurately not disable) compression on the `http.Transport`, allowing Loki to return gzip-compressed payloads.
This improves overall execution time and reduces data transfer by 10-15x.
Signed-off-by: Jason Tackaberry <tack@urandom.ca>
When tailing with logcli, the loghttp.TailResponse is created outside of the loop, resulting in stale labels within the TailResponse LabelSet as the unmarshal does not clear the map.
Signed-off-by: Cyrill Troxler <cyrill@nine.ch>
Iterators of various types are widely used throughout the Loki code base. With the recent code additions of the bloom filters, a new set of utility functions for iterators emerged in `github.com/grafana/loki/v3/pkg/storage/bloom/v1`. The package defines interfaces for various common types, but also provides implementations to create and compose new iterators that implement these interfaces. This new package uses Go Generics.
- However, at the current state, there are multiple iterator interfaces and implementations, which adds cognitive overhead in understanding how they work and which ones should be used. The idea is to unify them into a single iterator lib for Loki.
- As a first step towards a single iterator library for the Loki code base, this PR moves the utilities from the `pkg/storage/bloom/v1` package to the `pkg/iter/v2` package.
- Second, it changes the existing `EntryIterator` and `SampleIterator` iterators to "inherit" `v2.CloseIterator` in order to expose the same function names.
- And lastly, naming conventions of iterator interfaces and structs are unified.
The basic iterator interface (defined in `pkg/iter/v2/interface.go`) looks like so:
```go
// Usage:
//
// for it.Next() {
// curr := it.At()
// // do something
// }
// if it.Err() != nil {
// // do something
// }
type Iterator[T any] interface {
Next() bool
Err() error
At() T
}
```
---
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
**What this PR does / why we need it**:
Following https://github.com/grafana/loki/pull/11123 and in order to
enable https://github.com/grafana/loki/pull/10417 the query frontend
should send the serialized LogQL AST instead of the query string to the
queriers. This enables the frontend to change the AST and inject
expressions that are not expressible in LogQL.
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [x] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e3ec)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory.
[Example
PR](0d4416a4b0)
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Callum Styan <callumstyan@gmail.com>
**What this PR does / why we need it**:
This just updates the Loki build imaage changed in #11114.
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e3ec)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory.
[Example
PR](0d4416a4b0)
**What this PR does / why we need it**:
Use the metrics namespace setting instead of hardcoding to `cortex`.
This is a follow up to (and based on)
https://github.com/grafana/loki/pull/11014.
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [X] Tests updated
- [x] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e3ec)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory. <!--
TODO(salvacorts): Add example PR -->
---------
Signed-off-by: Michel Hollands <michel.hollands@gmail.com>
Co-authored-by: Ashwanth <iamashwanth@gmail.com>
**What this PR does / why we need it**:
#### Removes `shared_store` and `shared_store_key_prefix` from index
shipper and compactor configs and their corresponding CLI flags.
- `-tsdb.shipper.shared-store`
- `-boltdb.shipper.shared-store`
- `-tsdb.shipper.shared-store.key-prefix`
- `-boltdb.shipper.shared-store.key-prefix`
- `-boltdb.shipper.compactor.shared-store`
- `-boltdb.shipper.compactor.shared-store.key-prefix`
`shared_store` has been a confusing option allowing users to easily
misconfigure Loki.
Going forward `object_store` setting in the
[period_config](https://grafana.com/docs/loki/latest/configure/#period_config)
(which already configured the store for chunks) will be used to
configure store for the index.
And the newly added `path_prefix` option under the `index` key in
`period_config` will configure the path under which index tables are
stored.
This change enforces chunks and index files for a given period reside
together in the same storage bucket. More details in the upgrade guide.
---
`-compactor.delete-request-store` has to be **explicitly configured**
going forward. Without setting this, loki wouldn't know which object
store to use for storing delete requests. Path prefix for storing
deletes is decided by `-compactor.delete-request-store.key-prefix` which
defaults to `index/`.
**Checklist**
- [X] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [X] Documentation added
- [X] Tests updated
- [x] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [X] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e3ec)
**What this PR does / why we need it**:
This PR restructures the code for the shipper component of the Loki storage layer.
New package layout:
```console
$ tree -d pkg/storage/stores/shipper
pkg/storage/stores/shipper
└── indexshipper
├── boltdb
│ └── compactor
├── compactor
│ ├── client
│ │ └── grpc
│ ├── deletion
│ ├── deletionmode
│ ├── generationnumber
│ └── retention
├── downloads
├── gatewayclient
├── index
├── indexgateway
├── storage
├── testutil
├── tsdb
│ ├── index
│ ├── testdata
│ └── testutil
├── uploads
└── util
23 directories
```
* TSDB and BoltDB specific code is under `./pkg/storage/stores/shipper/indexshipper/tsdb` and `./pkg/storage/stores/shipper/indexshipper/boltdb` respectively.
* Common code for both TSDB and BoltDB is directly under `./pkg/storage/stores/shipper/indexshipper` and subdirectories, such as `uploads/`, `downloads/`, `compactor/`, ...
**Special notes for your reviewer**:
This PR is identical to https://github.com/grafana/loki/pull/10724 except of the package `pkg/storage/stores/indexshipper` the package is `pkg/storage/stores/shipper/indexshipper` (one level deeper).
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
**What this PR does / why we need it**:
This PR upgrades dskit and replaces use of packages from
weaveworks/common with their migrated equivalents in dskit. See
https://github.com/grafana/dskit/pull/342 for more details.
Note that Loki still uses some packages from weaveworks/common that I
haven't migrated (`aws` and `test`) - I'll migrate these separately.
If this PR needs to be rebuilt, I used `rewrite.sh`
([source](https://gist.github.com/charleskorn/48efe62a09d6d70f3de30327003df5c5#file-rewrite-sh))
to generate most of these changes.
**Which issue(s) this PR fixes**:
(none)
**Special notes for your reviewer**:
(none)
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [n/a] Documentation added
- [n/a] Tests updated
- [n/a] `CHANGELOG.md` updated
- [n/a] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [n/a] Changes that require user attention or interaction to upgrade
are documented in `docs/sources/setup/upgrade/_index.md`
- [n/a] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e3ec)
Co-authored-by: Kaviraj Kanagaraj <kavirajkanagaraj@gmail.com>
This PR fixes `logcli` to use the new endpoints for volume, which are
now `volume` and `volume_range` instead of `series_volume` and
`series_volume_range`. This PR also adds support for the additional
parameters `targetLabels` and `aggregateBy`.
Add `stats`, `volume`, and `volume_range` commands to `logcli`.
Does not implement the file client for now. I think it would be cool if
in the future the file client could read from a downloaded TSDB index
file for these commands.
**What this PR does / why we need it**:
set TSDB shipper mode to ReadOnly and disabled indexGatewayClient during
local query run. Also, I increase index downloading timeout to 1m
because the users run the query from local machines that not always have
more than 100Mbps
**Which issue(s) this PR fixes**:
Fixes#9555
**Special notes for your reviewer**:
Unfortunately, we do not have similar tests and test architecture to
cover it with unit tests properly.
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [x] `CHANGELOG.md` updated
- [x] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
- [x] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e3ec)
**What this PR does / why we need it**:
This PR improves the error messaging for Logcli when an object cannot be
downloaded from the store by printing the name of the object along with
the error.
**Which issue(s) this PR fixes**:
Internal support escalation
**Special notes for your reviewer**:
There are other places where GetObject is called:
- `indexStorageClient.GetFile`
- `indexStorageClient.GetUserFile`
Looks like both of them are used only by the compactor and the table
manager. IIUC, these functions are not used by LogCLI.
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
Signed-off-by: Edward Welch <edward.welch@grafana.com>
**What this PR does / why we need it**:
Loki does not currently split queries by time to a value smaller than
what's in the [range] of a range query.
Example
```
sum(rate({job="foo"}[2d]))
```
Imagine now this query being executed over a longer window of a few days
with a step of something like 30m.
Every step evaluation would query the last [2d] of data.
There are use cases where this is desired, specifically if you force the
step to match the value in the range, however what is more common is
someone accidentally uses `[$__range]` in here instead of
`[$__interval]` within Grafana and then sets the query time selector to
a large value like 7 days.
This PR adds a limit which will fail queries that set the [range] value
higher than the configured limit.
It's disabled by default.
In the future it may be possible for Loki to perform splits within the
[range] and remove the need for this limit, but until then this can be
an important safeguard in clusters with a lot of data.
**Which issue(s) this PR fixes**:
Fixes#8746
**Special notes for your reviewer**:
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
---------
Signed-off-by: Edward Welch <edward.welch@grafana.com>
Co-authored-by: Karsten Jeschkies <karsten.jeschkies@grafana.com>
Co-authored-by: Vladyslav Diachenko <82767850+vlad-diachenko@users.noreply.github.com>
**What this PR does / why we need it**:
Some end-users can impose great workload on a cluster by selecting too
many streams in their queries. We should be able to limit them.
Therefore we introduce a new limit `RequiredLabelMatchers` which list
label names that must be included in the stream selectors.
The implementation follows the same approach as for max query limit.
**Which issue(s) this PR fixes**:
Fixes#8745
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [x] Tests updated
- [x] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
In this PR we're allowing for passing of a `context.Context` via the
Limits interfaces (some of which are new, to clean up
hardcoding/embedding of `validation.Overrides`) This is based on
work/ideas by @jeschkies .
Fixes#8694
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Karsten Jeschkies <karsten.jeschkies@grafana.com>