**What this PR does / why we need it**:
This PR changes the `GeneratorURL` associated with alerts generated by
Loki. The new `GeneratorURL` uses a Grafana URL path.
**What this PR does / why we need it**:
This PR contains documentation how to make use of the newly added hierarchical scheduler queues.
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
**What this PR does / why we need it**:
add s3 client http timeout for GetObject or PutObject,
why?
The flush queue has been increasing. This is a difficult problem. We
have been investigating this problem for half a year. When the querier
has a large query, the load on the s3 server is too high and the number
of connections is full. It will cause the flush queue of the ingester to
increase continuously and cannot be reduced. When this problem occurs,
after we restart the s3 server, the flush queue will quickly become 0.
We checked with our s3 (object storage compatible with the s3 protocol)
team and found that it was because we used the s3 v1 sdk. The v1 sdk did
not configure timeout by default, and it would wait indefinitely without
configuration. The s3 v2 sdk defaults to a timeout of 30s, which only
causes write failures and does not cause loki to get stuck.
**Which issue(s) this PR fixes**:
Fixes #<issue number>
At present, as long as there are large logql queries such as 30 days, I
must restart the s3 server, otherwise our ingester queue will continue
to increase. We hope that we can break the stuck go routine after the 5m
timeout occurs.
**Special notes for your reviewer**:
ingester flush queue length up 🔝

big query, 30day logql or something happen

pprof goroutine list profile snapshot

**There are other alternatives**:
```
ingester:
flush_op_timeout: 10m
```
This `flush_op_timeout` configuration is more about the timeout of loki,
not the timeout of s3 client. We will also have various s3 operations
such as delete and getObject. We should add an independent timeout for
s3, just like the cassandra client has a separate The timeout period can
be configured.
And the parameter `flush_op_timeout` in the early version of loki
represents the timeout of writing `a batch of chunks `to s3, not a
timeout of s3. Now it is indeed changed to write to s3 one by one, but
it is not ruled out that in the future, it will be changed back to write
a batch of chunks to s3 sequentially. In this way, `flush_op_timeout:
5m` is far from enough to write a batch of chunks to s3.
```
cassandra:
timeout: 5m
```
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
---------
Co-authored-by: Michel Hollands <42814411+MichelHollands@users.noreply.github.com>
**What this PR does / why we need it**:
This re-adds the configuration that was removed in
https://github.com/grafana/loki/pull/8515
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
**What this PR does / why we need it**:
This updates the PrepareShutdown method so it supports GET and DELETE
methods as well. This makes it similar to Mimir:
https://github.com/grafana/mimir/pull/4718.
The status is now stored in a local file. A new config setting had to be
added for this file as there is no obvious place to store it.
**Checklist**
- [X] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [X] Documentation added
- [X] Tests updated
- [x] `CHANGELOG.md` updated
- [x] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
---------
Signed-off-by: Michel Hollands <michel.hollands@grafana.com>
Co-authored-by: Dylan Guedes <djmgguedes@gmail.com>
**What this PR does / why we need it**:
At https://github.com/grafana/loki/pull/8972 we started caching all
index stats requests.
If the results cache gets overloaded, it can quickly take down the rest
of the loki cell due to all the increased work.
This PR adds a new flag so we can easily disable caching index stats
requests.
**Which issue(s) this PR fixes**:
This PR is a follow up for https://github.com/grafana/loki/pull/8972
**Special notes for your reviewer**:
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [x] Tests updated
- [x] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
**What this PR does / why we need it**:
This PR bumps the GEL version to match the Loki version that was added
in version 5 of the helm chart. With these new versions we can now
default to 3 targets in scalable mode `read`, `write`, and `backend`.
This PR also fixes 2 bugs when running in scalable mode:
* the gateway should forward requests to the admin api to the `backend`
target when running 3 targets
* the index gateway should default to using `ring` mode, which will
allow the queriers in the new `read` target to find the index gateway
address for their index gateway clients via the ring.
**What this PR does / why we need it**:
This PR aims to remove the manual process required to create release
notes during a release. It will allow contributors to add the label
`add-to-release-notes` to any PR. When that PR is merged, this action
will create another PR appending the original PR's # and title to the
release notes for the next release. This second PR will give the author
an opportunity to add a description to their addition, as well as give
maintainers an opportunity to discuss it's relevance in the release
notes.
**What this PR does / why we need it**:
Updated Scalability benefit to call out decoupled read/write paths and
that benefit - this is a huge competitive edge over tools like Splunk
and Elastic and it makes sense to explicitly call it out under our
benefits
**What this PR does / why we need it**:
Update a note to callout that replication factor also impacts read path
behavior.
**Which issue(s) this PR fixes**:
Fixes #<issue number>
N/A
**Special notes for your reviewer**:
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
---------
Co-authored-by: J Stickler <julie.stickler@grafana.com>
**What this PR does / why we need it**:
Add two improvement to the target `gcplog`
* capability to ignore the `textPayload` from the log line -> don't
trunk important metadata
* add new source labels (severity and root labels of log entry) -> more
relabel_config possibilities
**Special notes for your reviewer**:
I'm not sure about the naming of the option `useFullLine`, feel free to
suggest any name, but I guess this feature must be disable by default
for compatibility?
**Checklist**
- [X] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [X] Documentation added
- [X] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
---------
Signed-off-by: Kevin Labesse <kevin@labesse.me>
Co-authored-by: J Stickler <julie.stickler@grafana.com>
**What this PR does / why we need it**:
- Add `eq` function
- Add example for nested if and `AND`/`OR` logic
**Which issue(s) this PR fixes**:
Fixes #<issue number>
N/A
**Special notes for your reviewer**:
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
**What this PR does / why we need it**:
mild grammar changes
**Which issue(s) this PR fixes**:
None
**Special notes for your reviewer**:
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [ ] ~~Tests updated~~
- [ ] ~~`CHANGELOG.md` updated~~
- [ ] ~~Changes that require user attention or interaction to upgrade
are documented in `docs/sources/upgrading/_index.md`~~
**What this PR does / why we need it**:
**Which issue(s) this PR fixes**:
Fixes #<issue number>
**Special notes for your reviewer**:
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
**What this PR does / why we need it**:
Promtail can set the tenant id from a label with PR 6290 (commit
a1e0298a5).
However, while docs/sources/clients/promtail/stages/tenant.md was
appropriately updated, docs/sources/clients/promtail/configuration.md
was left untouched.
**Which issue(s) this PR fixes**:
**Special notes for your reviewer**:
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
**What this PR does / why we need it**:
The example config path on Loki's query scheduler is using
`-config.file=/mimir/config/mimir.yaml`, which looks a little bit
confusing on Loki's document, even though there are multiple references
to Mimir.
7bec727c6d/docs/sources/operations/scalability.md (L23)
This PR replaces the Mimir config path with the one found in
query-frontend's document.
7bec727c6d/docs/sources/configuration/query-frontend.md (L115)
**Which issue(s) this PR fixes**:
N/A
**Special notes for your reviewer**:
I noticed this when checking configuration of 2.6, but this is still the
case in the [latest
version](https://grafana.com/docs/loki/v2.8.x/operations/scalability/).
**Checklist**
- [X] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
Changed "integrations and connections" to "connections" and removed the
lightning bolt icon reference.
---------
Co-authored-by: J Stickler <julie.stickler@grafana.com>
Removed the single binary from the scalability section as this is
confusing. This should be called monolithic also.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes**:
Fixes #<issue number>
**Special notes for your reviewer**:
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
Signed-off-by: Ashwanth Goli <iamashwanth@gmail.com>
**What this PR does / why we need it**:
Currently loki initializes a single instance of index-shipper to [handle
all the table
ranges](ff7b462973/pkg/storage/factory.go (L188))
(from across periods) for a given index type `boltdb-shipper, tsdb`.
Since index-shipper only has the object client handle to the store
defined by `shared_store_type`, it limits the index uploads to a single
store. Setting `shared_store_type` to a different store at a later point
in time would mean losing access to the indexes stored in the previously
configured store.
With this PR, we initialize a separate index-shipper & table manager for
each period if `shared_store_type` is not explicity configured. This
offers the flexibility to store index in multiple stores (across
providers).
**Note**:
- usage of `shared_store_type` in this commit text refers to one of
these config options depending on the index in use:
`-boltdb.shipper.shared-store`, `-tsdb.shipper.shared-store`
- `shared_store_type` used to default to the `object_store` from the
latest `period_config` if not explicitly configured. This PR removes
these defaults in favor of supporting index uploads to multiple stores.
**Which issue(s) this PR fixes**:
Fixes#7276
**Special notes for your reviewer**:
All the instances of downloads table manager operate on the same
cacheDir. But it shouldn't be a problem as the tableRanges do not
overlap across periods.
**Checklist**
- [X] Reviewed the `CONTRIBUTING.md` guide
- [ ] Documentation added
- [X] Tests updated
- [x] `CHANGELOG.md` updated
- [x] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
---------
Signed-off-by: Ashwanth Goli <iamashwanth@gmail.com>
Co-authored-by: J Stickler <julie.stickler@grafana.com>
**What this PR does / why we need it**:
Add a new helm value `singleBinary.extraContainers` to allow running
sidecar containers in a singleBinary loki deployment. Useful for a
container that keeps the data PV from filling up.
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
Currently, Promtail always polls files every 250ms. This PR allows the
polling period to be customized, starting at a minimum polling period
and exponentially increasing the polling frequency to a bounded maximum
if the file has no changes.
When file changes are detected, the polling frequency resets to the
minimum.
The default behavior is the existing behavior; both the minimum and
maximum polling intervals are set to 250ms.
I'm opening this in draft to get early feedback. There's a few open
questions:
* Where can I add validations to ensure that the min/max polling
intervals are configured properly?
* Is this the right approach for implementation? Do I need to make any
changes?
* Does this need tests?
I'll update the CHANGELOG and Documentation after we're agreed on an
approach.
**What this PR does / why we need it**:
> Repeat step 1.3, which will use the new image
is the same as calling `make drone`. So it is not required.
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
**What this PR does / why we need it**:
Adds a new `compression` configuration to Promtail to customize
decompression behavior.
As of now, two options are supported:
`enabled`: false by default but if truthy, Promtail will infer a
compression algorithm to use based on the file extension
`initialDelay`: 0 by default but if higher, Decompressor will sleep for
the given duration before trying to decompress. This is to avoid
scenarios where the Decompressor starts to work on a file that is still
being compressed/generated.
**Which issue(s) this PR fixes**:
Fixes https://github.com/grafana/loki/issues/8784
**What this PR does / why we need it**:
Mention python-logging-loki as an unofficial client, since it looks a reasonably well done client.
This PR is a continuation from https://github.com/grafana/loki/pull/8636. 😅 CLA has been signed!
**What this PR does / why we need it**:
This PR fixes the link to the configuration in the Azure Event Hubs
scraping doc.
Link to the discussion
https://github.com/grafana/loki/pull/8787#discussion_r1145298591
**Which issue(s) this PR fixes**:
Not available
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
**What this PR does / why we need it**:
Reports used to be sent to a fixed URL usageStatsURL.
Now this url can be configurable while using the old url as a default.
Updated tests to use config for custom URLs instead of changing package
variable.
**Which issue(s) this PR fixes**:
Fixes#6502
**Special notes for your reviewer**:
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [x] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
**What this PR does / why we need it**:
Fixes typos and grammar. Most of them were occurrences of `it's` that
should be `its`.
**Which issue(s) this PR fixes**:
None
**Special notes for your reviewer**:
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
**What this PR does / why we need it**:
Describes how exactly the `start` and `end` timestamps operate on the
time range boundaries.
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
---------
Co-authored-by: J Stickler <julie.stickler@grafana.com>
**What this PR does / why we need it**:
Members of our community often ask about our release cadence, so I have
documented it for easy future reference.
---------
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Co-authored-by: J Stickler <julie.stickler@grafana.com>
**What this PR does / why we need it**:
WHAT:
- Update examples for `contains` and `hasSuffix`
- Correct order should be `functionName arg1 arg2 argN`, eg. `contains s
src`
WHY:
- Current example for `contains` and `hasPrefix` / `hasSuffix` has a
wrong order that will cause `TemplateFormatErr`.
- Examples I'm referring to `{{ if .err hasSuffix "Timeout" }} timeout
{{end}}`
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
**What this PR does / why we need it**:
https://github.com/alibaba/ilogtail/pull/685
ilogtail is a very popular log agent in China, with tens of millions of
installations. The open source community of ilogtail has completed the
integration of loki. This PR is completed as a supplement to loki doc
demo config,Type: flusher_loki
```yaml
enable: true
inputs:
- Type: file_log
LogPath: .
FilePattern: simple.log
flushers:
- Type: flusher_loki
URL: http://localhost:3100/loki/api/v1/push
ExternalLabels:
source: ilogtail
```
**Which issue(s) this PR fixes**:
Fixes #<issue number>
**Special notes for your reviewer**:
flusher_loki PR snapshot

**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
Signed-off-by: Edward Welch <edward.welch@grafana.com>
**What this PR does / why we need it**:
Loki does not currently split queries by time to a value smaller than
what's in the [range] of a range query.
Example
```
sum(rate({job="foo"}[2d]))
```
Imagine now this query being executed over a longer window of a few days
with a step of something like 30m.
Every step evaluation would query the last [2d] of data.
There are use cases where this is desired, specifically if you force the
step to match the value in the range, however what is more common is
someone accidentally uses `[$__range]` in here instead of
`[$__interval]` within Grafana and then sets the query time selector to
a large value like 7 days.
This PR adds a limit which will fail queries that set the [range] value
higher than the configured limit.
It's disabled by default.
In the future it may be possible for Loki to perform splits within the
[range] and remove the need for this limit, but until then this can be
an important safeguard in clusters with a lot of data.
**Which issue(s) this PR fixes**:
Fixes#8746
**Special notes for your reviewer**:
**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
---------
Signed-off-by: Edward Welch <edward.welch@grafana.com>
Co-authored-by: Karsten Jeschkies <karsten.jeschkies@grafana.com>
Co-authored-by: Vladyslav Diachenko <82767850+vlad-diachenko@users.noreply.github.com>
**What this PR does / why we need it**:
- At different places, inherit the span/spanlogger from the given
context instead of instantiating a new one from scratch, which fix spans
being orphaned on a read/write operation.
- At different places, turn spans into events. Events are lighter than
spans and by having fewer spans in the trace, trace visualization will
be cleaner without losing any details.
- Adds new spans/events to places that might be a bottleneck for our
writes/reads.
**What this PR does / why we need it**:
Add trusted profile authentication in COS client
**Which issue(s) this PR fixes**:
Fixes NA
**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [x] Documentation added
- [x] Tests updated
- [x] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
---------
Co-authored-by: shahulsonhal <shahulsonhal@gmail.com>
Co-authored-by: tareqmamari <tariq.mamari@de.ibm.com>
**What this PR does / why we need it**:
Using the [query
blocker](https://grafana.com/docs/loki/next/operations/blocking-queries/)
can be unergonomic since queries can be long, require escaping, or hard
to copy from logs. This change enables an operator to block queries by
their hash.