28 Commits

Author SHA1 Message Date
Salva Corts
a4de7ad732 refactor: Move batching to pipeline wrapper (#21123) 2026-03-11 16:28:34 +01:00
Robert Fratto
b8920125c5 chore(dataobj): add configurable prefetching (#20946)
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2026-02-24 14:07:41 -05:00
Robert Fratto
00c8211fbf chore(engine): add concurrency to Pipeline.Open (#20839) 2026-02-18 08:19:18 -05:00
Ashwanth
73961f18a3 chore(xcap): leaner xcap API (#20771) 2026-02-17 19:42:33 +05:30
Robert Fratto
11f885af7a chore(engine): fix streamsView.Open semantics (#20802) 2026-02-13 08:37:16 -05:00
Robert Fratto
2d63582b15 chore: finish adding Open semantics in engine, metastore (#20798) 2026-02-13 07:34:24 -05:00
Ashwanth
8b7161a9fc chore(engine): enforce max_query_series (#20557) 2026-01-23 18:27:35 +05:30
Stas Spiridonov
bedfb78378 chore: Thor query engine memory improvements, part 2 (#20473) 2026-01-21 09:22:39 -05:00
Salva Corts
76a21de886 feat: Add Header Propagation and Stream Filtering to V2 Query Engine (#20449) 2026-01-21 09:21:09 +01:00
Ivan Kalita
08e3c4385f feat(metastore): shard sections queries over index files (#20134)
Running metastore queriesin a distributed manner using query engine workers:

- Build a distributed metastore plan from GetIndexes and execute it via the v2 scheduler/worker pipeline
- Split the request into per-index PointersScan tasks, then fan-in and CollectSections to produce final section descriptors
- Add physical plan + protobuf support for Merge/PointersScan, improve tracing/cleanup, and add coverage for planner/workflow/proto roundtrips
2025-12-22 12:56:45 +01:00
Stas Spiridonov
b6b7459435 chore: Aggregation groupings for by() and without() (#19928) 2025-12-17 12:38:01 -05:00
Ashwanth
cbad9d4ccb chore(xcap): support events, status and improve coverage (#20096) 2025-12-04 05:38:50 +00:00
Ashwanth
5bb6b54052 chore: compat should also handles metadata collisions (#20005) 2025-11-25 13:54:29 +00:00
Ashwanth
11b1dcb2a3 chore(engine): introduce execution capture (#19821) 2025-11-19 10:34:25 +05:30
Stas Spiridonov
f2c874e6ab chore: scheduler tasks are aware of their time range (#19709) 2025-11-07 12:38:17 -05:00
Trevor Whitney
8b39d4d077 refactor: implment parse as a projection (#19579)
Signed-off-by: Trevor Whitney <trevorjwhitney@gmail.com>
Co-authored-by: Christian Haudum <christian.haudum@gmail.com>
2025-10-31 11:07:47 +00:00
Robert Fratto
3c99535b0a chore(engine): provide worker (#19588)
The new worker package connects to an instance of a scheduler (#19570) 
for task assignment and execution. A worker spawns a fixed number of 
threads, each of which execute one task at a time.

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2025-10-28 14:53:45 +00:00
Stas Spiridonov
160dc2c493 chore: removed arrow-go allocators/retain/release (#19569) 2025-10-22 15:41:12 -04:00
Trevor Whitney
13429aa1ea refactor: move allocator into execution Context (#19550) 2025-10-21 08:38:52 -06:00
Trevor Whitney
3ce6fa2946 feat: implement unwrap as a projection (#19409) 2025-10-17 14:27:23 -06:00
Christian Haudum
290d1b711f chore(engine): Add support for drop stage (drop projection) (#19533)
This PR add support for dropping columns from the schema.
The drop projection only supports dropping column by name, not by matcher.
2025-10-17 14:59:13 +02:00
Robert Fratto
bd3f3dabe1 chore(engine): introduce ScanSet node (#19524)
Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2025-10-16 16:26:08 -04:00
Ashwanth
baf10907bd chore: introduces TopK node and replace the usage of SortMerge (#19520) 2025-10-16 14:23:45 +00:00
Robert Fratto
3a0480698c chore(engine): add Parallelize hint node (#19521) 2025-10-16 09:10:08 -04:00
Christian Haudum
c06eb636e5 chore(engine): Add "compatibility node" to physical plan to adhere with naming of "colliding labels" in v1 engine (#19470)
### Summary

The v1 engine has a mechanism to rename labels in case they have the same name but different origin, such as labels, structured metadata, or parsed fields.

1. In case a log line has a structured metadata key with the same name as the label name of the stream, than the metadata key is suffixed with `_extracted`, such as `service_extracted`, if `service` exists in both `labels` and `metadata`.
2. In case a parser creates a parsed field with the same as the label name of the stream, then the parsed key is suffixed with `_extracted` in the same way as case 1. However, if the field name also collides with a structured metadata key, then the extracted structured metadata is replaced with the extracted parsed field.

This PR only implements the first case. As a follow up PR, the second case needs to be implemented as well. Additionally, the newly introduced "compatibility node" should also be made optional with a feature flag and/or per-request.

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
2025-10-12 21:00:17 +02:00
Stas Spiridonov
16dab82593 chore: consistent expressionEvaluator.eval result memory ownership (#19438) 2025-10-09 11:43:19 -04:00
Christian Haudum
119275aab7 chore(engine): Column naming conventions (#19396)
In the new engine, we need fully qualified column names, since columns from different sources can have the same name.

Right now, the distinction between columns with the same name is implemented using the `Metadata` field on the `arrow.Field`. However, it is quite cumbersome to parse the column type and data type from this generic map.

This PR introduces package with naming conventions for columns, defined by name, data type, and column type. So, this information can be encoded into the `Name` field of the `arrow.Field`. The convention is defined as 

```
[DATA_TYPE].[COLUMN_TYPE].[COLUMN_NAME]
```

#### Examples:
* `utf8.label.service_name`
* `timestamp_ns.builtin.timestamp`

The column type can easily be converted into a `Scope`, which is defined by an origin and type.

The mapping is as follows:

```
ColumnTypeBuiltin   -> Scope{Record, Attribute}
ColumnTypeMetadata  -> Scope{Record, Builtin}
ColumnTypeLabel     -> Scope{Resource, Attribute}
ColumnTypeParsed    -> Scope{Generated, Attribute}
ColumnTypeGenerated -> Scope{Generated, Builtin}
ColumnTypeAmbiguous -> Scope{Unscoped, Attribute}
```

---
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
2025-10-08 09:06:24 +02:00
Robert Fratto
2e418b13fd chore(engine): unexport subpackages (#19384)
This moves packages around to reduce the surface area of the public engine API:

* `pkg/engine/planner` moves to `pkg/engine/internal/planner` 
* `pkg/engine/executor` moves to `pkg/engine/internal/executor` 

These packages were only used from `pkg/engine` and did not need to be public.
We may make them public again in the future if we want to expose subcomponents
of the engine. 

This move means that `pkg/engine/planner/internal/tree` became
`pkg/engine/internal/planner/internal/tree`. To reduce the import path, I also
moved that package to `pkg/engine/internal/util/tree`. 

Other than moving files and updating import paths, no code changes are made. 

Signed-off-by: Robert Fratto <robertfratto@gmail.com>
2025-10-07 17:05:51 +00:00