43 Commits

Author SHA1 Message Date
6dbc138339 prune exit codes only when container doesn't exist
Make sure to prune container exit codes only when the associated
container does not exist anymore.  This is needed when checking if any
container in kube-play exited non-zero and a building block for the
below linked Jira card.

[NO NEW TESTS NEEDED] - there are no unit tests for exit code pruning.

Jira: https://issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-05-25 13:14:27 +02:00
1fb3cdf8a8 sqlite: disable WAL mode
As shown in #17831, WAL mode plays a role in causing `database is locked`
errors.  Those are errors, in theory, should not happen as the DB should
busy wait.  mattn/go-sqlite3/issues/274 has some comments indicating
that the busy handler behaves differently in WAL mode which may be an
explanation to the error.

For now, let's disable WAL mode and only re-enable it when we have
clearer understanding of what's going on.  The upstream issue along with
the SQLite documentation do not give me the clear guidance that I would
need.

[NO NEW TESTS NEEDED] - flake is only reproducible in CI.

Fixes: #18356
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-05-09 15:54:26 +02:00
bbe9d61c49 sqlite: move first read into a transaction
According to an old upstream issue [1]: "If the first statement after
BEGIN DEFERRED is a SELECT, then a read transaction is started.
Subsequent write statements will upgrade the transaction to a write
transaction if possible, or return SQLITE_BUSY."

So let's move the first SELECT under the same transaction as the table
initialization.

[NO NEW TESTS NEEDED] as it's a hard to cause race.

[1] https://github.com/mattn/go-sqlite3/issues/274#issuecomment-1429054597

Fixes: #17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-04-25 16:01:49 +02:00
cdb5b3e990 sqlite: do not Ping() after connecting
`Ping()` requires the DB lock, so we had to move it into a transaction
to fix #17859. Since we try to access the DB directly afterwards, I
prefer to let that fail instead of paying the cost of a transaction
which would lock the DB for _all_ processes.

[NO NEW TESTS NEEDED] as it's a hard to reproduce race.

Fixes: #17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-28 11:27:43 +02:00
8bd9109fb8 Merge pull request #17917 from mheon/fix_17905
Ensure that SQLite state handles name-ID collisions
2023-03-27 07:48:37 -04:00
7daab31f1f Ensure that SQLite state handles name-ID collisions
If a container with an ID starting with "db1" exists, and a
container named "db1" also exists, and they are different
containers - if I run `podman inspect db1` the container named
"db1" should be inspected, and there should not be an error that
multiple containers matched the name or id "db1". This was
already handled by BoltDB, and now is properly managed by SQLite.

Fixes #17905

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-03-24 15:09:25 -04:00
e061cb968c Fix a race around SQLite DB config validation
The DB config is a single-row table, and the first Podman process
to run against the database creates it. However, there was a race
where multiple Podman processes, started simultaneously, could
try and write it. Only the first would succeed, with subsequent
processes failing once (and then running correctly once re-ran),
but it was happening often in CI and deserves fixing.

[NO NEW TESTS NEEDED] It's a CI flake fix.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-03-23 19:48:27 -04:00
b31d9e15f2 sqlite: do not use shared cache
SQLite developers consider it a misfeature [1], and after turning it on,
we saw a new set of flakes.  Let's turn it off and trust the developers
[1] that WAL mode is sufficient for our purposes.

Turning the shared cache off also makes the DB smaller and faster.

[NO NEW TESTS NEEDED]

[1] https://sqlite.org/forum/forumpost/1f291cdca4

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-22 15:44:38 +01:00
3925cd653b Drop SQLite max connections
The SQLite transaction lock Valentin found is (slightly) faster.
So let's go with that.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-03-21 14:20:34 -04:00
0fbc325156 sqlite: set connection attributes on open
The symptoms in #17859 indicate that setting the PRAGMAs in individual
EXECs outside of a transaction can lead to concurrency issues and
failures when the DB is locked.  Hence set all PRAGMAs when opening
the connection.  Move them into individual constants to improve
documentation and readability.

Further make transactions exclusive as #17859 also mentions an error
that the DB is locked during a transaction.

[NO NEW TESTS NEEDED] - existing tests cover the code.

Fixes: #17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>

<MH: Cherry-picked on top of my branch>

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-03-21 12:51:31 -04:00
9f0e0e8331 Fix database locked errors with SQLite
I was searching the SQLite docs for a fix, but apparently that
was the wrong place; it's a common enough error with the Go
frontend for SQLite that the fix is prominently listed in the API
docs for go-sqlite3. Setting cache mode to 'shared' and using a
maximum of 1 simultaneous open connection should fix.

Performance implications of this are unclear, but cache=shared
sounds like it will be a benefit, not a curse.

[NO NEW TESTS NEEDED] This fixes a flake with concurrent DB
access.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-03-21 09:57:56 -04:00
94f905a503 Fix SQLite DB schema migration code
It now can safely run on bare databases, before any tables are
created.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2023-03-17 13:24:53 -04:00
6142c16a9c Ensure SQLite places uses the runroot in transient mode
Transient mode means the DB should not persist, so instead of
using the GraphRoot we should use the RunRoot instead.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-03-15 14:45:28 -04:00
6e0f11da5d Improve handling of existing container names in SQLite
Return more sensible errors than SQLite's embedded constraint
failure ones. Should fix a number of integration tests.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-03-15 14:44:47 -04:00
38acab832d sqlite: remove dead code
Found by golangci-lint.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
86d12520e9 sqlite: implement RewriteVolumeConfig
[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
df88f546b6 sqlite: LookupVolume: fix partial name match
A partial name match is tricky as we want it to be fast but also make
sure there's only one partial match iff there's no full one.

[NO NEW TESTS NEEDED] as it fixes a system test.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
01359457c4 sqlite: LookupVolume: wrap error
Wrap the error with the message expexted by the system tests.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
e87014e444 sqlite: return correct error on pod-name conflict
I wasn't able to find a way to get error-checks working with the sqlite3
library with the time at hand.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
84b5c6c713 sqlite: RewritePodConfig: update error message
Use the same error message as the boltdb backend.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-03-01 16:09:51 +01:00
5d2d609be4 sqlite: fix volume lookups with partial names
Requires the trailing `%` to work correctly, see
        https://www.sqlitetutorial.net/sqlite-like/

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 13:56:58 +01:00
495314a16a sqlite: fix container lookups with partial IDs
Requires the trailing `%` to work correctly, see
	https://www.sqlitetutorial.net/sqlite-like/

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 13:47:32 +01:00
efe7aeb1da sqlite: fix LookupPod
To return the error message expected by the system tests.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 13:42:41 +01:00
19c2f37ba5 sqlite: fix pod create/rm
A number of fixes for pod creation and removal.

The important part is that matching partial IDs requires a trailing `%`
for SQL to interpret it as a wildcard.  More information at
	https://www.sqlitetutorial.net/sqlite-like/

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 13:38:17 +01:00
e32bea9378 sqlite: LookupContainer: update error message
As expected by the system tests.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 11:36:47 +01:00
565bb56454 sqlite: AddContainerExitCode: allow to replace
Allow to replace existing exit codes.  A container may be started and
stopped multiple times etc.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 11:30:46 +01:00
1b1cdfa357 sqlite: fix AllContainers with state
The state has been unmarshalled into the config which surfaced in wrong
states.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 11:19:43 +01:00
21fcc9070f sqlite: fix "UPDATE TABLE" typos
"TABLE" should refer to the actual table.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 10:48:11 +01:00
3f96b0ef28 sqlite: SaveVolume: fix syntax error updating the volumes table
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 10:35:48 +01:00
7c11f7e174 sqlite: exit code: allow -1
The value of -1 is used when we do not _yet_ know the exit code of the
container.  Otherwise, the DB checks would error.  There's probably a
smarter than allowing -1 but for now, that will do the trick and let the
tests progress.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 10:35:48 +01:00
e74f7bcaf3 sqlite: fix typo when removing exec sessions
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 10:35:48 +01:00
560805ac4c sqlite: AllContainers: fix inner join
The base table was missing, so we caused a syntax error.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 10:35:48 +01:00
8c64c4370f sqlite: move migration after table creation
Otherwise we'll fail immediately as the schema version is returned as 0.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 10:35:48 +01:00
eeabe975ea sqlite: implement pod methods
[NO NEW TESTS NEEDED] - the sqlite backend is still in development and
is not enabled by default.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-02-23 10:35:41 +01:00
03aaa8d350 Fix an incorrect comment on NewSqliteState
Signed-off-by: Matt Heon <mheon@redhat.com>
2023-02-22 19:24:36 -05:00
59a54f32dc Add support for volume operations to SQLite state
Signed-off-by: Matt Heon <mheon@redhat.com>
2023-02-22 11:00:50 -05:00
c0b92bdbc7 Implement exec session handling in SQL database
Signed-off-by: Matt Heon <mheon@redhat.com>
2023-02-22 11:00:50 -05:00
627a5b73bf Various fixes from code review
Signed-off-by: Matt Heon <mheon@redhat.com>
2023-02-22 11:00:50 -05:00
97499a70aa Implement network disconnect for SQLite state
Signed-off-by: Matt Heon <mheon@redhat.com>
2023-02-22 11:00:50 -05:00
939a4ccef4 Implement Network Connect/Modify for SQLite state
Signed-off-by: Matt Heon <mheon@redhat.com>
2023-02-22 11:00:50 -05:00
8ab18d8482 Fix various lint issues
Signed-off-by: Matt Heon <mheon@redhat.com>
2023-02-22 11:00:50 -05:00
b4c4f9c93d Some further work on SQLite state
- Added a mechanism to check schema version and migrate
  (no migrations yet since schema hasn't changed yet).
- Added pod support to AddContainer, and unified AddContainer and
  RemoveContainer between containers and pods.
- Fixed newly-added GetPodName and GetCtrName in BoltDB so they
  only return pod/container names.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-02-22 11:00:50 -05:00
1b968c6074 Add initial SQLite-backed state implementation
This contains the implementation of (most) container functions,
with stubs for all pod and volume functions. Presently accessed
via environment variable only for testing purposes.

Signed-off-by: Matt Heon <mheon@redhat.com>
2023-02-22 11:00:50 -05:00