Make sure to prune container exit codes only when the associated
container does not exist anymore. This is needed when checking if any
container in kube-play exited non-zero and a building block for the
below linked Jira card.
[NO NEW TESTS NEEDED] - there are no unit tests for exit code pruning.
Jira: https://issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
As shown in #17831, WAL mode plays a role in causing `database is locked`
errors. Those are errors, in theory, should not happen as the DB should
busy wait. mattn/go-sqlite3/issues/274 has some comments indicating
that the busy handler behaves differently in WAL mode which may be an
explanation to the error.
For now, let's disable WAL mode and only re-enable it when we have
clearer understanding of what's going on. The upstream issue along with
the SQLite documentation do not give me the clear guidance that I would
need.
[NO NEW TESTS NEEDED] - flake is only reproducible in CI.
Fixes: #18356
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
According to an old upstream issue [1]: "If the first statement after
BEGIN DEFERRED is a SELECT, then a read transaction is started.
Subsequent write statements will upgrade the transaction to a write
transaction if possible, or return SQLITE_BUSY."
So let's move the first SELECT under the same transaction as the table
initialization.
[NO NEW TESTS NEEDED] as it's a hard to cause race.
[1] https://github.com/mattn/go-sqlite3/issues/274#issuecomment-1429054597Fixes: #17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
`Ping()` requires the DB lock, so we had to move it into a transaction
to fix#17859. Since we try to access the DB directly afterwards, I
prefer to let that fail instead of paying the cost of a transaction
which would lock the DB for _all_ processes.
[NO NEW TESTS NEEDED] as it's a hard to reproduce race.
Fixes: #17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
If a container with an ID starting with "db1" exists, and a
container named "db1" also exists, and they are different
containers - if I run `podman inspect db1` the container named
"db1" should be inspected, and there should not be an error that
multiple containers matched the name or id "db1". This was
already handled by BoltDB, and now is properly managed by SQLite.
Fixes#17905
Signed-off-by: Matt Heon <mheon@redhat.com>
The DB config is a single-row table, and the first Podman process
to run against the database creates it. However, there was a race
where multiple Podman processes, started simultaneously, could
try and write it. Only the first would succeed, with subsequent
processes failing once (and then running correctly once re-ran),
but it was happening often in CI and deserves fixing.
[NO NEW TESTS NEEDED] It's a CI flake fix.
Signed-off-by: Matt Heon <mheon@redhat.com>
SQLite developers consider it a misfeature [1], and after turning it on,
we saw a new set of flakes. Let's turn it off and trust the developers
[1] that WAL mode is sufficient for our purposes.
Turning the shared cache off also makes the DB smaller and faster.
[NO NEW TESTS NEEDED]
[1] https://sqlite.org/forum/forumpost/1f291cdca4
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The symptoms in #17859 indicate that setting the PRAGMAs in individual
EXECs outside of a transaction can lead to concurrency issues and
failures when the DB is locked. Hence set all PRAGMAs when opening
the connection. Move them into individual constants to improve
documentation and readability.
Further make transactions exclusive as #17859 also mentions an error
that the DB is locked during a transaction.
[NO NEW TESTS NEEDED] - existing tests cover the code.
Fixes: #17859
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
<MH: Cherry-picked on top of my branch>
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
I was searching the SQLite docs for a fix, but apparently that
was the wrong place; it's a common enough error with the Go
frontend for SQLite that the fix is prominently listed in the API
docs for go-sqlite3. Setting cache mode to 'shared' and using a
maximum of 1 simultaneous open connection should fix.
Performance implications of this are unclear, but cache=shared
sounds like it will be a benefit, not a curse.
[NO NEW TESTS NEEDED] This fixes a flake with concurrent DB
access.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
Transient mode means the DB should not persist, so instead of
using the GraphRoot we should use the RunRoot instead.
Signed-off-by: Matt Heon <mheon@redhat.com>
Return more sensible errors than SQLite's embedded constraint
failure ones. Should fix a number of integration tests.
Signed-off-by: Matt Heon <mheon@redhat.com>
A partial name match is tricky as we want it to be fast but also make
sure there's only one partial match iff there's no full one.
[NO NEW TESTS NEEDED] as it fixes a system test.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
I wasn't able to find a way to get error-checks working with the sqlite3
library with the time at hand.
[NO NEW TESTS NEEDED]
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
A number of fixes for pod creation and removal.
The important part is that matching partial IDs requires a trailing `%`
for SQL to interpret it as a wildcard. More information at
https://www.sqlitetutorial.net/sqlite-like/
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Allow to replace existing exit codes. A container may be started and
stopped multiple times etc.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
The value of -1 is used when we do not _yet_ know the exit code of the
container. Otherwise, the DB checks would error. There's probably a
smarter than allowing -1 but for now, that will do the trick and let the
tests progress.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
[NO NEW TESTS NEEDED] - the sqlite backend is still in development and
is not enabled by default.
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
- Added a mechanism to check schema version and migrate
(no migrations yet since schema hasn't changed yet).
- Added pod support to AddContainer, and unified AddContainer and
RemoveContainer between containers and pods.
- Fixed newly-added GetPodName and GetCtrName in BoltDB so they
only return pod/container names.
Signed-off-by: Matt Heon <mheon@redhat.com>
This contains the implementation of (most) container functions,
with stubs for all pod and volume functions. Presently accessed
via environment variable only for testing purposes.
Signed-off-by: Matt Heon <mheon@redhat.com>