56 Commits

Author SHA1 Message Date
Lynne
215e22d1f1 ffv1enc_vulkan: fix typo
Fixes a segfault when host mapping is unsupported.
2026-03-10 19:31:00 +01:00
Lynne
b19707103e ffv1enc_vulkan: allocate a device-only output buffer if possible
This avoids needing to map HUGE 4GiB chunks of memory.
2026-02-19 19:42:34 +01:00
Lynne
4b6396a49b ffv1enc_vulkan: allocate all results memory upfront
Suballocation is the Vulkan way.
2026-02-19 19:42:34 +01:00
Lynne
f32e70ecc9 vulkan/ffv1: unify all constants buffer into a single buffer
Less allocations is always better.
2026-02-19 19:42:34 +01:00
Lynne
5c1b2947a4 ffv1enc_vulkan: only return the encoded size, not its offset
The encoded offset is just a multiple of the index by the max slice size.
2026-02-19 19:42:33 +01:00
Lynne
bc968bc8b4 vulkan/ffv1_enc: cache state probabilities
4x speedup on AMD.
2026-02-19 19:42:33 +01:00
Lynne
826b72d12f vulkan/ffv1: mark buffers as uniform/readonly when needed
Should be a speedup in most cases.
2026-02-19 19:42:32 +01:00
Lynne
06eb98bc97 ffv1enc_vulkan: remove dead code 2026-02-19 19:42:31 +01:00
Lynne
3ba81f2af4 vulkan: drop support for descriptor buffers
Descriptor buffers were a neat attempt at organizing descriptors.
Simple, robust, reliable.

Unfortunately, driver support never caught on, and neither did validation
layer support.

Now they're being replaced by descriptor heaps, which promises to be
the future. We'll see how it goes.
2026-02-19 19:42:31 +01:00
Lynne
b230ba4db9 ffv1enc_vulkan: use regular descriptors for slice state 2026-02-19 19:42:30 +01:00
Lynne
da99d3f209 vulkan_ffv1: implement parallel probability adaptation 2026-02-19 19:42:30 +01:00
Lynne
b3a388e36e ffv1enc_vulkan: overhaul the synchronization
Allows for the setup and reset shaders to run in parallel.
2026-02-19 19:42:30 +01:00
Lynne
3bc265d484 ffv1enc_vulkan: make reset shader independent from the setup shader
Allows them to run in parallel.
2026-02-19 19:42:29 +01:00
Lynne
b736d1c73e ffv1enc_vulkan: convert encode shader to compile-time SPIR-V generation 2026-02-19 19:42:29 +01:00
Lynne
4038af3da8 ffv1enc_vulkan: convert setup shader to compile-time SPIR-V generation 2026-02-19 19:42:29 +01:00
Lynne
6f4cef26df ffv1enc_vulkan: convert reset shader to compile-time SPIR-V generation 2026-02-19 19:42:28 +01:00
Lynne
fdee87d06d ffv1enc_vulkan: convert RCT search shader to compile-time SPIR-V generation 2026-02-19 19:42:28 +01:00
Lynne
f956e9817c ffv1enc_vulkan: use ff_vk_buf_barrier() 2025-12-31 15:00:46 +01:00
Lynne
9f3a04d2f6 vulkan: use HOST_CACHED memory flag only if such a heap exists
NVK does not offer such, so our code failed to allocate memory.
2025-12-31 15:00:46 +01:00
Lynne
a0d0b5cf73 ffv1enc_vulkan: only use native image representation
This was done for an unknown reason, and for whatever reason,
non-rgb 8+ bit formats were broken by this.
2025-11-12 00:37:24 +01:00
Lynne
38df9ba71b Revert "hwcontext_vulkan: fix planar 10 and 12-bit RGB formats using the new MSB formats"
This reverts commit 98ee3f6718.
2025-11-06 21:44:13 +01:00
Lynne
15e82dc452 Revert "hwcontext_vulkan: remove unsupported/broken pixel formats"
This reverts commit 5b388f2838.
2025-11-06 21:44:13 +01:00
Lynne
e85947576c ffv1enc_vulkan: limit probability caching to RADV only
Nvidia's drivers recently broke this.
2025-10-28 20:46:25 +01:00
Lynne
5b388f2838 hwcontext_vulkan: remove unsupported/broken pixel formats
We have no use for 14-bit pixel formats for now, so remove support for gray14,
which was broken due to the LSB padding issue.

Similarly YUVA at 10/12 bit was broken for the same reason.
2025-10-27 22:59:41 -03:00
Lynne
98ee3f6718 hwcontext_vulkan: fix planar 10 and 12-bit RGB formats using the new MSB formats 2025-10-27 22:59:41 -03:00
Koushik Dutta via ffmpeg-devel
fd136a4d82 ffv1enc_vulkan: fix empty struct build error on msvc
Signed-off-by: Koushik Dutta <koushd@gmail.com>
2025-09-30 19:36:56 +09:00
Lynne
ab37c7e49f ffv1enc: do not hardcode 1024 slices
Instead use a field where possible and the defined constant when not.

Tested by using 4096 slices.
2025-05-30 01:45:58 +09:00
Lynne
bf6d3dc339 ffv1enc_vulkan: allow slicecrc=2
For parity with the software encoder.
2025-05-27 21:50:35 +09:00
Maxime Gervais
cbdb5e2477 ffv1enc_vulkan: fix array overflow 2025-05-24 02:28:13 +09:00
Lynne
7576410af7 ffv1enc_vulkan: implement RCT search for level >= 4 2025-05-20 19:53:01 +09:00
Lynne
0156680f09 ffv1enc_vulkan: implement the cached EC writer from the decoder
This gives a 35% speedup on AMD and 50% on Nvidia.
2025-05-20 19:53:01 +09:00
Lynne
f69db914ce ffv1enc_vulkan: use ff_get_encode_buffer
We used to create our own buffer, but still used the DR1 flag,
which is not how it's supposed to work.

Instead, use ff_get_encode_buffer, and either host-map the buffer
before copying each slice via GPU transfers, or just copy each
slice manually if that fails or is unavailable.
2025-05-20 19:53:01 +09:00
Lynne
bd41838b60 ffv1enc_vulkan: switch to 2-line cache, unify prediction code 2025-05-20 19:53:01 +09:00
Lynne
7c0a8c07ce ffv1enc_vulkan: unify EC code between setup and encode 2025-05-20 19:53:00 +09:00
Lynne
69f83bafd1 ffv1enc_vulkan: get rid of temporary data for the setup shader 2025-05-20 19:53:00 +09:00
Lynne
ebbc7ff650 ffv1enc_vulkan: merge all encoder variants into one file
Makes it easier to work with, despite the heavy ifdeffery.
2025-05-20 19:52:55 +09:00
Lynne
707c04fe06 ffv1enc_vulkan: support 8 and 16-bit 2-plane YUV formats
This adds support for all 8-bit and 16-bit 2-plane formats.
P010 and others require more work as the data's LSB-padded.
2025-05-01 09:34:44 +02:00
Lynne
77f777d925 ffv1/vulkan: redo context count tracking and quant_table_idx management
This commit also makes it possible for the encoder to choose a different
quantization table on a per-slice basis, as well as adding this capability
to the decoder.

Also, this commit fully fixes decoding of context=1 encoded files.
2025-04-14 06:10:42 +02:00
Lynne
f2a0bdd6b1 vulkan: unify handling of BGR and simplify ffv1_rct 2025-03-17 08:49:15 +01:00
Lynne
dd7cc557af ffv1enc_vulkan: clip micro_version to 3 for level 4
This unbreaks level 4 encoding.
2025-03-17 08:49:14 +01:00
Lynne
b2ebe9884e ffv1enc_vulkan: refactor code to support sharing with decoder
The shaders were written to support sharing, but needed slight
tweaking.
2025-03-17 08:49:14 +01:00
Andreas Rheinhardt
0971fcf0a0 avcodec/codec_internal, all: Use macros to set deprecated AVCodec fields
The aim of this is twofold: a) Clang warns when setting a deprecated
field in a definition and because several of the widely set
AVCodec fields are deprecated, one gets several hundred warnings
from Clang for an ordinary build. Yet fortunately Clang (unlike GCC)
allows to disable deprecation warnings inside a definition, so
that one can create simple macros to set these fields that also suppress
deprecation warnings for Clang. This has already been done in
fdff1b9cbf for AVCodec.channel_layouts.
b) Using macros will allow to easily migrate these fields to internal ones.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-03-10 00:57:23 +01:00
Lynne
bb87d19cd9 ffv1enc_vulkan: disable autodetection of async_depth
The issue is that this could consume gigabytes of VRAM at higher
resolutions for not that much of a speedup.
Automatic detection was not a good idea as we can't know how much
VRAM is actually free.
Just remove it.
2025-02-27 19:08:42 +01:00
Lynne
542a567d50 ffv1enc_vulkan: support default range coder tables
This adds support for default range coder tables, rather than
only custom ones. Its two lines, as the same code can be used
for both thanks to ffv1enc.c setting f->state_transition properly.
2025-02-21 03:19:19 +01:00
James Almer
19045957af avcodec/ffv1enc_vulkan: add missing arguent to ff_ffv1_common_init()
Missed in 3d3ce9647f.

Found-by: kasper93
Signed-off-by: James Almer <jamrial@gmail.com>
2025-02-06 17:03:25 -03:00
Lynne
e7b474783c ffv1enc_vulkan: allow setting the number of slices via -slices
Falls back to the exact same code the software encoder uses.
2025-01-03 14:53:41 +09:00
Lynne
2e06b84e27 vulkan: do not reinvent a queue context struct
We recently introduced a public field which was a superset
of the queue context we used to have.

Switch to using it entirely.

This also allows us to get rid of the NIH function which was
valid only for video queues.
2024-12-23 04:25:09 +09:00
Michael Niedermayer
559d435fa3 avcodec/ffv1enc: Add enum for qtable
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-12-04 04:23:48 +01:00
Lynne
d4966f0a74 ffv1enc_vulkan: limit parallelism based on VRAM, fallback to host memory 2024-11-26 14:14:16 +01:00
Lynne
5effac3b02 ffv1enc: expose ff_ffv1_encode_buffer_size
The function is quite important to ensure that the output
is always going to be sufficient, and it can change version to
version, so exposing it makes sense.
2024-11-26 14:14:15 +01:00