Commit Graph

72 Commits

Author SHA1 Message Date
Miguel Pozo 4dc1c23384 Fix #114742: Draw: Buffers never shrink
The buffers from the new Draw Manager increase their size as needed,
but they never shrink.

Add `StorageArrayBuffer::trim_to_next_power_of_2` function that can
downsize the buffer following the same heuristic as `get_or_resize`.
Add `StorageVectorBuffer::trim_and_clear`, which calls
`trim_to_next_power_of_2` automatically.

Pull Request: https://projects.blender.org/blender/blender/pulls/114857
2023-11-20 12:23:12 +01:00
Jason Fielder 1b0ddfa6cb GPU: Add explicit API to sync storage buffer back to host
PR Introduces GPU_storagebuf_sync_to_host as an explicit routine to
flush GPU-resident storage buffer memory back to the host within the
GPU command stream.

The previous implmentation relied on implicit synchronization of
resources using OpenGL barriers which does not match the
paradigm of explicit APIs, where indiviaul resources may need
to be tracked.

This patch ensures GPU_storagebuf_read can be called without
stalling the GPU pipeline while work finishes executing. There are
two possible use cases:

1) If GPU_storagebuf_read is called AFTER an explicit call to
GPU_storagebuf_sync_to_host, the read will be synchronized.
If the dependent work is still executing on the GPU, the host
will stall until GPU work has completed and results are available.

2) If GPU_storagebuf_read is called WITHOUT an explicit call to
GPU_storagebuf_sync_to_host, the read will be asynchronous
and whatever memory is visible to the host at that time will be used.
(This is the same as assuming a sync event has already been signalled.)

This patch also addresses a gap in the Metal implementation where
there was missing read support for GPU-only storage buffers.
This routine now uses a staging buffer to copy results if no
host-visible buffer was available.

Reading from a GPU-only storage buffer will always stall
the host, as it is not possible to pre-flush results, as no
host-resident buffer is available.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/113456
2023-10-20 17:04:36 +02:00
Campbell Barton e7e4e63313 Cleanup: spelling in comments, white-space in comments 2023-10-19 18:53:16 +11:00
Miguel Pozo 6f125661e6 GPU: Add Texture::debug_clear
Clear uninitialized textures to NaN/debug values.
Enabled for `--debug-gpu` only.

Pull Request: https://projects.blender.org/blender/blender/pulls/113781
2023-10-17 15:54:09 +02:00
Jeroen Bakker 61b463d5e4 EEVEE-Next: Planar Probe Pipeline
This PR is contains the initial capture pipeline for planar probes.

It requires work to generate the correct view to capture and to include
the result during ray tracing. These will be developed in a separate PR.

This PR detects if a planar probe is active in the scene. If this is
the case the planar probe pipeline will be activated. During rendering
this is done by querying the depsgraph, during viewport drawing this
is done during sync. If an planar probe is detected and the pipeline
wasn't activated. The pipeline will be activated and the sampling
will be reset to ensure the pipeline is filled with all objects.

Per object the user can set the visibility of the object in planar
reflections.
![image](/attachments/fcfb40f9-f174-491c-bfba-e7f00f49aa1c)

For a reflection plane the resolution and clipping offset can be set.
EDIT: Resolution option was removed because too complex to
implement with the little time we have at the moment.
![image](/attachments/e42ad9ce-8af8-45d1-aa3a-630db1901ad3)

Related to #112966

Co-authored-by: Clément Foucault <foucault.clem@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/113203
2023-10-08 19:49:58 +02:00
Clément Foucault 672d25b02d EEVEE-Next: Shadow Rendering Refactor
Split shadow rendering per LOD per tilemap and improve
fragment shader invocation rate by using multi-viewport.

Also changes the layout of the atlas to be 4 x 4 x Layers.
This allow to grow the atlas while keeping the content
and page indirection correct, but this isn't implemented
in this patch.

# First attempt

Shadow rendering using atomic proved to be less than ideal
and performance were not quite to an acceptable level.

The previous method had issue with atomic contention when
a lot of triangle would overlap and too many fragment shader
invocations with quite complex indirection rules and biases
which made the technique costly.

The new implementation leverage multi viewport and
layered rendeing to effectively replace the need for atomic
and render directly to the shadow atlas. Using the well
supported extension these are free on modern hardware and
do not need a geometry shader.

One view per tile is needed since we use the viewport index
and the layer index as a way to index a specific tile in the
array.

# Geometric Complexity Problem

The counterpart of this is that we need to draw one geometry
instance per tile which is 32x32 time more instances (at most)
than with the previous method.

This means that we will have to find a way to mitigate this
geometry cost by either reducing the number of tiles per
tilemaps (in other words, making the system less memory efficient)
or splitting complex objects' geometry into smaller, more
cull friendly chunks (for example, like the sculpt PBVH nodes).
The later seems to be a longer term solution as it requires
way too much engineering time we have right now.

# Update Lag Problem

This also mean we can only update up to 64 tile per redraw
which is not enough even in the most basic cases. This leads
to missing or over shadowing when a light updates until there
is no updates and the shadow rendering can catch up.

One possible solution is to update a lower LODs first waiting
until there is no update to render. This would allow no artifact
during the transforms (unless there is too many light updates
even for lowest LOD, but that was an issue also for the
previous implementation). This could also help with the
geometric complexity.

# Solution

In the end, we decided to have one view per lod. This limits
the complexity of the fragment shader (improve speed),
reduces the number of views per tilemap (fix update lag),
and reduces the number of instances.
This also mean we cannot render directly to the atlas anymore
and reverted to the atomic solution. Using the smallest
possible viewport, we assure that there isn't that much fragment
shader invocations which was one of the bottleneck. And also
reduces the amount of geometry instances that pass the
clipping test.

Pull Request: https://projects.blender.org/blender/blender/pulls/110979
2023-08-17 17:35:19 +02:00
Campbell Barton e955c94ed3 License Headers: Set copyright to "Blender Authors", add AUTHORS
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.

While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.

Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.

Some directories in `./intern/` have also been excluded:

- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.

An "AUTHORS" file has been added, using the chromium projects authors
file as a template.

Design task: #110784

Ref !110783.
2023-08-16 00:20:26 +10:00
Clément Foucault 17db856686 EEVEE-Next: Ray-tracing Denoise Pipeline
This is a full rewrite of the raytracing denoise pipeline. It uses the
same principle as before but now uses compute shaders for every stages
and a tile base approach. More aggressive filtering is needed since we
are moving towards having no prefiltered screen radiance buffer. Thus
we introduce a temporal denoise and a bilateral denoise stage to the
denoising. These are optionnal and can be disabled.

Note that this patch does not include any tracing part and only samples
the reflection probes. It is focused on denoising only. Tracing will
come in another PR.

The motivation for this is that having hardware raytracing support
means we can't prefilter the radiance in screen space so we have to
have better denoising. Also this means we can have better surface
appearance with support for other BxDF model than GGX. Also GGX support
is improved.

Technically, the new denoising fixes some implementation mistake the
old pipeline did. It separates all 3 stages (spatial, temporal,
bilateral) and use random sampling for all stages hoping to create
a noisy enough (but still stable) output so that the TAA soaks the
remaining noise. However that's not always the case. Depending on the
nature of the scene, the input can be very high frequency and might
create lots of flickering. That why another solution needs to be found
for the higher roughness material as denoising them becomes expensive
and low quality.

Pull Request: https://projects.blender.org/blender/blender/pulls/110117
2023-08-03 15:32:06 +02:00
Jeroen Bakker a9d501ade2 Fix: Unable to create Cubemap Arrays using GPU Wrapper
When using `Texture.ensure_cube_array` the resulting texture wasn't actually
layered (array) and when used resulted into incorrect behavior.

Until now this function isn't used, but will be in when Eevee-next
world reflective light PR lands #108149 .

Pull Request: https://projects.blender.org/blender/blender/pulls/109497
2023-06-29 15:12:36 +02:00
Clément Foucault ddd88c00b4 EEVEE-Next: Irradiance Cache: Initial Implementation
This is a full rewrite of the irradiance volume baking.
The baking is much faster and doesn't scale linearly with the number
of irradiance samples in the volumes.

Ref #105643

Pull Request: https://projects.blender.org/blender/blender/pulls/108639
2023-06-23 08:39:46 +02:00
Campbell Barton 65f99397ec License headers: use SPDX-FileCopyrightText in all sources 2023-06-15 13:35:34 +10:00
Miguel Pozo 4808856dc2 Draw: Fix: Clear layer_views_ on Texture::free() 2023-05-31 17:54:29 +02:00
Sergey Sharybin c1bc70b711 Cleanup: Add a copyright notice to files and use SPDX format
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.

This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.

Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.

Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:

    https://reuse.software/faq/
2023-05-31 16:19:06 +02:00
Jason Fielder ae405639e7 Metal: Stencil texture view support
Adds stencil texture view support for Metal, allowing reading of
stencil component during texture sample/read.

Stencil view creation refactored to use additional parameter in
textureview creation function, due to deferred stencil parameter
causing double texture view creation in Metal, when this should
ideally be provided upfront.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/107971
2023-05-22 20:40:38 +02:00
Campbell Barton 6859bb6e67 Cleanup: format (with BraceWrapping::AfterControlStatement "MultiLine") 2023-05-02 09:37:49 +10:00
Clément Foucault 7e764ec692 GPU: Texture: Expose depth dimension extent
This function was not exposed outside of internal GPU module.

Renaming `draw::Texture::depth()` to `is_depth` for consistency
and removing the ambiguity.
2023-04-13 14:06:53 +02:00
Sergey Sharybin d32d787f5f Clang-Format: Allow empty functions to be single-line
For example

```
OIIOOutputDriver::~OIIOOutputDriver()
{
}
```

becomes

```
OIIOOutputDriver::~OIIOOutputDriver() {}
```

Saves quite some vertical space, which is especially handy for
constructors.

Pull Request: https://projects.blender.org/blender/blender/pulls/105594
2023-03-29 16:50:54 +02:00
Clément Foucault 9fb1f32f06 Cleanup: GPUTexture: Remove _ex suffix from texture creation
It isn't relevant anymore now that usage flags are mandatory.

Pull Request #105197
2023-02-25 11:39:54 +01:00
Clément Foucault e01b140fb2 GPUTexture: Remove data_format from 3D texture creation function
For every other texture types this is expected to be implicitly
`GPU_DATA_FLOAT`. There is only one case where this is not the case.

I believe this was previously needed because the data type was
conditionning the texture creation. This is not the case anymore.
2023-02-25 11:39:53 +01:00
Clément Foucault 73da5ee90d Cleanup: GPUTexture: Rename some functions with more descriptive names
List of renames:
GPU_texture_generate_mipmap > GPU_texture_update_mipmap_chain
GPU_texture_orig_width > GPU_texture_original_width
GPU_texture_orig_height > GPU_texture_original_height
GPU_texture_orig_size_set > GPU_texture_original_size_set
GPU_texture_format_description > GPU_texture_format_name
GPU_texture_array > GPU_texture_is_array
GPU_texture_cube > GPU_texture_is_cube
GPU_texture_depth > GPU_texture_has_depth_format
GPU_texture_stencil > GPU_texture_has_stencil_format
GPU_texture_integer > GPU_texture_has_integer_format
2023-02-25 11:39:53 +01:00
Clément Foucault a0f5240089 EEVEE-Next: Virtual Shadow Map initial implementation
Implements virtual shadow mapping for EEVEE-Next primary shadow solution.
This technique aims to deliver really high precision shadowing for many
lights while keeping a relatively low cost.

The technique works by splitting each shadows in tiles that are only
allocated & updated on demand by visible surfaces and volumes.
Local lights use cubemap projection with mipmap level of detail to adapt
the resolution to the receiver distance.
Sun lights use clipmap distribution or cascade distribution (depending on
which is better) for selecting the level of detail with the distance to
the camera.

Current maximum shadow precision for local light is about 1 pixel per 0.01
degrees.
For sun light, the maximum resolution is based on the camera far clip
distance which sets the most coarse clipmap.

## Limitation:
Alpha Blended surfaces might not get correct shadowing in some corner
casses. This is to be fixed in another commit.
While resolution is greatly increase, it is still finite. It is virtually
equivalent to one 8K shadow per shadow cube face and per clipmap level.
There is no filtering present for now.

## Parameters:
Shadow Pool Size: In bytes, amount of GPU memory to dedicate to the
shadow pool (is allocated per viewport).
Shadow Scaling: Scale the shadow resolution. Base resolution should
target subpixel accuracy (within the limitation of the technique).

Related to #93220
Related to #104472
2023-02-08 21:18:44 +01:00
Miguel Pozo e744673268 Draw: Improve Texture assignment operator
Differential Revision: https://developer.blender.org/D17119
2023-01-25 17:12:25 +01:00
Miguel Pozo ba982119cd Workbench Next
Rewrite of the Workbench engine using C++ and the new Draw Manager API.

The new engine can be enabled in Blender `Preferences > Experimental > Workbench Next`.
After that, the engine can be selected in `Properties > Scene > Render Engine`.
When `Workbench Next` is the active engine, it also handles the `Solid` viewport mode rendering.

The rewrite aims to be functionally equivalent to the current Workbench engine, but it also includes some small fixes/tweaks:
- `In Front` rendered objects now work correctly with DoF and Shadows.
- The `Sampling > Viewport` setting is actually used when the viewport is in `Render Mode`.
- In `Texture` mode, textured materials also use the material properties. (Previously, only non textured materials would)

To do:
- Sculpt PBVH.
- Volume rendering.
- Hair rendering.
- Use the "no_geom" shader versions for shadow rendering.
- Decide the final API for custom visibility culling (Needed for shadows).
- Profile/optimize.

Known Issues:
- Matcaps are not loaded until they’re shown elsewhere. (e.g. when opening the `Viewort Shading` UI)
- Outlines are drawn between different materials of the same object. (Each material submesh has its own object handle)

Reviewed By: fclem

Maniphest Tasks: T101619

Differential Revision: https://developer.blender.org/D16826
2023-01-23 17:59:07 +01:00
Clément Foucault e0c8fa4ab9 DRW: Fix Texture.ensure() function always recreating the texture
This was caused by recent change of the `size()` method which now return
1 for missing dimensions.
2023-01-23 11:05:04 +01:00
Chris Blackbourn bbeb37696d Cleanup: format 2023-01-20 11:43:28 +13:00
Clément Foucault 21b3689fb9 DRW: GPU Wrappers: Add swap to storage buffers, empty framebuffer and fixes
Also add an assert to mip_view to avoid incorrect usage.
2023-01-18 15:36:46 +01:00
Clément Foucault 8f44c37f5c Cleanup: Rename BLI_math_vec_types* files to BLI_math_vector_types
This is for the sake of consistency and clarity.
2023-01-06 20:09:51 +01:00
Jason Fielder 2e61c446ac GPU: Explicit Texture Usage Flags for enabling GPU Backend optimizations.
Texture usage flags can now be provided during texture creation specifying
the ways in which a texture can be used. This allows the GPU backends to
perform contextual optimizations which were not previously possible. This
includes enablement of hardware lossless compression which can result in
a 15%+ performance uplift for bandwidth-limited scenes on hardware such
as Apple-Silicon using Metal.

GPU_TEXTURE_USAGE_GENERAL can be used by default if usage is not known
ahead of time. Patch will also be relevant for the Vulkan backend.

Authored by Apple: Michael Parkin-White

Ref T96261

Reviewed By: fclem
Differential Revision: https://developer.blender.org/D15967
2022-12-08 23:31:05 +01:00
Campbell Barton 2a41cd46ba Cleanup: format 2022-11-15 16:43:18 +11:00
Clément Foucault 187bce103b DRW: Fix compilation issues in inline functions 2022-11-14 14:01:23 +01:00
Clément Foucault f1466ce9a8 DRW: Wrappers: Allow taking reference of the framebuffer object
This is in order to make it work with the new `framebuffer_set` command
which requires a `GPUFrameBuffer **`.
2022-11-13 16:02:57 +01:00
Clément Foucault 0e4bdd428c DRW: Wrappers: Allow trivial types inside `draw::SwapChain`
This allows to use pointers and such other trivial types which cannot
implement the `swap` mehod.
2022-11-13 16:00:58 +01:00
Clément Foucault 67dfb61700 DRW: Wrappers: Avoid default vector length of 0 if sizeof(T) is large
This increases the default size to some reasonable value (>512bytes) and
allocate at least 1 element.
2022-11-13 15:59:23 +01:00
Clément Foucault 7a9a83f4a0 DRW: Wrappers: Add TextureRef to wrap around GPUTexture pointers
This adds the possibility to use the C++ API for other GPUTexture.
2022-10-12 17:39:23 +02:00
Campbell Barton 4baa6e57bd Cleanup: prefer 'arg' over 'params' for sphinx documentation
While both are supported, 'arg' is in more common use so prefer it.
2022-09-19 14:24:31 +10:00
Clément Foucault a1aafddcbe DRW: GPU wrapper: Add new StorageVectorBuffer
Same as `StorageArrayBuffer` but has a length counter and act like a
`blender::Vector` you can clear and append to.
2022-09-17 12:27:00 +02:00
Brecht Van Lommel 44619eaa32 Cleanup: make format 2022-09-05 17:25:05 +02:00
Clément Foucault 65ad36f5fd DRWManager: New implementation.
This is a new implementation of the draw manager using modern
rendering practices and GPU driven culling.

This only ports features that are not considered deprecated or to be
removed.

The old DRW API is kept working along side this new one, and does not
interfeer with it. However this needed some more hacking inside the
draw_view_lib.glsl. At least the create info are well separated.

The reviewer might start by looking at `draw_pass_test.cc` to see the
API in usage.

Important files are `draw_pass.hh`, `draw_command.hh`,
`draw_command_shared.hh`.

In a nutshell (for a developper used to old DRW API):
- `DRWShadingGroups` are replaced by `Pass<T>::Sub`.
- Contrary to DRWShadingGroups, all commands recorded inside a pass or
   sub-pass (even binds / push_constant / uniforms) will be executed in order.
- All memory is managed per object (except for Sub-Pass which are managed
   by their parent pass) and not from draw manager pools. So passes "can"
   potentially be recorded once and submitted multiple time (but this is
   not really encouraged for now). The only implicit link is between resource
   lifetime and `ResourceHandles`
- Sub passes can be any level deep.
- IMPORTANT: All state propagate from sub pass to subpass. There is no
   state stack concept anymore. Ensure the correct render state is set before
   drawing anything using `Pass::state_set()`.
- The drawcalls now needs a `ResourceHandle` instead of an `Object *`.
   This is to remove any implicit dependency between `Pass` and `Manager`.
   This was a huge problem in old implementation since the manager did not
   know what to pull from the object. Now it is explicitly requested by the
   engine.
- The pases need to be submitted to a `draw::Manager` instance which can
   be retrieved using `DRW_manager_get()` (for now).

Internally:
- All object data are stored in contiguous storage buffers. Removing a lot
   of complexity in the pass submission.
- Draw calls are sorted and visibility tested on GPU. Making more modern
   culling and better instancing usage possible in the future.
- Unit Tests have been added for regression testing and avoid most API
   breakage.
- `draw::View` now contains culling data for all objects in the scene
   allowing caching for multiple views.
- Bounding box and sphere final setup is moved to GPU.
- Some global resources locations have been hardcoded to reduce complexity.

What is missing:
- ~~Workaround for lack of gl_BaseInstanceARB.~~ Done
- ~~Object Uniform Attributes.~~ Done (Not in this patch)
- Workaround for hardware supporting a maximum of 8 SSBO.

Reviewed By: jbakker

Differential Revision: https://developer.blender.org/D15817
2022-09-02 18:45:14 +02:00
Brecht Van Lommel 78e0c936c1 Merge branch 'blender-v3.3-release' 2022-08-19 17:32:55 +02:00
Brecht Van Lommel 0c8749788c Fix build error on mips64el architecture
Same as D12194, name "mips" conflicts on such systems.
2022-08-19 17:28:51 +02:00
Campbell Barton 1f2a5fea87 Cleanup: strip blank lines around comment blocks 2022-08-17 12:51:07 +10:00
Clément Foucault b43b62191c EEVEE-Next: HiZ Buffer: New implementation
This new implementation does all downsampling in a single compute shader
dispatch, removing a lot of complexity from the previous recursive
downsampling.

This is heavilly inspired by the Single-Pass-Downsampler from GPUOpen:
https://github.com/GPUOpen-Effects/FidelityFX-SPD
However I do not implement all the optimization bits as they require
vulkan (GL_KHR_shader_subgroup) and is not as versatile (it is only
for HiZ).

Timers inside renderdoc report ~0.4ms of saving on a 2048*1024 render for
the whole downsampling. Note that the previous implementation only
processed 6 mips where the new one processes 8 mips.
```
EEVEE ~1.0ms
EEVEE-Next ~0.6ms
```

Padding has been bumped to be of 128px for processing 8 mips.

A new debug option has been added (debug value 2) to validate the HiZ.
2022-08-15 18:36:19 +02:00
Clément Foucault c5526dc6f4 DRW: GPU Wrapper: add possibility to swap Texture and TextureFromPool
Ownership is transfered from the pool to the `Texture` and vice versa.
This allows to have history buffers with only 1 persistent texture.
2022-08-05 14:45:09 +02:00
Clément Foucault 1ae767be9f Cleanup: DRW: Remove void function argument 2022-08-05 14:45:09 +02:00
Clément Foucault 710609a2e0 DRW: GPU Wrapper: Fix invalid cached texture view when ensure() reallocs 2022-08-02 21:53:17 +02:00
Clément Foucault 22143b351f DRW: GPU wrapper: Make SwapChain renference work
This make using texture reference easier. But now, it makes it mandatory
for the wrapped type to implement the `swap()` static method.
2022-08-02 21:53:17 +02:00
Clément Foucault 04160ffd12 DRW: GPU wrappers: Expose more ease of use functions and cleanup style 2022-08-02 21:53:17 +02:00
Clément Foucault 82327ce01d DRW: TextureFromPool: Change API to use acquire / release
This removes the quirk of having to call the sync function for each new
render loop.

# Conflicts:
#	source/blender/draw/engines/eevee_next/eevee_view.cc
2022-07-28 17:00:46 +02:00
Clément Foucault fde7d39051 Cleanup: DRW: Fix misnamed argument and add more info in a function doc 2022-06-28 18:48:38 +02:00
Clément Foucault ae2d2c9361 DRW: GPU wrappers: Fix resize routines for StorageArrayBuffer
Resizing was not resizing the `data_` buffer. Also use `power_of_2_max_u`.
2022-05-19 00:35:36 +02:00