tornavis

Commit Graph

Author	SHA1	Message	Date
Miguel Pozo	4dc1c23384	Fix #114742 : Draw: Buffers never shrink The buffers from the new Draw Manager increase their size as needed, but they never shrink. Add `StorageArrayBuffer::trim_to_next_power_of_2` function that can downsize the buffer following the same heuristic as `get_or_resize`. Add `StorageVectorBuffer::trim_and_clear`, which calls `trim_to_next_power_of_2` automatically. Pull Request: https://projects.blender.org/blender/blender/pulls/114857	2023-11-20 12:23:12 +01:00
Jason Fielder	1b0ddfa6cb	GPU: Add explicit API to sync storage buffer back to host PR Introduces GPU_storagebuf_sync_to_host as an explicit routine to flush GPU-resident storage buffer memory back to the host within the GPU command stream. The previous implmentation relied on implicit synchronization of resources using OpenGL barriers which does not match the paradigm of explicit APIs, where indiviaul resources may need to be tracked. This patch ensures GPU_storagebuf_read can be called without stalling the GPU pipeline while work finishes executing. There are two possible use cases: 1) If GPU_storagebuf_read is called AFTER an explicit call to GPU_storagebuf_sync_to_host, the read will be synchronized. If the dependent work is still executing on the GPU, the host will stall until GPU work has completed and results are available. 2) If GPU_storagebuf_read is called WITHOUT an explicit call to GPU_storagebuf_sync_to_host, the read will be asynchronous and whatever memory is visible to the host at that time will be used. (This is the same as assuming a sync event has already been signalled.) This patch also addresses a gap in the Metal implementation where there was missing read support for GPU-only storage buffers. This routine now uses a staging buffer to copy results if no host-visible buffer was available. Reading from a GPU-only storage buffer will always stall the host, as it is not possible to pre-flush results, as no host-resident buffer is available. Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/113456	2023-10-20 17:04:36 +02:00
Campbell Barton	e7e4e63313	Cleanup: spelling in comments, white-space in comments	2023-10-19 18:53:16 +11:00
Miguel Pozo	6f125661e6	GPU: Add Texture::debug_clear Clear uninitialized textures to NaN/debug values. Enabled for `--debug-gpu` only. Pull Request: https://projects.blender.org/blender/blender/pulls/113781	2023-10-17 15:54:09 +02:00
Jeroen Bakker	61b463d5e4	EEVEE-Next: Planar Probe Pipeline This PR is contains the initial capture pipeline for planar probes. It requires work to generate the correct view to capture and to include the result during ray tracing. These will be developed in a separate PR. This PR detects if a planar probe is active in the scene. If this is the case the planar probe pipeline will be activated. During rendering this is done by querying the depsgraph, during viewport drawing this is done during sync. If an planar probe is detected and the pipeline wasn't activated. The pipeline will be activated and the sampling will be reset to ensure the pipeline is filled with all objects. Per object the user can set the visibility of the object in planar reflections. ![image](/attachments/fcfb40f9-f174-491c-bfba-e7f00f49aa1c) For a reflection plane the resolution and clipping offset can be set. EDIT: Resolution option was removed because too complex to implement with the little time we have at the moment. ![image](/attachments/e42ad9ce-8af8-45d1-aa3a-630db1901ad3) Related to #112966 Co-authored-by: Clément Foucault <foucault.clem@gmail.com> Pull Request: https://projects.blender.org/blender/blender/pulls/113203	2023-10-08 19:49:58 +02:00
Clément Foucault	672d25b02d	EEVEE-Next: Shadow Rendering Refactor Split shadow rendering per LOD per tilemap and improve fragment shader invocation rate by using multi-viewport. Also changes the layout of the atlas to be 4 x 4 x Layers. This allow to grow the atlas while keeping the content and page indirection correct, but this isn't implemented in this patch. # First attempt Shadow rendering using atomic proved to be less than ideal and performance were not quite to an acceptable level. The previous method had issue with atomic contention when a lot of triangle would overlap and too many fragment shader invocations with quite complex indirection rules and biases which made the technique costly. The new implementation leverage multi viewport and layered rendeing to effectively replace the need for atomic and render directly to the shadow atlas. Using the well supported extension these are free on modern hardware and do not need a geometry shader. One view per tile is needed since we use the viewport index and the layer index as a way to index a specific tile in the array. # Geometric Complexity Problem The counterpart of this is that we need to draw one geometry instance per tile which is 32x32 time more instances (at most) than with the previous method. This means that we will have to find a way to mitigate this geometry cost by either reducing the number of tiles per tilemaps (in other words, making the system less memory efficient) or splitting complex objects' geometry into smaller, more cull friendly chunks (for example, like the sculpt PBVH nodes). The later seems to be a longer term solution as it requires way too much engineering time we have right now. # Update Lag Problem This also mean we can only update up to 64 tile per redraw which is not enough even in the most basic cases. This leads to missing or over shadowing when a light updates until there is no updates and the shadow rendering can catch up. One possible solution is to update a lower LODs first waiting until there is no update to render. This would allow no artifact during the transforms (unless there is too many light updates even for lowest LOD, but that was an issue also for the previous implementation). This could also help with the geometric complexity. # Solution In the end, we decided to have one view per lod. This limits the complexity of the fragment shader (improve speed), reduces the number of views per tilemap (fix update lag), and reduces the number of instances. This also mean we cannot render directly to the atlas anymore and reverted to the atomic solution. Using the smallest possible viewport, we assure that there isn't that much fragment shader invocations which was one of the bottleneck. And also reduces the amount of geometry instances that pass the clipping test. Pull Request: https://projects.blender.org/blender/blender/pulls/110979	2023-08-17 17:35:19 +02:00
Campbell Barton	e955c94ed3	License Headers: Set copyright to "Blender Authors", add AUTHORS Listing the "Blender Foundation" as copyright holder implied the Blender Foundation holds copyright to files which may include work from many developers. While keeping copyright on headers makes sense for isolated libraries, Blender's own code may be refactored or moved between files in a way that makes the per file copyright holders less meaningful. Copyright references to the "Blender Foundation" have been replaced with "Blender Authors", with the exception of `./extern/` since these this contains libraries which are more isolated, any changed to license headers there can be handled on a case-by-case basis. Some directories in `./intern/` have also been excluded: - `./intern/cycles/` it's own `AUTHORS` file is planned. - `./intern/opensubdiv/`. An "AUTHORS" file has been added, using the chromium projects authors file as a template. Design task: #110784 Ref !110783.	2023-08-16 00:20:26 +10:00
Clément Foucault	17db856686	EEVEE-Next: Ray-tracing Denoise Pipeline This is a full rewrite of the raytracing denoise pipeline. It uses the same principle as before but now uses compute shaders for every stages and a tile base approach. More aggressive filtering is needed since we are moving towards having no prefiltered screen radiance buffer. Thus we introduce a temporal denoise and a bilateral denoise stage to the denoising. These are optionnal and can be disabled. Note that this patch does not include any tracing part and only samples the reflection probes. It is focused on denoising only. Tracing will come in another PR. The motivation for this is that having hardware raytracing support means we can't prefilter the radiance in screen space so we have to have better denoising. Also this means we can have better surface appearance with support for other BxDF model than GGX. Also GGX support is improved. Technically, the new denoising fixes some implementation mistake the old pipeline did. It separates all 3 stages (spatial, temporal, bilateral) and use random sampling for all stages hoping to create a noisy enough (but still stable) output so that the TAA soaks the remaining noise. However that's not always the case. Depending on the nature of the scene, the input can be very high frequency and might create lots of flickering. That why another solution needs to be found for the higher roughness material as denoising them becomes expensive and low quality. Pull Request: https://projects.blender.org/blender/blender/pulls/110117	2023-08-03 15:32:06 +02:00
Jeroen Bakker	a9d501ade2	Fix: Unable to create Cubemap Arrays using GPU Wrapper When using `Texture.ensure_cube_array` the resulting texture wasn't actually layered (array) and when used resulted into incorrect behavior. Until now this function isn't used, but will be in when Eevee-next world reflective light PR lands #108149 . Pull Request: https://projects.blender.org/blender/blender/pulls/109497	2023-06-29 15:12:36 +02:00
Clément Foucault	ddd88c00b4	EEVEE-Next: Irradiance Cache: Initial Implementation This is a full rewrite of the irradiance volume baking. The baking is much faster and doesn't scale linearly with the number of irradiance samples in the volumes. Ref #105643 Pull Request: https://projects.blender.org/blender/blender/pulls/108639	2023-06-23 08:39:46 +02:00
Campbell Barton	65f99397ec	License headers: use SPDX-FileCopyrightText in all sources	2023-06-15 13:35:34 +10:00
Miguel Pozo	4808856dc2	Draw: Fix: Clear layer_views_ on Texture::free()	2023-05-31 17:54:29 +02:00
Sergey Sharybin	c1bc70b711	Cleanup: Add a copyright notice to files and use SPDX format A lot of files were missing copyright field in the header and the Blender Foundation contributed to them in a sense of bug fixing and general maintenance. This change makes it explicit that those files are at least partially copyrighted by the Blender Foundation. Note that this does not make it so the Blender Foundation is the only holder of the copyright in those files, and developers who do not have a signed contract with the foundation still hold the copyright as well. Another aspect of this change is using SPDX format for the header. We already used it for the license specification, and now we state it for the copyright as well, following the FAQ: https://reuse.software/faq/	2023-05-31 16:19:06 +02:00
Jason Fielder	ae405639e7	Metal: Stencil texture view support Adds stencil texture view support for Metal, allowing reading of stencil component during texture sample/read. Stencil view creation refactored to use additional parameter in textureview creation function, due to deferred stencil parameter causing double texture view creation in Metal, when this should ideally be provided upfront. Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/107971	2023-05-22 20:40:38 +02:00
Campbell Barton	6859bb6e67	Cleanup: format (with BraceWrapping::AfterControlStatement "MultiLine")	2023-05-02 09:37:49 +10:00
Clément Foucault	7e764ec692	GPU: Texture: Expose depth dimension extent This function was not exposed outside of internal GPU module. Renaming `draw::Texture::depth()` to `is_depth` for consistency and removing the ambiguity.	2023-04-13 14:06:53 +02:00
Sergey Sharybin	d32d787f5f	Clang-Format: Allow empty functions to be single-line For example ``` OIIOOutputDriver::~OIIOOutputDriver() { } ``` becomes ``` OIIOOutputDriver::~OIIOOutputDriver() {} ``` Saves quite some vertical space, which is especially handy for constructors. Pull Request: https://projects.blender.org/blender/blender/pulls/105594	2023-03-29 16:50:54 +02:00
Clément Foucault	9fb1f32f06	Cleanup: GPUTexture: Remove _ex suffix from texture creation It isn't relevant anymore now that usage flags are mandatory. Pull Request #105197	2023-02-25 11:39:54 +01:00
Clément Foucault	e01b140fb2	GPUTexture: Remove data_format from 3D texture creation function For every other texture types this is expected to be implicitly `GPU_DATA_FLOAT`. There is only one case where this is not the case. I believe this was previously needed because the data type was conditionning the texture creation. This is not the case anymore.	2023-02-25 11:39:53 +01:00
Clément Foucault	73da5ee90d	Cleanup: GPUTexture: Rename some functions with more descriptive names List of renames: GPU_texture_generate_mipmap > GPU_texture_update_mipmap_chain GPU_texture_orig_width > GPU_texture_original_width GPU_texture_orig_height > GPU_texture_original_height GPU_texture_orig_size_set > GPU_texture_original_size_set GPU_texture_format_description > GPU_texture_format_name GPU_texture_array > GPU_texture_is_array GPU_texture_cube > GPU_texture_is_cube GPU_texture_depth > GPU_texture_has_depth_format GPU_texture_stencil > GPU_texture_has_stencil_format GPU_texture_integer > GPU_texture_has_integer_format	2023-02-25 11:39:53 +01:00
Clément Foucault	a0f5240089	EEVEE-Next: Virtual Shadow Map initial implementation Implements virtual shadow mapping for EEVEE-Next primary shadow solution. This technique aims to deliver really high precision shadowing for many lights while keeping a relatively low cost. The technique works by splitting each shadows in tiles that are only allocated & updated on demand by visible surfaces and volumes. Local lights use cubemap projection with mipmap level of detail to adapt the resolution to the receiver distance. Sun lights use clipmap distribution or cascade distribution (depending on which is better) for selecting the level of detail with the distance to the camera. Current maximum shadow precision for local light is about 1 pixel per 0.01 degrees. For sun light, the maximum resolution is based on the camera far clip distance which sets the most coarse clipmap. ## Limitation: Alpha Blended surfaces might not get correct shadowing in some corner casses. This is to be fixed in another commit. While resolution is greatly increase, it is still finite. It is virtually equivalent to one 8K shadow per shadow cube face and per clipmap level. There is no filtering present for now. ## Parameters: Shadow Pool Size: In bytes, amount of GPU memory to dedicate to the shadow pool (is allocated per viewport). Shadow Scaling: Scale the shadow resolution. Base resolution should target subpixel accuracy (within the limitation of the technique). Related to #93220 Related to #104472	2023-02-08 21:18:44 +01:00
Miguel Pozo	e744673268	Draw: Improve Texture assignment operator Differential Revision: https://developer.blender.org/D17119	2023-01-25 17:12:25 +01:00
Miguel Pozo	ba982119cd	Workbench Next Rewrite of the Workbench engine using C++ and the new Draw Manager API. The new engine can be enabled in Blender `Preferences > Experimental > Workbench Next`. After that, the engine can be selected in `Properties > Scene > Render Engine`. When `Workbench Next` is the active engine, it also handles the `Solid` viewport mode rendering. The rewrite aims to be functionally equivalent to the current Workbench engine, but it also includes some small fixes/tweaks: - `In Front` rendered objects now work correctly with DoF and Shadows. - The `Sampling > Viewport` setting is actually used when the viewport is in `Render Mode`. - In `Texture` mode, textured materials also use the material properties. (Previously, only non textured materials would) To do: - Sculpt PBVH. - Volume rendering. - Hair rendering. - Use the "no_geom" shader versions for shadow rendering. - Decide the final API for custom visibility culling (Needed for shadows). - Profile/optimize. Known Issues: - Matcaps are not loaded until they’re shown elsewhere. (e.g. when opening the `Viewort Shading` UI) - Outlines are drawn between different materials of the same object. (Each material submesh has its own object handle) Reviewed By: fclem Maniphest Tasks: T101619 Differential Revision: https://developer.blender.org/D16826	2023-01-23 17:59:07 +01:00
Clément Foucault	e0c8fa4ab9	DRW: Fix Texture.ensure() function always recreating the texture This was caused by recent change of the `size()` method which now return 1 for missing dimensions.	2023-01-23 11:05:04 +01:00
Chris Blackbourn	bbeb37696d	Cleanup: format	2023-01-20 11:43:28 +13:00
Clément Foucault	21b3689fb9	DRW: GPU Wrappers: Add swap to storage buffers, empty framebuffer and fixes Also add an assert to mip_view to avoid incorrect usage.	2023-01-18 15:36:46 +01:00
Clément Foucault	8f44c37f5c	Cleanup: Rename BLI_math_vec_types* files to BLI_math_vector_types This is for the sake of consistency and clarity.	2023-01-06 20:09:51 +01:00
Jason Fielder	2e61c446ac	GPU: Explicit Texture Usage Flags for enabling GPU Backend optimizations. Texture usage flags can now be provided during texture creation specifying the ways in which a texture can be used. This allows the GPU backends to perform contextual optimizations which were not previously possible. This includes enablement of hardware lossless compression which can result in a 15%+ performance uplift for bandwidth-limited scenes on hardware such as Apple-Silicon using Metal. GPU_TEXTURE_USAGE_GENERAL can be used by default if usage is not known ahead of time. Patch will also be relevant for the Vulkan backend. Authored by Apple: Michael Parkin-White Ref T96261 Reviewed By: fclem Differential Revision: https://developer.blender.org/D15967	2022-12-08 23:31:05 +01:00
Campbell Barton	2a41cd46ba	Cleanup: format	2022-11-15 16:43:18 +11:00
Clément Foucault	187bce103b	DRW: Fix compilation issues in inline functions	2022-11-14 14:01:23 +01:00
Clément Foucault	f1466ce9a8	DRW: Wrappers: Allow taking reference of the framebuffer object This is in order to make it work with the new `framebuffer_set` command which requires a `GPUFrameBuffer **`.	2022-11-13 16:02:57 +01:00
Clément Foucault	0e4bdd428c	DRW: Wrappers: Allow trivial types inside `draw::SwapChain` This allows to use pointers and such other trivial types which cannot implement the `swap` mehod.	2022-11-13 16:00:58 +01:00
Clément Foucault	67dfb61700	DRW: Wrappers: Avoid default vector length of 0 if sizeof(T) is large This increases the default size to some reasonable value (>512bytes) and allocate at least 1 element.	2022-11-13 15:59:23 +01:00
Clément Foucault	7a9a83f4a0	DRW: Wrappers: Add TextureRef to wrap around GPUTexture pointers This adds the possibility to use the C++ API for other GPUTexture.	2022-10-12 17:39:23 +02:00
Campbell Barton	4baa6e57bd	Cleanup: prefer 'arg' over 'params' for sphinx documentation While both are supported, 'arg' is in more common use so prefer it.	2022-09-19 14:24:31 +10:00
Clément Foucault	a1aafddcbe	DRW: GPU wrapper: Add new StorageVectorBuffer Same as `StorageArrayBuffer` but has a length counter and act like a `blender::Vector` you can clear and append to.	2022-09-17 12:27:00 +02:00
Brecht Van Lommel	44619eaa32	Cleanup: make format	2022-09-05 17:25:05 +02:00
Clément Foucault	65ad36f5fd	DRWManager: New implementation. This is a new implementation of the draw manager using modern rendering practices and GPU driven culling. This only ports features that are not considered deprecated or to be removed. The old DRW API is kept working along side this new one, and does not interfeer with it. However this needed some more hacking inside the draw_view_lib.glsl. At least the create info are well separated. The reviewer might start by looking at `draw_pass_test.cc` to see the API in usage. Important files are `draw_pass.hh`, `draw_command.hh`, `draw_command_shared.hh`. In a nutshell (for a developper used to old DRW API): - `DRWShadingGroups` are replaced by `Pass<T>::Sub`. - Contrary to DRWShadingGroups, all commands recorded inside a pass or sub-pass (even binds / push_constant / uniforms) will be executed in order. - All memory is managed per object (except for Sub-Pass which are managed by their parent pass) and not from draw manager pools. So passes "can" potentially be recorded once and submitted multiple time (but this is not really encouraged for now). The only implicit link is between resource lifetime and `ResourceHandles` - Sub passes can be any level deep. - IMPORTANT: All state propagate from sub pass to subpass. There is no state stack concept anymore. Ensure the correct render state is set before drawing anything using `Pass::state_set()`. - The drawcalls now needs a `ResourceHandle` instead of an `Object *`. This is to remove any implicit dependency between `Pass` and `Manager`. This was a huge problem in old implementation since the manager did not know what to pull from the object. Now it is explicitly requested by the engine. - The pases need to be submitted to a `draw::Manager` instance which can be retrieved using `DRW_manager_get()` (for now). Internally: - All object data are stored in contiguous storage buffers. Removing a lot of complexity in the pass submission. - Draw calls are sorted and visibility tested on GPU. Making more modern culling and better instancing usage possible in the future. - Unit Tests have been added for regression testing and avoid most API breakage. - `draw::View` now contains culling data for all objects in the scene allowing caching for multiple views. - Bounding box and sphere final setup is moved to GPU. - Some global resources locations have been hardcoded to reduce complexity. What is missing: - ~~Workaround for lack of gl_BaseInstanceARB.~~ Done - ~~Object Uniform Attributes.~~ Done (Not in this patch) - Workaround for hardware supporting a maximum of 8 SSBO. Reviewed By: jbakker Differential Revision: https://developer.blender.org/D15817	2022-09-02 18:45:14 +02:00
Brecht Van Lommel	78e0c936c1	Merge branch 'blender-v3.3-release'	2022-08-19 17:32:55 +02:00
Brecht Van Lommel	0c8749788c	Fix build error on mips64el architecture Same as D12194, name "mips" conflicts on such systems.	2022-08-19 17:28:51 +02:00
Campbell Barton	1f2a5fea87	Cleanup: strip blank lines around comment blocks	2022-08-17 12:51:07 +10:00
Clément Foucault	b43b62191c	EEVEE-Next: HiZ Buffer: New implementation This new implementation does all downsampling in a single compute shader dispatch, removing a lot of complexity from the previous recursive downsampling. This is heavilly inspired by the Single-Pass-Downsampler from GPUOpen: https://github.com/GPUOpen-Effects/FidelityFX-SPD However I do not implement all the optimization bits as they require vulkan (GL_KHR_shader_subgroup) and is not as versatile (it is only for HiZ). Timers inside renderdoc report ~0.4ms of saving on a 2048*1024 render for the whole downsampling. Note that the previous implementation only processed 6 mips where the new one processes 8 mips. ``` EEVEE ~1.0ms EEVEE-Next ~0.6ms ``` Padding has been bumped to be of 128px for processing 8 mips. A new debug option has been added (debug value 2) to validate the HiZ.	2022-08-15 18:36:19 +02:00
Clément Foucault	c5526dc6f4	DRW: GPU Wrapper: add possibility to swap Texture and TextureFromPool Ownership is transfered from the pool to the `Texture` and vice versa. This allows to have history buffers with only 1 persistent texture.	2022-08-05 14:45:09 +02:00
Clément Foucault	1ae767be9f	Cleanup: DRW: Remove void function argument	2022-08-05 14:45:09 +02:00
Clément Foucault	710609a2e0	DRW: GPU Wrapper: Fix invalid cached texture view when ensure() reallocs	2022-08-02 21:53:17 +02:00
Clément Foucault	22143b351f	DRW: GPU wrapper: Make SwapChain renference work This make using texture reference easier. But now, it makes it mandatory for the wrapped type to implement the `swap()` static method.	2022-08-02 21:53:17 +02:00
Clément Foucault	04160ffd12	DRW: GPU wrappers: Expose more ease of use functions and cleanup style	2022-08-02 21:53:17 +02:00
Clément Foucault	82327ce01d	DRW: TextureFromPool: Change API to use acquire / release This removes the quirk of having to call the sync function for each new render loop. # Conflicts: # source/blender/draw/engines/eevee_next/eevee_view.cc	2022-07-28 17:00:46 +02:00
Clément Foucault	fde7d39051	Cleanup: DRW: Fix misnamed argument and add more info in a function doc	2022-06-28 18:48:38 +02:00
Clément Foucault	ae2d2c9361	DRW: GPU wrappers: Fix resize routines for StorageArrayBuffer Resizing was not resizing the `data_` buffer. Also use `power_of_2_max_u`.	2022-05-19 00:35:36 +02:00

1 2

72 Commits