There exist a bunch of "give me a (filtered) image pixel at this location"
functions, some with duplicated functionality, some with almost the same but
not quite, some that look similar but behave slightly differently, etc.
Some of them were in BLI, some were in ImBuf.
This commit tries to improve the situation by:
* Adding low level interpolation functions to `BLI_math_interp.hh`
- With documentation on their behavior,
- And with more unit tests.
* At `ImBuf` level, there are only convenience inline wrappers to the above BLI
functions (split off into a separate header `IMB_interp.hh`). However, since
these wrappers are inline, some things get a tiny bit faster as a side
effect. E.g. VSE image strip, scaling to 4K resolution (Windows/Ryzen5950X):
- Nearest filter: 2.33 -> 1.94ms
- Bilinear filter: 5.83 -> 5.69ms
- Subsampled3x3 filter: 28.6 -> 22.4ms
Details on the functions:
- All of them have `_byte` and `_fl` suffixes.
- They exist in 4-channel byte (uchar4) and float (float4), as well as
explicitly passed amount of channels for other float images.
- New functions in BLI `blender::math` namespace:
- `interpolate_nearest`
- `interpolate_bilinear`
- `interpolate_bilinear_wrap`. Note that unlike previous "wrap" function,
this one no longer requires the caller to do their own wrapping.
- `interpolate_cubic_bspline`. Previous similar function was called just
"bicubic" which could mean many different things.
- Same functions exist in `IMB_interp.hh`, they are just convenience that takes
ImBuf and uses data pointer, width, height from that.
Other bits:
- Renamed `mod_f_positive` to `floored_fmod` (better matches `safe_floored_modf`
and `floored_modulo` that exist elsewhere), made it branchless and added more
unit tests.
- `interpolate_bilinear_wrap_fl` no longer clamps result to 0..1 range. Instead,
moved the clamp to be outside of the call in `paint_image_proj.cc` and
`paint_utils.cc`. Though the need for clamping in there is also questionable.
Pull Request: https://projects.blender.org/blender/blender/pulls/117387
Fix#115983: Crash when saving empty Composite output
Blender crashes when the compositor tries to save en empty composite
output through the render pipeline. This happens when no Composite node
exist and no Render Layers node is used, essentially producing an empty
render result.
We fix this by only saving the composite output if a Composite node or a
Render Layers node exist.
Pull Request: https://projects.blender.org/blender/blender/pulls/117129
The term `PIL` stands for "platform independent library." It exists since the `Initial Revision`
commit from 2002. Nowadays, we generally just use the `BLI` (blenlib) prefix for such code
and the `PIL` prefix feels more confusing then useful. Therefore, this patch renames the
`PIL` to `BLI`.
Pull Request: https://projects.blender.org/blender/blender/pulls/117325
An assert that ensures the GPU compositor executes in a non main thread
wrongly fires in background mode, that's because in background mode,
rendering happens in the main thread. So add a condition for background
mode.
This allows code outside of the render pipeline to make proper
decisions about how the imbuf of render passes are to be handled.
For example, IMB_create_gpu_texture() will now properly select
single channel grayscale texture format for depth pass coming from
multilayer EXR, additionally solving assert in the GPU compositor
code which verifies expected and actual imbuf texture format.
Pull Request: https://projects.blender.org/blender/blender/pulls/117184
Set the number of planes based on the number of pass channels.
If the pass contains 2 passes or more than 4 passes set the number
of planes to the previously used value of 32.
This is needed because quite some areas check for the number of
planes for various optimizations. For example, this is one of the
factors which make IMB_create_gpu_texture() to choose the texture
format. If the number of planes for the depth pass is set to the
previously used this function will never consider using single
channel GPU texture.
Unfortunately, this change is not enough to make the GPU texture
to use single channel format as the color space of the image
buffer is also checked, and that is nullptr which means scene linear.
Part of overall "improve filtering situation" (#116980) task:
* Add Bicubic filtering option to strip Transform "Filter" setting.
Previously this option only existed in Transform Effect "Interpolation"
setting.
- With this addition, it feels like the transform effect could
possibly be marked as legacy/deprecated, since the regular Transform
that is on all strips can do everything that Transform Effect did?
* Speed up bicubic filtering (used now in VSE, but also in CPU Compositor,
image paint, etc.) by slightly simplifying the code and using some SIMD.
Upscaling 96x54 image to 3840x2160 resolution, using Bicubic filtering:
- Windows (VS2022, Ryzen 5950X): 35.5ms -> 15.1ms
- Mac (clang 15, M1 Max): 29.6ms -> 24.4ms
* Add gtest coverage for bicubic functionality.
Pull Request: https://projects.blender.org/blender/blender/pulls/117100
The previous commit introduced a new `RPT_()` macro to translate
strings which are not tooltips or regular interface elements, but
longer reports or statuses.
This commit uses the new macro to translate many strings all over the
UI.
Most of it is a simple replace from `TIP_()` or `IFACE_()` to
`RPT_()`, but there are some additional changes:
- A few translations inside `BKE_report()` are removed altogether
because they are already handled by the translation system.
- Messages inside `UI_but_disable()` are no longer translated
manually, but they are handled by a new regex in the translation
system.
Pull Request: https://projects.blender.org/blender/blender/pulls/116804
Pull Request: https://projects.blender.org/blender/blender/pulls/116804
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
Some common headers were including this. Separating the includes
will ideally lead to better conceptual separation between CustomData
and the attribute API too. Mostly the change is adding the file to
places where it was included indirectly before. But some code is
shuffled around to hopefully better places as well.
Except for vertex groups and a few older color types, these
are generally replaced by newer generic attribute types.
Also remove some includes of DNA_mesh_types.h, since it's
included indirectly by BKE_mesh.hh currently.
Each value is now out of the global namespace, so they can be shorter
and easier to read. Most of this commit just adds the necessary casting
and namespace specification. `enum class` can be forward declared since
it has a specified size. We will make use of that in the next commit.
Use the standard "elements_num" naming, and use the "corner" name rather
than the old "loop" name: `verts_num`, `edges_num`, and `corners_num`.
This matches the existing `faces_num` field which was already renamed.
Pull Request: https://projects.blender.org/blender/blender/pulls/116350
Make the naming consistent with the recent change from "loop" to
"corner". Avoid the need for a special type for these triangles by
conveying the semantics in the naming instead.
- `looptris` -> `corner_tris`
- `lt` -> `tri` (or `corner_tri` when there is less context)
- `looptri_index` -> `tri_index` (or `corner_tri_index`)
- `lt->tri[0]` -> `tri[0]`
- `Span<MLoopTri>` -> `Span<int3>`
- `looptri_faces` -> `tri_faces` (or `corner_tri_faces`)
If we followed the naming pattern of "corner_verts" and "edge_verts"
exactly, we'd probably use "tri_corners" instead. But that sounds much
worse and less intuitive to me.
I've found that by using standard vector types for this sort of data,
the commonalities with other areas become much clearer, and code ends
up being naturally more data oriented. Besides that, the consistency
is nice, and we get to mostly remove use of `DNA_meshdata_types.h`.
Pull Request: https://projects.blender.org/blender/blender/pulls/116238
The term `looptri` was used ambiguously for both single & arrays.
The term `tri` was also used, causing `tri->tri`.
Use terms:
- `looptris` for an array or when dealing with multiple items.
- `looptri` is used when dealing with a single item.
- `lt` for a single MLoopTri variables & arguments.
This was already a convention but not followed closely.
This patches refactors the compositor File Output mechanism and
implements the file output node for the Realtime Compositor. The
refactor was done for the following reasons:
1. The existing file output mechanism relied on a global EXR image
resource where the result of each compositor execution for each
view was accumulated and stored in the global resource, until the
last view is executed, when the EXR is finally saved. Aside from
relying on global resources, this can cause effective memory leaks
since the compositor can be interrupted before the EXR is written and
closed.
2. We need common code to share between all compositors since we now
have multiple compositor implementations.
3. We needed to take the opportunity to fix some of the issues with the
existing implementation, like lossy compression of data passes,
and inability to save single values passes.
The refactor first introduced a new structure called the Compositor
Render Context. This context stores compositor information related to
the render pipeline and is persistent across all compositor executions
of all views. Its extended lifetime relative to a single compositor
execution lends itself well to store data that is accumulated across
views. The context currently has a map of File Output objects. Those
objects wrap a Render Result structure and can be used to construct
multi-view images which can then be saved after all views are executed
using the existing BKE_image_render_write function.
Minor adjustments were made to the BKE and RE modules to allow saving
using the BKE_image_render_write function. Namely, the function now
allows the use of a source image format for saving as well as the
ability to not save the render result as a render by introducing two new
default arguments. Further, for multi-layer EXR saving, the existent of
a single unnamed render layer will omit the layer name from the EXR
channel full name, and only the pass, view, and channel ID will remain.
Finally, the Render Result to Image Buffer conversion now take he number
of channels into account, instead of always assuming color channels.
The patch implements the File Output node in the Realtime Compositor
using the aforementioned mechanisms, replaces the implementation of the
CPU compositor using the same Realtime Compositor implementation, and
setup the necessary logic in the render pipeline code.
Pull Request: https://projects.blender.org/blender/blender/pulls/113982
The Realtime compositor currently relies on the GPU cache in image IDs.
That cache only supports single layer images, so multi-layer images will
be acquired without a cache, introducing significant IO bottlenecks for
the GPU compositor.
This patch ignores the image GPU cache and stores the images in the
static cache manager of the compositor. Draw data was introduced to the
image ID for proper cache invalidation, like other IDs such as masks.
The downside is that the cache will no longer be shared between EEVEE
and the compositor. But realistically, images are not typically shared
between materials and compositors.
This is just a temporary solution until we have proper GPU storage
support for image buffers.
Pull Request: https://projects.blender.org/blender/blender/pulls/115511
This gives better asserts in debug builds through use of Span, more
safety when name convention attributes happen to have different types
or domains, and simpler code in some cases. But the main reasoning is to
avoid relying on the specifics of CustomData more to allow us to replace
it in the future.
"mesh" reads much better than "me" since "me" is a different word.
There's no reason to avoid using two more characters here. Replacing
all of these at once is better than encountering it repeatedly and
doing the same change bit by bit.
Due to changes in the build environment shader_builder wasn't able to
compile on macOs. This patch reverts several recent changes to CMake files.
* dbb2844ed9
* 94817f64b9
* 1b6cd937ff
The idea is that in the near future shader_builder will run on the buildbot as
part of any regular build to ensure that changes to the CMake doesn't break
shader_builder and we only detect it after a few days.
Pull Request: https://projects.blender.org/blender/blender/pulls/115929
The experimental GPU compositor crashes when the render size is huge.
This is just due to GPU texture allocation failing. The patch fixes that
by downscaling the render result when reading, then upscaling it again
when writing. Additionally, the render size was adapted to the
downscaled size since it is used by other input nodes. This is not an
ideal solution, but it a good temporary solution to prevent crashes
until we have proper support for huge textures.
Pull Request: https://projects.blender.org/blender/blender/pulls/115299
Implement the next phases of bounds improvement design #96968.
Mainly the following changes:
Don't use `Object.runtime.bb` for performance caching volume bounds.
This is redundant with the cache in most geometry data-block types.
Instead, this becomes `Object.runtime.bounds_eval`, and is only used
where it's actually needed: syncing the bounds from the evaluated
geometry in the active depsgraph to the original object.
Remove all redundant functions to access geometry bounds with an
Object argument. These make the whole design confusing, since they
access geometry bounds at an object level.
Use `std::optional<Bounds<float3>>` to pass and store bounds instead
of an allocated `BoundBox` struct. This uses less space, avoids
small heap allocations, and generally simplifies code, since we
usually only want the min and max anyway.
After this, to avoid performance regressions, we should also cache
bounds in volumes, and maybe the legacy curve and GP data types
(though it might not be worth the effort for those legacy types).
Pull Request: https://projects.blender.org/blender/blender/pulls/114933
This patch adds support for full precision compositing for the Realtime
Compositor. A new precision option was added to the compositor to change
between half and full precision compositing, where the Auto option uses
half for the viewport compositor and the interactive render compositor,
while full is used for final renders.
The compositor context now need to implement the get_precision() method
to indicate its preferred precision. Intermediate results will be stored
using the context's precision, with a number of exceptions that can use
a different precision regardless of the context's precision. For
instance, summed area tables are always stored in full float results
even if the context specified half float. Conversely, jump flooding
tables are always stored in half integer results even if the context
specified full. The former requires full float while the latter has no
use for it.
Since shaders are created for a specific precision, we need two variants
of each compositor shader to account for the context's possible
precision. However, to avoid doubling the shader info count and reduce
boilerplate code and development time, an automated mechanism was
employed. A single shader info of whatever precision needs to be added,
then, at runtime, the shader info can be adjusted to change the
precision of the outputs. That shader variant is then cached in the
static cache manager for future processing-free shader retrieval.
Therefore, the shader manager was removed in favor of a cached shader
container in the static cache manager.
A number of utilities were added to make the creation of results as well as
the retrieval of shader with the target precision easier. Further, a
number of precision-specific shaders were removed in favor of more
generic ones that utilizes the aforementioned shader retrieval
mechanism.
Pull Request: https://projects.blender.org/blender/blender/pulls/113476
Design task: #93551
This PR replaces the auto smooth option with a geometry nodes modifier
that sets the sharp edge attribute. This solves a fair number of long-
standing problems related to auto smooth, simplifies the process of
normal computation, and allows Blender to automatically choose between
face, vertex, and face corner normals based on the sharp edge and face
attributes.
Versioning adds a geometry node group to objects with meshes that had
auto-smooth enabled. The modifier can be applied, which also improves
performance.
Auto smooth is now unnecessary to get a combination of sharp and smooth
edges. In general workflows are changed a bit. Separate procedural and
destructive workflows are available. Custom normals can be used
immediately without turning on the removed auto smooth option.
**Procedural**
The node group asset "Smooth by Angle" is the main way to set sharp
normals based on the edge angle. It can be accessed directly in the add
modifier menu. Of course the modifier can be reordered, muted, or
applied like any other, or changed internally like any geometry nodes
modifier.
**Destructive**
Often the sharp edges don't need to be dynamic. This can give better
performance since edge angles don't need to be recalculated. In edit
mode the two operators "Select Sharp Edges" and "Mark Sharp" can be
used. In other modes, the "Shade Smooth by Angle" controls the edge
sharpness directly.
### Breaking API Changes
- `use_auto_smooth` is removed. Face corner normals are now used
automatically if there are mixed smooth vs. not smooth tags. Meshes
now always use custom normals if they exist.
- In Cycles, the lack of the separate auto smooth state makes normals look
triangulated when all faces are shaded smooth.
- `auto_smooth_angle` is removed. Replaced by a modifier (or operator)
controlling the sharp edge attribute. This means the mesh itself
(without an object) doesn't know anything about automatically smoothing
by angle anymore.
- `create_normals_split`, `calc_normals_split`, and `free_normals_split`
are removed, and are replaced by the simpler `Mesh.corner_normals`
collection property. Since it gives access to the normals cache, it
is automatically updated when relevant data changes.
Addons are updated here: https://projects.blender.org/blender/blender-addons/pulls/104609
### Tests
- `geo_node_curves_test_deform_curves_on_surface` has slightly different
results because face corner normals are used instead of interpolated
vertex normals.
- `bf_wavefront_obj_tests` has different export results for one file
which mixed sharp and smooth faces without turning on auto smooth.
- `cycles_mesh_cpu` has one object which is completely flat shaded.
Previously every edge was split before rendering, now it looks triangulated.
Pull Request: https://projects.blender.org/blender/blender/pulls/108014
Currently object bounds (`object.runtime.bb`) are lazily initialized
when accessed. This access happens from arbitrary threads, and
is unprotected by a mutex. This can cause access to stale data at
best, and crashes at worst. Eager calculation is meant to keep this
working, but it's fragile.
Since e8f4010611, geometry bounds are cached in the geometry
itself, which makes this object-level cache redundant. So, it's clearer
to build the `BoundBox` from those cached bounds and return it by
value, without interacting with the object's cached bounding box.
The code change is is mostly a move from `const BoundBox *` to
`std::optional<BoundBox>`. This is only one step of a larger change
described in #96968. Followup steps would include switching to
a simpler and smaller `Bounds` type, removing redundant object-
level access, and eventually removing `object.runtime.bb`.
Access of bounds from the object for mesh, curves, and point cloud
objects should now be thread-safe. Other object types still lazily
initialize the object `BoundBox` cache since they don't have
a data-level cache.
Pull Request: https://projects.blender.org/blender/blender/pulls/113465
The experimental GPU compositor is leaking memory in any setup.
This is because the current implementation of the render texture pool
always created a new texture and only freed the textures upon deletion.
This was a temporary implementation until a proper implementation that
uses the DRW textures pool was used.
This patch implements a small texture pool as a temporary fix until the
aforementioned DRW texture pool implementation is done.
The GPU compositor crops the viewed images to the render resolution.
While the original size and content of the input to the viewer should be
retained as is.
This patch fixes that by specializing compositors that can use composite
outputs to be able to view images of any arbitrary size. This is still
missing the translation offset of the viewer, but this shall be tackled
separately.
The experimental GPU compositor always returned the first view in
multi-view rendering. This patch fixes that by also checking for the
view name of the context when searching for the appropriate pass.
Don't assume existence of GPU backend in (background) preview rendering.
Also add null pointer checks and rely on assert instead to detect
invalid usage of GPU_render_begin/end, so that potential future mistakes
don't cause crashes.
Pull Request: https://projects.blender.org/blender/blender/pulls/112971
This adds initial support for rendering Cycles and EEVEE shaders in Hydra
render engines that support MaterialX. Not all nodes are currently
supported, see the detailed compatibility list in #112864.
Co-authored-by: Georgiy Markelov <georgiy.m.markelov@gmail.com>
Co-authored-by: Vasyl Pidhirskyi <vpidhirskyi@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/111765
This simplifies running built-in IO tests with:
ctest -R bf_io_
Also use "bf_io_" prefix for the libraries since it was already used
by some and it's a useful hint the libraries are used for IO.
A mistake in some of the previous refactor which was aimed to make the
byte buffer to be stored as uint8_t. One of the array size calculation
was missing multiplication by 4 channels.
Pull Request: https://projects.blender.org/blender/blender/pulls/112508
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.
While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.
Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.
Some directories in `./intern/` have also been excluded:
- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.
An "AUTHORS" file has been added, using the chromium projects authors
file as a template.
Design task: #110784
Ref !110783.
Also semantically separate draw_lock and draw_unlock, as it
is more clear than a single method with a boolean argument.
Should be no functional changes.
Use appropriate function for converting paths. This adds an ugly
dependency on a function that is private to python/, similar to how it
is done for Freestyle. Solving that will be for a separate change.
Ref #110765
Using ClangBuildAnalyzer on the whole Blender build, it was pointing
out that BLI_math.h is the heaviest "header hub" (i.e. non tiny file
that is included a lot).
However, there's very little (actually zero) source files in Blender
that need "all the math" (base, colors, vectors, matrices,
quaternions, intersection, interpolation, statistics, solvers and
time). A common use case is source files needing just vectors, or
just vectors & matrices, or just colors etc. Actually, 181 files
were including the whole math thing without needing it at all.
This change removes BLI_math.h completely, and instead in all the
places that need it, includes BLI_math_vector.h or BLI_math_color.h
and so on.
Change from that:
- BLI_math_color.h was included 1399 times -> now 408 (took 114.0sec
to parse -> now 36.3sec)
- BLI_simd.h 1403 -> 418 (109.7sec -> 34.9sec).
Full rebuild of Blender (Apple M1, Xcode, RelWithDebInfo) is not
affected much (342sec -> 334sec). Most of benefit would be when
someone's changing BLI_simd.h or BLI_math_color.h or similar files,
that now there's 3x fewer files result in a recompile.
Pull Request #110944
Add a High Dynamic Range option in the Color Management > Display panel.
This enables display of extended color ranges above 1.0 for the 3D
viewport, image editor and render previews.
This requires a monitor that can display HDR colors, and a view
transform designed for HDR output. The Standard view transform works,
but Filmic does not as it was designed to bring values into the 0..1
range for SDR displays.
This patch is limited to allowing the display to visualize extended
colors, but does not include future looking work to better integrate HDR
into the full workflow.
It is implemented by rendering to high bit-depth texture formats for
the user interface, and uncapping the color range in color management.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/105662