tornavis/source/blender/editors/sculpt_paint/curves_sculpt_puff.cc

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

359 lines
14 KiB
C++
Raw Normal View History

/* SPDX-FileCopyrightText: 2023 Blender Authors
*
* SPDX-License-Identifier: GPL-2.0-or-later */
#include "BKE_attribute_math.hh"
#include "BKE_brush.hh"
#include "BKE_bvhutils.hh"
#include "BKE_context.hh"
#include "BKE_crazyspace.hh"
#include "BKE_mesh.hh"
#include "BKE_mesh_runtime.hh"
#include "ED_screen.hh"
#include "ED_view3d.hh"
#include "DEG_depsgraph.hh"
#include "DNA_brush_types.h"
#include "DNA_mesh_types.h"
#include "DNA_meshdata_types.h"
#include "WM_api.hh"
#include "BLI_length_parameterize.hh"
#include "BLI_math_geom.h"
#include "BLI_math_matrix.hh"
#include "BLI_task.hh"
#include "GEO_add_curves_on_mesh.hh"
#include "curves_sculpt_intern.hh"
namespace blender::ed::sculpt_paint {
class PuffOperation : public CurvesSculptStrokeOperation {
private:
/** Only used when a 3D brush is used. */
CurvesBrush3D brush_3d_;
/** Solver for length and collision constraints. */
CurvesConstraintSolver constraint_solver_;
friend struct PuffOperationExecutor;
public:
void on_stroke_extended(const bContext &C, const StrokeExtension &stroke_extension) override;
};
/**
* Utility class that actually executes the update when the stroke is updated. That's useful
* because it avoids passing a very large number of parameters between functions.
*/
struct PuffOperationExecutor {
PuffOperation *self_ = nullptr;
CurvesSculptCommonContext ctx_;
Object *object_ = nullptr;
Curves *curves_id_ = nullptr;
CurvesGeometry *curves_ = nullptr;
VArray<float> point_factors_;
BLI: refactor IndexMask for better performance and memory usage Goals of this refactor: * Reduce memory consumption of `IndexMask`. The old `IndexMask` uses an `int64_t` for each index which is more than necessary in pretty much all practical cases currently. Using `int32_t` might still become limiting in the future in case we use this to index e.g. byte buffers larger than a few gigabytes. We also don't want to template `IndexMask`, because that would cause a split in the "ecosystem", or everything would have to be implemented twice or templated. * Allow for more multi-threading. The old `IndexMask` contains a single array. This is generally good but has the problem that it is hard to fill from multiple-threads when the final size is not known from the beginning. This is commonly the case when e.g. converting an array of bool to an index mask. Currently, this kind of code only runs on a single thread. * Allow for efficient set operations like join, intersect and difference. It should be possible to multi-thread those operations. * It should be possible to iterate over an `IndexMask` very efficiently. The most important part of that is to avoid all memory access when iterating over continuous ranges. For some core nodes (e.g. math nodes), we generate optimized code for the cases of irregular index masks and simple index ranges. To achieve these goals, a few compromises had to made: * Slicing of the mask (at specific indices) and random element access is `O(log #indices)` now, but with a low constant factor. It should be possible to split a mask into n approximately equally sized parts in `O(n)` though, making the time per split `O(1)`. * Using range-based for loops does not work well when iterating over a nested data structure like the new `IndexMask`. Therefor, `foreach_*` functions with callbacks have to be used. To avoid extra code complexity at the call site, the `foreach_*` methods support multi-threading out of the box. The new data structure splits an `IndexMask` into an arbitrary number of ordered `IndexMaskSegment`. Each segment can contain at most `2^14 = 16384` indices. The indices within a segment are stored as `int16_t`. Each segment has an additional `int64_t` offset which allows storing arbitrary `int64_t` indices. This approach has the main benefits that segments can be processed/constructed individually on multiple threads without a serial bottleneck. Also it reduces the memory requirements significantly. For more details see comments in `BLI_index_mask.hh`. I did a few tests to verify that the data structure generally improves performance and does not cause regressions: * Our field evaluation benchmarks take about as much as before. This is to be expected because we already made sure that e.g. add node evaluation is vectorized. The important thing here is to check that changes to the way we iterate over the indices still allows for auto-vectorization. * Memory usage by a mask is about 1/4 of what it was before in the average case. That's mainly caused by the switch from `int64_t` to `int16_t` for indices. In the worst case, the memory requirements can be larger when there are many indices that are very far away. However, when they are far away from each other, that indicates that there aren't many indices in total. In common cases, memory usage can be way lower than 1/4 of before, because sub-ranges use static memory. * For some more specific numbers I benchmarked `IndexMask::from_bools` in `index_mask_from_selection` on 10.000.000 elements at various probabilities for `true` at every index: ``` Probability Old New 0 4.6 ms 0.8 ms 0.001 5.1 ms 1.3 ms 0.2 8.4 ms 1.8 ms 0.5 15.3 ms 3.0 ms 0.8 20.1 ms 3.0 ms 0.999 25.1 ms 1.7 ms 1 13.5 ms 1.1 ms ``` Pull Request: https://projects.blender.org/blender/blender/pulls/104629
2023-05-24 18:11:41 +02:00
IndexMaskMemory selected_curve_memory_;
IndexMask curve_selection_;
const CurvesSculpt *curves_sculpt_ = nullptr;
const Brush *brush_ = nullptr;
float brush_radius_base_re_;
float brush_radius_factor_;
float brush_strength_;
float2 brush_pos_re_;
eBrushFalloffShape falloff_shape_;
CurvesSurfaceTransforms transforms_;
Mesh: Replace auto smooth with node group Design task: #93551 This PR replaces the auto smooth option with a geometry nodes modifier that sets the sharp edge attribute. This solves a fair number of long- standing problems related to auto smooth, simplifies the process of normal computation, and allows Blender to automatically choose between face, vertex, and face corner normals based on the sharp edge and face attributes. Versioning adds a geometry node group to objects with meshes that had auto-smooth enabled. The modifier can be applied, which also improves performance. Auto smooth is now unnecessary to get a combination of sharp and smooth edges. In general workflows are changed a bit. Separate procedural and destructive workflows are available. Custom normals can be used immediately without turning on the removed auto smooth option. **Procedural** The node group asset "Smooth by Angle" is the main way to set sharp normals based on the edge angle. It can be accessed directly in the add modifier menu. Of course the modifier can be reordered, muted, or applied like any other, or changed internally like any geometry nodes modifier. **Destructive** Often the sharp edges don't need to be dynamic. This can give better performance since edge angles don't need to be recalculated. In edit mode the two operators "Select Sharp Edges" and "Mark Sharp" can be used. In other modes, the "Shade Smooth by Angle" controls the edge sharpness directly. ### Breaking API Changes - `use_auto_smooth` is removed. Face corner normals are now used automatically if there are mixed smooth vs. not smooth tags. Meshes now always use custom normals if they exist. - In Cycles, the lack of the separate auto smooth state makes normals look triangulated when all faces are shaded smooth. - `auto_smooth_angle` is removed. Replaced by a modifier (or operator) controlling the sharp edge attribute. This means the mesh itself (without an object) doesn't know anything about automatically smoothing by angle anymore. - `create_normals_split`, `calc_normals_split`, and `free_normals_split` are removed, and are replaced by the simpler `Mesh.corner_normals` collection property. Since it gives access to the normals cache, it is automatically updated when relevant data changes. Addons are updated here: https://projects.blender.org/blender/blender-addons/pulls/104609 ### Tests - `geo_node_curves_test_deform_curves_on_surface` has slightly different results because face corner normals are used instead of interpolated vertex normals. - `bf_wavefront_obj_tests` has different export results for one file which mixed sharp and smooth faces without turning on auto smooth. - `cycles_mesh_cpu` has one object which is completely flat shaded. Previously every edge was split before rendering, now it looks triangulated. Pull Request: https://projects.blender.org/blender/blender/pulls/108014
2023-10-20 16:54:08 +02:00
const Object *surface_ob_ = nullptr;
const Mesh *surface_ = nullptr;
Mesh: Move positions to a generic attribute **Changes** As described in T93602, this patch removes all use of the `MVert` struct, replacing it with a generic named attribute with the name `"position"`, consistent with other geometry types. Variable names have been changed from `verts` to `positions`, to align with the attribute name and the more generic design (positions are not vertices, they are just an attribute stored on the point domain). This change is made possible by previous commits that moved all other data out of `MVert` to runtime data or other generic attributes. What remains is mostly a simple type change. Though, the type still shows up 859 times, so the patch is quite large. One compromise is that now `CD_MASK_BAREMESH` now contains `CD_PROP_FLOAT3`. With the general move towards generic attributes over custom data types, we are removing use of these type masks anyway. **Benefits** The most obvious benefit is reduced memory usage and the benefits that brings in memory-bound situations. `float3` is only 3 bytes, in comparison to `MVert` which was 4. When there are millions of vertices this starts to matter more. The other benefits come from using a more generic type. Instead of writing algorithms specifically for `MVert`, code can just use arrays of vectors. This will allow eliminating many temporary arrays or wrappers used to extract positions. Many possible improvements aren't implemented in this patch, though I did switch simplify or remove the process of creating temporary position arrays in a few places. The design clarity that "positions are just another attribute" brings allows removing explicit copying of vertices in some procedural operations-- they are just processed like most other attributes. **Performance** This touches so many areas that it's hard to benchmark exhaustively, but I observed some areas as examples. * The mesh line node with 4 million count was 1.5x (8ms to 12ms) faster. * The Spring splash screen went from ~4.3 to ~4.5 fps. * The subdivision surface modifier/node was slightly faster RNA access through Python may be slightly slower, since now we need a name lookup instead of just a custom data type lookup for each index. **Future Improvements** * Remove uses of "vert_coords" functions: * `BKE_mesh_vert_coords_alloc` * `BKE_mesh_vert_coords_get` * `BKE_mesh_vert_coords_apply{_with_mat4}` * Remove more hidden copying of positions * General simplification now possible in many areas * Convert more code to C++ to use `float3` instead of `float[3]` * Currently `reinterpret_cast` is used for those C-API functions Differential Revision: https://developer.blender.org/D15982
2023-01-10 06:10:43 +01:00
Span<float3> surface_positions_;
Mesh: Replace MLoop struct with generic attributes Implements #102359. Split the `MLoop` struct into two separate integer arrays called `corner_verts` and `corner_edges`, referring to the vertex each corner is attached to and the next edge around the face at each corner. These arrays can be sliced to give access to the edges or vertices in a face. Then they are often referred to as "poly_verts" or "poly_edges". The main benefits are halving the necessary memory bandwidth when only one array is used and simplifications from using regular integer indices instead of a special-purpose struct. The commit also starts a renaming from "loop" to "corner" in mesh code. Like the other mesh struct of array refactors, forward compatibility is kept by writing files with the older format. This will be done until 4.0 to ease the transition process. Looking at a small portion of the patch should give a good impression for the rest of the changes. I tried to make the changes as small as possible so it's easy to tell the correctness from the diff. Though I found Blender developers have been very inventive over the last decade when finding different ways to loop over the corners in a face. For performance, nearly every piece of code that deals with `Mesh` is slightly impacted. Any algorithm that is memory bottle-necked should see an improvement. For example, here is a comparison of interpolating a vertex float attribute to face corners (Ryzen 3700x): **Before** (Average: 3.7 ms, Min: 3.4 ms) ``` threading::parallel_for(loops.index_range(), 4096, [&](IndexRange range) { for (const int64_t i : range) { dst[i] = src[loops[i].v]; } }); ``` **After** (Average: 2.9 ms, Min: 2.6 ms) ``` array_utils::gather(src, corner_verts, dst); ``` That's an improvement of 28% to the average timings, and it's also a simplification, since an index-based routine can be used instead. For more examples using the new arrays, see the design task. Pull Request: https://projects.blender.org/blender/blender/pulls/104424
2023-03-20 15:55:13 +01:00
Span<int> surface_corner_verts_;
Span<MLoopTri> surface_looptris_;
Span<float3> corner_normals_su_;
BVHTreeFromMesh surface_bvh_;
PuffOperationExecutor(const bContext &C) : ctx_(C) {}
void execute(PuffOperation &self, const bContext &C, const StrokeExtension &stroke_extension)
{
UNUSED_VARS(C, stroke_extension);
self_ = &self;
object_ = CTX_data_active_object(&C);
curves_id_ = static_cast<Curves *>(object_->data);
curves_ = &curves_id_->geometry.wrap();
if (curves_->curves_num() == 0) {
return;
}
if (curves_id_->surface == nullptr || curves_id_->surface->type != OB_MESH) {
report_missing_surface(stroke_extension.reports);
return;
}
curves_sculpt_ = ctx_.scene->toolsettings->curves_sculpt;
brush_ = BKE_paint_brush_for_read(&curves_sculpt_->paint);
brush_radius_base_re_ = BKE_brush_size_get(ctx_.scene, brush_);
brush_radius_factor_ = brush_radius_factor(*brush_, stroke_extension);
brush_strength_ = brush_strength_get(*ctx_.scene, *brush_, stroke_extension);
brush_pos_re_ = stroke_extension.mouse_position;
point_factors_ = *curves_->attributes().lookup_or_default<float>(
".selection", ATTR_DOMAIN_POINT, 1.0f);
BLI: refactor IndexMask for better performance and memory usage Goals of this refactor: * Reduce memory consumption of `IndexMask`. The old `IndexMask` uses an `int64_t` for each index which is more than necessary in pretty much all practical cases currently. Using `int32_t` might still become limiting in the future in case we use this to index e.g. byte buffers larger than a few gigabytes. We also don't want to template `IndexMask`, because that would cause a split in the "ecosystem", or everything would have to be implemented twice or templated. * Allow for more multi-threading. The old `IndexMask` contains a single array. This is generally good but has the problem that it is hard to fill from multiple-threads when the final size is not known from the beginning. This is commonly the case when e.g. converting an array of bool to an index mask. Currently, this kind of code only runs on a single thread. * Allow for efficient set operations like join, intersect and difference. It should be possible to multi-thread those operations. * It should be possible to iterate over an `IndexMask` very efficiently. The most important part of that is to avoid all memory access when iterating over continuous ranges. For some core nodes (e.g. math nodes), we generate optimized code for the cases of irregular index masks and simple index ranges. To achieve these goals, a few compromises had to made: * Slicing of the mask (at specific indices) and random element access is `O(log #indices)` now, but with a low constant factor. It should be possible to split a mask into n approximately equally sized parts in `O(n)` though, making the time per split `O(1)`. * Using range-based for loops does not work well when iterating over a nested data structure like the new `IndexMask`. Therefor, `foreach_*` functions with callbacks have to be used. To avoid extra code complexity at the call site, the `foreach_*` methods support multi-threading out of the box. The new data structure splits an `IndexMask` into an arbitrary number of ordered `IndexMaskSegment`. Each segment can contain at most `2^14 = 16384` indices. The indices within a segment are stored as `int16_t`. Each segment has an additional `int64_t` offset which allows storing arbitrary `int64_t` indices. This approach has the main benefits that segments can be processed/constructed individually on multiple threads without a serial bottleneck. Also it reduces the memory requirements significantly. For more details see comments in `BLI_index_mask.hh`. I did a few tests to verify that the data structure generally improves performance and does not cause regressions: * Our field evaluation benchmarks take about as much as before. This is to be expected because we already made sure that e.g. add node evaluation is vectorized. The important thing here is to check that changes to the way we iterate over the indices still allows for auto-vectorization. * Memory usage by a mask is about 1/4 of what it was before in the average case. That's mainly caused by the switch from `int64_t` to `int16_t` for indices. In the worst case, the memory requirements can be larger when there are many indices that are very far away. However, when they are far away from each other, that indicates that there aren't many indices in total. In common cases, memory usage can be way lower than 1/4 of before, because sub-ranges use static memory. * For some more specific numbers I benchmarked `IndexMask::from_bools` in `index_mask_from_selection` on 10.000.000 elements at various probabilities for `true` at every index: ``` Probability Old New 0 4.6 ms 0.8 ms 0.001 5.1 ms 1.3 ms 0.2 8.4 ms 1.8 ms 0.5 15.3 ms 3.0 ms 0.8 20.1 ms 3.0 ms 0.999 25.1 ms 1.7 ms 1 13.5 ms 1.1 ms ``` Pull Request: https://projects.blender.org/blender/blender/pulls/104629
2023-05-24 18:11:41 +02:00
curve_selection_ = curves::retrieve_selected_curves(*curves_id_, selected_curve_memory_);
falloff_shape_ = static_cast<eBrushFalloffShape>(brush_->falloff_shape);
surface_ob_ = curves_id_->surface;
Mesh: Replace auto smooth with node group Design task: #93551 This PR replaces the auto smooth option with a geometry nodes modifier that sets the sharp edge attribute. This solves a fair number of long- standing problems related to auto smooth, simplifies the process of normal computation, and allows Blender to automatically choose between face, vertex, and face corner normals based on the sharp edge and face attributes. Versioning adds a geometry node group to objects with meshes that had auto-smooth enabled. The modifier can be applied, which also improves performance. Auto smooth is now unnecessary to get a combination of sharp and smooth edges. In general workflows are changed a bit. Separate procedural and destructive workflows are available. Custom normals can be used immediately without turning on the removed auto smooth option. **Procedural** The node group asset "Smooth by Angle" is the main way to set sharp normals based on the edge angle. It can be accessed directly in the add modifier menu. Of course the modifier can be reordered, muted, or applied like any other, or changed internally like any geometry nodes modifier. **Destructive** Often the sharp edges don't need to be dynamic. This can give better performance since edge angles don't need to be recalculated. In edit mode the two operators "Select Sharp Edges" and "Mark Sharp" can be used. In other modes, the "Shade Smooth by Angle" controls the edge sharpness directly. ### Breaking API Changes - `use_auto_smooth` is removed. Face corner normals are now used automatically if there are mixed smooth vs. not smooth tags. Meshes now always use custom normals if they exist. - In Cycles, the lack of the separate auto smooth state makes normals look triangulated when all faces are shaded smooth. - `auto_smooth_angle` is removed. Replaced by a modifier (or operator) controlling the sharp edge attribute. This means the mesh itself (without an object) doesn't know anything about automatically smoothing by angle anymore. - `create_normals_split`, `calc_normals_split`, and `free_normals_split` are removed, and are replaced by the simpler `Mesh.corner_normals` collection property. Since it gives access to the normals cache, it is automatically updated when relevant data changes. Addons are updated here: https://projects.blender.org/blender/blender-addons/pulls/104609 ### Tests - `geo_node_curves_test_deform_curves_on_surface` has slightly different results because face corner normals are used instead of interpolated vertex normals. - `bf_wavefront_obj_tests` has different export results for one file which mixed sharp and smooth faces without turning on auto smooth. - `cycles_mesh_cpu` has one object which is completely flat shaded. Previously every edge was split before rendering, now it looks triangulated. Pull Request: https://projects.blender.org/blender/blender/pulls/108014
2023-10-20 16:54:08 +02:00
surface_ = static_cast<const Mesh *>(surface_ob_->data);
transforms_ = CurvesSurfaceTransforms(*object_, surface_ob_);
Mesh: Move positions to a generic attribute **Changes** As described in T93602, this patch removes all use of the `MVert` struct, replacing it with a generic named attribute with the name `"position"`, consistent with other geometry types. Variable names have been changed from `verts` to `positions`, to align with the attribute name and the more generic design (positions are not vertices, they are just an attribute stored on the point domain). This change is made possible by previous commits that moved all other data out of `MVert` to runtime data or other generic attributes. What remains is mostly a simple type change. Though, the type still shows up 859 times, so the patch is quite large. One compromise is that now `CD_MASK_BAREMESH` now contains `CD_PROP_FLOAT3`. With the general move towards generic attributes over custom data types, we are removing use of these type masks anyway. **Benefits** The most obvious benefit is reduced memory usage and the benefits that brings in memory-bound situations. `float3` is only 3 bytes, in comparison to `MVert` which was 4. When there are millions of vertices this starts to matter more. The other benefits come from using a more generic type. Instead of writing algorithms specifically for `MVert`, code can just use arrays of vectors. This will allow eliminating many temporary arrays or wrappers used to extract positions. Many possible improvements aren't implemented in this patch, though I did switch simplify or remove the process of creating temporary position arrays in a few places. The design clarity that "positions are just another attribute" brings allows removing explicit copying of vertices in some procedural operations-- they are just processed like most other attributes. **Performance** This touches so many areas that it's hard to benchmark exhaustively, but I observed some areas as examples. * The mesh line node with 4 million count was 1.5x (8ms to 12ms) faster. * The Spring splash screen went from ~4.3 to ~4.5 fps. * The subdivision surface modifier/node was slightly faster RNA access through Python may be slightly slower, since now we need a name lookup instead of just a custom data type lookup for each index. **Future Improvements** * Remove uses of "vert_coords" functions: * `BKE_mesh_vert_coords_alloc` * `BKE_mesh_vert_coords_get` * `BKE_mesh_vert_coords_apply{_with_mat4}` * Remove more hidden copying of positions * General simplification now possible in many areas * Convert more code to C++ to use `float3` instead of `float[3]` * Currently `reinterpret_cast` is used for those C-API functions Differential Revision: https://developer.blender.org/D15982
2023-01-10 06:10:43 +01:00
surface_positions_ = surface_->vert_positions();
Mesh: Replace MLoop struct with generic attributes Implements #102359. Split the `MLoop` struct into two separate integer arrays called `corner_verts` and `corner_edges`, referring to the vertex each corner is attached to and the next edge around the face at each corner. These arrays can be sliced to give access to the edges or vertices in a face. Then they are often referred to as "poly_verts" or "poly_edges". The main benefits are halving the necessary memory bandwidth when only one array is used and simplifications from using regular integer indices instead of a special-purpose struct. The commit also starts a renaming from "loop" to "corner" in mesh code. Like the other mesh struct of array refactors, forward compatibility is kept by writing files with the older format. This will be done until 4.0 to ease the transition process. Looking at a small portion of the patch should give a good impression for the rest of the changes. I tried to make the changes as small as possible so it's easy to tell the correctness from the diff. Though I found Blender developers have been very inventive over the last decade when finding different ways to loop over the corners in a face. For performance, nearly every piece of code that deals with `Mesh` is slightly impacted. Any algorithm that is memory bottle-necked should see an improvement. For example, here is a comparison of interpolating a vertex float attribute to face corners (Ryzen 3700x): **Before** (Average: 3.7 ms, Min: 3.4 ms) ``` threading::parallel_for(loops.index_range(), 4096, [&](IndexRange range) { for (const int64_t i : range) { dst[i] = src[loops[i].v]; } }); ``` **After** (Average: 2.9 ms, Min: 2.6 ms) ``` array_utils::gather(src, corner_verts, dst); ``` That's an improvement of 28% to the average timings, and it's also a simplification, since an index-based routine can be used instead. For more examples using the new arrays, see the design task. Pull Request: https://projects.blender.org/blender/blender/pulls/104424
2023-03-20 15:55:13 +01:00
surface_corner_verts_ = surface_->corner_verts();
surface_looptris_ = surface_->looptris();
Mesh: Replace auto smooth with node group Design task: #93551 This PR replaces the auto smooth option with a geometry nodes modifier that sets the sharp edge attribute. This solves a fair number of long- standing problems related to auto smooth, simplifies the process of normal computation, and allows Blender to automatically choose between face, vertex, and face corner normals based on the sharp edge and face attributes. Versioning adds a geometry node group to objects with meshes that had auto-smooth enabled. The modifier can be applied, which also improves performance. Auto smooth is now unnecessary to get a combination of sharp and smooth edges. In general workflows are changed a bit. Separate procedural and destructive workflows are available. Custom normals can be used immediately without turning on the removed auto smooth option. **Procedural** The node group asset "Smooth by Angle" is the main way to set sharp normals based on the edge angle. It can be accessed directly in the add modifier menu. Of course the modifier can be reordered, muted, or applied like any other, or changed internally like any geometry nodes modifier. **Destructive** Often the sharp edges don't need to be dynamic. This can give better performance since edge angles don't need to be recalculated. In edit mode the two operators "Select Sharp Edges" and "Mark Sharp" can be used. In other modes, the "Shade Smooth by Angle" controls the edge sharpness directly. ### Breaking API Changes - `use_auto_smooth` is removed. Face corner normals are now used automatically if there are mixed smooth vs. not smooth tags. Meshes now always use custom normals if they exist. - In Cycles, the lack of the separate auto smooth state makes normals look triangulated when all faces are shaded smooth. - `auto_smooth_angle` is removed. Replaced by a modifier (or operator) controlling the sharp edge attribute. This means the mesh itself (without an object) doesn't know anything about automatically smoothing by angle anymore. - `create_normals_split`, `calc_normals_split`, and `free_normals_split` are removed, and are replaced by the simpler `Mesh.corner_normals` collection property. Since it gives access to the normals cache, it is automatically updated when relevant data changes. Addons are updated here: https://projects.blender.org/blender/blender-addons/pulls/104609 ### Tests - `geo_node_curves_test_deform_curves_on_surface` has slightly different results because face corner normals are used instead of interpolated vertex normals. - `bf_wavefront_obj_tests` has different export results for one file which mixed sharp and smooth faces without turning on auto smooth. - `cycles_mesh_cpu` has one object which is completely flat shaded. Previously every edge was split before rendering, now it looks triangulated. Pull Request: https://projects.blender.org/blender/blender/pulls/108014
2023-10-20 16:54:08 +02:00
corner_normals_su_ = surface_->corner_normals();
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 18:56:34 +02:00
BKE_bvhtree_from_mesh_get(&surface_bvh_, surface_, BVHTREE_FROM_LOOPTRI, 2);
BLI_SCOPED_DEFER([&]() { free_bvhtree_from_mesh(&surface_bvh_); });
if (stroke_extension.is_first) {
if (falloff_shape_ == PAINT_FALLOFF_SHAPE_SPHERE) {
self.brush_3d_ = *sample_curves_3d_brush(*ctx_.depsgraph,
*ctx_.region,
*ctx_.v3d,
*ctx_.rv3d,
*object_,
brush_pos_re_,
brush_radius_base_re_);
}
self_->constraint_solver_.initialize(
*curves_, curve_selection_, curves_id_->flag & CV_SCULPT_COLLISION_ENABLED);
}
Array<float> curve_weights(curves_->curves_num());
if (falloff_shape_ == PAINT_FALLOFF_SHAPE_TUBE) {
this->find_curve_weights_projected_with_symmetry(curve_weights);
}
else if (falloff_shape_ == PAINT_FALLOFF_SHAPE_SPHERE) {
this->find_curves_weights_spherical_with_symmetry(curve_weights);
}
else {
BLI_assert_unreachable();
}
BLI: refactor IndexMask for better performance and memory usage Goals of this refactor: * Reduce memory consumption of `IndexMask`. The old `IndexMask` uses an `int64_t` for each index which is more than necessary in pretty much all practical cases currently. Using `int32_t` might still become limiting in the future in case we use this to index e.g. byte buffers larger than a few gigabytes. We also don't want to template `IndexMask`, because that would cause a split in the "ecosystem", or everything would have to be implemented twice or templated. * Allow for more multi-threading. The old `IndexMask` contains a single array. This is generally good but has the problem that it is hard to fill from multiple-threads when the final size is not known from the beginning. This is commonly the case when e.g. converting an array of bool to an index mask. Currently, this kind of code only runs on a single thread. * Allow for efficient set operations like join, intersect and difference. It should be possible to multi-thread those operations. * It should be possible to iterate over an `IndexMask` very efficiently. The most important part of that is to avoid all memory access when iterating over continuous ranges. For some core nodes (e.g. math nodes), we generate optimized code for the cases of irregular index masks and simple index ranges. To achieve these goals, a few compromises had to made: * Slicing of the mask (at specific indices) and random element access is `O(log #indices)` now, but with a low constant factor. It should be possible to split a mask into n approximately equally sized parts in `O(n)` though, making the time per split `O(1)`. * Using range-based for loops does not work well when iterating over a nested data structure like the new `IndexMask`. Therefor, `foreach_*` functions with callbacks have to be used. To avoid extra code complexity at the call site, the `foreach_*` methods support multi-threading out of the box. The new data structure splits an `IndexMask` into an arbitrary number of ordered `IndexMaskSegment`. Each segment can contain at most `2^14 = 16384` indices. The indices within a segment are stored as `int16_t`. Each segment has an additional `int64_t` offset which allows storing arbitrary `int64_t` indices. This approach has the main benefits that segments can be processed/constructed individually on multiple threads without a serial bottleneck. Also it reduces the memory requirements significantly. For more details see comments in `BLI_index_mask.hh`. I did a few tests to verify that the data structure generally improves performance and does not cause regressions: * Our field evaluation benchmarks take about as much as before. This is to be expected because we already made sure that e.g. add node evaluation is vectorized. The important thing here is to check that changes to the way we iterate over the indices still allows for auto-vectorization. * Memory usage by a mask is about 1/4 of what it was before in the average case. That's mainly caused by the switch from `int64_t` to `int16_t` for indices. In the worst case, the memory requirements can be larger when there are many indices that are very far away. However, when they are far away from each other, that indicates that there aren't many indices in total. In common cases, memory usage can be way lower than 1/4 of before, because sub-ranges use static memory. * For some more specific numbers I benchmarked `IndexMask::from_bools` in `index_mask_from_selection` on 10.000.000 elements at various probabilities for `true` at every index: ``` Probability Old New 0 4.6 ms 0.8 ms 0.001 5.1 ms 1.3 ms 0.2 8.4 ms 1.8 ms 0.5 15.3 ms 3.0 ms 0.8 20.1 ms 3.0 ms 0.999 25.1 ms 1.7 ms 1 13.5 ms 1.1 ms ``` Pull Request: https://projects.blender.org/blender/blender/pulls/104629
2023-05-24 18:11:41 +02:00
IndexMaskMemory memory;
const IndexMask curves_mask = IndexMask::from_predicate(
curve_selection_, GrainSize(4096), memory, [&](const int64_t curve_i) {
return curve_weights[curve_i] > 0.0f;
});
this->puff(curves_mask, curve_weights);
self_->constraint_solver_.solve_step(*curves_, curves_mask, surface_, transforms_);
curves_->tag_positions_changed();
DEG_id_tag_update(&curves_id_->id, ID_RECALC_GEOMETRY);
WM_main_add_notifier(NC_GEOM | ND_DATA, &curves_id_->id);
ED_region_tag_redraw(ctx_.region);
}
void find_curve_weights_projected_with_symmetry(MutableSpan<float> r_curve_weights)
{
const Vector<float4x4> symmetry_brush_transforms = get_symmetry_brush_transforms(
eCurvesSymmetryType(curves_id_->symmetry));
for (const float4x4 &brush_transform : symmetry_brush_transforms) {
this->find_curve_weights_projected(brush_transform, r_curve_weights);
}
}
void find_curve_weights_projected(const float4x4 &brush_transform,
MutableSpan<float> r_curve_weights)
{
const float4x4 brush_transform_inv = math::invert(brush_transform);
float4x4 projection;
ED_view3d_ob_project_mat_get(ctx_.rv3d, object_, projection.ptr());
const float brush_radius_re = brush_radius_base_re_ * brush_radius_factor_;
const float brush_radius_sq_re = pow2f(brush_radius_re);
const bke::crazyspace::GeometryDeformation deformation =
bke::crazyspace::get_evaluated_curves_deformation(*ctx_.depsgraph, *object_);
const OffsetIndices points_by_curve = curves_->points_by_curve();
curve_selection_.foreach_index(GrainSize(256), [&](const int64_t curve_i) {
const IndexRange points = points_by_curve[curve_i];
const float3 first_pos_cu = math::transform_point(brush_transform_inv,
deformation.positions[points[0]]);
float2 prev_pos_re;
ED_view3d_project_float_v2_m4(ctx_.region, first_pos_cu, prev_pos_re, projection.ptr());
float max_weight = 0.0f;
for (const int point_i : points.drop_front(1)) {
const float3 pos_cu = math::transform_point(brush_transform_inv,
deformation.positions[point_i]);
float2 pos_re;
ED_view3d_project_float_v2_m4(ctx_.region, pos_cu, pos_re, projection.ptr());
BLI_SCOPED_DEFER([&]() { prev_pos_re = pos_re; });
const float dist_to_brush_sq_re = dist_squared_to_line_segment_v2(
brush_pos_re_, prev_pos_re, pos_re);
if (dist_to_brush_sq_re > brush_radius_sq_re) {
continue;
}
const float dist_to_brush_re = std::sqrt(dist_to_brush_sq_re);
const float radius_falloff = BKE_brush_curve_strength(
brush_, dist_to_brush_re, brush_radius_re);
math::max_inplace(max_weight, radius_falloff);
}
r_curve_weights[curve_i] = max_weight;
});
}
void find_curves_weights_spherical_with_symmetry(MutableSpan<float> r_curve_weights)
{
float4x4 projection;
ED_view3d_ob_project_mat_get(ctx_.rv3d, object_, projection.ptr());
float3 brush_pos_wo;
ED_view3d_win_to_3d(
ctx_.v3d,
ctx_.region,
math::transform_point(transforms_.curves_to_world, self_->brush_3d_.position_cu),
brush_pos_re_,
brush_pos_wo);
const float3 brush_pos_cu = math::transform_point(transforms_.world_to_curves, brush_pos_wo);
const float brush_radius_cu = self_->brush_3d_.radius_cu * brush_radius_factor_;
const Vector<float4x4> symmetry_brush_transforms = get_symmetry_brush_transforms(
eCurvesSymmetryType(curves_id_->symmetry));
for (const float4x4 &brush_transform : symmetry_brush_transforms) {
this->find_curves_weights_spherical(
math::transform_point(brush_transform, brush_pos_cu), brush_radius_cu, r_curve_weights);
}
}
void find_curves_weights_spherical(const float3 &brush_pos_cu,
const float brush_radius_cu,
MutableSpan<float> r_curve_weights)
{
const float brush_radius_sq_cu = pow2f(brush_radius_cu);
const bke::crazyspace::GeometryDeformation deformation =
bke::crazyspace::get_evaluated_curves_deformation(*ctx_.depsgraph, *object_);
const OffsetIndices points_by_curve = curves_->points_by_curve();
curve_selection_.foreach_index(GrainSize(256), [&](const int64_t curve_i) {
const IndexRange points = points_by_curve[curve_i];
float max_weight = 0.0f;
for (const int point_i : points.drop_front(1)) {
const float3 &prev_pos_cu = deformation.positions[point_i - 1];
const float3 &pos_cu = deformation.positions[point_i];
const float dist_to_brush_sq_cu = dist_squared_to_line_segment_v3(
brush_pos_cu, prev_pos_cu, pos_cu);
if (dist_to_brush_sq_cu > brush_radius_sq_cu) {
continue;
}
const float dist_to_brush_cu = std::sqrt(dist_to_brush_sq_cu);
const float radius_falloff = BKE_brush_curve_strength(
brush_, dist_to_brush_cu, brush_radius_cu);
math::max_inplace(max_weight, radius_falloff);
}
r_curve_weights[curve_i] = max_weight;
});
}
void puff(const IndexMask &selection, const Span<float> curve_weights)
{
const OffsetIndices points_by_curve = curves_->points_by_curve();
MutableSpan<float3> positions_cu = curves_->positions_for_write();
selection.foreach_segment(GrainSize(256), [&](IndexMaskSegment segment) {
Vector<float> accumulated_lengths_cu;
for (const int curve_i : segment) {
const IndexRange points = points_by_curve[curve_i];
const int first_point_i = points[0];
const float3 first_pos_cu = positions_cu[first_point_i];
const float3 first_pos_su = math::transform_point(transforms_.curves_to_surface,
first_pos_cu);
/* Find the nearest position on the surface. The curve will be aligned to the normal of
* that point. */
BVHTreeNearest nearest;
nearest.dist_sq = FLT_MAX;
BLI_bvhtree_find_nearest(surface_bvh_.tree,
first_pos_su,
&nearest,
surface_bvh_.nearest_callback,
&surface_bvh_);
const MLoopTri &looptri = surface_looptris_[nearest.index];
const float3 closest_pos_su = nearest.co;
Mesh: Replace MLoop struct with generic attributes Implements #102359. Split the `MLoop` struct into two separate integer arrays called `corner_verts` and `corner_edges`, referring to the vertex each corner is attached to and the next edge around the face at each corner. These arrays can be sliced to give access to the edges or vertices in a face. Then they are often referred to as "poly_verts" or "poly_edges". The main benefits are halving the necessary memory bandwidth when only one array is used and simplifications from using regular integer indices instead of a special-purpose struct. The commit also starts a renaming from "loop" to "corner" in mesh code. Like the other mesh struct of array refactors, forward compatibility is kept by writing files with the older format. This will be done until 4.0 to ease the transition process. Looking at a small portion of the patch should give a good impression for the rest of the changes. I tried to make the changes as small as possible so it's easy to tell the correctness from the diff. Though I found Blender developers have been very inventive over the last decade when finding different ways to loop over the corners in a face. For performance, nearly every piece of code that deals with `Mesh` is slightly impacted. Any algorithm that is memory bottle-necked should see an improvement. For example, here is a comparison of interpolating a vertex float attribute to face corners (Ryzen 3700x): **Before** (Average: 3.7 ms, Min: 3.4 ms) ``` threading::parallel_for(loops.index_range(), 4096, [&](IndexRange range) { for (const int64_t i : range) { dst[i] = src[loops[i].v]; } }); ``` **After** (Average: 2.9 ms, Min: 2.6 ms) ``` array_utils::gather(src, corner_verts, dst); ``` That's an improvement of 28% to the average timings, and it's also a simplification, since an index-based routine can be used instead. For more examples using the new arrays, see the design task. Pull Request: https://projects.blender.org/blender/blender/pulls/104424
2023-03-20 15:55:13 +01:00
const float3 &v0_su = surface_positions_[surface_corner_verts_[looptri.tri[0]]];
const float3 &v1_su = surface_positions_[surface_corner_verts_[looptri.tri[1]]];
const float3 &v2_su = surface_positions_[surface_corner_verts_[looptri.tri[2]]];
float3 bary_coords;
interp_weights_tri_v3(bary_coords, v0_su, v1_su, v2_su, closest_pos_su);
const float3 normal_su = geometry::compute_surface_point_normal(
looptri, bary_coords, corner_normals_su_);
const float3 normal_cu = math::normalize(
math::transform_direction(transforms_.surface_to_curves_normal, normal_su));
accumulated_lengths_cu.reinitialize(points.size() - 1);
length_parameterize::accumulate_lengths<float3>(
positions_cu.slice(points), false, accumulated_lengths_cu);
/* Align curve to the surface normal while making sure that the curve does not fold up much
* in the process (e.g. when the curve was pointing in the opposite direction before). */
for (const int i : IndexRange(points.size()).drop_front(1)) {
const int point_i = points[i];
const float3 old_pos_cu = positions_cu[point_i];
/* Compute final position of the point. */
const float length_param_cu = accumulated_lengths_cu[i - 1];
const float3 goal_pos_cu = first_pos_cu + length_param_cu * normal_cu;
const float weight = 0.01f * brush_strength_ * point_factors_[point_i] *
curve_weights[curve_i];
float3 new_pos_cu = math::interpolate(old_pos_cu, goal_pos_cu, weight);
/* Make sure the point does not move closer to the root point than it was initially. This
* makes the curve kind of "rotate up". */
const float old_dist_to_root_cu = math::distance(old_pos_cu, first_pos_cu);
const float new_dist_to_root_cu = math::distance(new_pos_cu, first_pos_cu);
if (new_dist_to_root_cu < old_dist_to_root_cu) {
const float3 offset = math::normalize(new_pos_cu - first_pos_cu);
new_pos_cu += (old_dist_to_root_cu - new_dist_to_root_cu) * offset;
}
positions_cu[point_i] = new_pos_cu;
}
}
});
}
};
void PuffOperation::on_stroke_extended(const bContext &C, const StrokeExtension &stroke_extension)
{
PuffOperationExecutor executor{C};
executor.execute(*this, C, stroke_extension);
}
std::unique_ptr<CurvesSculptStrokeOperation> new_puff_operation()
{
return std::make_unique<PuffOperation>();
}
} // namespace blender::ed::sculpt_paint