tornavis/source/blender/compositor/COM_compositor.hh

349 lines
18 KiB
C++

/* SPDX-FileCopyrightText: 2011 Blender Authors
*
* SPDX-License-Identifier: GPL-2.0-or-later */
#pragma once
#include "DNA_color_types.h"
#include "DNA_node_types.h"
namespace blender::realtime_compositor {
class RenderContext;
}
struct Render;
/* Keep ascii art. */
/* clang-format off */
/**
* \defgroup Model The data model of the compositor
* \ingroup compositor
* \defgroup Memory The memory management stuff
* \ingroup compositor
* \defgroup Execution The execution logic
* \ingroup compositor
* \defgroup Conversion Conversion logic
* \ingroup compositor
* \defgroup Node All nodes of the compositor
* \ingroup compositor
* \defgroup Operation All operations of the compositor
* \ingroup compositor
*
* \page Introduction of the Blender Compositor
*
* \section bcomp Blender compositor
* This project redesigns the internals of Blender's compositor.
* The project has been executed in 2011 by At Mind.
* At Mind is a technology company located in Amsterdam, The Netherlands.
* The project has been crowd-funded. This code has been released under GPL2 to be used in Blender.
*
* \section goals The goals of the project
* the new compositor has 2 goals.
* - Make a faster compositor (speed of calculation)
* - Make the compositor work faster for you (workflow)
*
* \section speed Faster compositor
* The speedup has been done by making better use of the hardware Blenders is working on.
* The previous compositor only used a single threaded model to calculate a node.
* The only exception to this is the Defocus node.
* Only when it is possible to calculate two full nodes in parallel a second thread was used.
* Current workstations have 8-16 threads available, and most of the time these are idle.
*
* In the new compositor we want to use as much of threads as possible.
* Even new OpenCL capable GPU-hardware can be used for calculation.
*
* \section workflow Work faster
* The previous compositor only showed the final image.
* The compositor could wait a long time before seeing the result of his work.
* The new compositor will work in a way that it will focus on
* getting information back to the user. It will prioritize its work to get earlier user feedback.
*
* \page memory Memory model
* The main issue is the type of memory model to use.
* Blender is used by consumers and professionals.
* Ranging from low-end machines to very high-end machines.
* The system should work on high-end machines and on low-end machines.
* \page executing Executing
* \section prepare Prepare execution
*
* during the preparation of the execution All ReadBufferOperation will receive an offset.
* This offset is used during execution as an optimization trick
* Next all operations will be initialized for execution \see NodeOperation.init_execution
* Next all ExecutionGroup's will be initialized for execution \see ExecutionGroup.init_execution
* this all is controlled from \see ExecutionSystem.execute
*
* \section priority Render priority
* Render priority is an priority of an output node.
* A user has a different need of Render priorities of output nodes
* than during editing.
* for example. the Active ViewerNode has top priority during editing,
* but during rendering a CompositeNode has.
* All NodeOperation has a setting for their render-priority,
* but only for output NodeOperation these have effect.
* In ExecutionSystem.execute all priorities are checked.
* For every priority the ExecutionGroup's are check if the
* priority do match.
* When match the ExecutionGroup will be executed (this happens in serial)
*
* \see ExecutionSystem.execute control of the Render priority
* \see NodeOperation.get_render_priority receive the render priority
* \see ExecutionGroup.execute the main loop to execute a whole ExecutionGroup
*
* \section order Chunk order
*
* When a ExecutionGroup is executed, first the order of chunks are determined.
* The settings are stored in the ViewerNode inside the ExecutionGroup.
* ExecutionGroups that have no viewer-node,
* will use a default one.
* There are several possible chunk orders
* - [@ref ChunkOrdering.CenterOut]:
* Start calculating from a configurable point and order by nearest chunk.
* - [@ref ChunkOrdering.Random]:
* Randomize all chunks.
* - [@ref ChunkOrdering.TopDown]:
* Start calculation from the bottom to the top of the image.
* - [@ref ChunkOrdering.RuleOfThirds]:
* Experimental order based on 9 hot-spots in the image.
*
* When the chunk-order is determined, the first few chunks will be checked if they can be scheduled.
* Chunks can have three states:
* - [@ref eWorkPackageState.NotScheduled]:
* Chunk is not yet scheduled, or dependencies are not met.
* - [@ref eWorkPackageState.Scheduled]:
* All dependencies are met, chunk is scheduled, but not finished.
* - [@ref eWorkPackageState.Executed]:
* Chunk is finished.
*
* \see ExecutionGroup.execute
* \see ViewerOperation.get_chunk_order
* \see ChunkOrdering
*
* \section interest Area of interest
* An ExecutionGroup can have dependencies to other ExecutionGroup's.
* Data passing from one ExecutionGroup to another one are stored in 'chunks'.
* If not all input chunks are available the chunk execution will not be scheduled.
* <pre>
* +-------------------------------------+ +--------------------------------------+
* | ExecutionGroup A | | ExecutionGroup B |
* | +----------------+ +-------------+ | | +------------+ +-----------------+ |
* | | NodeOperation a| | WriteBuffer | | | | ReadBuffer | | ViewerOperation | |
* | | *==* Operation | | | | Operation *===* | |
* | | | | | | | | | | | |
* | +----------------+ +-------------+ | | +------------+ +-----------------+ |
* | | | | | |
* +--------------------------------|----+ +---|----------------------------------+
* | |
* | |
* +---------------------------+
* | MemoryProxy |
* | +----------+ +---------+ |
* | | Chunk a | | Chunk b | |
* | | | | | |
* | +----------+ +---------+ |
* | |
* +---------------------------+
* </pre>
*
* In the above example ExecutionGroup B has an outputoperation (ViewerOperation)
* and is being executed.
* The first chunk is evaluated [@ref ExecutionGroup.schedule_chunk_when_possible],
* but not all input chunks are available.
* The relevant ExecutionGroup (that can calculate the missing chunks; ExecutionGroup A)
* is asked to calculate the area ExecutionGroup B is missing.
* [@ref ExecutionGroup.schedule_area_when_possible]
* ExecutionGroup B checks what chunks the area spans, and tries to schedule these chunks.
* If all input data is available these chunks are scheduled [@ref ExecutionGroup.schedule_chunk]
*
* <pre>
*
* +-------------------------+ +----------------+ +----------------+
* | ExecutionSystem.execute | | ExecutionGroup | | ExecutionGroup |
* +-------------------------+ | (B) | | (A) |
* O +----------------+ +----------------+
* O | |
* O ExecutionGroup.execute | |
* O------------------------------->O |
* . O |
* . O-------\ |
* . . | ExecutionGroup.schedule_chunk_when_possible
* . . O----/ (*) |
* . . O |
* . . O |
* . . O ExecutionGroup.schedule_area_when_possible|
* . . O---------------------------------------->O
* . . . O----------\ ExecutionGroup.schedule_chunk_when_possible
* . . . . | (*)
* . . . . O-------/
* . . . . O
* . . . . O
* . . . . O-------\ ExecutionGroup.schedule_chunk
* . . . . . |
* . . . . . O----/
* . . . . O<=O
* . . . O<=O
* . . . O
* . . O<========================================O
* . . O |
* . O<=O |
* . O |
* . O |
* </pre>
*
* This happens until all chunks of (ExecutionGroup B) are finished executing or the user break's the process.
*
* NodeOperation like the ScaleOperation can influence the area of interest by reimplementing the
* [@ref NodeOperation.determine_area_of_interest] method
*
* <pre>
*
* +--------------------------+ +---------------------------------+
* | ExecutionGroup A | | ExecutionGroup B |
* | | | |
* +--------------------------+ +---------------------------------+
* Needed chunks from ExecutionGroup A | Chunk of ExecutionGroup B (to be evaluated)
* +-------+ +-------+ | +--------+
* |Chunk 1| |Chunk 2| +----------------+ |Chunk 1 |
* | | | | | ScaleOperation | | |
* +-------+ +-------+ +----------------+ +--------+
*
* +-------+ +-------+
* |Chunk 3| |Chunk 4|
* | | | |
* +-------+ +-------+
*
* </pre>
*
* \see ExecutionGroup.execute Execute a complete ExecutionGroup.
* Halts until finished or breaked by user
* \see ExecutionGroup.schedule_chunk_when_possible Tries to schedule a single chunk,
* checks if all input data is available. Can trigger dependent chunks to be calculated
* \see ExecutionGroup.schedule_area_when_possible
* Tries to schedule an area. This can be multiple chunks
* (is called from [@ref ExecutionGroup.schedule_chunk_when_possible])
* \see ExecutionGroup.schedule_chunk Schedule a chunk on the WorkScheduler
* \see NodeOperation.determine_depending_area_of_interest Influence the area of interest of a chunk.
* \see WriteBufferOperation Operation to write to a MemoryProxy/MemoryBuffer
* \see ReadBufferOperation Operation to read from a MemoryProxy/MemoryBuffer
* \see MemoryProxy proxy for information about memory image
* (a image consist out of multiple chunks)
* \see MemoryBuffer Allocated memory for a single chunk
*
* \section workscheduler WorkScheduler
* the WorkScheduler is implemented as a static class. the responsibility of the WorkScheduler
* is to balance WorkPackages to the available and free devices.
* the work-scheduler can work in 2 states.
* For witching these between the state you need to recompile blender
*
* \subsection multithread Multi threaded
* Default the work-scheduler will place all work as WorkPackage in a queue.
* For every CPUcore a working thread is created.
* These working threads will ask the WorkScheduler if there is work
* for a specific Device.
* the work-scheduler will find work for the device and the device
* will be asked to execute the WorkPackage.
*
* \subsection singlethread Single threaded
* For debugging reasons the multi-threading can be disabled.
* This is done by changing the `COM_threading_model`
* to `ThreadingModel::SingleThreaded`. When compiling the work-scheduler
* will be changes to support no threading and run everything on the CPU.
*
* \section devices Devices
* A Device within the compositor context is a Hardware component that can used to calculate chunks.
* This chunk is encapsulated in a WorkPackage.
* the WorkScheduler controls the devices and selects the device where a
* WorkPackage will be calculated.
*
* \subsection WS_Devices Work-scheduler
* The WorkScheduler controls all Devices.
* When initializing the compositor the WorkScheduler selects all
* devices that will be used during compositor.
* There are two types of Devices, CPUDevice and OpenCLDevice.
* When an ExecutionGroup schedules a Chunk the schedule method of the WorkScheduler
* The Workscheduler determines if the chunk can be run on an OpenCLDevice
* (and that there are available OpenCLDevice).
* If this is the case the chunk will be added to the work-list for OpenCLDevice's
* otherwise the chunk will be added to the work-list of CPUDevices.
*
* A thread will read the work-list and sends a work-package to its device.
*
* \see WorkScheduler.schedule method that is called to schedule a chunk
* \see Device.execute method called to execute a chunk
*
* \subsection CPUDevice CPUDevice
* When a CPUDevice gets a WorkPackage the Device will get the input-buffer that is needed to
* calculate the chunk. Allocation is already done by the ExecutionGroup.
* The output-buffer of the chunk is being created.
* The OutputOperation of the ExecutionGroup is called to execute the area of the output-buffer.
*
* \see ExecutionGroup
* \see NodeOperation.execute_region executes a single chunk of a NodeOperation
* \see CPUDevice.execute
*
* \subsection GPUDevice OpenCLDevice
*
* To be completed!
* \see NodeOperation.execute_opencl_region
* \see OpenCLDevice.execute
*
* \section execute_pixel executing a pixel
* Finally the last step, the node functionality :)
*/
/**
* \brief The main method that is used to execute the compositor tree.
* It can be executed during editing (`blenkernel/node.cc`) or rendering
* (`renderer/pipeline.cc`).
*
* \param render: Render instance for GPU context.
*
* \param render_data: Render data for this composite, this won't always belong to a scene.
*
* \param node_tree: Reference to the compositor editing tree
*
* \param rendering: This parameter determines whether the function is called from rendering
* (true) or editing (false).
* based on this setting the system will work differently:
* - during rendering only Composite & the File output node will be calculated
* \see NodeOperation.is_output_program(bool rendering) of the specific operations
*
* - during editing all output nodes will be calculated
* \see NodeOperation.is_output_program(bool rendering) of the specific operations
*
* - another quality setting can be used bNodeTree.
* The quality is determined by the bNodeTree fields.
* quality can be modified by the user from within the node panels.
* \see bNodeTree.edit_quality
* \see bNodeTree.render_quality
*
* - output nodes can have different priorities in the WorkScheduler.
* This is implemented in the COM_execute function.
*
* OCIO_TODO: this options only used in rare cases, namely in output file node,
* so probably this settings could be passed in a nicer way.
* should be checked further, probably it'll be also needed for preview
* generation in display space
*/
/* clang-format off */
void COM_execute(Render *render,
RenderData *render_data,
Scene *scene,
bNodeTree *node_tree,
bool rendering,
const char *view_name,
blender::realtime_compositor::RenderContext *render_context);
/**
* \brief Deinitialize the compositor caches and allocated memory.
* Use COM_clear_caches to only free the caches.
*/
void COM_deinitialize(void);
/**
* \brief Clear all compositor caches. (Compositor system will still remain available).
* To deinitialize the compositor use the COM_deinitialize method.
*/
// void COM_clear_caches(void); // NOT YET WRITTEN