Symbolic Logic:Programming:Demand Driven Evaluation
The purpose of Demand Driven Evaluation is to guide the evaluation so that only values that are required for the result are calculated. This speeds up the calculation, and, in some situations, makes the calculation possible.
When a function returns an infinite list it is not possible to create the whole list. But not all the values in the list may be needed. Demand Drive Evaluation allows only the values that are required to be calculated.
Demand Driven Evaluation is a form of Lazy Evaluation. However with Demand Driven Evaluation the value calculated does not have to be the result of the function. If the return value is known, but some input value is required the result will be calculated using an appropriate inverse function.
Contents |
[edit] Scenarios
[edit] Variables
Start with variables that the user wants to see or that are required in the output. Mark these variables as requested. Then look at what function calls have those variables as a parameter. Try to execute those function calls. Each function call may have other variables as parameters. Functions that directly access a variables value need to request the variable.
Normally only primitive functions (==, !=, +, -, *, /, and, or, if) need to request variables. High level functions take parameters but do not directly need to access the values. Only those methods that actually need to access the value directly need request the variable.
When a function is called look at each function call within that function. If the call has a requested variable, and has whatever inputs it requires, execute it straight away. If not then request the input variables. Then create an object representing the function call and put it on the heap.
The heap is a place where calls are put when they are not ready to be executed. It will have a structure that organises calls according to when they are needed. Of course if the calls are already in the order they are needed then they will be executed immediately. So the heap is a fall back mechanism.
When a primitive function is invoked the action performed by the input function will depend on which variable is requested. For example if the function is,
and is requested the functions action will be,
A call that is never requested will go on the heap but never be invoked. These call objects must be cleaned up by a memory management process. For this purpose calls are associated with transactions. When a call associated with a transaction produces all its requested output it is complete. All the objects created within the transaction that are not required outside the transaction are then deleted.
[edit] Attributes
An object has attributes (member variables). The object is passed to a function as an input parameter. Do all of the attributes of the object need to be known for the object to be known?
If yes then the object and its attributes must be known before the function may be executed. If no the function may be executed and the attributes populated as needed.
If the attribute is another object then a large tree of objects may need to be known before the function may be executed.
An ownership modifier on each member variable allows control of this. An owned member must be known for the object to be known. An unowned object need not be.
There is a similarity between concepts in CLP and C++.
C++ | CLP |
---|---|
const | input |
non const reference | output |
mutable | not owned |
In CLP all objects are constant. Values may only change from unknown to known. However there is a similarity between the modifiers of C++ and CLP,
- A const object in C++ must have its value provided before the function is called, like input.
- A non const reference may have its value set by the function, as output.
- A mutable value is not an intrinsic part of an objects value. It may be changed even though the object is const.
[edit] Memory Management
Automated memory management is good for the developer. But it has a cost for the execution of the program.
In many cases a program will perform a series of isolated individual tasks. For example a server may execute a series of tasks with no record of the state between calls. This suggest a simpler approach.
- Delete all memory allocated in a task at the end of the task.
The memory is then immediately freed up in a simple manner with no need for complex analysis.
This approach would be satisfactory for many purposes. However then no information may be saved between tasks.
To allow for more flexibility memory management may allow "promotion". If an object is assigned to an output parameter of the task, or to a member variable of an object from outside the task, the object is promoted so that it belongs outside the task. It will not be deleted when the task is completed.
Promotion needs to be recursive so that when an object is promoted, all objects it has access to are also promoted. This is so that the object will not end up pointing to a deleted object when the task ends.
The promoted object becomes attached to a task with a longer scope than the task in which it was created. It wont be deleted until that task ends.
In CLP an object cannot be changed so an object will only become inaccessible when a variable goes out of scope. There is then no point in garbage collection until the end of the task. A long running server task may need to be bounced (stopped and restarted) each night in order to forget information that no longer needs to be kept in memory.
[edit] Implementation
Each function call in a task is attached to a task. The heap records function calls against tasks, and each function call knows what task it belong to.
A task is completed when the call for the task exits and there are no more requested function calls that may be run for that task. The task will ask the heap to process any requested calls for the task at that time. If a function call has an output to a promoted object the the function call is promoted. If there are requested calls that cannot be completed an exception is raised.
Each object created in a task is placed in a list. When the task completes all objects in the list that have not been promoted are deleted.
An object is promoted when it is assigned to an output parameter of the object, or to a member variable of an object from another task. Promotion is implemented by setting the objects task reference and adding the object to the new tasks list.
[edit] Language Syntax
Language syntax is relatively unimportant. Does it matter if we declare variables as,
Language | Declaration |
---|---|
Pascal | var a: integer |
C++ | long a; |
Java | long a; |
VB | dim a as integer |
In each case we are representing the same thing. A program is not a piece of text. A program is an object structure that may be represented as text.
A programs should be stored as an object structure independent of language. Then based on your preference you could ask for your program to be displayed with your own preference of syntax.
It would be better if we didnt need to interract with the program as text. Waiting for compilation and retrieving syntax error messages is a waste of time. Instead there should be a GUI that allows us to interract with the program.
It would then be much easier to write checking and refactoring programs.
However in this document I will write code fragments in a Java style, with some extensions as required.
[edit] Evaluating Requested Variables from Known
Because the return value is a parameter in the Result Equivalent Function, a function does not need to calculate the result from the inputs. It could calculate any parameter or the result.
The value to be calculated is the requested parameter in the function call. Based on the requested value other values are requested from which the requested value may be calculated.
How the requested result is calculated is determined by the characteristics.
[edit] Characteristics
The characteristics input("in"), output ("out"), and unique may be used to create characteristic conditions on a call.
- in - The parameter value will already have been calculated before the function is called.
- out - The parameter value will not have been calculated.
- unique - There is only one value (no need for a ValueSet).
- multi - There may be multiple values.
For example,
- unique out long Random(unique in double seed, unique out double newSeed);
is a function that calculates its return value from its input value. The characteristic conditions may also be written,
- long Random(long n)
- {
- characteristic
- {
- seed.Known();
- !newSeed.Known();
- !result.Known();
- }
- }
Testing to see if a value is already calculated (Known) is not strictly OK in CLP. For this reason input and output characteristics are not part of the precondition. The characteristics are tested separately in the characteristic condition.
When there are multiple implementations with the same name, signature and precondition, the characteristic conditions determine which function implementation is used, and when. It is up to the developer to insure all implementations with the same pre-conditions but different characteristics are logically equivalent. The characteristics only choose which implementation is used.
Characteristics may also delay the execution of the implementation of a function. If none of the characteristic conditions are met the call is placed on a queue to be run later when the characteristic conditions are met.
Characteristics should be used sparingly. The developer must insure that each implementation is logically consistent.
Characteristics should only be used,
- At interface methods with data coming from the outside world.
- To define primitive functions.
As a general principle it is better to leave out information that Meta Phase 1 will calculate.
- Types that may be deduced from context.
- Characteristics.
- Parameters that may be deduced from roles.
This results in code that easier to re-use and modify.
Note: Characteristics are similar to modes in [Mercury].
[edit] Binary Operators
The standard binary operators may be defined using the characteristics in and out.
The reserved words "in" and "out" add pre-conditions on the parameter to a function.
- in means that the value needs to be known (as in already calculated).
- out means that the value must be unknown.
The definition will calculate the unknown values from the unknown. This definition would be provided in a standard library, but is described here as an example.
- //Got the inputs so multiply.
- out long operator *(in long x, in long y)in
- {
- return x * y;
- }
- //Need to calculate y so divide.
- in long operator *(in long x, out long y)
- {
- y = result / x;
- }
- //Need to calculate x so divide.
- in long operator *(out long x, in long y)
- {
- x = result / y;
- }
The "result" keyword provides a more flexible way of specifying the return value. Result is the value returned by the function. Here "return" is equivalent to "result =".
[edit] Equivalent C++ code
The equivalent C++ would need to pass the return value as a function parameter. This is because C++ wants to calculate the return value, whereas in CLP the return value is regarded just like another parameter, whose value may or may not be known.
An expression like,
C = A * B
would be implemented as,
BinaryCall <long, MultiplyOp<long>, DivideOp<long>, DivideOp<long> >::Call(A, B, C);
where BinaryCall is defined as,
template <class BIN_OP, class INV1_OP, INV2_OP, class TX, class TY, class TRESULT> class BinaryCall : public virtual Call { private TX m_x; TY m_y; TRESULT m_result; public static void Call(TX &x, TY &y, TRESULT &result); BinaryCall(TX &x, TY &y, TRESULT &result); bool Requested(); long Card(); void Run(); }; template <class BIN_OP, class INV1_OP, INV2_OP, class TX, class TY, class TRESULT> /*static*/ BinaryCall::Call(TX &x, TY &y, TRESULT &result) { // First if a result is Requested, calculate it from the known values, if we have enough input. if (result.Requested()) { if (x.Known() && y.Known()) { Cartesian<long, BIN_OP<long> >::Call(x, y, result); } else { x.Request(); y.Request(); Class BinaryCall c = new BinaryCall(x, y, result); c.ScheduleCall(); } } else if (x.Requested()) { if (y.Known() && result.Known()) { // x.value = result.value / y.value; Cartesian<long, INV1_OP<long> >::Call(result, y, x); } else { y.Request(); result.Request(); Class BinaryCall c = new BinaryCall(x, y, result); c.ScheduleCall(); } } else if (y.Requested()) { if (x.Known() && result.Known()) { Cartesian<long, INV2_OP<long> >::Call(result, x, y); } else { x.Request(); result.Request(); Class BinaryCall c = new BinaryCall(x, y, result); c.ScheduleCall(); } } else { Class BinaryCall c = new MultiplyCall(x, y, result); c.ScheduleCall(); } } template <class BIN_OP, class INV1_OP, INV2_OP, class TX, class TY, class TRESULT> BinaryCall::BinaryCall(TX &x, TY &y, TRESULT &result) m_x(x), m_y(y), m_result(result) { } template <class BIN_OP, class INV1_OP, INV2_OP, class TX, class TY, class TRESULT> void BinaryCall::Run() { Call(m_x, m_y, m_result); } template <class BIN_OP, class INV1_OP, INV2_OP, class TX, class TY, class TRESULT> bool BinaryCall::Requested() { return m_x.Requested() || m_y.Requested() || m_result.Requested(); } // Cardinality is the number of different values we expect to calculate. // It is a measure of how much work we must do if we do the calculation now. // See [[Minimizing Value Set Size|Minimizing Value Set Size]]. template <class BIN_OP, class INV1_OP, INV2_OP, class TX, class TY, class TRESULT> long BinaryCall::Card() { return Min( m_x.Card()*m_y.Card(), m_result.Card()*m_y.Card(), m_x.Card()*m_result.Card()); };
BinaryCall is a class that allows us to execute the call later. AddToHeap stores this call for later execution.
It also searches for calls that are already on the queue whose values are requested (Requested) cardinality Cardinality() is 1.
[edit] Delayed Calls
The following is a design for classes that implement the delayed calls.
Note that this code has not been checked on a computer yet. It is design only at this stage.
[edit] The Call Class
The Call Class is a base class for implementing calls to methods that need to be delayed until there is sufficient input data.
class Call { public: Call() { }; public: void CheckRequest() = 0; // If this value is requested, request any inputs. bool Requested() = 0; // Does the call calculate a value that is requested. long Card() = 0; // How many values do we expect to calculate in the result. bool IsReady() = 0; // Are the inputs known. bool Run() = 0; public: bool ScheduleCall() { CheckRequest(); CallManager::ScheduleCall(this); } }
[edit] Variable Class
class Variable { private: bool m_Requested; public: Variable() : m_Requested(false) bool Known() = 0; long Card() = 0; bool Requested() const { return m_Requested; } void Request() { m_Requested = true; } }
[edit] The Call Manager
The Call Manager records calls for execution later when their inputs are known. This is a simple implementation of the Call Manager. This version lets each Task manage the calls for itself. This has the advantage of processing groups of related calls together.
[edit] Header
class CallManager { private: static m_CallManager; Task m_CurrentTask; Task *m_TaskList; long m_NumTasks; long m_MaxTasks; public: static void ScheduleCall(this); static Task *CreateTask(); static void FinishTask(Task *task); public: Task *InsertTask(Task *task); void DeleteTask(Task *task); void Execute(); private: void ResizeTaskList(long numTasks); }
[edit] Construction and Static
CallManager::m_CallManager = new CallManager; CallManager::CallManager() : m_MaxTasks = 20 , m_NumTasks = 0 { m_TaskList = new Tasks[m_MaxTasks] } Task *CallManager::CreateTask() { return m_CallManager->InsertTask(new Task); } void CallManager::FinishTask(Task *task) { m_CallManager->DeleteTask(task); } void CallManager::ScheduleCall(Call *call) { GetCurrentTask()->ScheduleCall(call); m_CallManager->Execute(); }
[edit] Task List
Task *CallManager::InsertTask(Task *task) { m_CurrentTask = task; if (m_NumTasks == m_MaxTasks) { ResizeTaskList(m_MaxTasks * 2); } m_TaskList[m_NumTasks++] = m_CurrentTask; return t; } void CallManager::DeleteTask(Task *task) { task->CheckRequestedCallsCompleted(); delete task; for (long j = 0; j < m_NumTasks; j++) { if (k < j) { m_TaskList[k] = m_TaskList[j]; } if (m_TaskList[j] != task) { ++k; } } m_NumTasks = k; } void CallManager::ResizeTaskList(long numTasks) { assert(m_NumTasks <= numTasks); Task *old = m_TaskList; m_TaskList = new Task[numTasks]; for (long j=0; j<m_NumTasks; j++) { m_TaskList[j] = old[j]; } m_MaxTasks = numTasks; }
[edit] Execute Scheduled tasks
void Execute() { Task *saveTask = m_CurrentTask; for (long j = 0; j < m_NumTasks; j++) { m_CurrentTask = m_TaskList[j]; m_CurrentTask->Execute(); } m_CurrentTask = saveTask }
[edit] Task class
A task represents a single call usually performed in response to an outside request. A task manages the queueing of calls that cannot be executed immediately. A task manages memory and deletes objects on completion.
[edit] Header
class Task { private: RingBuffer m_Requested; RingBuffer m_NotRequested; RingBuffer m_Multi; Object **m_ObjectList; long m_MaxObjects; long m_NumObject; long m_DoRequested; // How many calls to process, in one go. long m_DoNotRequested; long m_DoMulti; public: Task(); ~Task(); void AddObject(); bool AllRequestedCallsCompleted(); void Execute(); private: bool ExecuteRingBuffer(RingBuffer &r, long numCalls); bool CheckNotRequested(long numCalls); void FreeMemory(); }
[edit] Construct/Destruct
Task::Task() : m_Requested(64) , m_NotRequested(64) , m_Multi(32) , m_MaxObjects(64) , m_NumObject(0) , m_DoRequested(16) , m_DoNotRequested(4) , m_DoMulti(8) { m_ObjectList = new ObjectList(m_MaxObjects); } Task::~Task() { FreeMemory(); delete m_ObjectList; }
[edit] Object List
void Task::AddObject(Object *object) { if (m_NumObjects == m_MaxObjects) { ResizeObjectList(m_MaxObjects * 2); } object->SetTask(this); m_ObjectList[m_NumObjects++] = object; } void ResizeObjectList(long maxObjects) { assert(m_NumObjects <= maxObjects); Task **old = m_ObjectList; m_ObjectList = new Task[maxObjects]; for (long j=0; j<m_NumTasks; j++) { m_ObjectList[j] = old[j]; } m_MaxObjects = maxObjects; }
[edit] Calls
void Task::ScheduleCall(Call *call) { if (!call->Requested()) { m_NotRequested->Put(call); } else if (call->Card() > 1) { m_Multi->Put(call); } else { m_Requested->Put(call); } Execute(); } bool Task::Execute() { bool action = true; while (action) { action = ExecuteRingBuffer(m_Requested, m_DoRequested); if (!action) { action = ExecuteRingBuffer(m_Multi, m_DoMulti); } action = action || CheckNotRequested(m_DoNotRequested) } return action; } bool ExecuteRingBuffer(RingBuffer &r, long numCalls) { bool action = false; long entries = m_NotRequested.Entries(); for (long j=0; j<numCalls && (entries || action); j++) { Call *call = r.Take(); if (call->IsReady()) { action = true; call->Run(); } else { Requested.Put() } } return action; } bool CheckNotRequested(long numCalls) { bool action = false; long maxCalls = m_NotRequested.Entries(); if (maxCalls > numCalls) { maxCalls = numCalls; } for (long j=0; j<numCalls; j++) { Call *call = m_NotRequested.Take(); if (call->Requested()) { action = true; if (call->Cardinality() == 1) { m_Requested.Put(call) } else { m_Multi.Put(call); } } else { Requested.Put() } } return action; } void Task::FreeMemory() { for (long j=0; j<m_NumObjects; j++) { Object *object = m_ObjectList[j]; if (object->GetTask() == this) { delete object; } } }
[edit] RingBuffer class
A simple RingBuffer class for design purposes. Warning: Code not checked. Design purposes only.
template <class T> class RingBuffer { private: T *m_Buffer; T *m_TakePointer; T *m_PutPointer; T *m_EndPointer; long m_BufferSize; public: RingBuffer(long bufferSize) long Entries(); T *Take(); void Put(T *t); private: void ResizeBuffer(long bufferSize); } template <class T> RingBuffer::RingBuffer(long bufferSize) { m_BufferSize = bufferSize; m_Buffer = new T[bufferSize]; m_TakePointer = m_Buffer; m_PutPointer = m_Buffer; m_EndPointer = m_Buffer + bufferSize; } template <class T> long RingBuffer::Entries() { long result = (m_PutPointer - m_TakePointer); if (result < 0) { result = result + m_BufferSize; } } template <class T> T *RingBuffer::Take() { T *result = 0 if (m_TakePointer != m_PutPointer) { result = *m_TakePointer++;; if (m_TakePointer == m_EndPointer) { m_TakePointer = m_Buffer; } } return result; } template <class T> void RingBuffer::Put(T *t) { *m_PutPointer++ = t; if (m_PutPointer == m_EndPointer) { m_PutPointer = m_Buffer; } if (m_PutPointer == m_TakePointer) { ResizeBuffer(m_BufferSize * 2); } } template <class T> void RingBuffer::ResizeBuffer(long bufferSize) { assert(Entries() < bufferSize); T **newBuffer = new Task[maxObjects]; T **newPutPointer = newBuffer; T **endPointer = m_PutPointer; if (m_PutPointer < m_TakePointer) { endPointer = m_EndPointer; } while (m_TakePointer < endPointer) { *newPutPointer++ = *m_TakePointer++; } if (m_TakePointer == m_EndPointer) { m_TakePointer = m_Buffer; while (m_TakePointer < m_PutPointer) { *newPutPointer++ = *m_TakePointer++; } } m_Buffer = newBuffer; m_TakePointer = m_Buffer; m_PutPointer = newPutPointer; }
[edit] Links
- Symbolic Logic:Programming:Value Set Programming
- Symbolic Logic:Programming
- Intelligence and Reasoning