Incremental computation is an optimization technique that takes advantage of the resuability of successive, iterative computations that vary slightly from one another. For instance, if you wanted to compute the successive sums of each m-element window in an n-element array (with m < n) then you might use the following code:
for (int i = 0 ; i < n-m ; i++) { for (int j = i ; j < m ; j++) { sum[i] += ary[j]; } }which is an O(n2) algorithm. However, using incremental computation, we can optimize the code in the following way:
for (int j = 0 ; j < m ; j++) { sum[0] += ary[j]; } for (int i = 1 ; i < n-m ; i++) { sum[i] = sum[i-1] - ary[i-1] + ary[i+m-1]; }taking advatange of the fact that we've already computed the previous window, which differs from the current one in only two values. The new algorithm is O(n).
I'd like to explore optimization algorithms that identify opportunities for incremental computation, as well as some of its more interesting applications.
Two notes
There is an entire area of study devoted to creating incremental algorithms;
that is, algorithms that are explicitly designed to efficiently accomodate incremental
changes to their inputs, or that incrementally compute a solution so that if
computation is terminated part-way, a useful result can still be obtained. I
do not deal with this area; instead, the papers I cite below deal with the automated
incrementalization of existing code at compile-time. (...although at
least one of the papers bridges the gap between the two areas by proposing an
automated system that can introduce incrementalization on a function-call level.)
Also, considerable research has been done in developing an incremental model of computation; code is rendered incremental at run-time via function caching (storing the results of previous calls to the same function) and other methods. This approach relies on a runtime mechanism and as such never explicitly constructs incremental code that can be run by conventional means.
Papers
J. Earley, "High Level Iterators and a Method of Automatically Designing
Data Structure representation," ERL-M416, Computer Science Division, University
of California, Berkeley, February, 1974 Robert Paige, J. T. Schwartz, Expression continuity
and the formal differentiation of algorithms, Proceedings of the 4th ACM
SIGACT-SIGPLAN symposium on Principles of programming languages, p.58-71, January
17-19, 1977, Los Angeles, California Robert Paige, Shaye Koenig, Finite Differencing
of Computable Expressions, ACM Transactions on Programming Languages and
Systems (TOPLAS), v.4 n.3, p.402-454, July 1982 Yanhong Liu, Tim Teitelbaum, Deriving Incremental
Programs, Technical Report TR 93-1384, Department of Computer Science, Cornell
University, Ithaca, New York, September 1993. Yanhong Liu, Tim Teitelbaum, Systematic derivation
of incremental programs, Science of Computer Programming, v.24 n.1, p.1-39,
February 1995 Yanhong Liu, Tim Teitelbaum, Caching Intermediate
Results for Program Improvement, Proceedings of the Symposium of Partial
Evaluation and Semantics-Based Program Manipulation, June 21-23, 1995, La Jolla,
California Yanhong Liu, Scott Stoller, Tim Teitelbaum, Discovering
Auxiliary Information for Incremental Computation, Proceedings of POPL'96,
January 21-24, 1996, St. Petersburg Beach, Florida Yanhong Liu, Efficient Computation via Incremental
Computation, Pacific-Asia Conference on Knowledge Discovery and Data Mining,
December 1999 Yanhong Liu, Scott Stoller, From
recursion to iteration: what are the optimizations?, Proceedings of PEPM'00,
January 22-23, 2000, Boston, Massachusetts Yanhong Liu, Scott Stoller, Ning Li, Tom Rothamel, Optimizing
Aggregate Array Computations in Loops, Technical Report TR 02-xxxx, Computer
Science Department, State University of New York at Stony Brook, Stony Brook,
New York, July 2002
The original paper that introduces the idea; couldn't find it online.
(14 pages) This paper explores the idea of incremental computation at an
extremely high level, with respect to sets. It introduces some preliminary heuristics
to determine if an expression can be optimized. It also discusses the feasibility
of an system that would automatically employ those heuristics and the
symbolic transformations he discusses in the paper -- and reaches the conclusion
that while full automation may be too difficult, even semi-automation would
be beneficial.
(53 pages) A more comprehensive look at the ideas presented in the previous
paper. It goes through several step-by-step examples of pseudo-mechanical transformation
to incremental code; however, the paper still only deals with a very high-level
language.
(41 pages) The first systematic approach to incrementalizing programs written
in common functional languages. This paper defines many of the terms used in
incremental computation, formalizes the notion of partial and complete incremental
functions, and provides a step-by-step automated approach to deriving incremental
functions. It also introduces a time model by which one can determine when to
use a cached result, and when it might be faster to recompute the whole function.
Finally, it provides a proof of correctness for the deriviations it suggests,
as well as some potential future improvements.
(30 pages) The published version of the above technical report. More formal,
less readable =). I skimmed it, so I might be missing something...
(12 pages) Extends the basic "cached-result" framework with a second
phase, one which exploits intermediate results computed in the original computation
of f(x). Describes the "cache-and-prune" algorithm for identifying
which intermediate results are useful, and employing them in the incrementalized
code.
(14 pages) This paper describes phase three of of the overall incremental
computation problem: discovering, computing, maintaining, and exploiting auxiliary
information about the input value. (Note that this information is by definition
not computed by f(x) and therefore is not considered an intermediate result.)
The usual step-by-step analysis follows.
(24 pages) This summary paper provides an excellent overview of the basic ideas
behind incremental computation, and also covers some of the additional research
done to date, including the discovery and use of intermediate results and auxiliary
information. It also steps through several simple examples and describes the
author's first go of an actual implementation, CACHET. This paper is a good
place to start.
(19 pages) Liu again expands the scope of incremental computation to tackle
recursive functions. Describes algorithm to convert recursion to iteration that
handles multiple base cases and multiple recursive cases. Also provides some
interesting stats on how tail recursion can actually result in slower
code than normal recursion; the proposed algorithm avoids these slowdowns.
(34 pages) I only had time to skim this paper. It tackles the specific task
of applying incremental computation techniques to aggregate array computations
in loops. Unlike previous papers, it includes comprehensive performance measurements
(running time, cache misses, etc) taken from a variety of real-world algorithms.