Originally published in Rebol Forces.

Reb-IT!
Author: Joel Neely
Date: [
04-Jan-2002 02-Feb-2002 23-Feb-2002 09-Mar-2002
]
Version: 1.3
Part: 1 | 2

Contents

Introduction

This series of articles explores some techniques for incrementally refactoring Rebol scripts, and illustrates the power of the refactoring approach to program development.

First steps

The first article presented a strategy for transforming recursive functions into iterative functions:

  1. Create a helper function inside the original function.
  2. Push state into the helper's arguments.
  3. Replace helper arguments with locals of the original function.
  4. Replace the helper function with a loop.
  5. Simplify the loop.

It also introduced three sub-strategies for moving state into the argument list, step two above, by using:

  1. Accumulator arguments (to do work early).
  2. To-do-list argument (to hold pending work).
  3. Argument rewriting (to maintain only the significant parts of the pending work).

Trade-offs

The first article mentioned some performance trade-offs between the to-do-list sub-strategy and the argument-rewriting sub-strategy. The to-do-list version used an extra variable for each block of pending work, took slightly more code to manage the extra variables, and occasionally created empty to-do-lists.  In contrast, the argument-rewriting version, using COMPOSE to un-nest leading sub-blocks, would create new blocks by copying portions of its arguments.  This copying is potentially time- and space-consuming, depending on the structure of those arguments.

Our focus is on refactoring instead of performance or benchmarking, so we'll primarily use the argument-rewriting sub-strategy in our examples, solely because it takes less typing.  But don't forget to evaluate the trade-off issues in any application of these strategies.

Next steps

The previous article touched briefly on the important question: "Why would we want to transform recursion into iteration?" Based on what we've covered so far, we can recall one very bad reason and two good ones.

  1. "Recursion is slow and/or confusing" - This is the bad reason. Recursion simply has an image problem that's mainly the result of bad teaching.
  2. "Sometimes recursion imposes limits" - Sometimes the simplest algorithm IS slower than one that's tuned for speed (whether recursion is involved or not).  Also, recursion is implemented in Rebol (and many other languages) in a way that limits the size of the task that can be handled recursively.
  3. "Free your mind" - Being able to approach a problem from multiple perspectives often helps us understand it better, and allows us to think about which of many approaches may be more suitable in a specific context.

When we're just trying to hack out a quick-and-dirty solution to a small task, the first solution we think of may be entirely adequate.  But when we're trying to expand our mental toolkit, it can be helpful to think of as many solutions to a problem (or a family of related problems) as possible. We haven't yet explored all of the possibilities of refactoring, nor of the little class of problems we've been playing with, so let's continue to apply refactoring and stretch our thinking a little more.

Nested Block Product

After deriving functions to sum single-level blocks and nested blocks, the first article concluded with two questions.  The first asked, "What difference would it make if we had been trying to calculate the product of all the numbers in a block, rather than the sum?"

The easiest solution

Calculating the product of a collection is similar to calculating their sum, with two changes:

  1. combine the values with * instead of + and
  2. initialize the result with 1 instead of 0.

Instead of deriving a function from recursive form, we can make those changes to NBT from the previous article, creating this:

nb-product: func [
b [block!]
/local result front
][
result: 1 ;; initialize
while [not empty? b][
either block? front: first b [
b: compose [(front) (next b)]
][
result: result * front ;; combine
b: next b
]
]
result ;; completed
]

which behaves as expected:

>> nb-product [[[[1] 2 [1]] 3 [1]] 4 [[1] 2 [1]]]
== 48

We can create other variations on this theme just as easily.  For example, we could write a function to determine whether the numbers in the nested block structure are all even:

nb-all-even: func [
b [block!]
/local result front
][
result: true ;; initialize
while [not empty? b][
either block? front: first b [
b: compose [(front) (next b)]
][
result: result and even? front ;; combine
b: next b
]
]
result ;; completed
]

and so on.  As long as we have:

  1. an expression or function for combining a new value with the accumulated result, and
  2. an initial value for that accumulator (the mathematical identity of the combining function).

we can continue to play this game.

DRY (Don't Repeat Yourself)

Let's apply the DRY principle from extreme programming.  Instead of cutting and pasting to create more variations, let's notice that both NB-PRODUCT and NB-ALL-EVEN combine two distinct issues:

  1. navigating through the nested block argument, and
  2. computing the desired result, using each number encountered.

We can refactor the code to separate these issues in a variety of ways.  The first, and probably the simplest, is to make a generic nested-block-navigation function that takes the initial value and combining function as arguments.

nb-summarize: func [
b [block!]
i [any-type!] ;; initializer
f [any-function!] ;; combiner
/local result front
][
result: i ;; initialize
while [not empty? b][
either block? front: first b [
b: compose [(front) (next b)]
][
result: f result front ;; combine
b: next b
]
]
result ;; completed
]

We can now perform our total, product, and all-even calculations on nested blocks this way:

>> testblock: [[[[1] 2 [1]] 3 [1]] 4 [[1] 2 [1]]]
== [[[[1] 2 [1]] 3 [1]] 4 [[1] 2 [1]]]
>> nb-summarize testblock 0 :+
== 16
>> nb-summarize testblock 1 :*
== 48
>> nb-summarize testblock true func [a b] [a and even? b]
== false

and, with a different set of test data:

>> testblock: [[[[10] 2 []] 12 [0]] 4 [[6] 2 [8]]]
== [[[[10] 2 []] 12 [0]] 4 [[6] 2 [8]]]
>> nb-summarize testblock 0 :+
== 44
>> nb-summarize testblock 1 :*
== 0
>> nb-summarize testblock true func [a b] [a and even? b]
== true

Note that we haven't completely separated navigation and computation in this first attempt.  Although NB-SUMMARIZE is responsible for navigation, its local word RESULT is still used to accumulate the answer.  For one reason why this is a problem, think about using NB-SUMMARIZE to calculate an average of the numbers in a nested block. That would require both a total and a count, but NB-SUMMARIZE only has a single result accumulator!

Refactoring into objects

If we make NB-SUMMARIZE into the most generic possible tour guide, it will have no involvement with the workings of its passenger.  All it will do is notify the passenger when the tour is beginning, and point out each attraction along the way.  The tour guide is solely in charge of the state of the tour, and the passenger is solely in charge of the state of the ongoing computation.

Since we represent state with data, enforcing this separation of responsibilities requires us to define specific collections of data and the distinct functions that manipulate those data.  In fact, we only use the data through those functions.  Fortunately, Rebol provides a way to implement this idea through the use of objects.

We can create a passenger object with three methods that will BEGIN a new computation, PROCESS an additional value, or END the computation and return the result.  The tour guide function takes a passenger object as an argument, and delivers data values to the passenger during the tour (without "knowing" what the passenger does with the data).  It's completely up to the passenger to maintain all state needed for its work.  Of course, to complete the separation of concerns, the passenger has no involvement in how the tourguide navigates through the data structure.

The tour guide is created with only a small change from the previous version, as follows:

nb-tourguide: func [
b [block!]
passenger [object!]
/local front ;; no result variable!
][
passenger/begin ;; initialize
while [not empty? b][
either block? front: first b [
b: compose [(front) (next b)]
][
passenger/process front ;; combine
b: next b
]
]
passenger/end ;; completed
]

We can write a totalling passenger as follows:

tot-passenger: make object! [
sum: 0
begin: func [] [sum: 0]
process: func [num [number!]] [sum: sum + num]
end: func [] [sum]
]

and get the expected result:

>> nb-tourguide testblock tot-passenger
== 36

Now that accumulation and navigation have been separated, we can easily have a passenger (this time an anonymous one) which knows how to compute averages:

>> nb-tourguide testblock make object! [
[ t: n: 0.0
[ process: func [x] [t: t + x n: n + 1]
[ begin: func [] [t: 0.0 n: 0.0 ]
[ end: func [] [t / n]
[ ]
== 5.14285714285714

This design completes the separation of navigating and computating. We can now create new passengers to perform different computations without copying all of the navigation code (and possibly making a bug if we mis-type a copy of already-working code).  In addition, we can create a new navigator to traverse a different kind of data structure, and immediately get to re-use all of the passengers written up to this point.

In this case, the navigation is active and the computation is passive. This decision to let the navigator also serve as the the driver is not the only alternative, as we'll see shortly.

Same Block Contents

The second question at the end of the first article was, "How can we check whether two blocks contain the same data (in the same order), regardless of the nesting structures of those blocks?"

For example, all of these blocks:

[[1 2 1] [3 1 4] [1 2 1]]
[[[[1] 2 [1]] 3 [1]] 4 [[1] 2 [1]]]
[1 2 1 3 1 4 1 2 1]

contain the same numbers in the same order (ignoring the nesting), but the block

[1 2 1 3 1 4 2 1 1]

contains a different collection of numbers which fails to match any of the previous three.

Recursive definition

According to our original strategy, we can begin with a recursively-oriented specification for this test on two blocks.

  • If both blocks are empty, the result is TRUE.
  • If either block is empty and the other begins with a number, the result is FALSE.
  • If both blocks begin with numbers, the result is TRUE exactly when those numbers are equal and the remainders of the blocks (recursively) have the same content.
  • If either (or both) of the blocks begin with a nested sub-block, the nesting must be recursively removed until one of the first three cases occurs.

Turning this verbal description into Rebol is more tedious than the previous problems.  Three states for each of two blocks combine to make nine possibilities to consider.  The combinations can be expressed as a table, with the appropriate action at each intersection.

Right
empty?sub-block? ...number? ...
Leftempty?trueunwrap right,
recur
false
sub-block? ...unwrap left,
recur
unwrap both,
recur
unwrap left,
recur
number? ...falseunwrap right,
recur
numbers equal?
AND
recur on remainders

After writing out explicit Rebol expressions for all possibilities (left as an exercise ;-) we can simplify things just a little and get our first version:

same-data1: func [lb [block!] rb [block!] /local lfront rfront][
either empty? lb [
either empty? rb [
true
][
either block? rfront: first rb [
same-data1 lb compose [(rfront) (next rb)]
][
false
]
]
][
either block? lfront: first lb [
same-data1 compose [(lfront) (next lb)] rb
][
either empty? rb [
false
][
either block? rfront: first rb [
same-data1 lb compose [(rfront) (next rb)]
][
to-logic all [
lfront = rfront
same-data1 next lb next rb
]
]
]
]
]
]

Using our test cases from the earlier explanation:

>> a: [[1 2 1] [3 1 4] [1 2 1]]
== [[1 2 1] [3 1 4] [1 2 1]]
>> b: [[[[1] 2 [1]] 3 [1]] 4 [[1] 2 [1]]]
== [[[[1] 2 [1]] 3 [1]] 4 [[1] 2 [1]]]
>> c: [1 2 1 3 1 4 1 2 1]
== [1 2 1 3 1 4 1 2 1]
>> d: [1 2 1 3 1 4 2 1 1]
== [1 2 1 3 1 4 2 1 1]

this version does what we had intended:

>> same-data1 a b
== true
>> same-data1 a c
== true
>> same-data1 a d
== false

but was entirely too tedious to write! It would have been worse to debug, even for a simple typing error.  Let's consider another strategy that avoids all that tedium.

Stacking

The complexity of that last version resulted from entangling the states of three component concepts:

  • navigating through the left argument block,
  • navigating through the right argument block, and
  • comparing the values encountered along the way.

Separating those concepts will give us a much simpler solution.

In addition to grouping the code into components, we can separate the activity of those components in time.  We can first perform the navigation through the nested argument blocks (prior to any comparisons) by "flattening" them to remove any nested structure. The resulting data-only blocks then can be compared trivially.

How do we extract the flattened content of a nested block structure? Per the strategy from the previous article, we can begin with a simple recursive definition of... WAIT!

If we go down that path, we will be forgetting the DRY principle! We already have a navigator (NB-TOURGUIDE) that can take a passenger on a complete tour of the structure.  All we need is a passenger which collects the values into a block.

Beginning a new collection means starting with an empty block. Processing an additional datum means appending it to the block under construction.  Completing the process simply means returning the block already collected.

flat-data: func [nestedblock [block!]] [
nb-tourguide nestedblock make object! [
result: copy []
process: func [x] [insert tail result x]
begin: func [] [result: copy []]
end: func [] [result]
]
]

Because NB-TOURGUIDE returns the final result from the passenger after completing the tour, it give us:

>> flat-data a
== [1 2 1 3 1 4 1 2 1]
>> flat-data b
== [1 2 1 3 1 4 1 2 1]
>> flat-data c
== [1 2 1 3 1 4 1 2 1]
>> flat-data d
== [1 2 1 3 1 4 2 1 1]

Using the built-in test for equality of blocks, we can write a new version of SAME-DATA as follows:

same-data2: func [leftblock [block!] rightblock [block!]] [
equal? flat-data leftblock flat-data rightblock
]

and try it out on the same test cases:

>> same-data2 a b
== true
>> same-data2 a c
== true
>> same-data2 a d
== false

So why is the name of this section "Stacking"? We're building a stack of components, taking the result of one function as an argument to another function, just as we would stack up toy building blocks or build a protocol stack for networking.  This approach allows us to design, develop, and test each layer or component independently of the others, which improves the speed and accuracy of the entire development process.

To create a different pair-wise test on nested blocks, we can use any other test on two numbers that meets the new requirement, and apply it in place of EQUAL? to create the new function.  No navigation coding will be required.

Free lunches and other mythical creatures

We might conclude that we've exhausted this line of thinking (or at least exhausted ourselves! ;-) but we've been ignoring questions of performance.  Consider the following transcript:

>> leftblock: array/initial 10000 1
== [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1...
>> head change rightblock: copy leftblock 0
== [0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1...
 
>> same-data1 leftblock rightblock
== false
>> same-data2 leftblock rightblock
== false

We've already verified these two functions independently, so the results are not surprising.  But there's more to the tale than we can see by reading the transcript.  Let's add a little more information:

>> print [
[ now/time/precise
[ same-data1 leftblock rightblock
[ now/time/precise
[ same-data2 leftblock rightblock
[ now/time/precise
[ ]
8:03:38.45 false 8:03:38.45 false 8:03:42.3

SAME-DATA1 provides an answer almost immediately, while SAME-DATA2 takes almost four seconds to do the same job.  A similar time penalty occurs when we pass RIGHTBLOCK to NB-PRODUCT , NB-ALL-EVEN , or the equivalent computations built with NB-SUMMARIZE or NB-TOURGUIDE .

The reason for these delays is simple; the final result never changes after:

  • a product computation encounters a zero,
  • an all-even test encounters an odd number, or
  • a same-data test encounters two different values.

However, only the passenger knows that fact; the navigator continues to grovel through the data structure even when there's no need of further data.  In fact, using the stacking strategy, the navigator completes all navigation before the passenger even gets to see the first data value.  Because the first design mixed the comparison and navigation together, it was able to return a result at the earliest possible time.

Do we conclude that separation of concerns is an "inefficient" concept? You can already guess where we're headed; refactoring comes to the rescue again!

"Introverted" objects

We could certainly change the protocol between navigator and passenger to include an signal for whether the passenger needs to see any more data.  For example, the /PROCESS method could return TRUE to say "Keep going; I'll take additional data" or FALSE to say "My result won't change; you might as well stop now".  But that seems wrong for at least two reasons:

  • We were working to make the navigator and passenger as independent as possible, but this seems to make them more involved with each other.
  • There are plenty of possible passengers (such as the ones that calculate totals or averages) that will never signal an early completion; for them that mechanism is pure overhead with no benefit.

Instead of adding clutter, let's apply some additional refactoring to eliminate the need for the clutter.  Section 2.3 closed with the observation that the navigator didn't have to be the driver, so let's rearrange the seating.  Remember that commercial airlines have no problem separating the roles of pilot and navigator!

Passive navigation

The total of a flat block could be computed this way:

block-total: func [arg-block [block!] /local my-block result] [
result: 0
my-block: arg-block ;; start tour
while [not empty? my-block] [ ;; tour over?
result: result + first my-block ;; current value
my-block: next my-block ;; on to next attraction
]
result
]

We probably wouldn't write all that for a simple block total, but the point is to be explicit about every step in the process.  A "passive" navigator would be able to start the tour with a supplied data structure, and subsequently would respond to these requests:

  • to tell whether the tour is over.
  • to supply the current front value, and
  • to advance to the next value and wait for further requests.

In effect, we're going to turn the previous navigator/driver inside out.  This "introverted" navigator will only be concerned with its own state, and not where the requests for its services are coming from. A component that can provide values from a data structure on demand is conventionally called an "iterator".  We begin designing our nested block iterator as an object.

nb-iterator: make object! [
original-data: []
current-data: []
restart: func [] [current-data: original-data]
start: func [nested-block [block!]] [
original-data: nested-block
restart
]
;; more to come!
]

Keeping a copy of the original data structure allows the iterator to be restarted from the beginning, which lets us perform multiple computations over the same data.  But how do we construct the navigator to stop at each value and then resume later?

Inside-out loops

Remember that the body of the active tourguide contained this compound expression:

while [not empty? b][
either block? front: first b [
b: compose [(front) (next b)] ;; rewrite step
][
passenger/process front ;; process step
b: next b
]
]

That loop actually evaluates the rewrite step repeatedly, as long as NOT EMPTY? B and BLOCK? FRONT: FIRST B are both true.  That repetition takes an arbitrary state (of the nested data) and "stabilizes" it — transforms it into a stable state where either

  • the entire (current) block is empty, or
  • there's a usable value at the beginning of the (current) block.

We can use this as an informal specification for stabilizing the value of CURRENT-DATA, and get this:

stabilize: func [/local front] [
while [all[
not empty? current-data
block? front: first current-data
]][
current-data: compose [(front) (next current-data)]
]
]

Notice that a "stabilized" block structure has a number of very nice properties:

  • it is only empty when there are no more values to be found,
  • the current value (if there is one) is always at the front of the block with sub-block wrapping removed,
  • stabilizing an already-stabilized block structure is safe; nothing changes, so there is no chance of data loss.

If the repetition stops because B has become empty, the tour is over. If it stops because BLOCK? FRONT: FIRST B has become true, the tourguide evaluates the process step, then resumes the quest to eliminate leading sub-blocks.

Delegating the process step

Since we're separating the processing from navigation, the iterator doesn't evaluate the process step itself; instead it must supply the current value as a result for some other evaluation to use.  To be consistent with other kinds of data streams, we'll design our iterator so that it can always indicate whether there are any more data, and supply the first unused value if so.  That means it must stabilize CURRENT-DATA:

  • in the same method that (re)starts the iterator, and
  • in the same method that provides each value.

What happens when a caller requests another value from an empty collection? A polite caller will first ask whether there are more data, but we should prepare for poor manners also.  For simplicity, we'll return NONE if CURRENT-DATA is empty.  (A good alternative would be to return an ERROR! value indicating that the caller went past the end of the data.)

A complete iterator

Based on these ideas, we can now add the remaining methods to our iterator object:

nb-iterator: make object! [
original-data: []
current-data: []
stabilize: func [/local front] [
while [all[
not empty? current-data
block? front: first current-data
]][
current-data: compose [(front) (next current-data)]
]
]
restart: func [] [
current-data: original-data
stabilize
]
start: func [nested-block [block!]] [
original-data: nested-block
restart
]
at-end?: func [] [empty? current-data]
more?: func [] [not empty? current-data]
get-next: func [/local result] [
either empty? current-data [
none
][
result: first current-data
current-data: next current-data
stabilize
result
]
]
]

(Providing both AT-END? and MORE? methods is simply a notational convenience.)

We can test our iterator with a simple printing function to show that all of the values are presented in the proper order:

print-em: func [nb [block!] /local my-iterator] [
prin " ( "
my-iterator: make nb-iterator []
my-iterator/start nb
while [my-iterator/more?] [
prin [my-iterator/get-next ""]
]
print ")"
]

and perform a couple of tests, as follows:

>> print-em [[[[1 2] 3] 4] 5]
( 1 2 3 4 5 )
>> print-em [[[[]]]]
( )
>> print-em [1 [2 [3] [[[[4 5]]]]]]
( 1 2 3 4 5 )

Using the iterator

Remember our earlier objective: we want functions to produce their results at the earliest possible time.  In the case of the product of numbers in a nested block, that means we can quit when either:

  • we've run out of numbers to multiply, or
  • our product has become zero (and therefore can't change).

The combination of an active "multiplier" function and a passive navigator object gives us this version:

nb-product2: func [b [block!] /local iterator product] [
iterator: make nb-iterator []
iterator/start b
product: 1
while [iterator/more?] [
if 0 = product: product * iterator/get-next [
return 0
]
]
product
]

(MAKEing a new iterator object lets us to use more than one at a time.  Hold on to that thought!)

We can cheat and look under the hood to verify that we're getting our answer with no wasted effort.

>>  nb-product2 [1 [2 [3] [[[[4 5]]]]]]
== 120
>> mold get first second :nb-product2
== {
make object! [
original-data: [1 [2 [3] [[[[4 5]]]]]]
current-data: []
stabilize: func [/local front][
whi...

In that test case, CURRENT-DATA was completely consumed because all values were needed for the result.  Let's try another one that should bail out early:

>> nb-product2 [1 [[[[0]]]] [2 [3] [[[[4 5]]]]]]
== 0
>> mold get first second :nb-product2
== {
make object! [
original-data: [1 [[[[0]]]] [2 [3] [[[[4 5]]]]]]
current-data: [2 [3] [[[[4 5]]]]]
stabilize: func ...

In this case, the data following the zero were never reached by the iterator.  That's the behavior we were looking for!

"Same data" test revisited

We are now in a position to write a new version of the test which determines whether two nested blocks contain the same data.  It's as easy as writing the equivalent of the built-in EQUAL? test on flat blocks, which we'll do as a warm-up exercise.

Our function should return:

  • TRUE if all values are (pair-wise) equal and both blocks run out of data at the same time; but
  • FALSE if a pair of unequal values are found or one block runs out of data before the other.

We can step through the two blocks as long as neither is at end and their first values are equal.  When we stop, the result is true exactly when both are empty.

fb-same-data: func [left [block!] right [block!]] [
while [
all [
not tail? left
not tail? right
left/1 = right/1
]
][
left: next left
right: next right
]
to-logic all [tail? left tail? right]
]

Our test function using nested block iterators is even simpler, since getting a value (for the comparison) automatically advances us to the next value.  A direct equivalent of the above would be:

nb-same-data3: func [
left [block!] right [block!]
/local li ri
][
li: make nb-iterator [] li/start left
ri: make nb-iterator [] ri/start right
while [
all [
li/more?
ri/more?
li/get-next = ri/get-next
]
][]
to-logic all [li/at-end? ri/at-end?]
]

Since the body of the WHILE is empty, we can use DeMorgan's law to replace the WHILE by UNTIL.

nb-same-data4: func [
left [block!] right [block!]
/local li ri
][
li: make nb-iterator [] li/start left
ri: make nb-iterator [] ri/start right
until [
any [
li/at-end?
ri/at-end?
li/get-next <> ri/get-next
]
]
to-logic all [li/at-end? ri/at-end?]
]

Finally, we can expand our earlier timing test to include this last version:

>> print [ now/time/precise
[ same-data1 leftblock rightblock now/time/precise
[ same-data2 leftblock rightblock now/time/precise
[ nb-same-data4 leftblock rightblock now/time/precise
[ ]
7:52:23.97 false 7:52:23.97 false 7:52:27.65 false 7:52:27.65

The iterator-based version is essentially as fast as the combinatorial first attempt, but far easier to write and read, since each concept (navigating a nested block, testing the sequences for equality, and quitting as soon as we know the answer) now has a specific "place" to live in our code.

Summary and next challenge

Let's step back from the details and see where this has taken us, and where else we can go with it.

Benefits

We can use refactoring simply to understand better what our programs are doing and how we might approach their design.  We can also use refactoring to find and make good trade-off decisions among the options available to us (for speed, memory footprint, simplicity, code reuse, etc.)

There's a "beneficial cycle" (the opposite of a "vicious cycle") at work here; sometimes an approach opens up new ways to think about our programs, which then expands the range of problems we can handle quickly and easily, which makes us think differently about how to program... and so on!

Strategies and accomplishments

In the first article, we worked through some strategies for refactoring functions from recursive to iterative form.  Those strategies used local variables to represent the state of an evaluation explicitly, rather than leaving the state implicit in arguments and "current position" during evaluation. Making state explicit allows us to see precisely what changes with each action of the code, sometimes using even less time or memory than the original.

In this second article, we've taken iterative functions and refactored them into objects that can be started and stopped "in the middle" of performing the original function's evaluations.  This strategy divides the overall state of the computation into separate objects, with each representing the state of one distinct concept. This division allows us to think about each concept independently of the others, and to write expressions that incrementally change state "just enough".

Looking ahead

In that vein, here's a little challenge to prepare for the next article.  Consider these two iterative functions:

f-dee: func [] [repeat i 5 [print ["Tweedledee says" i]]]
f-dum: func [] [repeat i 5 [print ["Tweedledum says" i]]]

Evaluating them consecutively gives us:

>> f-dee f-dum
Tweedledee says 1
Tweedledee says 2
Tweedledee says 3
Tweedledee says 4
Tweedledee says 5
Tweedledum says 1
Tweedledum says 2
Tweedledum says 3
Tweedledum says 4
Tweedledum says 5

Now, imagine being able to evaluate them at the same time.  Their output would be interleaved in some way, depending on how they actually took turns printing to the console.

With that in mind, consider a simple iterator:

i-count: make object! [
n: 1
limit: 5
f: func [a [number!]] []
start: func [] [n: 1]
more?: func [] [n <= limit]
do-next: func [/local nn] [nn: n n: n + 1 f nn]
]

which counts from 1 to LIMIT (default of 5), applying F (default of doing nothing) to each counted value.  We can instantiate it twice:

i-dee: make i-count [f: func [a] [print ["Tweedledee says" a]]]
i-dum: make i-count [f: func [a] [print ["Tweedledum says" a]]]

and do this:

>> i-dee/start   while [i-dee/more?] [i-dee/do-next]
Tweedledee says 1
Tweedledee says 2
Tweedledee says 3
Tweedledee says 4
Tweedledee says 5
>> i-dum/start while [i-dum/more?] [i-dum/do-next]
Tweedledum says 1
Tweedledum says 2
Tweedledum says 3
Tweedledum says 4
Tweedledum says 5

So far we've only reproduced the effect of evaluating F-DEE and F-DUM consecutively (with more typing!) But look what else we can do with these iterators:

round-robin: func [tasks [block!] /local scan] [
tasks: reduce tasks
foreach task tasks [task/start]
while [not empty? scan: tasks] [
forall scan [
either scan/1/more? [
scan/1/do-next
][
remove scan
]
]
]
]

This last function give us the following behavior:

>> round-robin [i-dee i-dum]
Tweedledee says 1
Tweedledum says 1
Tweedledee says 2
Tweedledum says 2
Tweedledee says 3
Tweedledum says 3
Tweedledee says 4
Tweedledum says 4
Tweedledee says 5
Tweedledum says 5

ROUND-ROBIN causes the argument objects to take turns fairly; it serves as a scheduler to control the activity of a block of objects, but without involvement in their actual work.  All it "knows" about those objects is that each one can be told to START, can be asked if it has MORE? to do, and be told to DO-NEXT (perform the next thing on its own agenda).  That resembles the relationship we explored earlier between an active navigator and its passenger object.

By refactoring iterative functions into iterators, and passing those iterators to a scheduler, we've implemented a simple scheme for multitasking Rebol expressions.

Challenge

Here are a couple of questions to consider:

  • How could we write a scheduler to conduct a race between its argument objects? (Think about board games for inspiration.)
  • How can we create task objects that do something more interesting than simply counting?

We'll continue this line of thinking in the next article.  Happy refactoring!