s c h e m a t i c s : c o o k b o o k

/ Cookbook.ListFlatten

This Web


WebHome 
WebChanges 
TOC (with recipes)
NewRecipe 
WebTopicList 
WebStatistics 

Other Webs


Chicken
Cookbook
Erlang
Know
Main
Plugins
Sandbox
Scm
TWiki  

Schematics


Schematics Home
Sourceforge Page
SchemeWiki.org
Original Cookbook
RSS

Scheme Links


Schemers.org
Scheme FAQ
R5RS
SRFIs
Scheme Cross Reference
PLT Scheme SISC
Scheme48 SCM
MIT Scheme scsh
JScheme Kawa
Chicken Guile
Bigloo Tiny
Gambit LispMe
GaucheChez

Lambda the Ultimate
TWiki.org

How to flatten a list.

Problem

To flatten a list means to destructure a list into a flat list. For example:

(flatten '(a (b (c d) e) f)) ;=> (a b c d e f)

The solution presented is a non-destructive function that operates very quickly. For instance, it out-performs this popular implementation of flatten:

(define (flatten x:xs)
  (define (f x:xs result)
    (cond
      [(null? x:xs) result]
      [(pair? (car x:xs)) (f (cdr x:xs) (f (car x:xs) result))]
      [else (f (cdr x:xs) (cons (car x:xs) result))]))
  (reverse! (f x:xs ())))

Notice that this version of flatten utilizes the destructive function `reverse!'. However, it is being used on the newly created list, and not the input list. Thus, this function is not classified as destructive, since its argument remains unchanged.

Solution

The fastest way to flatten a list non-destructively, that the author is aware of, happens to be the algorithm shown above, without reversing the final result. Thus, if the order of elements of the list to be flattened is of no significance to your particular requirement to flatten a list, this author recommends that you use the function as implemented above.

However, if the order of elements is important to you, then you will find the following function to be very fast, and robust in its ability to handle various structures of lists and size, without degrading its performance by more than a small factor. The algorithm implemented below operates in linear time with respect to the number of pair?'s in the list, in other words it has complexity O(n).

(define (flatten x:xs)
  (let* ([result (cons '() '())][last-elt result])
    (define (f x:xs)
      (cond
        [(null? x:xs) result]
        [(pair? (car x:xs)) (f (car x:xs)) (f (cdr x:xs))]
        [else (set-cdr! last-elt (cons (car x:xs) '()))
              (set! last-elt (cdr last-elt)) 
              (f (cdr x:xs))]))
    (f x:xs)
    (cdr result)))

Discussion

You'll notice this function also employs a destructive function, this time, the primitive `set-cdr!'. However, like the implementation above, the destructive primitive is used in the building of the newly created list, and not on its argument, so this function is non-destructive. You may also notice that it employs an odd initial value for the list being created; e.g., the pair? (cons '() '()). This is done to avoid an expensive special case for the very first element added to the list being built, that otherwise would be need to be included within the internal function f, where all the recursion takes place.

The overview of this function's algorithm is to avoid the need to reverse the result by maintaining a lexically bound variable, that always remains bound to the value of the last pair? seen, and in producing exactly one new cons pair for every pair in its argument, plus the single initial cons pair described above. The author has benchmarked this implementation of flatten with a variety of list structures, including a deeply nested list, a list having the shape of a binary search tree and a completely flat list. The flat list is of particular interest, because the author found other implementations to be fragile when they were handed the answer as their argument.

The benchmarks were performed using DrScheme in no debug mode, as well as being repeated with the Chicken Scheme to C compiler. The ranking of the various algorithms changed somewhat between the slower implementations, but did not affect that of the three fastest algorithms. Lists lengths of 2,000,000 were possible with DrScheme. Note, the length merely represents the size of the cdr spine of a list. The actual lists contained substantially more elements than reported by the `length' function, due to nesting within the lists. For technical reasons, I was not able to test beyond lists of length 1,000,000 using the Chicken compiler. This, no doubt, is the result of my inexperience with the compiler.

The author would be very appreciative of the results of other benchmarks, especially on platforms other than DrScheme. Please add your comments and results if you are interested in comparing these functions and results on any platform, and especially if you know of an implementation of flatten which you believe to be faster.


Comments about this recipe

Contributors

The author is grateful to Robby Findler, who provided a critique of my code, which led to writing the implementation of flatten presented in this recipe.

-- KyleSmith - 03 Jun 2007

CookbookForm
TopicType: Recipe
ParentTopic: ListRecipes
TopicOrder: 999

 
 
Copyright © 2004 by the contributing authors. All material on the Schematics Cookbook web site is the property of the contributing authors.
The copyright for certain compilations of material taken from this website is held by the SchematicsEditorsGroup - see ContributorAgreement & LGPL.
Other than such compilations, this material can be redistributed and/or modified under the terms of the GNU Lesser General Public License (LGPL), version 2.1, as published by the Free Software Foundation.
Ideas, requests, problems regarding Schematics Cookbook? Send feedback.
/ You are Main.guest