Friday, July 18, 2008

Generating random sequences with LINQ

A concise random number generator

Inspired by The LINQ Enumerable Class article from the lastest MSDN Magazine issue, I started to play a little bit with the random sequence generator:

/* generate the sequence */
Random rnd = new Random();
var sequence = Enumerable.Range( 1, 10 ).OrderBy( n => rnd.Next() );
 
/* write down the sequence */
sequence.ToList().ForEach( n => Console.WriteLine( n ) );

As you can see, the sequence is generated with two statements, the first one initializes a random number generator and the second one sorts the list using the results from the generator.


My immediate thought was: "Is there a way to genearte such random sequence using a single statement?"


The obvious approach



var sequence = Enumerable.Range( 1, 10 ).OrderBy( n => (new Random()).Next() );

does not work because a new random number generator is created for each value from the list and the same random value is returned from all these newly created generators.




Random number generator testing

Let's write a test of the random number generator - we treat consecutive values from the array as coordinates and draw an image of the generator:



static void RandomSequenceGeneratorTest( IEnumerable<int> Sequence )
{
    int Max = Sequence.Max();
 
    /* create image */
    using ( Bitmap bitmap = new Bitmap( Max, Max ) )
    using ( Graphics graphics = Graphics.FromImage( bitmap ) )
    {
        graphics.Clear( Color.Black );
 
        int prev = Sequence.First();
        foreach ( var current in Sequence.Skip(1) )
        {
            graphics.DrawRectangle( 
                Pens.White, 
                new Rectangle( 
                    /* draw a point 
                       using current and previous values
                       as coordinates
                    */
                    new Point( prev, current ), 
                    new Size( 1, 1 ) ) );
 
            prev = current;
        }
 
        bitmap.Save( "test.png", ImageFormat.Png );
    }
}

Testing the generator (version 1)

Let's also test our generator:



var sequence = Enumerable.Range( 1, 500 ).OrderBy( n => ( new Random() ).Next() );
RandomSequenceGeneratorTest( sequence );


An improved generator (version 2)

However, I can modify the generator slightly to get much better results!



var sequence = Enumerable.Range( 1, 500 ).OrderBy( n => n * ( new Random() ).Next() );
RandomSequenceGeneratorTest( sequence );


Hey! It seems that the multiplication makes the generator much less predictive, however as the clear pattern reveals, the resulting sequence is still not random!


What exacly happens is the arithmetic overflow caused by



n * (new Random()).Next()

which "distrubutes" multiplied values in a more "random" way.


Do we really need (new Random())?

After I've realized that the "randomness" of the improved generator is caused by arithmetic overflows, I immediately thought of getting rid of the (new Random()). Maybe the overflow itself is enough to get random sequence?



var sequence = Enumerable.Range( 1, 500 ).OrderBy( n => n * 1234567890 );
RandomSequenceGeneratorTest( sequence );


Well, it is not. Altough the sequence printed on the console looks quite random at first sight, the image reveals that it's not random at all. It seems that the value produced by the (new Random()).Next() is then important, even though the first test (version 1) revealed that the value itself is not enough!


Even more improved generator (version 3)

Let's go back to our improved generator and try to improve it even more:



var sequence = Enumerable.Range( 1, 500 ).OrderBy( n => n * n * ( new Random() ).Next() );
RandomSequenceGeneratorTest( sequence );


I guess, I am satisfied. However, I do not think I am curious enough to dig for a exhaustive explanation. Does really the arithmetic overflow causes this generator to produce random sequence? What's the exact role of (new Random()).Next() here? It seems that it's not important itself (version 1), however removing it completely also does not work.


I belive that the deferred nature of LINQ enumeration is the one of keys to explain these obervations. I would also like to see some deeper and more throughout tests of my "yet-improved" generator.


On the other hand, could it be possible that incidentally, by making a square function (n*n), I've built a chaotic function with random distribution over its domain?

3 comments:

Anonymous said...

I believe this is not random, more like modulo by range-max.
It it not equal to:

int nConst = (new Random()).Next()

var sequence = Enumerable.Range( 1, 500 ).OrderBy( n => n * n * nConst );

?

Wiktor Zychla said...

there's an interesting and quite obvious proposal on:

http://www.linqexchange.com/post/2008/07/How-to-Generate-a-Random-Sequence-with-LINQ.aspx

where Guid.NewGuid() is used to order sequences.

nice trick to remember!

Anonymous said...

order by guid.newguid.tostring