Monday, January 15, 2024

An interesting edge case in C# yield state machines

Among many useful LINQ methods, there's no functional ForEach. People often implement it on their own, an extension method is just a few lines of code.
Someone did that in one of our projects, too:
public static void ForEach1<T>( this IEnumerable<T> enumeration, Action<T> action )
{
	foreach ( T item in enumeration )
	{
		action( item );
	}
}
This works great. Someone else, however, noticed that while this works great, it doesn't allow further chaining, since the method returns void
This someone else modified the code slightly:
public static IEnumerable<T> ForEach2<T>( this IEnumerable<T> enumeration, Action<T> action )
{
	foreach ( T item in enumeration )
	{
		action( item );
		yield return item;
	}
}
This doesn't work great. Frankly, it doesn't work at all. Surprising but true. Take this:
internal class Program
{
	static void Main( string[] args )
	{
		Test1();
		Test2();

		Console.ReadLine();
	}

	static void Test1()
	{
		var array = new int[] { 1,2,3 };

		var list = new List<int>();

		array.ForEach1( e => list.Add( e ) );

		Console.WriteLine( list.Count() );
	}
	static void Test2()
	{
		var array = new int[] { 1,2,3 };

		var list = new List<int>();

		array.ForEach2( e => list.Add( e ) );

		Console.WriteLine( list.Count() );
	}
}

public static class EnumerableExtensions
{
	public static void ForEach1<T>( this IEnumerable<T> enumeration, Action<T> action )
	{
		foreach ( T item in enumeration )
		{
			action( item );
		}
	}

	public static IEnumerable<T> ForEach2<T>( this IEnumerable<T> enumeration, Action<T> action )
	{
		foreach ( T item in enumeration )
		{
			action( item );
			yield return item;
		}
	}
}
While this is expected to produce 3 for both tests, it produces 3 and 0. The upgraded version doesn't work as expected.
The reason for this is a kind-of-a-limitation of the state machine generated by the yield sugar. The machine doesn't execute any code until the very result is enumerated. This means that changing
array.ForEach2( e => list.Add( e ) );
to
array.ForEach2( e => list.Add( e ) ).ToList();
would "fix it". What a poor foreach, though, that requires an extra terminator (the ToList) and doesn't work otherwise.
Luckily, a simpler fix exists, just forget the state machine at all for this specific case:
public static IEnumerable<T> ForEach2<T>( this IEnumerable<T> enumeration, Action<T> action )
{
	foreach ( T item in enumeration )
	{
		action( item );
	}

	return enumeration;
}

No comments: