Sunday, October 16, 2016

C# List-Sampling Extension Method

While messing around with some for-fun code in C# today, I found myself looking for something along the lines of Ruby's Array#sample method, but for .NET's IList. Strangely (or perhaps not-so-strangely, because when you search for something like "C# IList sample extension", you get tons of noise), I couldn't find anything. So I wrote a simple one to share here:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
public static class ListExtensions
{
 readonly static Random myRand = new Random();
 readonly static object lockObj = new object();

 public static T Sample<T>(this IList<T> list)
 {
  lock (lockObj)
  {
   return Sample(list, myRand);
  }
 }

 public static T Sample<T>(this IList<T> list, Random rand)
 {
  if (list == null || list.Count == 0)
  {
   return default(T);
  }
  return list[rand.Next(list.Count)];
 }
 public static IList<T> Sample<T>(this IList<T> list, int size)
 {
  lock (lockObj)
  {
   return Sample(list, size, myRand);
  }
 }

 public static IList<T> Sample<T>(this IList<T> list, int size, Random rand)
 {
  if (list == null)
  {
   return new List<T>();
  }
  return list.Select(x => new { Entropy = rand.Next(), Item = x })
   .OrderBy(x => x.Entropy)
   .Select(x => x.Item)
   .Take(size)
   .ToList();

 }
}

Notes:

  • I put this on the IList interface instead of, say, IEnumerable, because the random access helps greatly with the efficiency of the size-1 overloads. (And that just happens to be the specific case I needed today).
  • You can use the overloads that require you to pass your own Random instance, or not, but if you choose not to, note that you'll take a locking penalty since System.Random isn't thread-safe.
  • Samples of size > 1 are implemented via that Linq-style shuffle.

No comments: