Easily Create DataTables For Unit Tests

Ideally, you have a data access layer that only returns domain objects or data transfer objects. The key is real, strongly-typed objects. One of the many reasons for this is to make unit testing easier. Unfortunately, sometimes you are stuck working with code that uses datatables. I’ve recently written a function that takes most of the pain out of creating datatables for unit testing. This function and the code that calls it takes advantage of several new c# features: lamda expressions, object initializers, list initializers, and auto-implemented properties. It also uses generics and reflection.

The first thing that stinks about creating datatables from scratch is that you have to manually define the columns which is tedious. The second thing that stinks is that you can only add rows to the datatable using an object array which means you have to count commas to keep track of which column you are populating. Or you have to do something awkward like obj[datatable.Columns[“FieldName”].Ordinal] = “somevalue”. To get around this, the first step is to define the structure of our rows by creating a simple class, EG:

public class Person
{
	public string LastName { get; set; }
	public string FirstName { get; set; }
	public DateTime DateOfBirth { get; set; }
	public decimal Salary { get; set; }
}

Now we can call the method with a very convenient syntax that capitalizes on object and list initializers:

public DataTable GetPeople()
{
	return ListToTable(new List<Person> {
		new Person {
			LastName = "Opincar",
			FirstName = "John",
			DateOfBirth = new DateTime(1901, 1, 1),
			Salary = 250000.00M
		},
		new Person {
			LastName = "Lincoln",
			FirstName = "Abe",
			DateOfBirth = new DateTime(1801, 1, 15),
			Salary = 1000.00M
		}
	});
}

If you’ve ever manually populated datatables you can really appreciate what a huge improvement this is.

Finally, we discuss the method itself. Using reflection, we can leverage the type information stored in Person to create our datatable columns in a generic method that takes a List as input and returns a datatable:

public DataTable ListToTable(List rows)
{
var dt = new DataTable();
var props = typeof(T).GetProperties();
Array.ForEach(props, p => dt.Columns.Add(p.Name, p.PropertyType));
foreach ( var r in rows )
{
object[] vals = new object[props.Length];
for (int idx = 0; idx < vals.Length; idx++) { vals[idx] = props[idx].GetValue(r, null); } dt.Rows.Add(vals); } return dt; } [/sourcecode] I hope you find this method as useful as I have.

Advertisements

Generic Methods in Non-Generic Classes in c#

I’ve been using generic classes quite a bit over the last few years. However, I had never used a generic method in a non-generic class until today. I had a web service class that was using declaritive attributes to generate the WSDL. I already had a generic class, WebMethod<R>, that I was using as the base class of all my web methods to encapsulate common functionality such as authentication, logging, etc. There were some things that I needed to in the web service class that I did not want to move to the generic web method class. However, I did not want to duplicate that code in each web method of the web service. Here’s the solution, a generic method in a non-generic class. This code has been simplified for demo purposes:

[WebService(Namespace="http://blah.com/SomeWebServices/",
  Description="Some web service.")]
public class SomeWebService : System.Web.Services.WebService {
	public R Run<R>(WebMethod<R> method) where R : IWebServiceResponse, new() {
		method.UserHostAddress = this.Context.Request.UserHostAddress;
		if (AppConfig.GetSetting("systemFactoryType").ToLower() == "local") {
			method.SystemFactory = new LocalSystemFactory();
		}
		return method.Run();
	}

	[WebMethod(Description = "Take Action.")]
	public ActionResponse Action(UserToken user, ActionRequest request) {
		return Run(new ActionWebMethod(user, request));
	}

	[WebMethod(Description = "Fiddle.")]
	public FiddleResponse Fiddle(UserToken user, FiddleRequest request) {
		return Run(new FiddleWebMethod(user, request));
	}
}

public abstract class WebMethod<R> where R : IWebServiceResponse, new() {
	public SystemFactory SystemFactory {
		set { _systemFactory = value; }
	}

	protected abstract void RunMethod();

	public R Run() {
		_response = new R();
		if (Login()) {
			RunMethod();
		}
		return _response;
	}
}

WebMethod initializes SystemFactory to a remoted implementation in the constructor. I wanted to be able to change that to a non-remoted implementation via the web.config. Similarly, I did not want to add Host Address to the WebMethod constructor, but when I am actually calling the web methods from the web service, I do want to record the Host Address. Without the generic method, I would end up duplicating this code in the body of each web method.

Note that I do not have to explicitly provide a type when I am calling Run, the compiler infers it for me.

C# Generic Covariance

Intuitively, an immutable, generic collection of a subclass should be covariant with a collection of the superclass. Here’s an example:

interface IWidget
class Widget : IWidget
interface IWidgetCollection : IEnumerable<IWidget>
class WidgetCollection : List<Widget>

As I think most experienced developers will recognize, this is not a contrived example (aside from the “Widget” part) and is something that would be extremely useful. I am repeatedly surprised when I receive the compiler error that you cannot convert WidgetCollection to IEnumerable<IWidget>. List implements IEnumerable so WidgetCollection implements IEnumerable<Widget>. Thus, the crux of the matter is: should IEnumerable<Widget> be covariant with IEnumerable<IWidget> given that Widget : IWidget? I say yes and I’ve yet to see a good reason as to why not.

There’s actually a very good reason why List<Widget> is not covariant with List<IWidget>. In a nutshell, covariance of non-immutable collections would potentially allow insertion of a different subclass into a collection of another subclass through upcasting. See this for full-blown details. However, this scenario does not apply to IEnumerable nor immutable collections.

I’ve encountered this very sticky problem twice in the real world. Most recently, I was trying to use generic collections of an interface in combination with generic collections of classes that implemented the interface to implement the strategy pattern in way that didn’t require downcasting.

List<A> aic = _retreiver.Get(max);
foreach (A ai in aic)
{
    _updater.UpdateRemote(ai);
    _updater.UpdateLocal(ai);
}
_updateCompletionRecorder.RecordCompletion(aic);

_retreiver, _updater, and _updateCompletionRecorder are strategies and A is a generic type variable. In terms of the intro example, I needed _updateCompletionRecorder to be able to work with Widgets instead of IWidgets without downcasting in the implementation.

In addition, the project I was working on uses remoting. So I needed to declare non-generic types (which I feel is good practice anyway) for my collections since you cannot serialize a directly generic collection, EG WidgetCollection (serializable) instead of List<Widget> (not serializable). In this case, I was utlimately able to leave List<> in to solve the problem since I ended up not needing to cross a remoting boundary in this case. However, I wasted quite a bit of time and ended up with an inferior design, IMO.

Everthing would have been much better if IEnumerable<Widget> had been covariant with IEnumerable<IWidget>.

Implementing ITypedList for Virtual Properties

ITypedList allows you to create “views” of objects for databinding purposes without actually having to modify the object(s) underlying the view. You can prevent public properties from show up by using attributes. However, attributes won’t allow you to present methods as properties or crosstab an array of values. Prior to developing this technique, I would implement a method that returned a DataTable that I used for databinding. This works OK for read-only displays but you still end up duplicating data in memory at least for a small time. Once you allow updating, you must keep both the view (datatable) and the model (object collection) in memory and synchronized. At this point, implementing ITypedList becomes a very elegant, effecient solution. (see a later post for more on this topic).

In the .net 2.0 world, you should be using a generic collection class as the base for your underlying data. So the first question is, since List<X> is a strongly typed list, will databinding still recognize the ITypedList implementation? Fortunately, the answer is yes. The databinding system in .net 2.0 windows forms will display class defined as

public class PersonCollection : List<Person>, ITypedList {...}

using the ITypedList implementation and instead of the default view consisting of all of the Person public properties.

On the surface, implementing ITypedList looks easy — you only have to write two methods

PropertyDescriptorCollection GetItemProperties(PropertyDescriptor[] listAccessors);

string GetListName(PropertyDescriptor[] listAccessors);

In fact, you only have to write one method — GetItemProperties. GetListName isn’t used by the .net 2.0 DataGridView. Wow, just one method. How hard can that be? Unfortunately, the documentation is less than stellar and the steps required are less than obvious. First, you can’t just create an empty PropertyDescriptorCollection and start adding items to it. No, that would make too much sense. Instead, you have to create a PropertyDescription array and pass that to the PropertyDescriptorCollection constructor.

Now that you’ve figured out how to create a PropertyDescriptorCollection, you must find a way to create a PropertyDescriptor. PropertyDescriptor is an abstract class. If you search the documentation, you won’t find any obviously useful concrete subclasses, either. Since PropertyDescriptor is not sealed, you can subclass it. When you subclass it, you find that there is really only one useful base constructor which takes a property name and an attribute array as a parameter. However, you still have 8 members that you must implement. I was hoping for a constructor that would take some of these as parameters but you really do have to implement them all yourself.

In the paragraphs that follow, I will step you through implementing a List<Person> that implements ITypedList. I will define an interface for buildng a Person view. I will implement that interface and pass an instance of that implmentation in to the PersonCollection class using the Strategy design pattern. Finally, in order to create the “virtual” properties, I will also implement a concrete subclass of PropertyDescriptor.

Below is the code for a Person class. There’s nothing special about it but we need something to work with to demonstrate ITypedList.

public class Person {
 protected string _firstName;
 protected string _lastName;
 protected string _midName;
 protected DateTime _dob;
 
 public Person(string firstName, string lastName, string midName, DateTime dob) {
  _firstName = firstName;
  _lastName = lastName;
  _midName = midName;
  _dob = dob;
 }
 
 public string FirstName {
  get { return _firstName; }
 }

 public string LastName {
  get { return _lastName; }
 }

 public string MiddleName {
  get { return _midName; }
 }

 public DateTime DateOfBirth {
  get { return _dob; }
 }
}

Here’s the complete PersonCollection class definition — all it does is implement ITypedList by deferring all of the work to another class (note that using System.ComponentModel is assumed in all of the following classes):

public class PersonCollection : List<Person>, ITypedList {
 protected IPersonViewBuilder _viewBuilder;

 public PersonCollection(IPersonViewBuilder viewBuilder) {
  _viewBuilder = viewBuilder;
 }

 #region ITypedList Members

 protected PropertyDescriptorCollection _props;

 public PropertyDescriptorCollection GetItemProperties(PropertyDescriptor[] listAccessors) {
  if (_props == null) {
   _props = _viewBuilder.GetView();
  }
  return _props;
 }

 public string GetListName(PropertyDescriptor[] listAccessors) {
  return ""; // was used by 1.1 datagrid
 }

 #endregion
}

I am using the Strategy pattern to allow specification of the view builder. If you needed to provide multiple, simultanous views of the same underlying collection, you would need to use a different design. I have defined an interface to facilitate the Strategy pattern.

public interface IPersonViewBuilder {
 PropertyDescriptorCollection GetView();
}

The next two classes are where are the real work happens with respect to creating the view. In this case we know how many properties we are exposing up front. In other cases, the number of “virtual” properties exposed could be dynamic and the code below demonstrates that by using a List<> instead of a predefined array. Yes, I know that Age isn’t accurate but I kept it simple for demonstration purposes.

public class PersonFullNameAgeView : IPersonViewBuilder {
 public PropertyDescriptorCollection GetView() {
  List<PropertyDescriptor> props = new List<PropertyDescriptor>();
  PersonMethodDelegate del = delegate(Person p) 
   { return p.FirstName + " " + p.MiddleName + " " + p.LastName; };
  props.Add(new PersonMethodDescriptor("FullName", del, typeof(string)));
  del = delegate(Person p) { return DateTime.Today.Year - p.DateOfBirth.Year; };
  props.Add(new PersonMethodDescriptor("Age", del, typeof(int)));
  PropertyDescriptor[] propArray = new PropertyDescriptor[props.Count];
  props.CopyTo(propArray);
  return new PropertyDescriptorCollection(propArray);
 }
}

public delegate object PersonMethodDelegate(Person person);

public class PersonMethodDescriptor : PropertyDescriptor {
 protected PersonMethodDelegate _method;
 protected Type _methodReturnType;

 public PersonMethodDescriptor(string name, PersonMethodDelegate method,
  Type methodReturnType)
  : base(name, null) {
  _method = method;
  _methodReturnType = methodReturnType;
 }

 public override object GetValue(object component) {
  Person p = (Person)component;
  return _method(p);
 }

 public override Type ComponentType {
  get { return typeof(Person); }
 }

 public override Type PropertyType {
  get { return _methodReturnType; }
 }

 public override bool CanResetValue(object component) {
  return false;
 }
 
 public override void ResetValue(object component) { }
 
 public override bool IsReadOnly {
  get { return true; }
 }

 public override void SetValue(object component, object value) { }

 public override bool ShouldSerializeValue(object component) {
  return false;
 }
}

Finally, we have the actual code to create the collection, set the view, and bind it to a datagridview.

 PersonCollection pc = new PersonCollection(new PersonFullNameAgeView());
 pc.Add(new Person("John", "Opincar, Jr", "Thomas", new DateTime(1968, 8, 12)));
 pc.Add(new Person("Abraham", "Lincoln", "X", new DateTime(1825, 1, 1)));
 pc.Add(new Person("John", "Smith", "David", new DateTime(1985, 2, 15)));
 this.dataGridView1.DataSource = pc;

Here’s a screenshot of what is displayed in the grid:

itypedlistdemo.jpg

Some final thoughts. There are many different ways to implement subclasses of PropertyDescriptor. I have demonstrated one and it only provides readonly virtual properties based on methods. You can provide read-write virtual properties that are based on properties or methods. You could write a very specific subclass that takes the component and an index to provide read-write access to an indexed property. You could write very generic subclass that takes a member name, component type, getter and setter delegates that would serve all your needs (I ended up doing this myself) at the cost of being a little harder to follow and probably less effecient.

Bear in mind that this sample code is for demonstration purposes only. Obviously, you could have just implemented FullName and Age as public properties on Person. From a purist standpoint, I would argue that making changes to the Person class solely for display purposes not necessarily a good thing. However, if you have compound or nested data that you wish to display as a single row in a crosstab, this technique is invaluable.

Generics, Invalid Cast Exceptions, and Non-Generic Base Classes

Its hard even for me to believe that I’ve been coding for 25 years now.  I started writing code when I was 15 and I’ve been doing it pretty much non-stop ever since.  I took an upper division survey of programming languages class back in my college days at UT where I learned two things:  One, I never wanted to write Cobol again, and, two, ADA was really cool because it had generics.

I was really excited when c# 2.0 was announced and I heard that it would include generics.  Over the last couple of years I’ve had the opportunity to use generics in c# in the real world and I’ve generally been pleased with the results.  Most of what I’ve done has been simple, collection-related coding that made straight forward use of System.Collections.Generic classes.  I have also developed some relatively simple classes that either derived from built-in generic types or actually were generic themselves.

Given my quarter-century of development experience combined with my familiarity and practical use of generics in production-deployed code, I was surprised to find myself spinning my wheels for over a day on what seemed like a relatively simple task at the outset (How many times have I thought that :)).  So without further ado, let me share some valuable nuggets that I gathered over a couple of long days and late nights that didn’t produce what I expected.

I had an existing class, Series, that stored data for a single calendar day.  This is a very useful class in the power scheduling business domain that I work in because it provides:

  • varying the period of the data from 1 to 60 minutes
  • merging and splitting functionality to change the period of the data
  • mathematical operations with overloaded operators
  • daylight savings time transition support
  • loading and saving to and from datasets

If you’ve never worked with time-series data in a 24×7 sub-hourly scheduling environment, then you may not appreciate the complexities involved, particularly with respect to the daylight savings time transition days.  It’s incredible for me to think how much time we spend coding for those 2 hours out of 8,760 hours  in a year.  After years of experience, I’ve concluded that using GMT behind the scenes, and then computing the GMT start and and end times for a given calendar day is definitely the cleanest way to handle this.  One day has 23 hours, another has 25, and the rest have 24 but time remains continuous in GMT.  The previous hour is always one hour less and the next hour is always one hour more.  That sounds obvious, but its only true if you always stick with GMT.

I digress.  The Series class is great.  In fact, I also had a SeriesCollection class.  Series has a DateOf field and a Description field so stuffing a bunch of Series in a collection was very useful for displaying different types of data across time and even for displaying cross-tabbed data to the user in a grid.

There were several sub-classes but they weren’t sub-classes that added any real functionality.  They simply allowed you to store a different set fields/keys/tags with the series of numerical data for that day.  So when I needed to add yet another pair of Series, SeriesCollection sub-classes I decided enough was enough, time to make these generic.

My seemingly simple idea was to convert Series and SeriesCollection to Series<I> and SeriesCollection<I>, where I would be an interface that provided the “fields” that described the assocated Series.  I could really be any class you wanted as long as it implemented the ISeriesDescriptor interface.  Here’s a simplified, high-level overview of the interface and classes:

public interface ISeriesDescriptor {
 void GetColDefs(DataColumn[] cols);
 void GetValues(object[] vals);
 int Count { get; }
}

public class Series<I> where I : ISeriesDescriptor {
 double this[int] { get {...} set {...} }
 void AddInPlace(Series<I> other) {...}
 static DataSeries operator +(DataSeries d1, double val) {...}
 public static DataSeries operator +(DataSeries d1, double val) {...}
}

public class SeriesCollection<I> : IList<Series<I>>, ITypedList  where I : ISeriesDescriptor {...}

Those of you that have also been lured down this seemingly inviting path may recognize the dreaded nested <<>>.  I am leaving out several other classes that help provide the ITypedList implementation and some cool “virtual property” functionality for databinding.  The key here is that I blindly started replacing every occurence of Series with Series<I> and every occurence of what used to be “string Description” with “I Description.”  The <I> started to ripple outwards in a seemingly never-ending spiral.  Soon, I had 8 classes that where based on <I> in varying fashions. 

I (myself) was also using reflection in my SeriesCollection<I> class to add new instances of Series<I> to the collection.  So when I finally finished several hours of making changes, trying to compile, making more changes, ad infinitum, it was with great pleasure that I finally ran my newly compiling, super-generic code.  I then began modifying the code that would use this wonderful new construct.  I created subclasses of Series<I> and SeriesCollection<I> that didn’t add anything other than specific load from database methods.  Really all they did was invoke different methods on a data-access object and then populate the collection using a reflected constructor on Series<I>.  As I was doing this I thought shouldn’t I be using a factory method on a passed in object here?  Naw, the reflection keeps it more “generic.”

public class StockDescriptor : ISeriesDescriptor {...}
public class StockSeries : Series<StockDescriptor> {...}
public class StockSeriesCollection : SeriesCollection<StockDescripor> {...}

That was quick.  I run the code.  I am binding a StockSeriesCollection to a DataGridview.  Wow, that implementation of ITypedList is working beautifully.  I want to add some validation code so I need to take grd.Rows[e.RowIndex].DataBound and cast it to a StockSeries.  At run-time I get an invalid cast exception.  What?  You’re telling me I can’t cast a Series<StockDescriptor> to a StockSeries when StockSeries : Series<StockDescriptor>?  When I write this now it seems obvious that you can’t cast down the inheritance tree, only up. 

But I was so caught up in the generics aspect of it that I dug myself in even deeper.  I read the c# spec sections on generics and the type casting rules.  I try to solve the problem by adding a second type variable.  That takes a long time and yields exactly the same result. At 4am, I am really frustrated and dejected.  I go to bed.  The next morning I finally realize my simple mistake with the cast.  Then I started to look at what I’d done to my Series operators.  Before I changed to the generics, I could do:

VolumeSeries vs = new VolumeSeries();
// load it up
PriceSeries ps = new PriceSeries();
// load it up
PriceSeries result = (PriceSeries) ps * vs;

That wouldn’t be possible with my generic implementation because I hadn’t created a non-generic base or interface.  Something I hadn’t even thought about up front.  Unfortunately, this same limitation would cripple my SeriesCollection<I> class.  I could no longer put all of the different sub-classes of Series into a single SeriesCollection.  The bad news wasn’t over.  I also realized that my implementation of ITypedList which brought the ISeriesDescriptor properties “up one level” for data binding and turned the interval numeric values normally accessed via indexors into “virtual properties” for cross-tabbed display using System.Component.PropertyDescriptors wouldn’t handle a mixed collection anyway.

I hate that feeling that you just wasted a day or two of your life on what initially looked so simple and promising and turned out so ugly and inflexible.  Unfortunately, I didn’t have the time to properly refactor any of it.  I finally solved my immediate problem by replacing StockSeries with Series<StockDescriptor>.  I completed the UI component that depended on the Series and SeriesCollection. I could have just used the original set of classes and added just one more pair of subclasses an probably finished in 1/4th the time or even less.

The only silver lining of the whole affair is that I gained some valuable insight into non-trivial generic class design: generic type equivalence can be tricky and you will more than likely want a non-generic base or interface so that you can mingle different generic derivations in one collection or define operators on the base class.  I am still not sure whether using generics to achieve composition the way I tried is a good idea.  I also realized that my ITypedList implementation would need to be much more flexible to handle binding a mixed collection.  I relearned for the nth time the hard way that you really should stop and think hard about what your doing when you start changing the second or third class you weren’t expecting to touch.

—————————————–

P.S. (5/12/2007)

The classes described above have been working pretty well with some small refinements. Just because you have the <T> around, doesn’t mean you are always dealing directly with a C<T>. You may still need to reflect the actual type of the instance you are working on or one of it’s properties so that you can return properly typed object from operators.

For example:

public class SeriesCollection : List<Series<I>>, ITypedList {

public Series<I> GetTotal() {
    PropertyInfo pi = this.GetType().GetProperty(”Item”, new Type[] {typeof(int)});
    ConstructorInfo ci = pi.PropertyType.GetConstructor(new Type[] {typeof(DateTime), typeof(int)});
    DateTime dateOf = DateTime.Today;
    object obj = ci.Invoke(new object[] {dateOf, this._displayPeriod});
    Series<I> ret = (Series<I>) obj;
    foreach ( Series<I> ds in this ) {
       …total things up
    }
    return ret;
}
}