CSV Exports

This summer I worked on a CSV Export system.  The goal was to take all of the data in the application/database and export it for analysis in Excel.  An additional requirement was that this data would be able to be imported back into the application (how we made the imports work is a future blog post topic).

The application has a highly embedded data structure and we wanted the output to reflect this data structure.  This meant that if there was a list that needed to be output the CSV would have a separate row for each of the entities in the list and would also include the parent entities.

Example:
class A { public B b; public string dataA; }
public class B { public C[] cArray; public string dataB;}
public class C { public string dataC; }

CSV:
dataA, dataB, cArray[0].dataC
dataA, dataB, cArray[1].dataC
dataA, dataB, cArray[2].dataC

Some of the data would be duplicated in the output based on the number of nested lists. There would also be multiple data fields for each class.

A naive solution would be to use a string buffer and append the data with a comma delimiter. While this would work for the majority of the use cases, a string buffer makes it very challenging to enter a runtime determined number of commas between specific fields.  I needed a system where every row would have the exact same number of commas so the columns would line up correctly down the page. If I only appended the data as I needed it, I would be missing all of the trailing commas.

The solution I developed  grouped fields from the same entity and then to used a string format to create the final output line. Creating dynamic format strings was an interesting problem.

My format string for making a format string looked something like “{{{0}}}{1}" . The outer curly braces were used to escape a curly brace in the C# format string.  The {0} is used to embed the correct index and the {1} is used to embed the delimiter (usually a single comma).

A variable (and runtime determined) number of inputs for a string output provides a unique opportunity to see the power of C# string.format.  Not only can the format string be used for standard string outputs with static inputs, but when applied correctly fixed width output can be achieved with minimal additional work.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s