Site Loader

Recently I stumbled over the task to read either CSV or Excel files and transfer data in an independent format to a server.

The existing software part implemented to similar approaches to

  1. Read the file  
  2. Transfer the file into an intermediate format (a DataTable)
  3. Transfer the file up to the server

The server took the file, read all data, translate number according to a given number format sent by the client, …

Reviewing the code resulted into a couple of drawbacks: redundant code, performance, use of language depending information on the server side, etc.

As a result I looked for an approach to

  1. combine CSV and excel load within one flow and  
  2. have data translated on the client side into a format which can be read by the server without knowledge of the client user language (or the date format used for loading files)

Based on a code example for IDataReader (http://blogs.msdn.com/b/anthonybloesch/archive/2013/01/23/bulk-loading-data-with-idatareader-and-sqlbulkcopy.aspx) I had the idea to create a library which streams data independent of the input format.

The result is now on GitHub. A library which results into a simple interface to read data:

IStreamImporter streamImporter = new CsvStreamImporter(streamdata);
((CsvStreamImporter)streamImporter).SetDelimiter(';');
// or
ExcelPackage package = new ExcelPackage();
package.Load(excelFile);
ExcelStreamImporter streamImporter = 
    new ExcelStreamImporter(package.Workbook.Worksheets.First(), true);
streamImporter.SetHeader(true);
streamImporter.AddColumnDefinition("Field1", typeof(string));
streamImporter.AddColumnDefinition("Field2", typeof(int));
streamImporter.AddColumnDefinition("Field3", typeof(string));

while (streamImporter.Read())
{
  string column1 = streamImporter.GetValue(0) as string;
  int column2 = streamImporter.GetValue(2) as int;
  string column3 = streamImporter.Getvalue(3) as string;
  // ... do work
}

The current status of the project is some kind of work in progress. Anyhow, NuGet packages are available from https://www.myget.org/F/stream-importer/ (prerelease versions only).

The following is still to be done:

  • Documentation
  • Examples
  • Replace EPPlus Excel library for xlsx files by XmlReader to reduce memory footprint

Anyhow, the library seems to be stable and is used in a production environment.

Advertisements

Post Author: Stephen

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.