Site Loader

At Tideum and for TimeJack we are developing line of business applications which usually handle numbers. As we are developing our software for international markets we usually face the problem of users entering numbers in a format that is not the same as the current locale (e.g. Germans are entering numbers in English locale and vice versa). This happens very often if data is copied from 3rd party applications (e.g. SAP or Excel).

Parsing those numbers with the standard .NET functions (aka. Decimal.Parse()) is parsing only a fixed language, e.g.

string source = "123,45";
decimal value = Decimal.Parse(source);

depends on the current thread culture settings. In case the current settings are “de-DE” (or “de-AT”), the conversion will return value == 123.45, for “en-US” value == 12345 will be returned and for “de-CH”/”fr-CH” a System.FormatException is thrown.

Examples of input numbers

To explain the complexity, the following list contains a couple of valid examples of numbers that will pop in as input:

  • 100 –> 100
  • 100.00 –> 100
  • 1.000,00 –> 1000
  • 1000,00 –> 1000
  • 1,00000 –> 1
  • 1’000’000.00 –> 1000000 (Swiss)
  • 1,000,000.00 –> 1000000
  • 1.000.000,00 –> 1000000
  • 1.000 –> ambiguous

With the exception of Swiss numbers (as we have several customers in Switzerland there’s this special handling implemented in all our software products), where we simply remove all single quotation (‘) characters.

Solution

Okay, how to solve this issue: As you can see from the examples above, there are three kinds of numbers

  1. Numbers without delimiters (e.g. 100)
  2. Numbers with one delimiter (e.g. 100.00; 1,0000)
  3. Numbers with two delimiters (e.g. 1.000,00; 1,000.00)

If a number does not fit into this schema, it is considered as invalid. What we did is to simply count the number of unique delimiters in the number string. Based on the result (0, 1 or 2), we decide what to do with the number. The only case were a decision has to be made is case 2. In this case we check the first delimiter. If this delimiter is a dot (.), we simply swap all dots and commas:

public static decimal ToDecimal(string value)
{
  string tempValue = RemoveSwissChars(value);

  var delimiters = tempValue.Where(x => x == '.' || x == ',').Distinct().ToList();
  int count = delimiters.Count();

  switch (count)
  {
    case 0: // nothing to do
      break;
    case 1: // only comma or dot
      if (tempValue.Count(x => x == '.' || x == ',') == 1)
      {
        var pos = tempValue.LastIndexOfAny(new[] { ',', '.' });
        if (pos == tempValue.Length - 4)
        {
          throw new FormatException("Ambiguous value provided: " + tempValue);
        }
      }
      if (tempValue.IndexOf(',') != -1)
      {
        tempValue = value.Replace(",", ".");
      }
      break;
    case 2:
      if (delimiters.ElementAt(0) == '.')
      { // 123.456,12 --> 123,456.12
        tempValue = SwapTwoCharacters(tempValue, '.', ',');
      }
      break;
  }
  NumberFormatInfo format = CultureInfo.InvariantCulture.NumberFormat;
  decimal number = decimal.Parse(tempValue, format);
  return number;
}

P.S. The functions RemoveSwissCharacters() and SwapTwoCharacters() are left as an exercise to the reader.

EDIT: One of my business colleagues gave a comment on this: What would be if somebody enters 1,000 . This can be the number 1 (German) or 1000 (English). As this is not ambiguous, an exception will be thrown.

Advertisements

Post Author: Stephen

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.