Gal Ratner
Gal Ratner is a Techie who lives and works in Los Angeles CA and Austin TX. Follow galratner on Twitter Google
Ten ways to remove alphabetic characters from a string

I was reviewing a function yesterday designed to remove non numeric characters from a string and it seemed a little slow so I rewrote it and sent the modified version around the office. I guess I spiked some interest because the amount of versions I got back was staggering. Programmers love this sort of thing and it turned into a mini competition to see who can write the best version. I decided to post most of the versions and see if any of the readers can come up with more.

 


/// <summary>
/// Using a StringBuilder looping over the text
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric1(string Word)
{
StringBuilder sb = new StringBuilder();
char c;

for (int i = 0; i < Word.Length; i++)
{
c = Word[i];
if (Char.IsDigit(c))
sb.Append(c);
}

return sb.ToString();
}

/// <summary>
/// Using a StringBuilder and a foreach loop
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric2(string Word)
{
StringBuilder sb = new StringBuilder();

foreach (char c in Word.ToCharArray())
{
if (Char.IsDigit(c))
sb.Append(c);
}

return sb.ToString();
}

/// <summary>
/// Using a Regex. This is the slowest by far
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric3(string Word)
{
return Regex.Replace(Word, @"[^\d]"String.Empty);
}

/// <summary>
/// With Linq. Looks great. Works good.
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric4(string Word)
{
return new String(Word.ToCharArray().Where(c => Char.IsDigit(c)).ToArray());
}

/// <summary>
/// Using a generic list and a for loop
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric5(string Word)
{
List<char> myCharList = new List<char>();

for (int i = 0; i < Word.Length; i++)
{
if (Char.IsDigit(Word[i]))
myCharList.Add(Word[i]);
}

return new string(myCharList.ToArray());
}

/// <summary>
/// Using an array and a for loop. Fastest in .NET 3.5
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric6(string Word)
{
int x = 0;
char c;
char[] chars = new char[Word.Length];
for (int i = 0; i < Word.Length; i++)
{
c = Word[i];
if (Char.IsDigit(c))
{
chars[x] = c;
x++;
}
}

return new string(chars, 0, x);
}

/// <summary>
///  Using an array and a foreach loop. Fastest in .NET 4.0
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric7(string Word)
{
int x = 0;
char[] chars = new char[Word.Length];
foreach (char c in Word.ToCharArray())
{
if (Char.IsDigit(c))
{
chars[x] = c;
x++;
}
}

return new string(chars, 0, x);
}

/// <summary>
/// Using an IEnumerator. Slow.
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric8(string Word)
{
int x = 0;
char[] chars = new char[Word.Length];
var enm = Word.ToCharArray().GetEnumerator();
while (enm.MoveNext())
{
if (Char.IsDigit((char)enm.Current))
{
chars[x] = (char)enm.Current;
x++;
}
}

return new string(chars, 0, x);
}

/// <summary>
/// using String.Join and Linq. Looks good but slow
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric9(string Word)
{
return string.Join<char>(null, Word.Where(c => Char.IsDigit(c)));
}

/// <summary>
/// Using a CharEnumerator. Slow
/// </summary>
/// <param name="Word"></param>
/// <returns></returns>
private static string RemoveNonNumeric10(string Word)
{
int x = 0;
char[] chars = new char[Word.Length];
using (var enm = Word.GetEnumerator())
{
while (enm.MoveNext())
{
if (Char.IsDigit((char)enm.Current))
{
chars[x] = (char)enm.Current;
x++;
}
}
}
return new string(chars, 0, x);
}


 

I would like to thank Stephen Wright for reminding me that going back to basics is the best way for speed by co writing functions 6 and 7. The fastest functions here by far.

 

 


Posted 16 Dec 2010 11:08 PM by Gal Ratner
Filed under:

Powered by Community Server (Non-Commercial Edition), by Telligent Systems