c# - .NET : StreamReader does not recognize ° characters -


i trying run regex locate degree characters (\u00b0|\u00ba degrees in addition locating other form of ' --> \u00b4). reading latitude , longitude dms coordinates one: 12º30'23.256547"s

the problem way reading file can manually inject string 1 below (format latitude, longitude, description):

const string myteststring = @"12º30'23.256547""s, 12º30'23.256547""w, somewhere";

and regex matching expected - can see º values where, when using streamreader, see � unrecognized characters (the º symbol being included 1 of unrecognized characters)

i've tried:

            var sr = new streamreader(dlg.file.openread(), encoding.utf8);             var sr = new streamreader(dlg.file.openread(), encoding.unicode);             var sr = new streamreader(dlg.file.openread(), encoding.bigendianunicode); 

in addition default ascii.

either way read file, end these special characters. advice appreciated!!

you need identify encoding file saved in, , use when read streamreader.

if created using regular texteditor i'm guessing default encoding either windows-1252 or iso-8859-1.

the degree symbol 0xba in iso-8859-1 , goes outside of 7bit ascii table. don't know how encoding.ascii interprets it.

otherwise, might easier make sure save file utf-8 if have possibility.

the reason works when define string in code because .net work strings it's internal encoding (ucs-2?), streamreader convert bytes reading file internal encoding using encoding specify when create streamreader.


Comments

Popular posts from this blog

javascript - Enclosure Memory Copies -

php - Replacing tags in braces, even nested tags, with regex -