c# - .NET : StreamReader does not recognize ° characters -
i trying run regex locate degree characters (\u00b0|\u00ba degrees in addition locating other form of ' --> \u00b4). reading latitude , longitude dms coordinates one: 12º30'23.256547"s
the problem way reading file can manually inject string 1 below (format latitude, longitude, description):
const string myteststring = @"12º30'23.256547""s, 12º30'23.256547""w, somewhere";
and regex matching expected - can see º values where, when using streamreader, see � unrecognized characters (the º symbol being included 1 of unrecognized characters)
i've tried:
var sr = new streamreader(dlg.file.openread(), encoding.utf8); var sr = new streamreader(dlg.file.openread(), encoding.unicode); var sr = new streamreader(dlg.file.openread(), encoding.bigendianunicode);
in addition default ascii.
either way read file, end these special characters. advice appreciated!!
you need identify encoding file saved in, , use when read streamreader.
if created using regular texteditor i'm guessing default encoding either windows-1252 or iso-8859-1.
the degree symbol 0xba in iso-8859-1 , goes outside of 7bit ascii table. don't know how encoding.ascii interprets it.
otherwise, might easier make sure save file utf-8 if have possibility.
the reason works when define string in code because .net work strings it's internal encoding (ucs-2?), streamreader convert bytes reading file internal encoding using encoding specify when create streamreader.
Comments
Post a Comment