CString.GetAt(i)的判断双字节字符的问题

ralln 2015-03-20 02:15:34
我需要判断一个字符串CString X里面的汉字、特殊字符和英文
如果是汉字或者双字节字符(类似五角星啦,全角标点啦之类),存到CString A中
如果是英文等单字节字符,存到CString B中
我使用的是如下方法

CString X = _T("一二三abc四五六XDZ九十49");
for(int i = 0; i< X.GetLenth();i++)
{
if( 128 < (BYTE)X.GetAt(i) )
{
A = A + X.GetAt(i) + X.GetAt(i + 1);//认为大于128的是双字节,存到A中
}
else
{
B = B + X.GetAt(i);
}
}

我不知道我使用的这种方法是否合适,会不会出现导致程序崩溃之类的错误,还请了解的人指教一下。
...全文
373 5 打赏 收藏 转发到动态 举报
写回复
用AI写文章
5 条回复
切换为时间正序
请发表友善的回复…
发表回复
panglly 2015-10-08
  • 打赏
  • 举报
回复
"琯"这个字用这个方法时,获取的前半部分时负数,为什么?
赵4老师 2015-03-20
  • 打赏
  • 举报
回复
Strings: Unicode and Multibyte Character Set (MBCS) Support Home | Overview | How Do I Some international markets use languages, such as Japanese and Chinese, with large character sets. To support programming for these markets, the Microsoft Foundation Class Library (MFC) is enabled for two different approaches to handling large character sets: Unicode Multibyte Character Sets (MBCS) MFC Support for Unicode Strings The entire class library is conditionally enabled for Unicode characters and strings. In particular, class CString is Unicode-enabled. Note The Unicode versions of the MFC libraries are not copied to your hard drive unless you select them during a Custom installation. They are not copied during other types of installation. If you attempt to build or run an MFC Unicode application without the MFC Unicode files, you may get errors. To copy the files to your hard drive, rerun Setup, choose Custom installation, clear the check boxes for all other components except "Microsoft Foundation Class Libraries," click the Details button, and select both "Static Library for Unicode" and "Shared Library for Unicode." This will copy the following files to your hard drive: UAFXCW.LIB UAFXCW.PDB UAFXCWD.LIB UAFXCWD.PDB MFCxxU.LIB MFCxxU.DBG MFCxxU.DLL MFCxxUD.LIB MFCxxUD.PDB MFCxxUD.DLL MFCDxxUD.LIB MFCDxxUD.PDB MFCDxxUD.DLL MFCNxxUD.LIB MFCNxxUD.PDB MFCNxxUD.DLL MFCOxxUD.LIB MFCOxxUD.PDB MFCOxxUD.DLL Where xx represents the version number of the file; for example, ‘42’ represents version 4.2. CString is based on the TCHAR data type. If the symbol _UNICODE is defined for a build of your program, TCHAR is defined as type wchar_t, a 16-bit character encoding type; otherwise, it is defined as char, the normal 8-bit character encoding. Under Unicode, then, CStrings are composed of 16-bit characters. Without Unicode, they are composed of characters of type char. To complete Unicode programming of your application, you must also: Use the _T macro to conditionally code literal strings to be portable to Unicode. When you pass strings, pay attention to whether function arguments require a length in characters or a length in bytes. The difference is important if you’re using Unicode strings. Use portable versions of the C run-time string-handling functions. Use the following data types for characters and character pointers: TCHAR Where you would use char. LPTSTR Where you would use char*. LPCTSTR Where you would use const char*. CString provides the operator LPCTSTR to convert between CString and LPCTSTR. CString also supplies Unicode-aware constructors, assignment operators, and comparison operators. For related information on Unicode programming, see Unicode and MBCS and Unicode Topics. The Run-Time Library Reference defines portable versions of all of its string-handling functions. See the category Internationalization. MFC Support for MBCS Strings The class library is also enabled for multibyte character sets — specifically for double-byte character sets (DBCS). Under this scheme, a character can be either one or two bytes wide. If it is two bytes wide, its first byte is a special “lead byte,” chosen from a particular range depending on which code page is in use. Taken together, the lead and “trail bytes” specify a unique character encoding. If the symbol _MBCS is defined for a build of your program, type TCHAR, on which CString is based, maps to char. It’s up to you to determine which bytes in a CString are lead bytes and which are trail bytes. The C run-time library supplies functions to help you determine this. Under DBCS, a given string can contain all single-byte ANSI characters, all double-byte characters, or a combination of the two. These possibilities require special care in parsing strings, including CString objects. Note Unicode string serialization in MFC can read both Unicode and MBCS strings regardless of which version of the application you are running. Because of this, your data files are portable between Unicode and MBCS versions of your program. CString member functions use special “generic text” versions of the C run-time functions they call, or they use Unicode-aware functions such as lstrlen or lstrcpy. Thus, for example, if a CString function would normally call strcmp, it calls the corresponding generic-text function _tcscmp instead. Depending on how the symbols _MBCS and _UNICODE are defined, _tcscmp maps as follows: _MBCS defined _mbscmp _UNICODE defined wcscmp Neither symbol defined strcmp Note The symbols _MBCS and _UNICODE are mutually exclusive. Generic-text function mappings for all of the run-time string-handling routines are detailed in the Run-Time Library Reference. See the category Internationalization. Similarly, CString member functions are implemented using “generic” data type mappings. To enable both MBCS and Unicode, MFC uses TCHAR for char, LPTSTR for char*, and LPCTSTR for const char*. These result in the correct mappings for either MBCS or Unicode.
赵4老师 2015-03-20
  • 打赏
  • 举报
回复
Unicode and MBCS Provide Portability With MFC version 3.0 and later, MFC, including CString, is enabled for both Unicode and Multibyte Character Sets (MBCS). This support makes it easier for you to write portable applications that you can build for either Unicode or ANSI characters. To enable this portability, each character in a CString object is of type TCHAR, which is defined as wchar_t if you define the symbol _UNICODE when you build your application, or as char if not. A wchar_t character is 16 bits wide. (Unicode is available only under Windows NT.) MBCS is enabled if you build with the symbol _MBCS defined. MFC itself is built with either the _MBCS symbol (for the NAFX libraries) or the _UNICODE symbol (for the UAFX libraries) defined. Note The CString examples in this and the accompanying articles on strings show literal strings properly formatted for Unicode portability, using the _T macro, which translates the literal string to the form L"literal string" which the compiler treats as a Unicode string. For example, the following code: CString strName = _T("Name"); is translated as a Unicode string if _UNICODE is defined or as an ANSI string if not. For more information, see the article Strings: Unicode and Multibyte Character Set (MBCS) Support. A CString object can store up to INT_MAX (2,147,483,647) characters. The TCHAR data type is used to get or set individual characters inside a CString object. Unlike character arrays, the CString class has a built-in memory allocation capability. This allows CString objects to automatically grow as needed (that is, you don’t have to worry about growing a CString object to fit longer strings).
赵4老师 2015-03-20
  • 打赏
  • 举报
回复
GetLength Returns the number of characters in a CString object. For multibyte characters, counts each 8-bit character; that is, a lead and trail byte in one multibyte character are counted as two characters.
笨笨仔 2015-03-20
  • 打赏
  • 举报
回复
不会运行试试? 因为与编码方式有关,因此你的程序估计玩不转

16,371

社区成员

发帖
与我相关
我的任务
社区描述
VC/MFC相关问题讨论
社区管理员
  • 基础类社区
  • Web++
  • encoderlee
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告

        VC/MFC社区版块或许是CSDN最“古老”的版块了,记忆之中,与CSDN的年龄几乎差不多。随着时间的推移,MFC技术渐渐的偏离了开发主流,若干年之后的今天,当我们面对着微软的这个经典之笔,内心充满着敬意,那些曾经的记忆,可以说代表着二十年前曾经的辉煌……
        向经典致敬,或许是老一代程序员内心里面难以释怀的感受。互联网大行其道的今天,我们期待着MFC技术能够恢复其曾经的辉煌,或许这个期待会永远成为一种“梦想”,或许一切皆有可能……
        我们希望这个版块可以很好的适配Web时代,期待更好的互联网技术能够使得MFC技术框架得以重现活力,……

试试用AI创作助手写篇文章吧