html转义字符如何通过代码识别_第1页
已阅读5页,还剩17页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、html转义字符如何通过代码识别间或会在数据中看到诸如&39; 这样的字符,特征如下以&开始,中间是一串数字,以;结尾以&开始,中间一串字符,以;结尾比如最频繁的 或者等价的&160;扫瞄器碰到这些转义符,会转义回归,但如何通过代码识别? mons.lang.stringescapeutils.unescapehtml提供了很好的解释碰到上面的第一种状况,中间是数字的,挺直将数字(unicode)转为char碰到其次状况,中间是字符,只能查映射表了,从映射表中找到字符对应的数字再转换为char 看看代码就

2、一目了然了看看html40如何定义的 复制代码代码如下:static html40 = new entities();fillwithhtml40entities(html40);static void fillwithhtml40entities(entities entities) entities.addentities(basic_array);entities.addentities(iso8859_1_array);entities.addentities(html40_array);再看看basic_array、iso8859_1_array、html40_array 分离是什么

3、basic_array复制代码代码如下:private static final string basic_array = "quot", "34", / " - double-quote"amp", "38", / & - ampersand"lt", "60", / - greater-than;iso8859_1_array复制代码代码如下:

4、static final string iso8859_1_array = "nbsp", "160", / non-breaking space"iexcl", "161", / inverted exclamation mark"cent", "162", / cent sign"pound", "163"

5、, / pound sign"curren", "164", / currency sign"yen", "165", / yen sign = yuan sign"brvbar", "166", / broken bar = broken vertical bar"sect", "167", / sectio

6、n sign"uml", "168", / diaeresis = spacing diaeresis"copy", "169", / - copyright sign"ordf", "170", / feminine ordinal indicator"laquo", "171", / left-pointi

7、ng double angle quotation mark = left pointing guillemet"not", "172", / not sign"shy", "173", / soft hyphen = discretionary hyphen"reg", "174", / - registered trademark sign"macr&

8、quot;, "175", / macron = spacing macron = overline = apl overbar"deg", "176", / degree sign"plusmn", "177", / plus-minus sign = plus-or-minus sign"sup2", "178", / superscript

9、 two = superscript digit two = squared"sup3", "179", / superscript three = superscript digit three = cubed"acute", "180", / acute accent = spacing acute"micro", "181", / micro sign"pa

10、ra", "182", / pilcrow sign = paragraph sign"middot", "183", / middle dot = georgian comma = greek middle dot"cedil", "184", / cedilla = spacing cedilla"sup1", "185",

11、 / superscript one = superscript digit one"ordm", "186", / masculine ordinal indicator"raquo", "187", / right-pointing double angle quotation mark = right pointing guillemet"frac14", "188", /

12、vulgar fraction one quarter = fraction one quarter"frac12", "189", / vulgar fraction one half = fraction one half"frac34", "190", / vulgar fraction three quarters = fraction three quarters"iquest", "19

13、1", / inverted question mark = turned question mark"agrave", "192", / - uppercase a, grave accent"aacute", "193", / - uppercase a, acute accent"acirc", "194", / - uppercase a, circumf

14、lex accent"atilde", "195", / - uppercase a, tilde"auml", "196", / - uppercase a, umlaut"aring", "197", / - uppercase a, ring"aelig", "198", / - uppercase ae

15、"ccedil", "199", / - uppercase c, cedilla"egrave", "200", / - uppercase e, grave accent"eacute", "201", / - uppercase e, acute accent"ecirc", "202", / - upp

16、ercase e, circumflex accent"euml", "203", / - uppercase e, umlaut"igrave", "204", / - uppercase i, grave accent"iacute", "205", / - uppercase i, acute accent"icirc", &quo

17、t;206", / - uppercase i, circumflex accent"iuml", "207", / - uppercase i, umlaut"eth", "208", / - uppercase eth, icelandic"ntilde", "209", / - uppercase n, tilde"ograve&q

18、uot;, "210", / - uppercase o, grave accent"oacute", "211", / - uppercase o, acute accent"ocirc", "212", / - uppercase o, circumflex accent"otilde", "213", / - uppercase o, ti

19、lde"ouml", "214", / - uppercase o, umlaut"times", "215", / multiplication sign"oslash", "216", / - uppercase o, slash"ugrave", "217", / - uppercase u, grave

20、 accent"uacute", "218", / - uppercase u, acute accent"ucirc", "219", / - uppercase u, circumflex accent"uuml", "220", / - uppercase u, umlaut"yacute", "221"

21、, / - uppercase y, acute accent"thorn", "222", / - uppercase thorn, icelandic"szlig", "223", / - lowercase sharps, german"agrave", "224", / - lowercase a, grave accent"aacute&quot

22、;, "225", / - lowercase a, acute accent"acirc", "226", / - lowercase a, circumflex accent"atilde", "227", / - lowercase a, tilde"auml", "228", / - lowercase a, umlaut&quo

23、t;aring", "229", / - lowercase a, ring"aelig", "230", / - lowercase ae"ccedil", "231", / - lowercase c, cedilla"egrave", "232", / - lowercase e, grave accent&quo

24、t;eacute", "233", / - lowercase e, acute accent"ecirc", "234", / - lowercase e, circumflex accent"euml", "235", / - lowercase e, umlaut"igrave", "236", / - lowercase

25、 i, grave accent"iacute", "237", / - lowercase i, acute accent"icirc", "238", / - lowercase i, circumflex accent"iuml", "239", / - lowercase i, umlaut"eth", "240&

26、;quot;, / - lowercase eth, icelandic"ntilde", "241", / - lowercase n, tilde"ograve", "242", / - lowercase o, grave accent"oacute", "243", / - lowercase o, acute accent"ocirc"

27、, "244", / - lowercase o, circumflex accent"otilde", "245", / - lowercase o, tilde"ouml", "246", / - lowercase o, umlaut"divide", "247", / division sign"oslash&q

28、uot;, "248", / - lowercase o, slash"ugrave", "249", / - lowercase u, grave accent"uacute", "250", / - lowercase u, acute accent"ucirc", "251", / - lowercase u, circumflex acc

29、ent"uuml", "252", / - lowercase u, umlaut"yacute", "253", / - lowercase y, acute accent"thorn", "254", / - lowercase thorn, icelandic"yuml", "255", / - lowe

30、rcase y, umlaut;html40_array复制代码代码如下:static final string html40_array = /"fnof", "402", / latin small f with hook = function= florin, u+0192 isotech ->/"alpha", "913", / greek capital letter alpha, u+0391 ->"beta&

31、amp;quot;, "914", / greek capital letter beta, u+0392 ->"gamma", "915", / greek capital letter gamma,u+0393 isogrk3 ->"delta", "916", / greek capital letter delta,u+0394 isogrk3 ->"epsilon&

32、;quot;, "917", / greek capital letter epsilon, u+0395 ->"zeta", "918", / greek capital letter zeta, u+0396 ->"eta", "919", / greek capital letter eta, u+0397 ->"theta", "920&

33、amp;quot;, / greek capital letter theta,u+0398 isogrk3 ->"iota", "921", / greek capital letter iota, u+0399 ->"kappa", "922", / greek capital letter kappa, u+039a ->"lambda", "923", /

34、 greek capital letter lambda,u+039b isogrk3 ->"mu", "924", / greek capital letter mu, u+039c ->"nu", "925", / greek capital letter nu, u+039d ->"xi", "926", / greek capital letter xi,

35、 u+039e isogrk3 ->"omicron", "927", / greek capital letter omicron, u+039f ->"pi", "928", / greek capital letter pi, u+03a0 isogrk3 ->"rho", "929", / greek capital letter rho, u+03a1

36、->/"sigma", "931", / greek capital letter sigma,u+03a3 isogrk3 ->"tau", "932", / greek capital letter tau, u+03a4 ->"upsilon", "933", / greek capital letter upsilon,u+03a5 isogrk3 -&g

37、t;"phi", "934", / greek capital letter phi,u+03a6 isogrk3 ->"chi", "935", / greek capital letter chi, u+03a7 ->"psi", "936", / greek capital letter psi,u+03a8 isogrk3 ->"omeg

38、a", "937", / greek capital letter omega,u+03a9 isogrk3 ->"alpha", "945", / greek small letter alpha,u+03b1 isogrk3 ->"beta", "946", / greek small letter beta, u+03b2 isogrk3 ->"gamma&

39、amp;quot;, "947", / greek small letter gamma,u+03b3 isogrk3 ->"delta", "948", / greek small letter delta,u+03b4 isogrk3 ->"epsilon", "949", / greek small letter epsilon,u+03b5 isogrk3 ->"zeta&

40、amp;quot;, "950", / greek small letter zeta, u+03b6 isogrk3 ->"eta", "951", / greek small letter eta, u+03b7 isogrk3 ->"theta", "952", / greek small letter theta,u+03b8 isogrk3 ->"iota&quo

41、t;, "953", / greek small letter iota, u+03b9 isogrk3 ->"kappa", "954", / greek small letter kappa,u+03ba isogrk3 ->"lambda", "955", / greek small letter lambda,u+03bb isogrk3 ->"mu",

42、"956", / greek small letter mu, u+03bc isogrk3 ->"nu", "957", / greek small letter nu, u+03bd isogrk3 ->"xi", "958", / greek small letter xi, u+03be isogrk3 ->"omicron", "959

43、", / greek small letter omicron, u+03bf new ->"pi", "960", / greek small letter pi, u+03c0 isogrk3 ->"rho", "961", / greek small letter rho, u+03c1 isogrk3 ->"sigmaf", "962",

44、 / greek small letter final sigma,u+03c2 isogrk3 ->"sigma", "963", / greek small letter sigma,u+03c3 isogrk3 ->"tau", "964", / greek small letter tau, u+03c4 isogrk3 ->"upsilon", "965&quot

45、;, / greek small letter upsilon,u+03c5 isogrk3 ->"phi", "966", / greek small letter phi, u+03c6 isogrk3 ->"chi", "967", / greek small letter chi, u+03c7 isogrk3 ->"psi", "968", / gree

46、k small letter psi, u+03c8 isogrk3 ->"omega", "969", / greek small letter omega,u+03c9 isogrk3 ->"thetasym", "977", / greek small letter theta symbol,u+03d1 new ->"upsih", "978", / gr

47、eek upsilon with hook symbol,u+03d2 new ->"piv", "982", / greek pi symbol, u+03d6 isogrk3 ->/"bull", "8226", / bullet = black small circle,u+2022 isopub ->/"hellip", "8230", / horizon

48、tal ellipsis = three dot leader,u+2026 isopub ->"prime", "8242", / prime = minutes = feet, u+2032 isotech ->"prime", "8243", / double prime = seconds = inches,u+2033 isotech ->"oline", "8254&a

49、mp;quot;, / overline = spacing overscore,u+203e new ->"frasl", "8260", / fraction slash, u+2044 new ->/"weierp", "8472", / script capital p = power set= weierstrass p, u+2118 isoamso ->"image", &qu

50、ot;8465", / blackletter capital i = imaginary part,u+2111 isoamso ->"real", "8476", / blackletter capital r = real part symbol,u+211c isoamso ->"trade", "8482", / trade mark sign, u+2122 isonum ->"alef

51、sym", "8501", / alef symbol = first transfinite cardinal,u+2135 new ->/"larr", "8592", / leftwards arrow, u+2190 isonum ->"uarr", "8593", / upwards arrow, u+2191 isonum->"rarr&quo

52、t;, "8594", / rightwards arrow, u+2192 isonum ->"darr", "8595", / downwards arrow, u+2193 isonum ->"harr", "8596", / left right arrow, u+2194 isoamsa ->"crarr", "8629&quo

53、t;, / downwards arrow with corner leftwards= carriage return, u+21b5 new ->"larr", "8656", / leftwards double arrow, u+21d0 isotech ->/"uarr", "8657", / upwards double arrow, u+21d1 isoamsa ->"rarr", &

54、amp;quot;8658", / rightwards double arrow,u+21d2 isotech ->/"darr", "8659", / downwards double arrow, u+21d3 isoamsa ->"harr", "8660", / left right double arrow,u+21d4 isoamsa ->/"forall", &am

55、p;quot;8704", / for all, u+2200 isotech ->"part", "8706", / partial differential, u+2202 isotech ->"exist", "8707", / there exists, u+2203 isotech ->"empty", "8709", / empty

56、set = null set = diameter,u+2205 isoamso ->"nabla", "8711", / nabla = backward difference,u+2207 isotech ->"isin", "8712", / element of, u+2208 isotech ->"notin", "8713", / not an ele

57、ment of, u+2209 isotech ->"ni", "8715", / contains as member, u+220b isotech ->/"prod", "8719", / n-ary product = product sign,u+220f isoamsb ->/"sum", "8721", / n-ary summation, u+22

58、11 isoamsb ->/"minus", "8722", / minus sign, u+2212 isotech ->"lowast", "8727", / asterisk operator, u+2217 isotech ->"radic", "8730", / square root = radical sign,u+221a isotech ->

59、;"prop", "8733", / proportional to, u+221d isotech ->"infin", "8734", / infinity, u+221e isotech ->"ang", "8736", / angle, u+2220 isoamso ->"and", "8743&

60、;quot;, / logical and = wedge, u+2227 isotech ->"or", "8744", / logical or = vee, u+2228 isotech ->"cap", "8745", / intersection = cap, u+2229 isotech ->"cup", "8746", / union = cup,

61、u+222a isotech ->"int", "8747", / integral, u+222b isotech ->"there4", "8756", / therefore, u+2234 isotech ->"sim", "8764", / tilde operator = varies with = similar to,u+223c isotech

62、->/"cong", "8773", / approximately equal to, u+2245 isotech ->"asymp", "8776", / almost equal to = asymptotic to,u+2248 isoamsr ->"ne", "8800", / not equal to, u+2260 isotech ->&am

63、p;quot;equiv", "8801", / identical to, u+2261 isotech ->"le", "8804", / less-than or equal to, u+2264 isotech ->"ge", "8805", / greater-than or equal to,u+2265 isotech ->"sub&quot

64、;, "8834", / subset of, u+2282 isotech ->"sup", "8835", / superset of, u+2283 isotech ->/"sube", "8838", / subset of or equal to, u+2286 isotech ->"supe", "8839", /

65、superset of or equal to,u+2287 isotech ->"oplus", "8853", / circled plus = direct sum,u+2295 isoamsb ->"otimes", "8855", / circled times = vector product,u+2297 isoamsb ->"perp", "8869&quo

66、t;, / up tack = orthogonal to = perpendicular,u+22a5 isotech ->"sdot", "8901", / dot operator, u+22c5 isoamsb ->/"lceil", "8968", / left ceiling = apl upstile,u+2308 isoamsc ->"rceil", "8969&a

67、mp;quot;, / right ceiling, u+2309 isoamsc ->"lfloor", "8970", / left floor = apl downstile,u+230a isoamsc ->"rfloor", "8971", / right floor, u+230b isoamsc ->"lang", "9001", / left-pointing angle bracket = bra,u+2329 isotech ->/"rang", "9002", / right-pointing angle bracket = ket,u+232a isotech ->/"loz", "9674", / lozenge, u+25ca isopub -

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论