




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
先说说写这个的背景最近有朋友在搞一个东西,已经做的挺不错了,最后想再完美一点,于是乎就提议把这种验证码给K.O.了,于是乎就K.O.了这个验证码。达到单个图片识别时间小于200ms,500个样本人工统计正确率为95%。由于本人没有相关经验,是摸着石头过河。本着经验分享的精神,分享一下整个分析的思路。在各位大神面前献丑了。再看看部分识别结果8FAE2扪QOQO-4MPQ-4MPOO1--8FAE-.8FAE.jpOT2-YG^H-¥G9H.DD3-MYAV-MYA004-2B7Q-2B7Q.JQ-jipggjpgVjpgpgx3nC况讥G4叱H£4S83EJW5-X3NC-X3NC.O06-RWTU-FLWT007-GAWQ-GAWOTS-HE45-HE45.J009-83EJ-B.3EJ.jpgJpgUjpqQjpgP9ekcg45B3P3K7F^701O-EKCG-EKC&,jD11-45B3-45Bl.jp012-刿HZ9¥4Zj013-PSKT-PSKT.J014-FWA7-FWA7,P9gpgP9jpgJG2FP4CBxMS巧016-Q4PD-Q4PD.D17-G2FP-G2FP.J01S-4CBX-4CBX.Jfll^-MSTT-MSTT.9jpgP9P9jpg94RY6^vXTKPDyUC.OiO-WRV-'MRV.jp021-&BAV-&BAV.J022-NPNC-NPN024灯KP-XTKP.j024-DVU匚-叽C,gP9c-jP9pgjpgSBUEF浮V9/3b025-RVWS-RVW5.OQfr-SBUE-SBUE.j027-VKWK-VKWK028-FQ9V-F-Q9V.J029-9J3U-gJ5U.jp」pgpg■jpgpgg7^76PV7B85DGU便(13fl-7KYT-7KYT.j051-B7GG-B7GG.jD52-6PVT-&PVT.jCG3-B35D-B35D.J0B4-GUKS-GUIC3.jpgpgpgpgpg是不是看着很眼熟?处理第一步,去背景噪音和二值化对于这一块,考虑了几种方法。方法一,统计图片颜色分布,颜色占有率低的判定为背景噪音。由于背景噪音和前景色区分并不明显,尝试了很多种取景方法都不能很好去除背景噪音,最终放弃了这种方法。方法二,事后在网上稍微查了下,最近比较流行计算灰度后设定一个阈值进行二值化。其实所谓的灰度图片原理是根据人眼对色彩敏感度取了权值,这个权值对计算机来说没有什么意义。稍微想一下就可以发现,这两个过程完全可以合并。于是乎我一步完成了去背景噪音和二值化。阈值设置为RGB三分量之和到500。结果非常令人满意。
OMpQ8FAEYG%2^7qDDO.bmpD01.bmpOOZ.bmpC03-.bmpOD4.bmpx3nCG4^HE4S83EJD05.bmp005.bmp007.bmpOO&.bmpOO9.bmpE©45B39丫皿p3KjFJ07D10.bmpD11.bmp012.bmpD13-.bmpD14.bmpJB电/PEG2FP4CBxMSTjD15.bmpD16.bmp017.bmpD18.bmpD19.bmp94RY帚2xtkpdyuc020.bmp021.bmp022.bmpO23-.bmp024.bmpfiyTSSSBUEF汐V9j3b025.bmp026.bmpC27.bmp028.bmp029.bmp7^7B7G06PV7B85DGU建D30.bmpD31.bmpQ3Z.bmpD33-.bmpQ34.bmp549q^U535J4QD35.bmpD36.bmpQ37.bmp038.bmpQ39.bmpBeF^叱8E^9PAP040.bmp041.bmpO4Z.bmpMi.brripO44.bmp处理第二步,制作字符样本样本对于计算机来说是非常重要的,因为计算机很难有逻辑思维,就算有逻辑思维也要经过长期训练才能让你满意。所以要用事先制作好的样本进行比较。如果你仔细观察过这些验证码会发现一个bug,几乎大部分的验证码都是使用同样的字体,于是乎就人工制作了一套字体的样本。由于上一步已经有去除背景噪音的结果,可以直接利用。制作样本这一步有点简单枯燥,还需要细心。可能因为你的一个不细心会导致某个符号的识别率偏低。在这500个样本中,只发现了31个字符。幸亏是某部门的某人员还考虑到了易错的字符,例如,1和I,0和O等。要不然这个某部门要背负更多的骂名。处理第三步,匹配单个匹配用了最简单最原始的二值比较,不过匹配的是匹配率而不是匹配数。我定义了相关的计分原则。大原则是“该有的有了加分,该有的没了减分,不该有的有了适度减分,可达区域外的不算分”。由于一些符号的部分区域匹配结果跟另一些符号的完整匹配结果相似,需要把单个匹配在一个扩大的区域择优。在一定的围,找到一个最佳匹配,这个最佳匹配就是当前位置对应的符号。完成了一次最佳匹配,可以把匹配位置向右推进一大步,若找不到合适的最佳匹配就向右推进一小步。处理第四步,优化和调整任何一个算法都是需要优化和调整的。现在要找到最佳参数配置和最佳代码组织这一步往往是需要花费最多时间和精力的。处理第五步,验证结果这一步呢,纯人力验证结果,统计出正确率。思考结果是出来了,代码也不多,效果也很理想。搞这一行的,很多时候都想要通用的能否通用,很大程度上在于抽象层次。本方法只是单纯的匹配,自然不能通用,但是方法和思想却是通用的。具体案例具体分析。至于扭曲文字、空心文字等,处理要复杂的多。网上也有一些使用第三方图像库的方法,也许那些方法会比较通用。等有空了有兴趣了继续搞一下这个主题。源码至于这个源码要不要发布,纠结了一段时间。网上已经有类似的商业活动了,而且这个识别本身没有太大难度,再加上某系统天生的bug,此验证码本身就相当于没有设置,因此发布此代码,仅作于学习交流。+ViewCode?1234567891011121314151617123456789101112131415161718usingSystem.Collections.Generic;usingSystem.Drawing;usingSystem.IO;usingSystem.IO.Compression;namespaceCrackl2306Captcha{publicclassCracker{List<CharInfo>words_=newList<Charlnfo>();publicCracker。{0x00,0x00,0x04,0x00,0x97,0x2f,0xe1,0x58,varbytes=newbyte[]{0x1f,0x8b,0x08,0x00,0x00,0xc5,0x58,0xd9,0x92,0x13,0x31,0x0c,0x94,0x9e,0x93,0x0c,0xe0,0x91,0x9b,0x82,0x62,0x0b,0x58,0xee,0xff,0xff,0x10,0x00,0x61,0xd8,0xcc,0xc8,0xea,0x96,0x6c,0x8f,0x13,0x48,0xel,0xaa,0x4d,0x46,0x96,0x6d,0xb5,0x8e,0x96,0x67,0x73,0x7f,0x3b,0x09,0x0e,0x25,0x41,0x49,0xa3,0xae,0xd7,0x5b,0xa9,0xa8,0xd5,0xb4,0x76,0x02,0x6a,0x5c,0x52,0x94,0x54,0xed,0x18,0x5a,0x7f,0x18,0x00,0x00,TOC\o"1-5"\h\z0x84, 0x07, 0x1b, 0x80, 0x4a, 0x9a, 0x08, 0x35, 0xb8, 0x81,0x50,0xe7,0xad,0xbe,0xc4,0x8e,0xb1, 0x4f, 0x2d, 0x5f, 0xba, 0x80, 0xbb, 0xfd, 0x9a, 0xad,0x19,0x36,0xe5,0xad,0x87,0xf1,0x10, 0xc0, 0x8d, 0xc6, 0x50, 0x40, 0x52, 0xf8, 0xb3, 0x98,0x2c,0xd6,0xec,0x59,0xe7,0x0d,0x3e,0x0f,0x93,0x3e,0x1d,0x02,0x7a,0x18,0x8f,0xb6,0xc7,0x46,0x4e,0x01,0xa3,0x96,0xdc,0x3a,0x20,0x77,0xbf,0x2c,0x24,0xe4,0x80,0xa9,0x20,0x14,0xe5,0x2d,0xb5,0x68,0xc9,0x55,0x89,0x23,0x96,0x82,0xaa,0xba,0x58,0xa6,0x03,0x38,0x71,0x4b,0x29,0xd2,0x47,0x80,0xe3,0x84,0x91,0xf4,0x78,0x43,0x64,0x41,0x7b,0x73,0x99,0x80,0x42,0x48,0x00,0xde,0x00,0x12,0x88,0x80,0xdb,0x51,0x4a,0x49,0x84,0x43,0xf6,0x51,0x90,0x27,0x21,0xc9,0xf8,0xac,0x00,0x4d,0xcd,0x46,0x09,0x9d,0x15,0x78,0xe0,0x00,0x1e,0x44,0x2a,0x51,0x8c,0xbc,0xd3,0xa3,0x68,0x8a,0xd5,0x3a,0x20,0x79,0xba,0x4d,0x71,0x4c,0x0b,0x91,0x98,0x90,0x7b,0x2a,0x42,0xc5,0x78,0x7a,0xfc,0xd5,0x1b,0x4b,0x09,0xa7,0x27,0x99,0x38,0x05,0x01,0xc2,0x80,0x39,0x9c,0x67,0xbb,0x4e,0x7f,0x6c,0x33,0xdd,0xed,0x87,0x55,0xda,0x5d,0xb5,0x56,0x33,0xc6,0xf9,0xea,0x60,0x64,0xcf,0xa7,0x41,0xe0,0x5c,0x1c,0xc4,0xb2,0x25,0xa3,0x89,0x88,0x8d,0x16,0x00,0xb5,0xed,0xa5,0x22,0x9d,0x52,0x41,0x53,0x8d,0x92,0x7f,0x31,0x51,0x3f,0xa8,0x00,0x85,0x8a,0x71,0x10,0x92,0x78,0xc4,0x59,0x08,0x39,0x69,0xa9,0x38,0x41,0x48,0xf7,0x40,0x5a,0x03,0xd5,0x3a,0xf5,0xe5,0x9d,0x33,0x66,0xc3,0xd7,0x1f,0xef,0x94,0xa0,0x53,0xea,0xf4,0x15,0xb2,0x1c,0x40,0x2d,0xcf,0xaf,0xce,0xe9,0xd4,0x7a,0x89,0x09,0xe6,0xdd,0xdb,0x0e,0xb8,0x58,0xa7,0x60,0x37,0xfd,0xf2,0xfa,0x2c,0x4e,20212223242526272829303132333435363738394041424344454647484950515253545556575859606162630x51,0x87,0x0d,0xfc,0x16,0x72,6465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061070x2a,0x5f,0xc0,0x80,0xf0,0x54,0xa7,0xde,0xfc,0x15,0x8b,0x9a,0x36,0x3a,0x2c,0x62,0xfc,0xd4,0x8c,0x31,0xb7,0xea,0xd7,0x26,0xc4,0xaf,0x75,0xea,0xdb,0x8b,0xff,0x9b,0x9b,0x50,0x7e,0xfe,0x15,0xab,0x17,0x2f,0x96,0x96,0xbd,0xaa,0x87,0xdd,0x77,0xa3,0x77,0xd3,0x85,0xf0,0xe0,0x58,0xd5,0xf6,0x8c,0xcd,0xc4,0x63,0x52,0x12,0x48,0x46,0x0f,0x93,0x5a,0xe3,0xea,0x24,0x67,0x73,0x63,0xa0,0xdf,0xdf,0x3d,0x67,0xf6,0xa9,0xfc,0xed,0x08,0xe3,0x82,0x57,0x08,0x35,0x47,0x68,0x9c,0x01,0x40,0x87,0x8b,0xbd,0x0c,0xb3,0xf4,0xe1,0x72,0xd7,0x54,0x62,0xfd,0x40,0xed,0x99,0xa6,0x7e,0x2b,0xe4,0xb4,0xc4,0x62,0x0d,0x79,0xae,0x1b,0xd7,0xf4,0x09,0xb7,0xe1,0x7c,0x44,0x09,0x9a,0xda, 0xff, 0x52, 0x6a, 0x3c, 0xe1, 0xc8, 0xd7, 0xbd, 0xbb,0xbe,0x37,0xfc,0xd6,0xd5,0x4e,0x3c, 0x40, 0x2a, 0x4b, 0x39, 0x1a, 0xbd, 0x2a, 0xcd, 0xc1,0x18,0x59,0x40,0x62,0x78,0xec,0x63, 0x19, 0x72, 0xf0, 0xcf, 0xf8, 0x38, 0xfa, 0x42, 0x3a,0xc8,0x02,0xec,0x5b,0xeb,0x8d,0xae,0xf1,0x45,0xdd,0x32,0x98,0x35,0x3c,0x9f,0xa6,0x3d,0xce,0x13,0xce,0x94,0x38,0x87,0x00,0x8d,0x85,0xc4,0x70,0x17,0x26,0x0e,0xa6,0x1e,0x16,0xcb,0xbf,0x52,0xdf,0x29,0x63,0xc4,0xf6,0x8c,0x35,0xba,0xf2,0xf9,0x1f,0xbf,0x73,0x1f,0x91,0x1b,0x9e,0x24,0x5e,0x63,0x22,0x82,0x23,0x05,0x19,0xb9,0x71,0x73,0xdc,0xcf,0x05,0x88,0x94,0x71,0xdb,0xdd,0x48,0x10,0xd5,0x55,0xb3,0x52,0xc3,0x1b,0x01,0x94,0x13,0x74,0x94,0x3a,0x80,0x2f,0x39,0xe2,0x75,0x0e,0xf2,0xc6,0x18,0xdc,0x46,0xfc,0xf3,0xea,0x14,0x80, 0xc1, 0xce, 0x24, 0xee, 0x72, 0xed, 0x94, 0xaf, 0xfb,0xa9,0xaa,0x4a,0xe0,0xd4,0x22,0xc6, 0xf0, 0x57, 0x1d, 0x8e, 0xd2, 0x90, 0xc6, 0x0c, 0xd3,0x9a,0x53,0xfb,0xd6,0xb7,0xdd,0x14, 0xd4, 0xbd, 0x41, 0xa7, 0x80, 0x7b, 0x23, 0xfe, 0x34,0x56,0x0d,0x96,0x46,0x02,0xfe,0xfd,0xb2,0x00,0x5f,0x01,0x9c,0xa0,0x32,0x39,0xd7,0x90,0xc2,0x6c,0xc7,0x4e,0x68,0x88,0x7d,0x9f,0x9b,0xcf,0xa7,0xbe,0xa0,0xfc,0x18,0x7d,0x07,0x5b,0xa9,0xbe,0x56,
1080xlf,0x67,0x1a,0x4a,0x91,0x9c,0x04,0x38,0x53,0x6b,1090x70,0x68,0x8f,0xea,0xf4,0x34,1100x87,0x7f,0x6e,0x82,0xc3,0xc1,0xab,0x40,0xc4,0x50,1110x13,0x0e,0x33,0x5d,0x67,0x7d,1120x01,0x1f,0xdb,0xc0,0x7f,0xed,0x87,0x7f,0xbc,0x0f,1130x75,0xe0,0xa5,0xba,0xc0,0x84,1140x3d,0x24,0x04,0xe0,0xf1,0x16,0x41,0x3b,0x74,0xd2,1150x52,0xc5,0xf8,0x7c,0x12,0xfb,1160xe4,0x37,0x5b,0xfb,0x57,0x11,0xa1,0x18,0x00,0x00,117};118using(varstream=newMemoryStream(bytes))119using(vargzip=newGZipStream(stream.120CompressionMode.Decompress))121using(varreader=newBinaryReader(gzip))122{123while(true)124{125charch=126reader.ReadChar();127if(ch=='\0')128break;129intwidth=130reader.ReadByte();131intheight二132reader.ReadByte();133134bool[,]map=new135bool[width,height];136for(inti二0;i<width;137i++)138for(intj二0;j<139height;j++)140map[i,j]141=reader.ReadBoolean();142words_.Add(new143CharInfo(ch,map));144}145}146}147148publicstringRead(Bitmapbmp)149{150varresult二string.Empty;151varwidth=bmp.Width;
152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195varheight二bmp.Height;vartable=ToTable(bmp);varnext二SearchNext(table,-1);while(next<width-7){varmatched=Match(table,next);if(matched.Rate>0.6){result+=matched.Char;next二matched.X+10;}else{next+=1;}}returnresult;}privatebool[,]ToTable(Bitmapbmp){vartable=newbool[bmp.Width,bmp.Height];for(inti二0;i<bmp.Width;i++)for(intj=0;j<bmp.Height;j++){varcolor=bmp.GetPixel(i,j);table[i,j]=(color.R+color.G+color.B<500);}returntable;}privateintSearchNext(bool[,]table,intstart){varwidth=table.GetLength(0);varheight二table.GetLength(1);for(start++;start<width;start++)for(intj=0;j<height;j++)if(table[start,j])returnstart;
196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231returnstart;target,intx0,sourceHeight)bool[,]target.privatedoubleFixedMatch(bool[,]source,bool[,]inty0)target,intx0,sourceHeight)bool[,]target.{doubletotal=0;doublecount二0;inttargetWidth=target.GetLength(0);inttargetHeight二target.GetLength(l);intsourceWidth=source.GetLength(0);intsourceHeight二source.GetLength(l);intx,y;for(inti二0;i<targetWidth;i++){x=i+x0;if(x<0||x>=sourceWidth)continue;for(intj=0;j<targetHeight;j++){y=j+y0;if(y<0IIy>=continue;if(target[i,j]){total++;if(source[x,y])count++;elsecount——;}elseif(source[x,y])count一二0.55;}}returncount/total;}privateMatchedCharScopeMatch(bool[,]source,intstart
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 股市技术分析工具的应用考题及答案
- 2025年证券从业资格证考试分析与试题及答案
- 试题及答案:微生物行为学的研究
- 了解2025年特许金融分析师考试框架试题及答案
- 2024年数据库使用技巧试题及答案
- 重塑考生心态福建事业单位考试试题及答案
- 2024年项目管理资格认证的学习方法试题及答案
- 渔用饲料选购考核试卷
- 贵州景区防腐木施工方案
- 探讨高校辅导员的责任感与使命感试题及答案
- 人教版四年级上册数学【选择题】专项练习100题附答案
- 湖南省长沙市雨花区2023-2024学年八年级下学期期末考试历史试题(解析版)
- 空天地一体化算力网络资源调度机制
- 2024年计算机二级MS Office考试题库500题(含答案)
- DL∕T 846.11-2016 高电压测试设备通 用技术条件 第11部分:特高频局部放电检测仪
- 心理压力评分(PSS)问卷表
- CJJT177-2012 气泡混合轻质土填筑工程技术规程
- (高清版)JTGT 3374-2020 公路瓦斯隧道设计与施工技术规范
- 禁止强迫性劳工管理办法
- 国家开放大学《心理健康教育》形考任务1-9参考答案
- 火力发电厂热工自动化系统可靠性评估技术导则
评论
0/150
提交评论