|Table of Contents|

[1] Cao Jiuxin, Wang Tianfeng, Shi Lili, Mao Bo, et al. Architecture and algorithm for web phishing detection [J]. Journal of Southeast University (English Edition), 2010, 26 (1): 43-47. [doi:10.3969/j.issn.1003-7985.2010.01009]
Copy

Architecture and algorithm for web phishing detection()
一种网络钓鱼检测的体系结构及算法
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
26
Issue:
2010 1
Page:
43-47
Research Field:
Computer Science and Engineering
Publishing date:
2010-03-30

Info

Title:
Architecture and algorithm for web phishing detection
一种网络钓鱼检测的体系结构及算法
Author(s):
Cao Jiuxin1 Wang Tianfeng1 Shi Lili1 Mao Bo2
1School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
2School of Architecture and Built Environment, Royal Institute of Technology, Stockholm SE-10044, Sweden
曹玖新1 王田峰1 时莉莉1 毛波2
1东南大学计算机科学与工程学院, 南京210096; 2瑞典皇家理工大学结构与建筑环境学院, 斯德哥尔摩SE-10044
Keywords:
phishing detection image similarity attributed relational graph inner EMD outer EMD
钓鱼检测 图像相似度 特征关系图 内部EMD 外部EMD
PACS:
TP393
DOI:
10.3969/j.issn.1003-7985.2010.01009
Abstract:
A phishing detection system, which comprises client-side filtering plug-in, analysis center and protected sites, is proposed. An image-based similarity detection algorithm is conceived to calculate the similarity of two web pages. The web pages are first converted into images, and then divided into sub-images with iterated dividing and shrinking. After that, the attributes of sub-images including color histograms, gray histograms and size parameters are computed to construct the attributed relational graph(ARG)of each page. In order to match two ARGs, the inner earth mover’s distances(EMD)between every two nodes coming from each ARG respectively are first computed, and then the similarity of web pages by the outer EMD between two ARGs is worked out to detect phishing web pages. The experimental results show that the proposed architecture and algorithm has good robustness along with scalability, and can effectively detect phishing.
提出了一个网络钓鱼防范系统, 该系统由客户端过滤插件、后台分析中心和受保护网站3个逻辑组件构成. 设计了一个基于图像的网页相似度检测算法, 该算法首先将被检测网页转换为图像格式, 然后采用迭代分割和收缩算法将原始图像划分为一组子图像集合, 在计算子图像颜色直方图、灰度直方图以及大小参数的基础上, 构建被检测网页的特征关系图(ARG), 计算ARG之间的内部EMD距离, 并通过计算2个网页ARG之间的外部EMD距离来标示网页之间的相似度, 最终通过对不同网页之间相似度的分析检测出钓鱼网站. 实验结果显示所提出的体系结构与算法具有良好的鲁棒性和可扩展性, 可对钓鱼网页进行更加有效的检测.

References:

[1] APWG. Phishing attack trends report, 2nd half/2008 [EB/OL].(2009-03-17)[2009-05-01].http: //www.antiphishing.org/reports/apwg-report-H2-2008.pdf.
[2] Bank of America. Page of privacy & security[EB/OL].(2009-04-01)[2009-05-01].http: //www.bankofamerica.com/privacy/index.cfm?template=sitekey.
[3] Dhamija R, Tygar J D. The battle against phishing: dynamic security skins [C]//Proceedings of the Symposium on Usable Privacy and Security. Pittsburgh, PA, USA, 2005: 77-88.
[4] Dhamija R, Tygar J D. Phish and hips: human interactive proofs to detect phishing attack [C]// Second International Workshop, HIP 2005. Bethlehem, PA, USA, 2005: 127-141.
[5] Inomata A, Rahman S, Okamoto T, et al. A novel mail filtering method against phishing [C]// Proceedings of the IEEE Pacific Rim Conference on Communications, Computers and Signal Processing. Victoria, BC, Canada, 2005: 221-224.
[6] Chandrasekaran M, Chinchani R, Upadhyaya S. PHONEY: mimicking user response to detect phishing attacks [C]//Proceedings of the International Symposium on World of Wireless, Mobile and Multimedia Networks. Buffalo-Niagara Falls, NY, USA, 2006: 668-672.
[7] Choi Daeseon, Jin Seunghun, Yoon Hyunsoo. A method for preventing the leakage of the personal information on the Internet [C]//The 8th International Conference on Advanced Communication Technology. Phoenix Park, Korea, 2006, 2: 1194-1198.
[8] Liu Wenyin, Huang Guanglin, Liu Xiaoyue, et al. Phishing web page detection [C]//Proceedings of the Eighth International Conference on Document Analysis and Recognition. Seoul, Korea, 2005, 2: 560-564.
[9] Fu Anthony Y, Liu Wenyin, Deng Xiaotie. Detecting phishing web pages with visual similarity assessment based on earth mover’s distance(EMD)[J]. IEEE Trans on Dependable and Secure Computing, 2006, 3(4):301-311.
[10] Cordero A, Blain T. Catching phish: detecting phishing attacks from rendered website images[R]. Berkeley, CA: University of California, 2006.
[11] Cortes C, Vapnik V. Support-vector networks [J]. Machine Learning, 1995, 20(3): 273-297.
[12] Pan Ying, Ding Xuhua. Anomaly based web phishing page detection[C]//The 22nd Annual Computer Security Applications Conference. Miami Beach, FL, USA, 2006: 381-392.
[13] Nagy G, Seth S, Stoddard S D. Document analysis with an expert system [C]//Proceedings of the 7th International Conference on Pattern Recognition in Practice. Paris, France, 1986: 19-21.
[14] Kim Duck Hoon, Yun Il Dong, Lee Sang Uk. A new attributed relational graph matching algorithm using the nested structure of earth mover’s distance[C]//Proceedings of the 17th International Conference on Pattern Recognition. Cambridge, UK, 2004: 48-51.

Memo

Memo:
Biography: Cao Jiuxin(1967—), male, doctor, associate professor, jx.cao@seu.edu.cn.
Foundation items: The National Basic Research Program of China(973 Program)(2010CB328104, 2009CB320501), the National Natural Science Foundation of China(No.60773103, 90912002), Specialized Research Fund for the Doctoral Program of Higher Education(No.200802860031), Key Laboratory of Computer Network and Information Integration of Ministry of Education of China(No.93K-9).
Citation: Cao Jiuxin, Wang Tianfeng, Shi Lili, et al. Architecture and algorithm for web phishing detection[J]. Journal of Southeast University(English Edition), 2010, 26(1): 43-47.
Last Update: 2010-03-20