|Table of Contents|

[1] Cao Jiuxin, Dong Dan, Mao Bo, et al. Phishing detection method based on URL features [J]. Journal of Southeast University (English Edition), 2013, 29 (2): 134-138. [doi:10.3969/j.issn.1003-7985.2013.02.005]
Copy

Phishing detection method based on URL features()
基于URL特征的Phishing检测方法
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
29
Issue:
2013 2
Page:
134-138
Research Field:
Computer Science and Engineering
Publishing date:
2013-06-20

Info

Title:
Phishing detection method based on URL features
基于URL特征的Phishing检测方法
Author(s):
Cao Jiuxin1 2 Dong Dan1 2 Mao Bo3 Wang Tianfeng1 2
1School of Computer Science and Engineering, Southeast University, Nanjing 211189, China
2Key Laboratory of Computer Network and Information Integration of Ministry of Education, Southeast University, Nanjing 211189, China
3Jiangsu Provincial Key Laboratory of E-Business, Nanjing University of Finance and Economics, Nanjing 210003, China
曹玖新1 2 董丹1 2 毛波3 王田峰1 2
1东南大学计算机科学与工程学院, 南京 211189; 2东南大学网络和信息集成教育部重点实验室, 南京 211189; 3南京财经大学江苏省电子商务重点实验室, 南京 210003
Keywords:
uniform resource locator(URL)features phishing detection support vector machine incremental learning
URL特征 phishing检测 支持向量机 增量学习
PACS:
TP393
DOI:
10.3969/j.issn.1003-7985.2013.02.005
Abstract:
In order to effectively detect malicious phishing behaviors, a phishing detection method based on the uniform resource locator(URL)features is proposed. First, the method compares the phishing URLs with legal ones to extract the features of phishing URLs. Then a machine learning algorithm is applied to obtain the URL classification model from the sample data set training. In order to adapt to the change of a phishing URL, the classification model should be constantly updated according to the new samples. So, an incremental learning algorithm based on the feedback of the original sample data set is designed. The experiments verify that the combination of the URL features extracted in this paper and the support vector machine(SVM)classification algorithm can achieve a high phishing detection accuracy, and the incremental learning algorithm is also effective.
为了有效检测恶意网络钓鱼(phishing)行为, 提出一种基于URL特征的phishing检测方法.该方法首先对现有钓鱼URL与合法URL进行分析对比, 提取钓鱼URL的显著特征, 然后采用机器学习算法对样本数据集训练从而获得分类检测模型, 用来检测待检测的URL.为适应钓鱼URL的变化, 分类模型需要根据新增样本不断更新, 因此, 设计了一种基于原始样本数据反馈的增量学习算法.实验表明:提取的URL特征与支持向量机(SVM)分类算法的结合能够使phishing检测达到较高的检测精度, 且该增量学习算法是有效的.

References:

[1] Wikipedia. Phishing[EB/OL].(2013-04-20)[2013-04-27]. http://en.wikipedia.org/wiki/Phishing.
[2] Anti-Phishing Working Group. Phishing activity trends report [EB/OL].(2012-10-17)[2013-03-16]. http://docs.apwg.org/reports/apwg_trends_report_q2_2012.pdf.
[3] Chandrasekaran M, Narayanan K, Upadhyaya S. Phishing email detection based on structural properties[C]//NYS Cyber Security Conference. New York, USA, 2006: 2-8.
[4] Zhang Y, Hong J I, Cranor L F. Cantina: a content-based approach to detecting phishing web sites[C]//16th International World Wide Web Conference. Banff, Alberta, Canada, 2007:639-648.
[5] Fu A Y, Liu W Y, Deng X T. Detecting phishing web pages with visual similarity assessment based on earth mover’s distance(EMD)[J]. IEEE Transactions on Dependable and Secure Computing, 2006, 3(4): 301-311.
[6] Cao Jiuxin, Mao Bo, Luo Junzhou, et al. A phishing web pages detection algorithm based on nested structure of earth mover’s distance(nested-EMD)[J]. Chinese Journal of Computers, 2009, 32(5): 922-929.(in Chinese)
[7] Garera S, Provos N, Chew M, et al. A framework for detection and measurement of phishing attacks[C]//Proceedings of the 2007 ACM Workshop on Recurring Malcode. Alexandria, VA, USA, 2007: 1-8.
[8] Smith T F, Waterman M S. Identification of common molecular subsequences [J]. Journal of Molecular Biology, 1981, 147(1): 195-197.
[9] Chang C C, Lin C J. LIBSVM: a library for support vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 1-27.
[10] Domeniconi C, Gunopulos D. Incremental support vector machine construction[C]//Proceedings of IEEE International Conference on Data Mining. San Jose, CA, USA, 2001: 589-592.
[11] Syed N A, Liu H, Sung K K. Incremental learning with support vector machines[C]//Proceedings of the Workshop on Support Vector Machines at the International Joint Conference on Artificial Intelligence. Stockholm, Sweden, 1999: 876-892.
[12] Wang W J. A redundant incremental learning algorithm for SVM[C]//Proceedings of the 7th International Conference on Machine Learning and Cybernetics. Kunming, China, 2008: 734-738.

Memo

Memo:
Biography: Cao Jiuxin(1967—), male, doctor, professor, jx.cao@seu.edu.cn.
Foundation items: The National Basic Research Program of China(973 Program)(No.2010CB328104, 2009CB320501), the National Natural Science Foundation of China(No.61272531, 61070158, 61003257, 61060161, 61003311, 41201486), the National Key Technology R& D Program during the 11th Five-Year Plan Period(No.2010BAI88B03), Specialized Research Fund for the Doctoral Program of Higher Education(No.20110092130002), the National Science and Technology Major Project(No.2009ZX03004-004-04), the Foundation of the Key Laboratory of Network and Information Security of Jiangsu Province(No.BM2003201), the Key Laboratory of Computer Network and Information Integration of the Ministry of Education of China(No.93K-9).
Citation: Cao Jiuxin, Dong Dan, Mao Bo, et al. Phishing detection method based on URL features[J].Journal of Southeast University(English Edition), 2013, 29(2):134-138.[doi:10.3969/j.issn.1003-7985.2013.02.005]
Last Update: 2013-06-20