Investigation of the Dark Web Illegal Activities using Data Mining Approach

Authors

  • Agus Pamuji Department of Islamic Counseling Guidance, Islamic State Religion Institute Sheikh Nurjati Cirebon, Indonesia

DOI:

https://doi.org/10.25008/bcsee.v4i1.1179

Keywords:

Dark web, Illegal activitites, Web crawling, Data mining, Classification classifier, Support Vector Classifier

Abstract

The rapid advancements in internet technology have opened up various avenues for illicit activities targeting users. These nefarious activities are carried out by anonymous individuals or groups, making identification and tracking a challenging task. Periodical updates to the content of the dark web are common, with alterations to concealed data often escaping detection. Consequently, the primary arduous tasks concerned the data mining framework and its impact on the classification accuracy with regards to illegal activities. In contemporary times, the constraint of dealing with considerations pertaining to the academia and the business environment has emerged as a crucial phenomenon. This paper encompasses an analysis of a web crawler designed for the dark web. The crawler is proficient not only in data collection and cleansing but also in storage, making use of a data-driven approach. Data mining is a potent technique that enables thorough investigation through exhaustive exploration of data, often revealing evidence of illicit activity. Consequently, the crawler has emerged as a focal point for enforcing automated classification of the amassed web pages into five distinct categories. The classification process involved the utilization of classifiers, specifically the Linear Support Vector Classifier (SVC) and Naïve Bayes (NB) for the categorization of pages. Furthermore, as per the probationary findings, the Support Vector Classifier (SVC) and the Naive Bayes (NB) algorithm demonstrated precision rates of 91% and 84%, respectively, in the presented sequence.

Downloads

Download data is not yet available.

Author Biography

  • Agus Pamuji, Department of Islamic Counseling Guidance, Islamic State Religion Institute Sheikh Nurjati Cirebon, Indonesia

     

     

References

N. Tavabi, N. Bartley, A. Abeliuk, S. Soni, E. Ferrara, and K. Lerman, “Characterizing activity on the deep and dark web,” Web Conf. 2019 - Companion World Wide Web Conf. WWW 2019, vol. 3, no. 2, pp. 206–213, 2019, doi: 10.1145/3308560.3316502.

A. Alharbi et al., “Exploring the Topological Properties of the Tor Dark Web,” IEEE Access, vol. 9, no. 1, pp. 21746–21758, 2021, doi: 10.1109/ACCESS.2021.3055532.

C. Wang, Q. Xu, X. Lin, and S. Liu, “Research on data mining of permissions mode for Android malware detection,” Cluster Comput., vol. 22, pp. 13337–13350, 2019, doi: 10.1007/s10586-018-1904-x.

A. Souri and R. Hosseini, “A state-of-the-art survey of malware detection approaches using data mining techniques,” Human-centric Comput. Inf. Sci., vol. 8, no. 1, 2018, doi: 10.1186/s13673-018-0125-x.

S. He, Y. He, and M. Li, “Classification of illegal activities on the dark web,” ACM Int. Conf. Proceeding Ser., vol. Part F1483, no. 5, pp. 73–78, 2019, doi: 10.1145/3322645.3322691.

J. H. Park, S. M. Yoo, I. S. Kim, and D. H. Lee, “Security Architecture for a Secure Database on Android,” IEEE Access, vol. 6, pp. 11482–11501, 2018, doi: 10.1109/ACCESS.2018.2799384.

A. E. H. Hor, G. Sohn, P. Claudio, M. Jadidi, and A. Afnan, “A semantic graph database for BIM-GIS integrated information model for an intelligent urban mobility web application,” ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci., vol. 4, no. 4, pp. 89–96, 2018, doi: 10.5194/isprs-annals-IV-4-89-2018.

A. Kumar, “Improving Database Security in Cloud Computing,” Int. J. Res. Appl. Sci. Eng. Technol., vol. 8, no. 8, pp. 607–611, 2020, doi: 10.22214/ijraset.2020.30962.

F. Thomaz, C. Salge, E. Karahanna, and J. Hulland, “Learning from the Dark Web: leveraging conversational agents in the era of hyper-privacy to enhance marketing,” J. Acad. Mark. Sci., vol. 48, no. 1, pp. 43–63, 2020, doi: 10.1007/s11747-019-00704-3.

R. J. S. Raj, M. V. Prakash, T. Prince, K. Shankar, V. Varadarajan, and F. Nonyelu, “Web Based Database Security in Internet of Things Using Fully Homomorphic Encryption and Discrete Bee Colony Optimization,” Malaysian J. Comput. Sci., vol. 2020, no. Special Issue 1, pp. 1–14, 2020, doi: 10.22452/mjcs.sp2020no1.1.

M. Y. Alyousef and N. T. Abdelmajeed, “Dynamically detecting security threats and updating a signature-based intrusion detection system’s database,” Procedia Comput. Sci., vol. 159, pp. 1507–1516, 2019, doi: 10.1016/j.procs.2019.09.321.

E. Kokolaki, E. Daskalaki, K. Psaroudaki, M. Christodoulaki, and P. Fragopoulou, “Investigating the dynamics of illegal online activity: The power of reporting, dark web, and related legislation,” Comput. Law Secur. Rev., vol. 38, p. 105440, 2020, doi: 10.1016/j.clsr.2020.105440.

K. Zhang, “Research on Data Mining Security under the Background of Big Data Era,” in International Conference on Management and Computer Science, 2018, vol. 77, no. Icmcs, pp. 236–239. doi: 10.2991/icmcs-18.2018.48.

M. K. Gupta and P. Chandra, “A comprehensive survey of data mining,” Int. J. Inf. Technol., vol. 12, no. 4, pp. 1243–1257, 2020, doi: 10.1007/s41870-020-00427-7.

T. Chandrakala, S. N. S. Rajini, K. Selvam, and K. Dharmarajan, “Implementation Of Data Mining And Maching Learing In The Concept Of Cybersecurity To Overcome Cyber Attack Turkish Journal of Computer and Mathematics Education,” Turkish J. Comput. Math. Educ., vol. 12, no. 12, pp. 4561–4571, 2021.

C. Tex, M. Schaler, and K. Bohm, “DISTANCE-BASED DATA MINING over ENCRYPTED DATA,” Proc. - IEEE 34th Int. Conf. Data Eng. ICDE 2018, vol. 1, pp. 1268–1271, 2018, doi: 10.1109/ICDE.2018.00126.

T. Javid, M. K. Gupta, and A. Gupta, “A hybrid-security model for privacy-enhanced distributed data mining,” J. King Saud Univ. - Comput. Inf. Sci., no. xxxx, 2020, doi: 10.1016/j.jksuci.2020.06.010.

F. Paquin, J. Rivnay, A. Salleo, N. Stingelin, and C. Silva, “Multi-phase semicrystalline microstructures drive exciton dissociation in neat plastic semiconductors,” J. Mater. Chem. C, vol. 3, no. 2, pp. 10715–10722, 2015, doi: 10.1039/b000000x.

Z. Sun, K. D. Strang, and F. Pambel, “Privacy and security in the big data paradigm,” J. Comput. Inf. Syst., vol. 60, no. 2, pp. 146–155, 2020, doi: 10.1080/08874417.2017.1418631.

S. Al-Darraji, D. G. Honi, F. Fallucchi, A. I. Abdulsada, R. Giuliano, and H. A. Abdulmalik, “Employee attrition prediction using deep neural networks,” Computers, vol. 10, no. 11, pp. 1–11, 2021, doi: 10.3390/computers10110141.

T. Siddiqui, A. Y. A. Amer, and N. A. Khan, “Criminal Activity Detection in Social Network by Text Mining: Comprehensive Analysis,” 2019 4th Int. Conf. Inf. Syst. Comput. Networks, ISCON 2019, pp. 224–229, 2019, doi: 10.1109/ISCON47742.2019.9036157.

T. Imandasari, E. Irawan, A. P. Windarto, and A. Wanto, “Algoritma Naive Bayes Dalam Klasifikasi Lokasi Pembangunan Sumber Air,” Pros. Semin. Nas. Ris. Inf. Sci., vol. 1, no. September, p. 750, 2019, doi: 10.30645/senaris.v1i0.81.

M. K. Hossain, M. M. Haque, and M. A. A. Dewan, “A comparative analysis of semi-supervised learning in detecting burst header packet flooding attack in optical burst switching network,” Computers, vol. 10, no. 8, 2021, doi: 10.3390/computers10080095.

Downloads

Published

2023-06-28

How to Cite

Investigation of the Dark Web Illegal Activities using Data Mining Approach. (2023). Bulletin of Computer Science and Electrical Engineering, 4(1), 37-48. https://doi.org/10.25008/bcsee.v4i1.1179