Impact of consumer data protection on the future of data science
Siddhant Pandey
Brevitaz Systems, India
: J Comput Eng Inf Technol
Abstract
Statement of the Problem: With crimes observing a daily rise in cases each passing day, the need to predict and prioritize them is getting crucial, to not only put a pause in the rise but to also stop crimes before they take place. Building a solution for predictive policing to identify the crime prone areas can play a major role for police to allocate their resources on that particular location to avoid the crime from happening. The process to successfully develop a system that can predict the crime is a bit complex, but can be crucial to help make patrolling easier and more efficient. This can not only lower the crime rates but also contribute to making the world safer. For the same, the analysis of data needs to be performed to efficiently identify the probability of a particular crime, which could be embedded with a map to geo locate the coordinates to simplify locate the prone areas. The same can be made more appealing through various statistics information such as graphs and 24 hour forecast predictions. This will allow the emergency services to reach any particular crime location on or before time, making the chances of crime happening lower down. We have used the sample data provided by Lucknow police and preprocessed it to extract the relevant features for training the model. Our model will work fine with any other dataset of any other city if it is recorded in the same format. Using the Google Colab Jupyter environment we built our model using python and its libraries such as numpy and pandas packages, which are used for data manipulation. After pre-processing, the next step was to select and extract the features of interest and then divide the data into training and testing using Pandas and Sci-Kit Learn libraries in python. The given data was run through various algorithms to measure its accuracy, including Decision Tree, Random Forest, and KNN, where the Random Forest Classifier provided pretty good accuracy as shown in the figure. Recent Publications 1. Shamsuddin, N. H. M., Ali, N. A., & Alwee, R. (2017). An overview on crime prediction methods. In 2017 6th ICT international student project conference (ICT-ISPC), Skudai (pp. 1–5). IEEE. 2. Feng, M., Zheng, J., Ren, J., Hussain, A., Li, X., Xi, Y., & Liu, Q. (2019). Big data analytics and mining for effective visualization and trends forecasting of crime data. IEEE Access, 7, 106111–106123. 3. Vaidya, O., Mitra, S., Kumbhar, R., Chavan, S., & Patil, M. R. (2018). Crime rate prediction using data clustering algorithms. International Research Journal of Engineering and Technology (IRJET). e-ISSN, 2395–0056
Biography
Siddhant Pandey is a Software Scientist at Brevitaz Systems. He works in the OpenEPCIS project in collaboration with Benelog and GS1. He’s helping the team in developing the EPCIS 2.0 Compatible data repository