Site Loader

Accelerating
crimes on internet alerts the law implementation bodies to keep an eye on
online activities which involve huge data. This will build a requirement to
detect suspicious activities online available on discussion forums by
optimizing the usage of data mining tools. This paper highlights on the data
mining techniques which are prototyped and implemented for closely studying discussion
forums data for suspicious activities in different domains. Thus, for detecting
suspicious discussions on the discussion forums dataset, numerous mining
methods have been implemented till date. Through this, doubtful activities can
be revealed by analyzing the interests of all users. The main obstacle faced by
researchers in doing so, is the lack of information retrieval and data analysis
tools for real time data of forum websites. The existent database is quite massive
and thus to extract desired knowledge from such large search space of social
data, an intelligent and interactive data mining algorithm is
required. Moreover, the involvement of large number of parameters in the
search space makes the large-scale search impractical. Consequently, efficient
search approaches are of essential significance. It is necessary to acquire
knowledge about data mining in order to discover information. Data mining is
defined as the process of discovering, extracting and analyzing meaningful
patterns, structure, models, and rules from large quantities of data. Data
mining is emanating as one of the tools for crime detection, clustering of
crime location for finding crime hot spots, criminal profiling, predictions of
crime trends and many other related applications.

 Many
scientific researches have been done on the significance of crime data mining
and their results are revealed in the new software applications to analysis and
detecting the crime data.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

A
framework has been developed by Fabio
Calefato, Filippo Lanubile, Nicole Novielli, University of Bari “Aldo Moro”,
which can be used for emotion detection from online forums. EmoTxt identifies
emotions in an input corpus provided as a comma separated value (CSV) file,
with one text per line, preceded by a unique identifier. The output is a CSV
file containing the text id and the predicted label for each item of the input
collection. There model intends to find the recognition of specific emotions,
such as joy, love, and anger etc. Whereas the other proposed systems have
classified the emotions as positive, negative, or neutral.

According
to research by Fabio Calefato, the
framework defines a tree-structured hierarchical classification of emotions,
where each level refines the granularity of the previous one, thus providing
more indication on its nature. The framework includes, at the top level, six
basic emotions, namely love, joy, anger, sadness, fear, and surprise.

A
research paper published in Imperial
Journal of Interdisciplinary Research
done by M.Suruthi Murugesan, R. Pavitha
Devi, S. Deepthi, V.Sri Lavanya & Dr. Annie Princy on “Automated
Monitoring Suspicious Discussions on Online Forums Using Data Mining
Statistical Corpus Based Approach”, suggests various techniques and algorithms
which can be employed. The paper elaborates about Stop-word Selection, Stemming
algorithm, Brute-force algorithm, Learning Based algorithm and Matching
algorithm.

 

 

Another
paper, “Surveillance of Suspicious Discussions on Online Forums Using Text Data
Mining” written by Harika Upganlawar,
Nilesh Sambhe, published in International
Journal of Advances in Electronics and Computer Science describes the system
will analyze online plain text sources from selected discussion forums and will
classify the text into different groups and system will decide which post is
legal and illegal using Levenshtein algorithm. In Levenshtein algorithm Levenshtein
distance is a measure of similarity between two words.

In our proposed system, we apply the steps of data
mining. The data set is collected and explored from various online forums such
as KD Nuggets, Reddit, GitHub etc.

The raw data is converted from
unstructured to structured using traditional analysis tools. We collect,
cleanse, and format the data because some of the mining functions accept data
only in a certain format. Preparing the data for
the modeling tool by selecting tables, records, and attributes, are typical
tasks. The meaning of the data is not changed. The main techniques of the crime
data mining are clustering, association rule mining, classification and
sequential pattern mining. Along with these techniques we use advanced algorithm
such Stop-word Selection and Emotional Algorithm to find clear and meaningful results
and patterns.

Post Author: admin

x

Hi!
I'm Katherine!

Would you like to get a custom essay? How about receiving a customized one?

Check it out