INTRODUCTION
This paper covers the different definitions of data mining that explores the essence of data mining in general. Data mining may have had varied definition but all the definitions falls into the context that may better explain data mining. Data mining as a very valuable tool in data management have extended its applications as not being only limited to statisticians but to the various other major fields such as in business, computer systems, manufacturing, and the like. Further to be able to expound on the applicability of data mining, examples are given in this paper.
The size of world's data is estimated to be doubling every 20 months (1991). The average Fortune 500 company manages over a terabyte of electronic information -- that's between 20 and 500 million document pages and records -- daily, with 57% annual growth ( 1998; , 1999). However, due to the growing amount of world’s data, some or majority of these data have not been effectively utilized brought about by the lack of new technology that supports and that is able to analyze tons of volume of data (1998). The problem has been said to be compounded already because of the ever pervasive use of Web applications such as the e-commerce, that is able to gather additional volumes of data to the already intricate databases or data warehouses. Problem that arises in this case emerges the solution of what is called data mining.
Date mining is also known as Knowledge-Discovery in Databases (KDD), is an automatic process in the search of is gigantic volumes of data in seek for patterns that would be useful in the making of futuristic predictions with regards to trends and to prepare contingency plans for the worst case of prediction that may be forecast. Some authors like (1997) see data mining as a "single step in a larger process that we call the KDD process" among of the processes includes data warehousing; target data selection; cleaning; preprocessing; transformation and reduction; data mining; model selection (or combination); evaluation and interpretation; consolidation and use of the extracted knowledge. The definitions being given serves as an overview of the broad concept that evolves in the term called data mining.
DEFINITION
As what have been previously mentioned, data mining has a broad meaning. To limit the number of definitions being incorporated to data mining, in this paper the discussion when it comes to the definition of data mining will be limited to three definitions. One definition of data mining refers to the the nontrivial extraction of implicit, previously unknown, and potentially useful information from data and as the science of extracting useful information from large data sets or databases ( 2005)
Second is that data mining us a technique that is rooted from statistics, computer science and pertinent related areas of science that uses typically large datasets for the purposes of finding hidden associations between variables that may correlate and gives significance in its association which in turn, can better aide managerial decision-making.
Third refers to data mining as a powerful approach that promises great potential to help organizations to provide emphasis to information that had already been available to the existing database wherein, it provides tools to predict futuristic trends and behaviors, thereby allowing managers to be proactive and make knowledge-driven decisions.
EXPLAINATIONS
Data mining to statisticians implies the sense of struggling against the game of chance. But to the computer people or the information technology professionals, data mining shed a positive light to their field. The databases in which they made are considered as a resource wherein they can grasp information that is valuable to them. The application of data mining can be utilized in the human performance data, text data, in geospatial data, in science and engineering data, data in bioinformatics (genetic), customer relationship management data, computer and network security, image data and in manufacturing quality data.
Data mining as a process of extracting previously unknown information into the consolidated databases and is also seen to provide support of strategic and tactical managerial decision making, invokes algorithms that enumerate patterns from, or fit models to, data (, 1997). From the extraction of information, the use of data mining can form a prediction or classification model as being able to draw identifying relationship between records of the database. Those patterns or rules can be used to guide decision-making and forecast the effect of those decisions (, 1998; , 1999). Therefore, in data mining it unfolds new additional knowledge for managers which, in turn, results in more informed decision-making.
(1998) grouped the operations of data mining into four common types. First is the classification, as being the most common practiced mining activity that helps in the recognition of patterns that is descriptive of the group to which an item belongs. Second, the clustering involves the use of segmentation into partitioning the database into clusters. Third, association that serves in identifying of the connections between records that is based from association and sequence discovery. Fourth, is the forecasting which provides estimates on the future value of continuous variables based on patters within the data. In effect, the completion of two general step is a requisite in data mining: One, is the selection and transformation of data into a recognizing format for the mining operations (data warehousing), and two is in the application of analytical techniques to analyze the data and be able to identify patterns and predictions for decision making purposes.
However, given all the good things that can be done in data mining, is that the few flaws in the use of data mining such as it may only lead to discovering non-existent correlations and issues about privacy concern is being associated in the use of data mining.
EXAMPLES
The predictability power of statistical models run against huge data form the basis of actuarial work in all areas, and strategic planning and risk-benefit analysis rely heavily on analysis of large sets of past data to forecast future trends (, 1999). For example in marketing, the outcomes of market and customer data analysis are expected to help marketing managers understand and predict future customer, product, or process behavior (, 1996). Another example in the use of data mining is from a credit card company. Because of data mining it may detect credit card fraud in the identification of counterfeit, lost, and stolen cards and are likely able to generate cautious alarms if cardholders' transactions is unmatched with their previous patterns. Similarly, data mining aids managers in the discovery of patterns that is predictive of customers purchasing behavior. For example, modeling customer behavior gives lenders a predictive tool that helps mine a bank's retail customers for mortgage product opportunities ( , 1999). The applicability of data mining enables savvy corporations to develop marketing strategies, target mailings, advertising messages, minimizes risk and as much as possible eradicate wasteful expenditures. In fact a number of software tools are already been manufactured in response to the demand in the use of data mining. An example of general tool is Explora ( 1991; 1996) and examples of more domain specific tools are the Interactive Data Exploration and Analysis system of AT&T ( 1996), which permits one to segment market data and analyze the effect of new promotions and advertisements, and Advanced Scout (. 1997) which seeks interesting patterns in basketball games.
CONCLUSION
Data mining makes it possible to made information available for managers, companies, and marketing in the event of being able to foresee of what going to happen and be pre-emptive of the plausible scenarios that any business matter would encounter. Given its many application, data mining must be properly interpreted to maximize its full benefits. However, data mining is also susceptible to the abuse use of data mining. Thus, when the collection of data involves individual people, many questions arises as concerning privacy, legality, and ethics.
REFERENCES:
0 comments:
Post a Comment