Data mining can unleash a gold mine of information=20 from one’s own databases. Leon Perlman reports
Unbeknown to many corporates, they’re sitting on a=20 gold mine. Buried deep inside their databases are=20 vast repositories of data patterns that could=20 unlock acres of productivity and millions of rands=20 in revenue.
The key to this info labyrinth is a new and=20 intelligent software technology called data mining.=20 The technique is very simple: find hidden data=20 patterns within complex databases. Data mining=20 programs use sophisticated analytical techniques to=20 simplify these complex databases by finding=20 patterns, groupings, and consistent relationships=20 within databases of any size. The links are=20 displayed in a simple graphical format.
They are being used for analysing patterns within=20 personnel communications, insurance claims, loan=20 portfolios, telecommunications networks, equities=20 trading, banking transactions, network traffic,=20 disease epidemiology, marketing databases and goods=20 inventories.=20
For example, a grocery chain may discover that a=20 significant percentage of its customers who buy=20 salmon also buy white wine. This might allow the=20 grocer to better forecast sales, create a theme for=20 merchandising, or target market to the group of=20 consumers most likely to purchase fish.
The technology will usually “mine” data from DB2=20 databases, as well as flat files containing data=20 extracted from relational databases or generated=20 from operational systems.
According to Kerry Evans of Synectics, the local=20 distributors of data minings software NetMap, a=20 number of corporates and some government=20 departments are using data mining techniques. These=20 include Old Mutual, M-Net, some medical aid=20 companies, NBS and a number of external auditors.
The software is restricted only by the limits of=20 the hardware. Synetics provide a bureau service for=20 companies who only require ad hoc studies and who=20 don’t have the requisite hardware.
NetMap’s genesis lies with Australian economist and=20 engineering enthusiast John Galloway who, in the=20 late 1970s, became interested in ways of=20 identifying informal team structures that emerge=20 within organisations and which often drive their=20 operation.
There are of course other data mining packages=20 available, but NetMap’s great strength is its=20 graphical eloquence. The most complex of=20 relationships can be unravelled, with up to 13,5- million nodes _ each identified by a four-character=20 alpha numeric code _ available for simultaneous=20 screen display.=20
Its analytical capacity provided a panacea for a=20 large South African Medical Aid company, which=20 slashed up to 15% off its claims bill by using data=20 mining to analyse duplicate, unnecessary and even=20 fraudulent claims.
One study looked for patterns that would isolate=20 claimants with a potential drug dependency problem=20 by searching for members that had been to more than=20 five different medical practitioners during a=20 certain period.=20
The analysis revealed that some members had been=20 seeing two practitioners on the same day.=20
Dentists also came under the spotlight with a study=20 that was aimed at isolating those dentists using=20 unnecessarily expensive procedures instead of=20 cheaper alternatives.=20
Multichoice/M-Net use data mining to investigate=20 the way in which staff members relate or work=20 together. The information was gathered through=20 carefully designed questionnaires and then stored=20 in a database.=20
They also analysed the data traffic on their local=20 area network (Lan), an extensive and complicated=20 infrastructure involving some 100 servers and over=20 800 workstations throughout the country. They=20 identified components down to network-card level=20 which represented critical or single points of=20 failure.=20
In the UK, NetMap’s ability to find seemingly=20 innocuous linkages between organisations and=20 individuals helped expose the kingpin in charge of=20 an organised crime syndicate who had evaded=20 exposure by limiting his connection to the criminal=20 network. The largely inconclusive investigation had=20 spanned 16 person-years, but in half an hour of=20 analysis of the assembled data, the Fraud Office=20 had identified four different frauds and the key=20 player they had overlooked.
Our beleaguered Receiver Of Revenue is reportedly =20 also interested in using data mining software.