Mining Our Way to Improved Healthcare

In my previous posts, I introduced the concept of data mining and related advanced analytical techniques.  I also enumerated some of the reasons why organizations are looking to incorporate these technologies and expertise in their toolset.  In this post, I will describe some of the applications of data mining in the healthcare domain.  The healthcare industry in this country is comprised of many different actors with varying objectives.  Healthcare providers, health insurance plans, pharmacy benefit managers, and governmental organizations have attempted to use data mining to manage utilization, improve health outcomes, and maintain financial stability in an increasingly complex economic landscape.

Institutional healthcare providers such as hospitals, skilled nursing facilities, and outpatient clinics have used data mining techniques to develop standardized treatment regimens that have shown success in clinical settings. Comparing the outcomes of different treatment options across patient cohorts for the same underlying indication allows healthcare providers to ascertain the most effective course of action for that disease or condition.  Data mining can also help healthcare providers in tailoring treatment options based on a patient’s profile. Techniques such as cluster analysis and classification trees can enable healthcare providers to create profiles based on their demographics, health history, response to prior treatments, etc. and prescribe real-time treatment for patients based on the profiles and results.

Health insurance companies use data mining to manage healthcare resource utilization and provide quality care to its beneficiaries. This includes, but is not limited to, anticipating future inpatient and emergency events, lowering inpatient readmission rates, and improving outreach to beneficiaries to facilitate timely care.  Each of the above tasks requires the collating and analyzing beneficiaries’ prior claims, diagnoses, lab results, and demographic data.  Health insurance companies can forecast clinical events and readmissions using predictive modeling analytics.  These predictive models can range from traditional techniques such as regression and classification to more advanced techniques such as machine learning algorithms.  techHealth plan sponsors can use these predictive models to develop outreach programs to those beneficiaries who are at the highest risk of an inpatient or emergency room event. These outreach programs help insurance companies limit expensive clinical events and improve overall health outcomes by encouraging beneficiaries to seek timely care and adhere to prescription regimens.



The applications described are just a few of the data mining applications that are prevalent in the healthcare landscape.  But they demonstrate the immense benefit of using data to improve health outcomes in this country.  The challenge remains that health related data remains very disconnected given existing regulations and legitimate concerns of data privacy.  Securely integrating healthcare data across data platforms should be the overarching goal and will require cooperation among the various healthcare entities to implement such a solution.

Data Mining: Why the Gold Rush?

My earlier post introduced the concept of data mining and highlighted some of the domains and applications that have utilized the set of technologies and expertise for their analytical needs.  Many companies are accelerating towards including this toolset as a part of their decision making process.  A casual survey of the current landscape seems to indicate that data mining has paid large dividends for many commercial and research enterprises.

At this point, you may be wondering about the reasons for this spike in interest in data mining techniques.  The simple answer to the question is that there has been an exponential increase in the amount of data generated while the computational cost of storing and analyzing the data has decreased dramatically.

Infographic by Domo


By some estimates, 2.5 billion gigabytes of data are generated each day globally, which represents an annual increase of 23% when measured over the last 2 decades.  From Facebook “likes” to daily credit card transactions and cholesterol measurements, the span of data that is available for analysis is mind-boggling.

The argument therefore is that, with the availability of such data, reliance on human analysts and traditional techniques are no longer adequate to analyze the staggering breadth and complexity of data stores.  Often, information is concealed in the data and is not readily evident by legacy analytical means or human intervention.  Tools such as machine learning algorithms are better suited to search for complicated multi-factor patterns in the data without loss of objectivity.  The cost of such automated algorithms is also much less expensive compared to employing additional analysts and statisticians.

Finally, with the availability of such data, the competitive pressure faced by commercial organizations to transform the data into operational strategies and ultimately market share is immense.  Retail companies are using data mining and machine learning algorithms to forecast product demand and tailor incentives and promotions at a customer level.  Financial institutions are using similar tools to build individual credit risk profiles prior to authorizing lines of credit.  While it is a little early to pronounce that the current state of data mining and machine learning is the panacea for all organizational issues, a recent survey by MIT Technology Review Custom and Google Cloud found that more than half of both early-stage and mature-stage users reported that deploying machine learning and other data mining techniques have resulted in demonstrable ROI.

Now that we know why organizations are jumping on the data mining bandwagon, it is time to delve a little deeper into some of the intricacies of these analytical tools.  The next set of posts in this blog will try to address some of the techniques and their applications.  Stay tuned!

Data Mining Unearthed!

How does your favorite online retailer prompt product suggestion as you shop online?  How do credit card companies target customers with their promotional credit card offers?  The answer to these questions invariably involves a concept known as data mining.

When you hear the term data mining, it conjures images of a person sitting in front of a computer screen digging through a vast mountain of numbers to extract precious nuggets of information that can provide valuable insights and solutions.  binary-1327493_1920While this slightly belabored metaphor is not entirely fiction, the reality of modern day data mining is much more sophisticated.

In information technology parlance, data mining is not limited to the extraction of data from a data warehouse.  Instead, it is the process of discovering trends and patterns in data through the use of computing technology, algorithms, and statistical analysis.

In recent nomenclature, this process has been synonymous with the more generalized concept of “knowledge discovery in databases” or KDD.  cranium-2099120_1280Trendy industry terms such as machine learning, artificial intelligence, and text mining are some of the cornerstone data mining techniques currently being applied in a variety of fields such as healthcare, marketing, advertising, and online retail among others.

My name is Unni Mundaya and I have studied and worked in the data analytics space for the past 8 years, primarily in the healthcare domain.  You can find more details about me in my bio.

There have been rapid advancements in data mining, both in terms of the technology and its applications over the last 20 years.  This blog will be devoted to demystifying some of the more esoteric concepts related to data mining and will highlight some of the applications of these techniques currently being employed by organizations worldwide.  I welcome your feedback as I explore this topic in my future entries.  Stay tuned!

Up ↑