The concept of artificial intelligence can be traced back to the early 1950s, nearly half a century before the term “big data” was first coined. While big data is fairly recent technological breakthrough, it is already shaping the future of AI in remarkable ways.
However, data scientists still face some significant roadblocks as they try to bring AI into the next century. One of their biggest challenges is dealing with the limitations of data liquidity.
What is data liquidity?
Data liquidity is in informal term with multiple possible definitions. However, most big data experts define data liquidity as the ease with which relevant data can be brought to the right end-user.
The term was first used in healthcare applications. Both healthcare providers and patients have become very dependent on big data in recent years. Big data is used to track diagnoses, identify likely health risks for individual patients and develop the most appropriate treatment programs.
This wouldn’t be possible if algorithms couldn’t determine the relationship between different datasets and the applicable patients. At best, big data would help healthcare providers make generalized observations of different categories of patients and use their own subjective and imperfect judgment to determine which patients that certain treatment regimens should be applied to. A team of researchers from the Institute of Medicine wrote a paper citing the importance of data liquidity with treating cancer. Here is an excerpt of their white paper:
“Rapid, seamless data exchange in cancer—throughout the continuum from research toclinical care—remains an unmet need with no clear path forward to a solution. We assert thatthere is a national urgency to find solutions to support and sustain the cancer informatics ecosystem(IOM, 2012c), and propose addressing this challenge through a coalition of diverse, interestedstakeholders working in a precompetitive collaboration to achieve data liquidity. Such a coalitioncould work toward the goal of personalized cancer medicine, addressing the challenges inmanageable, incremental steps.”
Data liquidity will be the focus of future AI discussions
Most of the discussion about data liquidity still focuses on the healthcare sector. However, it also plays a growing role in artificial intelligence.
Renowned data scientist Roger Chen recently conducted a podcast on data liquidity for AI. He highlights a number of the challenges and opportunities it poses to AI developers.
Artificial intelligence is already being used in just about every sector of the economy. Here are some of the most common applications:
- Improving actuarial analysis for setting insurance premiums
- Using predictive analytics to forecast revenue for an individual business at a specific future date
- Identifying equilibrium price points for financial security sold on public exchanges
- Determining the risk factors a specific patient has which may contribute to future health problems
- Helping people using online dating sites find more compatible matches
- Identifying decisions that could conflict with contracts and issue alternative recommendations to avoid disputes
- Showing drivers the most economical roots by factoring current gas prices
The list of applications for AI is virtually endless. They all depend on data liquidity at the macro or individual scale. Many of these solutions depend on specific to the end user. They may also need to use third-party sources of data that is relevant to the user.
For example, consider algorithms such as GasBuddy that help customers estimate the cost of gas they need for a given trip. They need to collect the following information:
- The customer’s vehicle
- The distance they are traveling
- The customer’s average driving speed
- Current traffic congestion levels
- The current cost of gas
All of these factors are very important for making accurate calculations. If the algorithms mistook just one of these variables for that of another user, the projected cost could be off by a magnitude of five or more. Since they rely almost exclusively on AI, a human employee won’t usually identify these mistakes unless a user files a complaint. This means that data liquidity problems often go unnoticed.
Many data scientists have encountered data liquidity issues with their projects. This is often the consequence of developers prioritizing data scalability too much and neglecting to emphasize data liquidity enough.
The good news is that number of Hadoop tools are providing the data liquidity solutions they depend on to address these challenges.