Posted by: gmontealegre | October 20, 2014

“Big Data”


from Spotlight

Correlating patient data from a broad variety of sources can help reveal patterns of illness, identify individuals most likely to use emergency services, cut healthcare costs, and improve patient outcomes

Aneesh Chopra, former chief technology officer for the federal government, believes that the value of data sources grows exponentially as new ones are added.
The term “Big Data” seems to be everywhere these days. It’s being used to describe how marketers learn about shopper’s preferences, security organizations pinpoint potential risks, and demographers identify major trends. But nowhere does the use of big data have more potential to impact our quality of life than in healthcare.

As electronic medical records become the norm, and computers and mobile devices become ubiquitous, crunching large volumes of digital records to enhance healthcare decision-making is now possible.

Researchers are demonstrating how inventive uses of data can reveal patterns of illness that were previously obscure. Some hospitals in New Jersey, Pennsylvania, and other states are getting better at identifying and treating the sickest members of their communities. Insurance companies are tracking patient data as part of new schemes to reward doctors financially for keeping people well.

In point of fact, “Big Data” is used to cover a wide range of disparate activities enabled by information technology, whether they involve sifting through hundreds of millions of records or only a few thousand. It includes the “hot spotting” of frequent emergency-room users innovated by Dr. Jeffrey Brenner in Camden; a hospital workflow that makes sure diabetes patients get scheduled blood tests; a mapping project by Princeton economist Janet Currie that shows how home foreclosures lead to increased hospital admissions; and a smartphone app that lets users look up product recalls, among many other efforts.

Click to see full-sized image.
Big data boosters say the field has great promise, with the potential to focus limited resources in ways that will improve the quality of patients’ lives, prevent needless deaths, and cut costs. At the same time, the productive use of data and analytics still faces a number of challenges, some of them unique to healthcare.

Privacy, in particular, is a concern. Current privacy laws often hamper research. Yet, some of the most cutting-edge public health research efforts and commercial ventures seek to “mash up” multiple sets of health records. This can put patients’ information to uses they never envisioned, employing information in ways that makes people uncomfortable.

A variety of solutions have been proposed for different kinds of privacy challenges, ranging from updated state and federal legislation to computer systems that allow data to be queried without revealing the subjects’ identities.


Healthcare organizations and researchers have been collecting and analyzing computer data for decades, but big data has gained currency as a buzzword only in the past two to three years. Experts refer to a new “volume, variety and velocity of data” that has resulted from the automated or large-scale collection of information — for example, from a wearable heart monitor — that allows real-time tracking and response.

Dr. Farzad Mostashari, the former national coordinator for health information technology at the U.S. Department of Health and Human Services, cited an early instance of relatively small “big data” from his work detecting disease outbreaks in New York 15 years ago.

While working for the Centers for Disease Control, he learned about the fire department’s records of ambulance calls, which were categorized by the problem described by the caller. While the information was scientifically unreliable “dirty data,” in the aggregate it showed “beautiful” patterns, like increases in respiratory calls at certain times.

The data turned out to reveal surges in flu cases well before individual doctors could become aware that something unusual was happening, Mostashari explained during a big data conference at Princeton University earlier this year.

Big data boosters say the field has the potential to focus limited resources in ways that will improve the quality of patients’ lives

“That was kind of my first exposure to this idea that you could take data, which is now electronic, because we had some sort of transactional system — and the data is being collected for some totally other purpose, right, to dispatch an ambulance — but if you could reuse and repurpose it and look for patterns within it, it might be useful,” he said.

At the very least, ambulance-call data could serve as an early-warning system, allowing hospitals to prepare for higher patient volume and public officials to broadcast advice on how to avoid getting sick. But for Mostashari and many others, the greater goal of big data work is prediction. They want to know who is likely to get sick, weeks or months in advance, so that interventions can be put in place and tested for effectiveness, and causes of illness can be studied in detail.

Predictive analytics is in its infancy and its long-term utility is unclear. At the clinical level, the term has been used to describe systems that monitor a premature baby’s vital signs and give earlier warnings of a new infection, for example. In the future, a computer might automatically adjust the baby’s medicine without a nurse’s intervention.

Danish Researchers Supersize Big Data, Analyze Nation’s Full Patient Registry
Working with medical records for more than 6 million people, Danish scientists uncover unknown disease patterns that could ultimately improve healthcare worldwide
In the United States, researchers can only dream of the ultimate health database — one that contains complete electronic records spanning decades for all Americans, allowing analysis of long-terms patterns of illness.

Read More ▶
A number of organizations are also researching ways to predict and prevent hospital readmissions, which are used as a measure of health quality. Providers with high readmission rates can be penalized by Medicare.

Geisinger Health System in Pennsylvania, an innovator in the advanced use of data, has studied the characteristics of readmitted patients and identified risk factors such as pulmonary disease, heart failure, and advanced age. Among patients with those factors, who also had a previous admission in the past year, fully half will die or end up back in the hospital within 30 days of being discharged, according to Dr. Jonathan Darer, Geisinger’s chief innovation officer.

But though Geisinger uses staff calls, robocalls, and home health visits to monitor certain sets of newly discharged patients, the organization is so far not using its findings on readmissions in a meaningful way, Darer said during a recent NJ Spotlight webinar on big data. It continues to analyze a long list of variables, including the patient’s home situation and other factors, in an effort to refine its predictive power.

Meanwhile Brenner, who has won plaudits and awards for pioneering uses of patient health records, criticizes health IT advocates who he calls “obsessed” with prediction. Instead of focusing on possible future illness, he says healthcare organizations should get better at surveillance, drilling down into data and building systems that alert them to current patients’ problems.

“So we want to know, ‘Tell me which person is going to be hospitalized three months from now so I can call them on the phone.’ Meanwhile, the hospital is full of sick people who’ve been back over and over and over,” Brenner said during the Princeton conference. “Or, this month there’s a women in Camden who’s been to the emergency room three times for sexually transmitted disease. No one is going to call her, no one is going to follow up, her primary care provider is unaware of it. So that’s a failure to surveil data.”

Brenner is best known for treating poor, chronically ill “super-utilizers” who generate astronomical medical costs. His organization, the Camden Coalition of Healthcare Providers, identifies them by looking at maps of ambulance calls or hospital admission records, or simply by asking doctors. Nurses and social workers visit those people and find out what they need — reminders to take medications, drug rehabilitation, or better housing, for example — and make sure they get it rather than repeatedly going to the emergency room for help.

Mostashari cited a similar effort at a San Diego hospital system that received a grant from the federal Beacon Community program to make better use of information technology. He said it achieved $8 million in savings by focusing on just 32 high-cost patients, including one woman who was continually calling for ambulances, according to the system’s records.

“They’d had 100 ambulance dispatches going to her house, and not a single transport,” he recalled. “No one had stopped to say, ‘And what happens when you go to her house?’ They said, ‘Usually we make her a sandwich.’ So they got her Meals on Wheels. It’s a lot cheaper than scrambling a rig.”

Click to see full-sized image.
Beacon hospital and others have also succeeded in improving health outcomes by installing and exploiting better communication and records systems. These may let ambulances send information about a patient ahead to the hospital, or keep a primary-care doctor in the loop when a patient sees another provider or visits the ER.

Such improvements are essential for the new accountable care organizations, or ACOs, that have sprung up since the passage of the Affordable Care Act. Hospitals and doctors in ACOs are paid for making sure members of their community undergo scheduled tests and stay well, particularly people with chronic conditions. Such systems require electronic health records, which often can be configured to send alerts to doctors, nurses, or even patients when gaps in care arise.

Digitizing ‘Bundles’ of Medical Procedures To Ensure Patients Get Complete Care
Geisinger Health System built a computer-based system that alerts nurses and other health practitioners when patients need to come in for tests, reducing so-called care gaps
Geisinger Health System began digitizing health records at its hospitals in rural Pennsylvania in the mid-1990s, well before most other providers. The system, which includes both providers and health plans, then created bundles of clinical care processes — a set of steps for every patient with a particular medical condition — and used its electronic records database as part of a reengineered workflow to make sure every step was followed.

Read More ▶
At Geisinger, doctors design care bundles for target populations, such as people with diabetes. A bundle includes specific items — vaccinations, blood-pressure readings, and glucose tests, for example — that nurses order up, or that the computer automatically turns into work orders for providers. In population after population — people with diabetes, coronary disease, osteoporosis, and other conditions — the system has resulted in better patient outcomes, Darer said.


Beyond the clinical setting, careful analysis of large datasets can also reveal global patterns of disease and help policymakers decide how to channel resources.

Optum, a leading health analytics firm, has done large-scale hot spotting for a number of states, including Maryland, which has been working to make its Medicaid program more efficient. For example, Optum discovered a high rate of emergency-room admissions for colds, a relatively minor illness, and found that one hospital accounted for most of the visits, said Dr. Lewis Sandy, the senior vice president for parent company UnitedHealth Group.

With that information, the state could encourage the hospital and those patients to manage their colds using less expensive alternatives to the emergency room.

In New Jersey the company created a statewide map down to the level of census tracts showing the prevalence of diabetes. That could be used to identify problems such as food deserts, where healthy food is hard to find, and drive improvements in program like Medicare and Medicaid, Sandy said.

“It’s not just data from the healthcare delivery system. You can actually use data from personal health records, patient surveys, from publicly available data, for example, from the U.S. Census, or from other government programs,” Sandy said during the NJ Spotlight webinar. “This information can be brought together to bring knowledge and insight to improve public health programs.”

At the cutting edge of big data mashups, developers combine public data with mobile devices to show where health problems are happening in real time.

To help people with respiratory conditions, the company Propeller Health created a device that attaches to an inhaler and uses publicly funded GPS signals to record where and when it is used, giving the patient a precise electronic record. In addition, officials in Louisville, Kentucky used the aggregate data to map out the worst locations for respiratory problems in their city and to examine how they corresponded to environmental factors. They then redeployed city resources to reduce air pollution.

Aneesh Chopra, the former chief technology officer for the federal government, cited the Louisville trial as an example of a project that can illuminate a health problem by generating and drawing on multiple sources of data.

“From a mathematical standpoint, the value of data isn’t one source itself — ‘Hey, this is a GPS source.’ It’s the mashup of multiple sources,” Chopra told the audience at the Princeton conference. “Adding one more data source on your proprietary data source doesn’t create value in a linear fashion, but actually creates value in an exponential fashion. So keep thinking about ways you can enhance or enrich your data with external data that is increasingly open.”

Greater openness about cost data is the goal of another growing movement within the healthcare sector. Insurance companies, either voluntarily or under legislative mandate, are increasingly releasing data on the actual amounts patients pay for different medical procedures, as well as measures of their outcomes.

More than a dozen states have or are creating all-payer claims databases (APCDs) so they can better understand the costs and quality of their healthcare systems. At the national level, three large insurers have given the Health Care Cost Institute cost data that consumers will be able to search using an online tool, and the organization recently won access to national Medicare claims data. Several universities have licensed the massive HCCI dataset so their faculty and students can use it for big data research.


The ubiquity of electronic data collection and the power of high-speed computer analysis have created a remarkably rich resource for innovation, but have also challenged established notions of privacy and even the definition of health data.

A frequently cited example of the dangers of big data comes from a New York Times article about analytics efforts at Target, the department store chain. By analyzing purchases, the company can determine with fairly high certainty if a customer is pregnant, and then will send her coupons for baby-related products. In the widely publicized incident, an angry father complained that the store should not be sending his teenage daughter such advertisements, only to apologize later after he learned she actually was pregnant.

Target did not have access to the young woman’s medical records, but did have her purchase history and potentially a wealth of financial, demographic, and other information obtained from data brokers and public sources. It was thus able to discern facts that had been known only to the woman and possibly her doctor.

“A big part of the big data project is not just analyzing information, it’s creating information,” Julie Brill, a commissioner of the Federal Trade Commission, said at the Princeton conference. “From innocuous retail purchases, health information is created.”

Brill said the federal government set up rules to protect consumers’ confidential information in the 1990s through the Health Insurance Portability and Accountability Act (HIPAA), the Fair Credit Reporting Act, and other legislation, but the laws do not address a newer generation of companies and products that have sprung up since then.

Brill raised the spectre of companies using proprietary and public data to learn whether a specific individual has diabetes, cancer, or mental illness, possibly when the person is ignorant of his or her condition. Using information on car ownership and other data, search firms have guessed that families are obese or diabetic and asked them to join a medical trials, discomfiting some of those contacted, she said. Wearable devices like Fitbits record a user’s physical activity, but the person may not have complete control over how the data is used.

Even as privacy experts warn about gaps in legal protections, others chafe at existing restrictions that require patients to give explicit consent for most uses of their personal health information.

Janet Currie is the Henry Putnam Professor of Economics and Public Affairs at Princeton University and the Director of Princeton’s Center for Health and Well Being.
Princeton’s Currie has mashed up data from different sources to gain new insights. In addition to analyzing the relationship between foreclosures and poor health, another of her studies show correlations between flu season and premature birth.

In addition, the requirement that researchers use only deidentified data, from which names and other details have been removed, makes it difficult to do longitudinal studies that track super-utilizers or to review the effects of a drug over time, said Joel Cantor, director of the Center for State Health Policy at Rutgers.

For such projects a researcher needs to have all the hospital admission data for each particular person being studied. The deidentification efforts required exceed what understaffed and underfunded state agencies can do, he said.

“They said they no longer have the capacity to do that. We’re asking for too much,” Cantor said during the Princeton University conference.

A number of reforms and new systems have been suggested to ameliorate both privacy gaps and access problems.

To respond to concerns about how patient data is used, “baseline privacy legislation” is needed at the federal level, Brill said, while acknowledging such laws are not currently in the offing. HIPAA and other legislation could be amended to recognize that health data exists in places beyond clinical and insurance databases. Sound data management practices, risk analysis, privacy officers, and audits must be standard at any firm that handles sensitive information, she and other experts said.

To prevent privacy rules from handicapping data analysis, government agencies could use computer systems that let researchers submit statistical queries and get answers without possessing the data, said Edward Felten, a professor of computer science and public affairs at Princeton. Brill said the U.S. Census uses such a system and could serve as a model for others.

As for the use of personal information outside of traditional healthcare settings, Chopra argues for engaging patients and teaching them how to view their own data. He advocates a control-panel model in which people are encouraged to actively decide how they want their data used and can easily opt out of giving access.

Others argue more aggressively for releasing data, while using institutional review boards or ethics review committees to weigh the potential benefits and risks.

A recent Health Affairs article on ethical concerns in predictive analytics said patients should be included in the early stages of big data projects, but developers also “should be allowed to use already collected patient data without explicit consent, provided that they comply with federal regulations regarding research on human subjects and the privacy of health information.”

Meir Rinde is a freelance writer based in Philadelphia..


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: