Whose Data Is It, Anyway?Jennifer Hinkel
Nov 14 · 9 min readA look into Project Nightingale, the privacy of health care data, and why this is a potentially big issue
So, what’s the huge deal about “Project Nightingale” and patient data privacy?The Wall Street Journal broke the news on Monday that Google is working with Ascension Health to aggregate and analyze patient data across Ascension’s health care delivery system. The initiative is called “Project Nightingale.”Google is a household name. Ascension may not be, but it owns hospitals and doctor’s offices in more than 20 states, is the US’s largest not-for-profit health system, is the world’s largest Catholic health system, and generates enough revenue to pay its CEO about $18M a year. Both of these companies are behemoths. Neither Ascension’s patients nor its clinical professionals were informed that their data would be shared with or potentially utilized by Google, and the public revelation of this project has ruffled more than a few feathers in the data privacy world. According to reports, data includes patient names and other identifiable information. Even when health care data is “blinded” or anonymized, a record can often still be identifiable, particularly in the case of a rare diagnosis or a specific health history in a given geographic area.
Many of Ascension’s facilities are not explicitly branded as Ascension, so you may have seen a doctor at an Ascension-owned facility without realizing you were participating in this large conglomerate.
Yes, some of your health care data may have been swept up in this project.
Wait, was this legal? What about HIPAA?Before you shout: “But doesn’t HIPAA, or some other patient data privacy law, prevent this?”, the answer is not really.HIPAA (the Health Insurance Portability and Accountability Act, enacted in 1996) is perhaps one of the most poorly understood American health care laws; even many employees of health systems and medical offices don’t really understand its purpose and what it prohibits or allows. The “P” in the law does not stand for privacy. HIPAA does govern several aspects of patient data privacy, but one of these provisions allows “covered entities” (such as a hospital or medical office subject to the law) to share data to Business Associates — which are companies that may help the covered entity carry out its health care functions.Ascension and Google assert that Google, as a Business Associate of Ascension, can access the data to help Ascension carry out its functions of taking care of patients, as well as its business objectives (these could be as broad as lowering health care costs, or managing insurance and bills). This is likely a correct interpretation of the law, and the US Department of Health and Human Services (HHS) has already opened an inquiry to determine whether the law was interpreted correctly in this case and whether Google and Ascension have done anything untoward. Consumers, patients, and clinical professionals may not realize that health systems frequently share data with all manner of Business Associate entities, from electronic health record (EHR) companies and analytics companies to consulting firms and collections agencies. Hospitals and medical offices aren’t in the practice of employing data analysts or statisticians, so there are strong reasons for why this exception to data privacy should exist.
Despite all of this, patients are rarely, if ever, informed that their data may be traveling into the hands of these different entities. Furthermore, even though these Business Associate entities are also bound to protect the data, it’s frequently difficult to determine whether their systems might have been breached or compromised. Data that Business Associates touch does not have to be anonymized; they might have access to everything.
Seriously? HIPAA doesn’t cover this?
Keep in mind that HIPAA was drafted in the mid-90s. That was long before many of today’s issues in technology, surveillance, and data privacy — from wearables to consumer-available genetic testing to cloud-stored data — even existed. At the time, a hospital transferring data to Business Associates likely meant operating within a closed network, or possibly via floppy disks and flash drives.Before the days of Big Tech and consumer data aggregation, when HIPAA was put into action, it would have been very hard to imagine what the current data world would look like. After all, most hospitals and medical offices ran on paper records until the 2010s!
Does HIPAA restrict what Business Associates can do with data? What are the implications of this arrangement and its legality?HIPAA doesn’t give Business Associates a free rein. This is where Google could, hypothetically, tiptoe into rough waters if they don’t get a clear legal opinion, and health care lawyers may disagree on what Google can do with the data. The issues at hand cover not only what HIPAA intends (or intended, in an age prior to AI and machine learning), but also how patient data itself might be viewed. Is this data, in and of itself, a valuable and proprietary asset that patients and their clinicians have generated and should own rights to? Or are patient records and health outcomes data merely the “exhaust” generated by the day-to-day work that hospitals and doctors do to take care of patients?
The usual interpretation of HIPAA is that Business Associates are allowed to use the shared data only for business purposes for the covered entity. Translated for this case, that would mean that Google can use the data to do things for Ascension’s usual business and health care purposes.The law also is typically interpreted to say that the Business Associate can’t use the data for independent business purposes. If taking this interpretation, Google would need to firewall this data within its organization so that it can only be used for Ascension.
What types of other activities might Google want to use the data for, but might not be able to under HIPAA restrictions? (The WSJ produced this handy video on why Big Tech may want your health care records.) Google may not be able to legally use data from Ascension’s patients to train Artificial Intelligence (AI) or Machine Learning (ML) algorithms that might then be ported beyond Ascension’s systems. If AI algorithms were developed based on Ascension patient data, those would potentially have to remain specific to Ascension.Likewise debatable would be whether Google could widely apply their learnings gathered through Nightingale on other initiatives within Google or via its other health-related subsidiaries, such as Calico or Verily. Google also works with a number of other health care systems, including Mayo Clinic; in an ideal world, it might want to be able to scale learnings from any one of these systems across all of its health care customers, but if those insights and learnings are based on HIPAA-controlled data, it may be prohibited from doing so.
Lastly, Google may have a desire to link health care data from Ascension or other parties to its rich trove of individual data gathered through its many other products. Other Big Tech players such as Facebook have already been questioned about how health care related data that users assume is private might be connected to other data and leveraged for advertising or monetization. While the insights and learnings potential of linking health data to consumer data could be transformative, the issues around privacy, consent, and the clinical decision-making implications of relying on such data are thorny indeed.
So is Project Nightingale a Good Thing? Or a Bad Thing?
That depends on your perspective.In concept, using high-quality, real-world data to power robust insights around health care delivery is undeniably positive. Most of the interventions we prescribe in health care are informed by clinical trials, which are performed in relatively controlled environments. As a result, the studied populations may not representatively reflect the population at large, or the clinical outcomes may not play out the same way “in the wild” as they did in a study. Access to “Real World Evidence” (RWE) is increasingly desirable for researchers and front-line clinicians alike. And yet, data held in EHRs may not be complete enough, nor high-quality enough, nor representative enough to generate scientifically valid insights. When it comes to protecting patient data, there are some significant flags raised by Project Nightingale, regardless of strict legality. For instance, if patient data is going to be monetized, patient advocates would assert that some returns should accrue back to the patients whose data made such revenues possible. In an era where health systems can potentially profit off of patient data while simultaneously pursuing legal action against patients who can’t pay their medical bills, advocates are clamoring for equilibrium. These sorts of revenue-generating tactics translate into especially poor optics for non-profit or faith-based systems.
Data brokers then aggregate this deidentified health information and sell it to third party buyers; for example Adam Tanner of the Harvard Institute for Quantitative Social Science estimates that a large pharmaceutical company might pay between $10 million and $40 million per year for data, consulting and services from Iqvia alone.
What do patients want? What do we actually do with all of this data?
Patients increasingly want to know how their data is being used and to maintain some degree of control of that data. This issue has been exacerbated at a time where so many companies, particularly in the tech sector, have built business models on aggregating consumer data, and are now moving into the health care space, where the data points describe our most private and personal matters.
Health care data privacy and control, as an issue, is further complicated by myriad stories of patients struggling to obtain their own records, or perhaps more troubling — discovering personal health records that contain wildly incorrect data.
EHR data, whether individual or aggregated, is often not correct, representative, or complete. Abstracting that to algorithmic learning models such as those that power AI means that algorithms trained on spotty EHR records could end up leading clinicians in the wrong direction by offering conclusions that are incorrect, yet appear to be derived from “data science”. These sorts of recommendations could be misinterpreted as more conclusive than clinical trials due to the vastness of RWE when in fact they are far less conclusive because they don’t contain scientific controls or apply statistical approaches to determine significance.
Was there a better way to do this?
Even if this arrangement is perfectly legal, Ascension and Google could have taken a few steps to avoid poor optics. They could have proactively engaged with patient groups and privacy experts to galvanize support for some potentially cool advancements in clinical science and RWE, but instead they both took a slightly more problematic, if easier, road.
Studies evaluating patients’ willingness to share data indicate positive attitudes towards sharing; by and large, patients are willing to share if they think the data will contribute to new scientific discoveries, improvements in the health care system, or lower health care costs — even if those advancements will not benefit the specific patient, but rather society at large or patient groups with the same disease. Patients are more hesitant to share if their privacy could be compromised, or if the data will be used for commercial purposes. That’s mostly great news for research, and it leads to us wonder why Ascension wasn’t upfront with their patients and employees, allowing them to opt-out if they felt uncomfortable with Ascension and Google’s arrangement.
Likely, the answer just comes down to the administrative hassle that would have been involved to inform patients and allow them to opt in or out. Many health systems still communicate with their patients mainly through snail mail. Some have online portals, but few use email directly, ironically because email is not defined as HIPAA compliant. (Keep in mind the 1996 timing of the law — and that while email often can’t satisfy HIPAA, many fax machines can.)Although Google is perhaps one of the strategic partners most well-equipped to undertake the type of analytic tasks Ascension likely seeks, it also (like many Silicon Valley companies) has an imperfect track record when it comes to transparency about data utilization. One comfort to concerned patients, however, may be that Google is a best-in-class company when it comes to data security so as long as Google isn’t directly selling or sharing that data beyond its servers, the data is unlikely to be compromised by third parties.
So, what happens next?We’re in a weird time when it comes to data ownership, privacy, aggregation, and control. Some would even say there’s a secret war going on for your data. On one side stand technology and information giants whose entire business models are predicated on the ability to aggregate personal data and monetize it. On the other side stand advocates for data privacy and ownership, regulations such as the EU’s GDPR (General Data Protection Regulation) and California’s CCPA (California Consumer Privacy Act), and companies that are building new tools to put data tracking, control, and monetization in the hands of individuals and data generators instead of data aggregators.
As of right now, we can’t predict entirely how this will play out, but the media attention on Project Nightingale and the volume of high-profile responses are hinting at where battle lines might be drawn.