According to the Johns Hopkins University COVID-19 Dashboard, by mid-August 2020, nearly 5.5 million cases of COVID-19 and almost 175,000 coronavirus-related deaths have been reported in the United States. Of almost 73 million tests, 7.8 percent have come back positive. Similar data are provided by the COVID Tracking Project, a data dashboard set up a few months ago by The Atlantic magazine to inform the public about relevant coronavirus developments. Universities and other private entities have established COVID-19 tracking systems in part to fill a vacuum in data created by official U.S. government sources.
While many assume that the Centers for Disease Control and Prevention (CDC) tracks infection and mortality statistics with precision, the truth is that the agency does not. Flawed statistical models and outdated, incomplete, data characterize the CDC universe. Too often, data is phoned or faxed in, resulting in delays in tracking and follow-up, and errors in transcription. Likewise, reports and their information vary across states and reporting entities. Once data is reported, it remains siloed in more than 100 CDC reporting systems, providing little to decision-makers with which to make informed decisions.
Despite multiple laws enacted in 2006, 2013, 2019, and 2020 that require it to do so, the CDC has yet to implement a modern and uniform information-management and reporting system to help guide policymakers in decisions about where to deploy resources, whether to shut down entire health systems and businesses, and when to re-open schools. Perhaps worse, reporting is too often a one-way street—from health care workers to public health officials, but not the other way. This means that clinical knowledge, patient data, and best practices are rarely shared promptly and at the point of care with frontline health workers so they can treat patients more effectively. This failure has harmed the national and global response to the pandemic by limiting our effectiveness in preventing infections and controlling disease spread.
Recognizing these shortcomings and their implications for pandemic response, the Trump Administration recently directed hospitals to report daily information on test results, available hospital beds, ventilators, how many patients are being treated for COVID-19, COVID-19 deaths, and available supplies directly to the state or to a government contractor—not the CDC. The Department of Health and Human Services (HHS) also contracted private firms to curate and amass the data into a manageable dashboard so it could better understand where supply shortages or capacity problems are emerging, something the CDC data was not showing adequately. This stopgap measure is intended to reduce reporting burdens and address the information crisis that the CDC created, although senior Administration officials have indicated that they intend to return this responsibility to the CDC, once it improves its system.
This Backgrounder lays out recommendations to address these challenges by establishing a system that collects, analyzes, and disseminates privacy-protected data in forms that are usable to frontline health care workers, policymakers, and public and private researchers. To be effective, data must be available in a standardized and timely manner across different reporting and management systems. Ultimately, HHS must take leadership and quickly ensure a solid information system is built to inform public health and clinical responses.
The CDC’s Flawed Data Collection and Dissemination
Any information system is only as good as its data. The CDC collects and uses key data gleaned from reports by more than 3,000 public health offices and hundreds of thousands of nurses, doctors, hospitals, and pharmacists. The U.S. lacks standard data on public health, including the coronavirus, and the data are not reported uniformly across providers, cities, counties, and states, typically due to differing state rules. The data may be reported in real time, daily, weekly, monthly, or never. It may be reported through a doctor’s electronic health record or a web portal or, more commonly, by fax, e-mail, or phone.
Data critical to responding to the pandemic include case reports (suspected or confirmed), demographics, mortality rates, positive and negative test results, immunization status, prescription history, and health system capacity and utilization. There are other important data points, but these are essential to producing a picture of disease spread, intensity, severity, and resource supply and use, and for informing a clinical response. Following are descriptions of these information elements—and where the gaps lie today:
Case Reports. COVID-19 is a reportable disease, meaning that cases must be reported to local, state, and national authorities when doctors or laboratories diagnose them. These case reports are intended to supply key information on patients, including demographic, clinical, and epidemiologic characteristics; exposure and contact history; and courses of clinical illness and care received. Because each state and territory has its own laws and regulations defining which diseases are reportable, data can be variable and incomplete. Almost 90 percent of the COVID-19 cases included in the CDC’s March report “lacked any data about underlying health conditions such as diabetes or chronic lung disease, and 75 percent lacked information about hospitalization.” Because the case report is not automatically transmitted from an electronic health record or pharmacy management system, busy frontline workers fill out CDC forms by hand and submit them by phone, fax, or e-mail. This process can take up to 30 minutes to complete and submit, potentially at the expense of spending time with patients. Less than 10 percent of the reports are submitted in a format that can be easily accessed and readily used.
Demographics. The CDC recently testified that states “have improved the completeness of their [case report form] reporting in the past two months; in particular, the percentage of reports that include race/ethnicity data has increased from 18 percent in April to 43 percent in early June.” This means that more than half of reports still lack the minimum information required by the CDC.
Mortality Rates. COVID-19 death counts are based on mortality data in the National Vital Statistics System. Actual death counts reported by the CDC can lag by an average of one to two weeks based on when a death certificate is submitted and coded by the National Center for Healthcare Statistics. Delayed reporting of mortality data is a long-standing problem and may be exacerbated by coroners who do not file in a timely manner. Other challenges with mortality data include COVID-19 being misclassified as pneumonia or influenza, meaning that actual deaths from COVID-19 may be overstated or misleading. Guidance from the CDC provides flexibility in reporting the actual cause and contributing factors to death. For example, a person with chronic lung disease who also has COVID-19 and dies, may have died from either cause. CDC guidance is that the chronic condition be listed as a contributing factor to death, and COVID-19 as the primary cause.
Some have also speculated that health systems may be incorrectly categorizing heart ailments, cancer, or age-related natural causes of death as COVID-19-related to gain additional payments from Medicare and government funding. Congress is looking into this issue through the House Select Subcommittee on the Coronavirus Crisis, while the Administration has set up a committee of Inspectors General, the Pandemic Response Accountability Committee, to handle investigations into fraud, waste, and mismanagement, and to ensure that hospitals are not gaming their reporting to gain greater reimbursement.
Test Results. Public health, commercial, and clinical laboratories report the number of specimens tested and the number of positives. Of the 20 million lab reports received by health departments each year, 80 percent are electronic from 44 states. Nationally, about 80 percent of coronavirus test results do not include demographic information, and half do not have addresses, limiting their utility. The recently enacted Coronavirus, Aid, Relief and Economic Security (CARES) Act requires the reporting of both positive and negative lab results. In addition, the CDC is now suggesting, but not requiring, that data be reported using existing electronic health records technology. This is an advancement, to be sure, but electronic reporting has been uneven, with smaller, hospital-based labs faring worse than their bigger counterparts. AIMS, a messaging platform built by the Association of Public Health Laboratories in 2008, is a bright spot. It transmits lab results electronically to public health offices at the federal, state, local, territorial, and tribal levels. But not all entities use the AIMS platform to transmit lab results electronically, and results may still be submitted using an antiquated CDC spreadsheet format. Additionally, patient name, street address, date of birth, and ordering-provider information are not required as standard data elements, making follow-up contact from state officials or doctors more difficult.
Immunization Status. The HHS has allowed pharmacists to test for the presence of the virus, as well as for antibodies, during the public health emergency. In addition, pharmacists often administer immunizations, such as flu shots, in states that allow them to do so. Pharmacists administer more than 6 million influenza and 3.5 million pneumococcal vaccines annually, and combined with doctor-, clinic-, and nurse-administered vaccines, many are reported to immunization registries that compile the information. Some of the information is still manually reported, but some vendors are automating the process. As of 2018, 40 state systems also communicate patient immunization history to a provider upon request. This allows pharmacists and others in those states to see a complete view of the patient’s history and informs next steps in care. While the data is standardized, data latency remains a problem and pharmacy management and emergency health records may not “talk” to each other, making integration slow and tedious. Lack of immunization information complicates their ability to work with patients because current records are not always available to pharmacists. According to the CDC, such information currently is kept, inconsistently, via a state-based immunization registry. While federal regulations require electronic health records to provide functionality that allows the exchange of immunization data with other electronic health records, they do not always integrate with other systems, such as those used by pharmacists for administering vaccines. This may become even more problematic once a COVID-19 vaccine is available, because each multiple shot vaccine must be linked to the same manufacturer for the same patient.
Prescription History. As with immunization history, pharmacists generally have information on medications dispensed at that pharmacy (or in some cases the chain), but not for those dispensed elsewhere. Today, prescribers can use the National Council for Prescription Drug Programs’ (NCPDP’s) SCRIPT Standard Medication History Transaction to see prescriptions across care settings. Pharmacists can use the NCPDP/HL7 Pharmacist eCare Plan to monitor and report a patient’s medication adherence, health assessments, lab results, and other activities (including medication optimization, medication-risk reduction, and disease-management education). However, few pharmacies can know which customers have been tested for COVID-19, or which patients have been treated with a prophylactic or an anti-viral drug. This increases the risk of unnecessary duplicate immunizations, process inefficiencies and delays, and increased cost and waste.
Health System Capacity and Utilization. Until July 15, the CDC’s National Healthcare Safety Network (NHSN) tracked health system capacity across several different measures. According to a report prepared by Republican staff of the U.S. House Energy and Commerce Committee, the CDC’s NHSN is fragmented and unreliable. The report concluded, “The U.S. does not currently have a unified, comprehensive, and designated national surveillance system specific to COVID-19.”
The HHS took some steps to address this problem recently by launching a system called HHS Protect. According to the HHS, the CDC used to receive data related to COVID-19 from 3,000 hospitals—of the approximately 6,200 hospitals in the United States. HHS Protect increased it to an additional 1,100 hospitals.
More recently, White House coronavirus task force head Dr. Deborah Birx announced the Administration’s intention to restore the CDC’s role in collecting COVID-19 data from hospitals.
Congressional Attempts to Reform the CDC’s Data Collection and Dissemination
After 9/11 and the subsequent anthrax attacks, Congress recognized that the CDC’s antiquated reporting system could lead to an ill-informed public health response that could cost lives during a public health crisis. It also recognized that a catastrophic public health situation could cripple the economy, sicken the country, and weaken national security. Access to real-time information was a key component in the congressionally mandated strategy to address emerging threats, including future terror attacks, Zika, Ebola, hurricanes, and opioid abuse.
In 2006, 2013, 2019, and 2020, Congress passed laws directing the CDC to modernize its antiquated and burdensome public health data systems. It directed the agency to create a near-real-time data network for information collection, analysis, and dissemination to help prepare public health offices and frontline health workers to respond to emerging threats.
- Congress passed the Pandemic and All Hazards Preparedness Act in 2006. The act required the HHS to “establish a near real-time electronic nationwide public health situational awareness capability through an interoperable network of systems to share data and information to enhance early detection of, rapid response to, and management of, potentially catastrophic infectious disease outbreaks and other public health emergencies that originate domestically or abroad.” Data were to be standardized and transmitted electronically according to standards developed and used by the private sector. Congress set a deadline for the network to be completed—two years from the date of enactment, in 2008. In 2010, two years after the HHS missed the deadline, the Government Accountability Office (GAO) reported that the HHS had not taken even basic steps to establish the network.
- Congress passed another law in 2013, the Pandemic and All Hazards Preparedness Reauthorization Act, and reiterated the mandate for a near real-time, interoperable public health network. The law also required the HHS to submit a strategy to Congress within 180 days of the law’s enactment for establishing the network. The strategy was to identify the measurable steps that the department would take to establish the data-sharing network and modernize public health data collection activities. The law again required the agency to use standards developed by the private sector to facilitate data exchange and reporting, as well as sharing information with frontline health workers. The law authorized $138 million per year for five years for the network. As in 2006, the 2013 law sought to reduce duplication and burdens in reporting. The GAO once again reported on the HHS’s lack of real progress in September 2017, finding that the CDC’s failure to implement statutory requirements means that “it will lack an effective tool for ensuring that public health situational awareness network capabilities have been established in accordance with all of the requirements defined by the law.”
- This now-familiar pattern was repeated once again in 2019 when Congress passed the Pandemic and All-Hazards Preparedness and Advancing Innovation Act. The law reiterated the call of the past two laws and—since the CDC had failed to implement it despite 13 years and two statutory mandates—added the requirement to submit within 18 months an implementation plan with measurable steps and performance benchmarks tied to specific dates on which each step would be implemented, to make sure the network was completed. It also required the CDC to work effectively with the private sector to help build and coordinate the functionality of the data system. Finally, the law requires the CDC to produce an annual budget plan that catalogues resources spent on, gaps in, and strategies to address ongoing inefficiencies in the public health data network. A GAO report assessing progress has yet to be issued, and the hope is it will not be as dismal as the assessments of the first two laws. But when asked by Senator Richard Burr (R–NC) about the CDC’s progress, CDC Director Robert Redfield said that he did not know, and could not confirm, how many specialists had been hired as authorized by the law. According to Senator Burr’s office, “The CDC later confirmed that the agency had hired no new biosurveillance specialists to date.”
- In March 2020, Congress passed the $2 trillion CARES Act, which provided $1 billion for public health data infrastructure modernization. The law required the CDC to develop a modernization plan by April 30 and report the results to Congress. The agency failed to do so.
The common feature of the four laws is to require the CDC to integrate and improve public health reporting systems and provide funding so that the President, his chief advisors, governors, and mayors had the information they needed to make decisions. The laws also addressed the need to create feedback loops and information flows to nurses, doctors, and pharmacists so that they had relevant, timely information when treating patients.
While integration across different reporting systems has long been understood to be challenging, the lesson learned from 14 years of failures is that the CDC has not complied with the law’s requirements, lacks the internal expertise to create such a system, and has not contracted with private experts to quickly establish a robust information management system. A sad truth is that the CDC lacks transparency and is rarely held accountable. Solid oversight, direction, and management also appears lacking from the HHS, the CDC’s parent agency.
The HHS Protect Public Health Data Hub: Helpful but Incomplete Reform
In response to the pandemic and the CDC’s foot-dragging, the HHS created the HHS Protect Public Health Data Hub to fill the information gaps created by 14 years of inaction. According to its website, HHS Protect is
a secure platform for authentication, amalgamation, and sharing of healthcare information, so that the U.S. government can harness the full power of data for the COVID-19 response. With Protect, more than 200 disparate data sources are brought together into one ecosystem that integrates data across federal, state, and local governments and the healthcare industry. It provides a holistic view of the U.S. healthcare system so decision makers informed by Protect have information to guide action and save lives with a data-driven COVID-19 response.
While this stopgap measure is important, it is incomplete. For example, the website provides a view of hospital utilization and potential shortages but provides no information on testing and results, or on deaths and recoveries. In addition, results on intensive care and inpatient-bed capacity are estimates based on statistical models, not a complete and precise census.
HHS Protect does nothing to change the game for doctors, nurses, and others who still must enter the data into reports by hand. Finally, because data is pulled from existing CDC data systems and state partners, the information in HHS Protect suffers from the very data latency and inaccuracies that mark the current system. By the CDC’s own admission, it “maintains more than 100 surveillance systems for different uses, which creates a reporting burden and duplication of effort for partners, discrepancies among the data elements, and the need to use multiple information technology (IT) systems.” Fragmentation limits the effectiveness of these systems in collection and analysis of data and in providing actionable information to policymakers, public health officials, and clinical workers so they might make more informed and timely decisions.
Dr. Birx recently stated that the “CDC is working with us right now to build a revolutionary new data system so it [responsibility for collecting COVID-19 data from hospitals] can be moved back to the CDC.”
A New Model
It should not sound like a radical idea, but our primary recommendation is that the HHS implement the law enacted 14 years ago. The department must ensure that the CDC carries out its mission and must aggressively ensure the needed work is done. Specifically, to ensure that the CDC creates and runs the near-real-time, integrated public health data system mandated by Congress, the HHS should develop the needed infrastructure (a standards-based data hub) by issuing a contract for a public-private partnership between the HHS and a private entity with proven experience and ability in technology and information management. Within the next three months, the HHS should release a request for proposal (RFP) to build the system. The CDC should issue a contract shortly thereafter. Creating the system within a year is reasonable, but its implementation should be phased, with near-term deliverables that start with data that are already standardized, interoperable, and available through the reporting structure suggested above, such as case reports, medication history to prevent opioid abuse, and lab data.
We estimate the cost based on similar federal data facilitators, and other countries’ data hubs, to be close to $200 million. Such funds should be taken from the $1 billion in public health data infrastructure modernization provided by Congress through the CARES Act. There will also be ongoing costs for maintaining and upgrading the system, but the savings from preventing an outbreak or pandemic would justify these expenditures.
The new system should include comprehensive, privacy-protected data to provide the best information to support decision-making. It would help to inform best practices. It would significantly reduce burdens on frontline health workers and on state and local public health officials. This will result in a more effective, more efficient system that can better control and prevent disease, thereby drastically improving public health.
Robust Data. A significant amount of data are already collected and used in the public health universe. The HHS should identify the full set of data necessary to ensure a robust view of public health threats and any information needed to mitigate and address those threats. For the coronavirus, such data should at least include hospitalizations and mortality rates, the ages and comorbidities of patients who develop serious illness, and total net active cases. The data should be sufficiently robust to identify confirmed cases by race, ethnicity, disability, and income levels in order to increase understanding of which populations are most susceptible to the contagion. Data should also be geographically specific enough to enable policymakers to identify hotspots so they can adapt mitigation strategies to those unique circumstances. To accomplish this, data should be broken down at the county and zip code levels.
An important goal in this effort should be to identify all the needed discrete data elements, and then prioritize the standardization of any data that remain undefined and not computable. Congress should establish a committee of public and private experts to identify any data that lack standardized fields, elements, vocabulary, or transport standards. Recommendations for improvements and prioritized data elements and standards, including privacy, should be made within 60 days of the committee’s formation. The HHS should adopt these standards no later than 90 days after receipt.
Senator Rick Scott (R–FL) recently introduced legislation along these lines. Enactment of such a process would ensure not only that data are standardized, but that the process is completed quickly to address the current pandemic and prepare for new threats. It would also ensure that the system is built according to existing marketplace standards for sharing and using data.
Privacy Protection. While reporting should remain robust, the new data hub should protect the privacy of personal health information. Existing federal privacy provisions in the Health Insurance Portability and Affordability Act (HIPAA) strike that balance by allowing public health authorities access to personal health information for the purpose of disease surveillance and contact tracing, while imposing civil and criminal penalties on those who improperly use or disclose personal health information. These penalties apply both to organizations and individuals. They escalate based on whether a violation was inadvertent or willful and can result in fines of up to $1.5 million and imprisonment for up to five years. These penalties should be enforced aggressively.
Federal privacy laws also currently permit the disclosure of individually identifiable personal health information to public health authorities without the written authorization of the patient. The public health authority is also authorized by law to collect or receive such information for the purpose of preventing or controlling disease or conducting public health surveillance, investigations, or interventions. Data from state and local public health offices should be deidentified to protect privacy before being sent to the CDC.
These rules protect patient privacy while, at the same time, recognizing that government has the authority and the obligation to protect public health.
Less Burdensome Collection. One folly of the current system is that clinicians must document data more than once. Much of that data already exists in clinical management systems, such as electronic health records, and current data flows, such as claims transactions that are sent in the normal course of payment. As a result of $40 billion in taxpayer investments, almost 100 percent of hospitals and 75 percent of physician offices have adopted some kind of electronic health record system. These systems already contain the data that public health authorities need to inform decisions to combat the pandemic. The modernizations outlined below should be implemented using this existing infrastructure.
Specifically, the new health data hub should pull public health data automatically from electronic health records and forms submitted for payment of claims. This can include case reports, lab results, prescription drug history, testing and syndromic surveillance, ethnicity, age, and other demographic factors. This information is already required to be reported to a public health office, but clinicians have to transcribe it by hand onto case report forms. Automating that process and relying on existing electronic infrastructure will speed up the process and relieve burdens on providers.
Where data cannot be reported electronically because they live outside an electronic health record or pharmacy management system, or where management systems are not sufficiently developed or connected, data should be reported electronically in a standardized format, such as through an Internet portal.
In order to improve reporting and make it less burdensome, Congress and the Administration should:
- Require hospitals and physician offices to transmit data on hospital capacity and resources, including any shortages, to the new data hub electronically on a daily basis. Such a system would more quickly identify gaps in resources while suppling reliable information on the capacity of the health system to absorb patients.
- Require all labs—commercial, public-health, and hospital-based—to report real-time results to providers and local, state, national, territorial, and Indian public health authorities electronically. Real-time test results—whether positive or negative—would allow patients to give their permission for sharing positive test results with public health officials, thus facilitating contact tracing.
- Require hospitals and physicians to share standardized immunization data with immunization registries and make the data available at the point of care for other clinicians, including pharmacists, to determine whether to recommend a vaccination or a booster shot to a patient.
- Allow pharmacists access—with patients’ prior consent—to patients’ medication history, including whether the person has already been tested for the virus, the test result, and if the patient has been treated with a prophylactic or an antiviral. Such information would also inform the pharmacist on whether a patient is at risk of misuse, abuse, addiction, and overdose of opioids, and if there are any potentially harmful drug interactions. Combined with pharmacy-based virus and antibody testing that is already in place, this information would allow pharmacists to identify and vaccinate people against COVID-19 once vaccines are available.
- Impose a clear dividing line between reporting and service-provision reimbursement. As part of Congress’s response to the pandemic, it created additional reimbursement monies for providers that treat COVID-19 patients. This has led to concerns that reporting data on infection and mortality may be skewed due to financial incentives. Regardless of what has actually occurred, it is critical that policymakers recognize that public trust in the data is essential and address possible conflicts of interest.
A Forum for Best Practices. While a data hub could also serve as a public health forum where information is collected, disseminated, and then used to update “best practice” treatment guidelines, immediate, near-term critical problems exist due to the lack of this information. In the near term, the CDC could take initial steps to address this problem by regularly convening doctors to discuss new or emerging clinical treatments and best practices for COVID-19 treatments discovered by medical care providers practicing in hot spots. The CDC should use these conversations to supplement efforts to survey clinical results from hot spot areas and collate them to create the most updated clinical guidelines possible. This step would address critical needs for better information flow between frontline clinicians treating patients.
Considering the CDC’s track record, the HHS should lead the implementation of public health infrastructure modernization at the department level, specifically within the Secretary’s office. This will involve issuing and managing the contracting process and coordinating communication, staffing, and technology integration among the contractor, the CDC, and other entities. Considering the nature of the pandemic, Congress should grant the HHS a waiver from federal acquisition rules to pursue the project on an emergency basis.
Because the CDC has failed for more than a decade, it should finally face real consequences. After all, real consequences are currently being felt by all Americans as a result of the CDC not producing a near-real-time and interoperable biosurveillance system. Congress should make a portion of the HHS Secretary’s discretionary budget contingent on the department meeting implementation guidelines for the new data hub. If the Secretary, for example, failed to issue the RFP within three months, the office should not receive its full funding allocation for the fiscal year. Such reductions should increase each week that a contract is not issued.
The COVID-19 pandemic has demonstrated both the need for timely and accurate data as well as the CDC’s failure to provide it. This failure has deprived public officials and medical professionals of vital information as they implement policies and deliver health care that profoundly affect the well-being of hundreds of millions of Americans. The CDC has ignored repeated directives from Congress to reform its data systems, and disregarded assessments from independent and unbiased auditors. The country can no longer tolerate the CDC’s inaction. Congress should require the HHS and the CDC to enter into a public-private partnership to establish a modernized system that will equip policymakers to respond more effectively to public health crises. This requirement should be directly tied to consequences, including the automatic and escalating loss of the HHS Secretary’s discretionary funds and staff if the Secretary fails to produce results on time.
It cannot be overstated how transformative these reforms will be. By greatly reducing the government-imposed reporting burdens, millions of man hours and public health dollars will be freed up to be put to better use fighting the pandemic and meeting patient needs.
Implementing these recommendations will equip policymakers and clinicians with current and reliable information that will help them to make the best decisions in confronting public health crises. A system that facilitates the efficient collection and dissemination of near-real-time comprehensive data, and that is usable by frontline health care workers and researchers, is no luxury in times of crisis. It is a necessity.
Joel White, a former Staff Director of the U.S. House Ways and Means Subcommittee on Health, is the Executive Director of the Health Innovation Alliance. Doug Badger is a Visiting Fellow in Domestic Policy Studies, of the Institute for Family, Community, and Opportunity, at The Heritage Foundation.