4 Lessons for Collecting Race, Ethnicity, and Primary Language Data

Learnings from the First Cohort of HealthBegins’ REaL Data Accelerator Program

Can small healthcare organizations—with limited staff and resources—improve their race, ethnicity, and primary language (REaL) data collection quickly? As two Federally Qualified Health Centers proved, the answer is yes—if they employ strategies to improve communication, support staff, and accelerate learning.

In spring 2023, Imperial Beach Community Clinic in California and Siouxland Community Health Center in Iowa were the first healthcare organizations to complete HealthBegins’ REaL Data Accelerator program. The three-month online program equips healthcare professionals with the resources and best practices to strengthen their REaL data collection efforts and leverage that data to improve equity-focused primary care improvement efforts. 

To make significant progress toward improving health equity and health care equity, health systems need to first understand where specific inequities exist, both in their communities and in the care they provide. This calls on health systems to stratify patient data about race, ethnicity, gender identity, sexual orientation, and primary language across a variety of outcomes, clinical service lines, and locations. Collecting accurate, complete REaL data is a critical first step in that journey.

Each health center entered the Accelerator with unique challenges. Imperial Beach, an under-resourced health center serving communities along the U.S.-Mexico border, was struggling to collect patients’ REaL data and get it into their electronic health record (EHR). Their team identified 891 patients—approximately 20% of their patient population—with no information recorded on their race, ethnicity, and/or sexual orientation. Siouxland had collected more of their patients’ REaL data, but their team was not confident that the information documented in the EHR was accurate. They were about to transition to a new EHR system and felt it was the perfect opportunity to validate existing data, but they also faced staffing and resource barriers.

By taking an extensive self-assessment (the first stage of the Accelerator program), teams were able to uncover both blindspots and successes in their previous data collection efforts, and identify opportunities to make meaningful changes within their resource constraints. They also set clear goals for the three-month period. Imperial Beach wanted to gather 60% of the missing REaL data, while Siouxland wanted to have staff interview a sample of patients to determine the accuracy of existing REaL data so they could plan further data efforts.

Over the course of 12 weeks, teams learned a lot about what it takes to collect REaL data effectively and efficiently. Here are some of the most essential lessons that shaped their work and can strengthen the work of others.

  1. Give patients context for why their race and ethnicity information is being collected. One reason that patients commonly don’t share their race and ethnicity information during registration or with healthcare staff is because they don’t understand why it is needed or trust how it will be used. Organizations can build trust and buy-in by being more transparent with their intentions. Imperial Beach circulated a flyer to let patients know that the clinic asks everyone to share this information and that the information is used to ensure and monitor quality of care at the health center. That flyer is now given to every patient as part of the registration process. Critically, outreach efforts focused on reaching Spanish-speaking patients in their primary language by both having Spanish-speaking patient navigators review messages for accuracy and cultural awareness and boosting access to interpreters. 
  2. Provide resources, training, and scripts to staff, including registration staff and care managers. To properly collect REaL data, teams need resources and support for how to talk to patients about this data and how it is used. Shortly after staff at Siouxland began meeting with patients to verify the accuracy of their information in the EHR, staff reported to Accelerator faculty that they didn’t feel confident having those conversations. Specifically, they expressed not knowing how to answer all of the patients’ follow-up questions, particularly around how to define race and ethnicity. In response, Siouxland hosted a training for staff focused on how to speak to patients about REaL data collection and answer those questions. Afterwards, the health center’s Accelerator team continued to meet with staff to identify additional barriers and incorporate ongoing training. Imperial Beach also relied on training to support its staff members’ data collection efforts and provided scripts to registration staff to help them explain race, ethnicity, and primary language questions to patients. These efforts have deepened organizational awareness around the need for REaL data and elevated this data collection as a priority. 
  3. Work with vendors and leverage technology to improve workflows. Several EHR vendors offer resources designed to help healthcare organizations improve their REaL data collection and reporting, and some even host EHR-user health equity workgroups that organizations can take part in to learn from their peers. Talking with these vendors can often lead to critical workflow changes to support teams’ data collection efforts. Imperial Beach worked with Phreesia, their patient intake software vendor, to implement three changes to increase the likelihood of collecting patients’ race, ethnicity, and primary language information via the health center’s registration questionnaire. First, Phreesia added a flyer to the online patient questionnaire explaining why the health center collects this information at registration. Second, they made it possible for patients to answer demographic questions via their mobile devices. Lastly, they improved the mapping of patients’ responses on those initial questionnaires to patients’ medical records to ensure the data is captured properly. Phreesia was also able to help Imperial Beach create a report that allows staff to identify all patients who are missing particular pieces of demographic information, whereas before staff had to look up patient’s information one-by-one to see if REaL data fields were complete. Staff can now log and track their outreach efforts to collect missing data in the EHR, preventing staff from duplicating efforts and saving team members considerable time. 
  4. Engage in small tests-of-change often. To do this work successfully and sustainably, teams need to start small, set achievable goals, and build upon the work over time. It’s critical to create an iterative process where teams examine their outcomes at short intervals, make adjustments, and check again. This can make the work feel more manageable, and seeing progress can motivate staff to continue the work. To support this process, Accelerator participants were given a test-of-change project management tracker to set goals and measure their progress. Imperial Beach set a goal of reducing the number of patients missing race and ethnicity data in their patient records and wanted to see how much data they could collect in one month by training staff and implementing protocol changes. They saw a noticeable improvement after 30 days, letting them know they were on the right track. By the end of their Accelerator participation, the health center had collected almost all of the REaL data missing from the EHR, exceeding their initial goal.


Kate Marple is a Boston-based writer who specializes in helping nonprofit, health care, and legal services organizations develop practices to ensure that the stories they tell are shaped by and benefit people directly impacted by the issue(s) those stories are about. Her website is https://whotellsthestory.org

Learn how HealthBegins can help you move healthcare upstream. Contact us to learn more.