BACKGROUND

Free thyroxine (fT4) is often ordered when not indicated. The goal of the current study was to use quality improvement tools to identify and implement an optimal approach to reduce inappropriate fT4 testing throughout a large pediatric hospital system.

METHODS

After reviewing evidence-based guidelines and best practices, a thyroid-stimulating hormone with reflex to fT4 test and an outpatient thyroid order panel with clinical decision support at order entry, along with several rounds of provider education and feedback, were implemented. Outpatient and inpatient order sets and system preference lists were reviewed with subject matter experts and revised when appropriate. Tracking metrics were identified. Automated monthly run charts and statistical process control charts were created using data retrieved from the electronic health record. Charts established baseline data, balancing measure data, monitored the impact of interventions, and identified future interventions.

RESULTS

Over a 44-month period, among nonendocrinology providers, a reduction in fT4 and thyroid-stimulating hormone co-orders from 67% to 15% and an increase in reflex fT4 tests from 0% to 77% was obtained in inpatient and outpatient settings. Direct cost savings as a result of performing 5179 fewer fT4 tests over 3 years was determined to be $45 800.

CONCLUSIONS

After implementation of a reflex fT4 test, a novel order panel with clinical decision support, provider education, and changes to ordering modes, a large and sustainable reduction in fT4 tests that was associated with significant cost savings was achieved among nonendocrinology providers.

Value in health care can be increased by improving the patient experience, improving population health, and reducing costs.1  Choosing Wisely is a national effort led by the American Board of Internal Medicine to promote increased awareness of wasteful or unnecessary medical tests, procedures, and treatments.2  Implementing effective laboratory test use programs represents an opportunity to increase value.3,4  Benefits to optimizing laboratory test use include decreased costs, fewer phlebotomies, and improved patient satisfaction and outcomes.4  Reducing unnecessary tests also decreases chances of a cascade effect, which refers to a chain of downstream events triggered by an initial event that may cause otherwise avoidable adverse effects and/or morbidity.5  In the case of laboratory testing, such downstream events can include unexpected results that may be associated with misdiagnosis and lead to additional unnecessary testing, treatments, or referrals for specialty consultation.3,4  The pursuit of value improvement in health care has been slowed because of perverse fee-for-service payment models. With the increasing prevalence of value-based health care delivery models, such as accountable care organizations and population health management, reducing costs is essential. To accelerate value attainment, health care companies are increasingly partnering with laboratory management benefit programs that create and enforce laboratory testing policies and review laboratory claims to preserve funds for high-value services.6,7  Although many specialty guidelines and policies target adult populations, improvements in test utilization and incorporation of recommendations can also lead to improvements in the pediatric setting.

Thyroid testing is an area where evidence-based guidance from the American Thyroid Association, the Pediatric Endocrine Society, and Choosing Wisely supports judicial use of thyroid tests.816 Table 1 summarizes recommendations that are applicable in the pediatric setting. Considerable practice variation in thyroid function test ordering exists and represents an opportunity for organizations to better guide test use and reduce costs. In a study of 82 laboratories from 24 health care organizations in the United States, thyroid-stimulating hormone (TSH) and free thyroxine (fT4) testing are estimated to cost $1.6 billion per year.17  Similar studies focusing exclusively on a pediatric setting, to our knowledge, have not been performed.

TABLE 1

Summary of Evidence-Based Thyroid Function Test Guidelines Applicable to Pediatrics

GuidelineRecommendation
Choosing Wisely Don’t order multiple tests in the initial evaluation of a patient with suspected nonneoplastic thyroid disease. Order TSH, and if abnormal, follow up with additional evaluation or treatment depending on the findings.8  
Choosing Wisely Avoid routinely measuring thyroid function and/or insulin levels in children with obesity.10  
European Society of Pediatric Endocrinology and Endocrinology Child with abnormal newborn screen for congenital hypothyroidism: repeat thyroid function testing with a serum sample and measure TSH and fT4.11,1416,28  
American Association of Clinical Endocrinologists and the American Thyroid Association TSH measurements in hospitalized patients should be done only if there is an index of suspicion for thyroid dysfunction.13  
American Association of Clinical Endocrinologists and the American Thyroid Association TSH and fT4 are used for evaluating central hypothyroidism and monitoring adequacy of thyroid hormone replacement, monitoring of antithyroid treatment and management of thyroid carcinoma.13  
GuidelineRecommendation
Choosing Wisely Don’t order multiple tests in the initial evaluation of a patient with suspected nonneoplastic thyroid disease. Order TSH, and if abnormal, follow up with additional evaluation or treatment depending on the findings.8  
Choosing Wisely Avoid routinely measuring thyroid function and/or insulin levels in children with obesity.10  
European Society of Pediatric Endocrinology and Endocrinology Child with abnormal newborn screen for congenital hypothyroidism: repeat thyroid function testing with a serum sample and measure TSH and fT4.11,1416,28  
American Association of Clinical Endocrinologists and the American Thyroid Association TSH measurements in hospitalized patients should be done only if there is an index of suspicion for thyroid dysfunction.13  
American Association of Clinical Endocrinologists and the American Thyroid Association TSH and fT4 are used for evaluating central hypothyroidism and monitoring adequacy of thyroid hormone replacement, monitoring of antithyroid treatment and management of thyroid carcinoma.13  

fT4, free thyroxine; TSH, thyroid-stimulating hormone.

Laboratory benefit management programs that manage routine laboratory tests have developed coverage policies pertaining to thyroid testing.18,19  Measuring TSH is generally regarded as the most sensitive initial laboratory test for screening individuals with symptoms consistent with hypothyroidism or hyperthyroidism because small changes in fT4 result in larger changes in TSH values. Therefore, measuring TSH only as the initial test for suspected thyroid problems and adding fT4 if TSH is abnormal is becoming standard.816  As 1 strategy to promote appropriate resource stewardship, Choosing Wisely states: “Don’t order multiple tests in the initial evaluation of a patient with suspected thyroid disease. Order TSH, and if abnormal, follow up with additional evaluation or treatment depending on the findings.”8  To address this guidance, several studies have reported successful implementation of a TSH with reflex to fT4 (reflex fT4) test where fT4 is automatically performed when TSH is outside of the normal reference range.2023  Clinical decision support panels at order entry have also been used.23  To investigate whether a change in practice was warranted at our institution, a baseline analysis of TSH and fT4 test orders placed by nonendocrinology providers was performed. It revealed 67% of TSH orders had a fT4 ordered on the same date by a single provider (ie, TSH/fT4 co-order). A quality improvement (QI) initiative to improve thyroid function test utilization was launched with an aim to reduce unnecessary TSH/fT4 test co-orders and align with best practice guidelines. The improvement team established specific aims to reduce TSH/fT4 co-orders from 67% to 30% and increase the percentage of reflex fT4 tests from 0% to 50% and sustain these changes over 8 months of intervention rollout. Here, we describe the results of these interventions over a 44-month period to promote high-value resource stewardship for thyroid function tests.

This QI initiative was conducted over a 44-month period between June 2019 and February 2023 at a large free-standing children’s hospital with 2 hospital locations, 400+ beds, and a primary care network of 37 primary care locations across 15 counties. The hospital system used computerized physician order entry via the electronic health record (EHR) (Epic Systems Corporation). All inpatient and outpatient sites were included in the project. All TSH, fT4, and reflex fT4 orders were included except for those placed by endocrinology providers. Ordering modes targeted for interventions included outpatient and inpatient order sets and system preference lists. Order sets consist of preconfigured sets of orders (eg, laboratory testing, imaging, medication) that are commonly ordered together and can be configured based on certain diagnoses. Preference lists are collections of individual orders that allow users to easily look up and select entries they most commonly use. Preference lists can be built at a system level to simplify (or restrict) options that are available to a group of users or they can be created by individual users who wish to have their own personalized list of frequently used orders. Panels can be created to group orders and/or medications for quick ordering and can also include clinical decision support.

This improvement project was initiated by the Laboratory Formulary Committee, a multidisciplinary group of physicians, nurses, and laboratory staff whose purpose is to promote best practices within laboratory medicine by ensuring high-quality, evidence-based, and cost-effective testing. In addition to several committee members, additional subject matter expert physicians, data analysts, nurses, and QI specialists were involved in this QI project. This QI project does not meet the definition of human subject research per our institutional review board and therefore does not fall under its purview.

Automated data reports were generated monthly from the EHR and included the number of TSH/fT4 co-orders, TSH, fT4, and reflex fT4 tests ordered using test definitions described in Table 2. The automated data reports also included the patient's name, medical record number, ordering mode (inpatient or outpatient), ordering provider, ordering department as well as the order set or preference list used to place the order. Percentages of TSH/fT4 co-orders and percentages of reflex fT4 test usage were calculated monthly between June 2019 and February 2023 using the formulas defined in Table 2. Two primary outcomes were measured: (1) the change in percentage of TSH/fT4 co-order usage and (2) the change in percentage of reflex fT4 test usage (Table 2). Automated monthly run charts were created using Qlik Sense Analytics Software (Qlik Technologies Inc., King of Prussia, PA) and statistical process control charts, namely p-charts, were created using Excel Macros from Cincinnati Children’s Hospital. Charts were used to track and display the 2 key outcome measures. Two secondary outcomes were measured: (1) yearly and cumulative direct cost savings as a result of performing fewer fT4 tests and (2) the fT4 utilization rate as measured by the ratio of fT4 to TSH tests performed (Table 2). The number of fT4 and TSH tests performed were determined by searching the EHR and Strata24  for CPT codes 84439 (fT4) and 84443 (TSH). fT4 costs were used from the hospitals’ cost accounting system from 2022. TSH tests performed as part of the Ohio newborn state screen were excluded.

TABLE 2

Test Definitions and Key Project Measures

CategoryAnalyte or MeasurementDefinition
Test definition TSH TSH ordered as a standalone test 
fT4 fT4 ordered as a standalone test 
TSH/fT4 co-order TSH and fT4 ordered on the same date by a single provider 
reflex fT4 TSH always performed; fT4 performed when TSH is outside the normal reference range 
Baseline analysis % TSH standalone orders # TSH standalone orders/(# TSH/fT4 co-orders + # TSH standalone orders) × 100 
% TSH/fT4 co-orders # TSH/fT4 co-orders/(# TSH/fT4 co-orders + # TSH standalone orders) × 100 
% inappropriate fT4 tests performed (# normal TSH tests from TSH/fT4 co-orders/# TSH/fT4 co-orders) × 100 
balancing measure (# abnormal fT4 tests with normal TSH results from TSH/fT4 co-orders/# TSH/fT4 co-orders) × 100 
Primary outcome % TSH/fT4 co-orders # TSH/fT4 co-orders/(# reflex fT4 orders + # TSH/fT4 co-orders + # TSH standalone orders) × 100 
% reflex fT4 orders # reflex fT4 orders/(# reflex fT4 orders + # TSH/fT4 co-orders + # TSH standalone orders) × 100 
Secondary outcome direct cost savings Cost savings associated with performing fewer fT4 tests 
fT4 utilization rate # fT4 tests performed/# TSH tests performed 
CategoryAnalyte or MeasurementDefinition
Test definition TSH TSH ordered as a standalone test 
fT4 fT4 ordered as a standalone test 
TSH/fT4 co-order TSH and fT4 ordered on the same date by a single provider 
reflex fT4 TSH always performed; fT4 performed when TSH is outside the normal reference range 
Baseline analysis % TSH standalone orders # TSH standalone orders/(# TSH/fT4 co-orders + # TSH standalone orders) × 100 
% TSH/fT4 co-orders # TSH/fT4 co-orders/(# TSH/fT4 co-orders + # TSH standalone orders) × 100 
% inappropriate fT4 tests performed (# normal TSH tests from TSH/fT4 co-orders/# TSH/fT4 co-orders) × 100 
balancing measure (# abnormal fT4 tests with normal TSH results from TSH/fT4 co-orders/# TSH/fT4 co-orders) × 100 
Primary outcome % TSH/fT4 co-orders # TSH/fT4 co-orders/(# reflex fT4 orders + # TSH/fT4 co-orders + # TSH standalone orders) × 100 
% reflex fT4 orders # reflex fT4 orders/(# reflex fT4 orders + # TSH/fT4 co-orders + # TSH standalone orders) × 100 
Secondary outcome direct cost savings Cost savings associated with performing fewer fT4 tests 
fT4 utilization rate # fT4 tests performed/# TSH tests performed 

fT4, free thyroxine; TSH, thyroid-stimulating hormone.

After a review of the literature and best practice recommendations for thyroid function test use,816  ordering practices of nonendocrinology providers were analyzed by determining the percentage of TSH/fT4 co-orders, TSH standalone orders, and fT4 standalone orders over a 9-month period (June 1, 2019–January 31, 2020) using the calculations described in Table 2 under baseline analysis. Potential inappropriate fT4 testing was defined as co-ordering TSH/fT4 when TSH was within the laboratory’s reference range and was calculated using the formula in Table 2 under baseline analysis. As a balancing measure, during the baseline period, the percent of abnormal fT4 results in the setting of a normal TSH result from TSH/fT4 co-orders was calculated (Table 2).

The Model for Improvement Framework25  was used and consisted of a project aim, tracking metrics monthly pre- and postinterventions, identifying change concepts/ideas, and conducting Plan Do Study Act cycles. A Key Driver Diagram was created to track the project work focusing on key points related to possible overutilization of fT4 testing (Fig 1). Educating providers on Choosing Wisely guidelines pertaining to thyroid function testing and EHR support were identified as key drivers. Pareto Charts were generated from pre- and postintervention data to identify and prioritize those changes that would be of greatest benefit for improvements in appropriate TSH and fT4 test utilization.

FIGURE 1

Key driver diagram to improve thyroid function test utilization demonstrating the SMART aim, target population, key drivers, and interventions to achieve the global aim.

FIGURE 1

Key driver diagram to improve thyroid function test utilization demonstrating the SMART aim, target population, key drivers, and interventions to achieve the global aim.

Close modal

Implementation of TSH With Reflex to fT4 Test

Baseline analysis over a 9-month period revealed 94% of TSH/fT4 co-orders had normal TSH results. The first intervention identified was to implement a reflex fT4 test order option where fT4 was automatically performed if TSH was outside the normal reference range. The reflex fT4 test was implemented on February 25, 2020. Downstream effects in the laboratory that resulted from implementation of the reflex fT4 test included: (1) requirements to create and validate a new test to ensure the fT4 test was triggered when TSH was outside the normal reference range; (2) delay in resulting of TSH when the fT4 test was triggered because the fT4 test had to be completed before resulting TSH; and (3) potential for increased work for laboratory personnel to manually retrieve samples if the reflex fT4 test was triggered and samples had been removed from the analyzer. Additionally, an increase in minimum test volume for the reflex fT4 test (1.5 mL blood) versus TSH only (0.5 mL blood) was required.

Order Entry Modifications and Clinical Decision Support

Concurrent with the implementation of the reflex fT4 test and after review of outpatient order sets, inpatient order sets, and system preference lists with subject matter experts, order entry modifications were made in the EHR and are detailed in Table 3. As seen, the reflex fT4 test replaced individual TSH and fT4 tests or was added to existing inpatient order sets, outpatient order sets, and preference lists. Either TSH or the reflex fT4 test is appropriate in the initial evaluation of a patient with suspected nonneoplastic thyroid disease (primary hypothyroidism and hyperthyroidism).8,9  Benefits of the reflex fT4 test over the TSH standalone test are avoidance of a second blood draw and faster fT4 results should TSH be abnormal. fT4 as a co-order with TSH is indicated as a follow-up to an abnormal newborn TSH screen as well as in suspected central (secondary or tertiary) hypothyroidism, hyperthyroidism, thyroid hormone resistance, TSH resistance syndrome, medication-induced hypothyroidism, and management of thyroid cancer. fT4 as an isolated test is indicated in assessing the adequacy of thyroid hormone replacement in patients with known central hypothyroidism.1116 

TABLE 3

Ordering Modes Modified With the Indicated Interventions and Rationale for Changes

Ordering ModeInterventionRationale
Outpatient order set Replaced separate TSH and fT4 orders with reflex fT4 test in appropriate outpatient order sets Choosing Wisely TSH-centered approach8  
Added reflex fT4 test to outpatient endocrinology order sets (diabetes, annual growth hormone, Turner syndrome, growth evaluation) Therapy monitoring in hypothyroidism; because of variations in provider practice, included reflex fT4 test as an option 
Neuro-oncology, pregestational diabetes, postpartum No changes: TSH and fT4 kept as separate tests; not medically appropriate 
Inpatient order set Replaced separate TSH and fT4 orders with reflex fT4 test in appropriate order sets Choosing Wisely TSH-centered approach8  
Added reflex fT4 test (endocrinology) Increased options for endocrinology providers should fT4 not be needed; both TSH and fT4 are recommended for secondary or tertiary (central) hypothyroidism, thyroid hormone or thyrotropin resistance syndromes, or medication-induced hypothyroidism 
Kept separate TSH and fT4 orders (NICU, hematology/oncology) NICU: TSH and fT4 are indicated to guide treatment initiation because of immaturity of hypothalamus-pituitary-thyroid axis and transient hypothyroidism
Hematology/oncology: TSH and fT4 are indicated due to variables such as use of high-dose steroids and other medications that could affect TSH results; central hypothyroidism risk in children with CNS tumors 
System preference lists Reflex fT4 test added to inpatient and outpatient preference lists that had TSH Additional option for ordering thyroid function tests 
Replaced separate TSH and fT4 orders with thyroid order panel in outpatient setting only except for neonatology, hematology/oncology, endocrinology Provides clinical decision support at order entry (see Fig 2
Laboratory database Added reflex fT4 test Contains all orderable tests 
Ordering ModeInterventionRationale
Outpatient order set Replaced separate TSH and fT4 orders with reflex fT4 test in appropriate outpatient order sets Choosing Wisely TSH-centered approach8  
Added reflex fT4 test to outpatient endocrinology order sets (diabetes, annual growth hormone, Turner syndrome, growth evaluation) Therapy monitoring in hypothyroidism; because of variations in provider practice, included reflex fT4 test as an option 
Neuro-oncology, pregestational diabetes, postpartum No changes: TSH and fT4 kept as separate tests; not medically appropriate 
Inpatient order set Replaced separate TSH and fT4 orders with reflex fT4 test in appropriate order sets Choosing Wisely TSH-centered approach8  
Added reflex fT4 test (endocrinology) Increased options for endocrinology providers should fT4 not be needed; both TSH and fT4 are recommended for secondary or tertiary (central) hypothyroidism, thyroid hormone or thyrotropin resistance syndromes, or medication-induced hypothyroidism 
Kept separate TSH and fT4 orders (NICU, hematology/oncology) NICU: TSH and fT4 are indicated to guide treatment initiation because of immaturity of hypothalamus-pituitary-thyroid axis and transient hypothyroidism
Hematology/oncology: TSH and fT4 are indicated due to variables such as use of high-dose steroids and other medications that could affect TSH results; central hypothyroidism risk in children with CNS tumors 
System preference lists Reflex fT4 test added to inpatient and outpatient preference lists that had TSH Additional option for ordering thyroid function tests 
Replaced separate TSH and fT4 orders with thyroid order panel in outpatient setting only except for neonatology, hematology/oncology, endocrinology Provides clinical decision support at order entry (see Fig 2
Laboratory database Added reflex fT4 test Contains all orderable tests 

CNS, central nervous system; fT4, free thyroxine; TSH, thyroid-stimulating hormone.

All medical staff were informed of the Choosing Wisely recommendation8  and new reflex fT4 test 1 week before the reflex fT4 test going live via an SBAR (situation, background, assessment, recommendation) e-mail communication. A subsequent SBAR was distributed to medical staff 14 months after implementation of the reflex fT4 test that reported a >30% reduction of TSH/fT4 co-orders among nonendocrinology providers. The SBAR also communicated that the top departments co-ordering TSH and fT4 were pediatrics and pediatric gastroenterology and reminded providers to order TSH or the reflex fT4 test in the initial evaluation of a patient with suspected nonneoplastic thyroid disease. Because many outpatient providers were still co-ordering TSH and fT4 from preference lists in situations where the reflex fT4 test would be appropriate, the next intervention was the implementation of a clinical decision support thyroid order panel on outpatient system preference lists at the time of electronic order on May 17, 2022 (Fig 2). TSH, fT4, and reflex fT4 were added as panel synonyms so the panel would display if searching for individual thyroid function tests. In addition to the initial Choosing Wisely guidance,8  the panel included pediatric-specific outpatient order guidance that states not to routinely measure thyroid function in children with obesity,10  during acute illness,13,2628  and to order both TSH and fT4 after an abnormal newborn screen for primary congenital hypothyroidism11,1416,28  (Table 1). The reflex fT4 test was prechecked as the recommended test for suspected thyroid disease with the option to uncheck and instead order TSH and/or fT4. The thyroid order panel replaced TSH and fT4 on all outpatient system preference lists except for neonatology, endocrinology, and hematology/oncology where co-ordering was more likely to be appropriate. The rationale for interventions implemented using various ordering modes is detailed in Table 3. All medical staff were informed of the new outpatient thyroid order panel 1 week before it going live via an SBAR e-mail communication. Data were reviewed and minor changes to order sets and preference list content were made when the panel was added.

FIGURE 2

Thyroid order panel with clinical decision support added to outpatient preference lists at order entry in the electronic health record.

FIGURE 2

Thyroid order panel with clinical decision support added to outpatient preference lists at order entry in the electronic health record.

Close modal

Baseline analysis of TSH/fT4 co-orders and TSH orders placed by nonendocrinology providers over a 9-month period revealed that of 8258 orders, 5510 (67%) had TSH/fT4 co-orders and 2748 (33%) had TSH orders only. During this period, there were 83 standalone fT4 orders only. A total of 94% (5195/5510) of TSH/fT4 co-orders had normal TSH during this time. Among the 5510 TSH/fT4 co-orders, fT4 was abnormal and TSH was normal in 229 cases (4.2%).

Primary Measures

The percent reduction in TSH/fT4 co-orders and the percent increase in reflex fT4 tests among outpatients and inpatients from tests ordered by nonendocrinology providers between June 1, 2019 and February 28, 2023, along with dates of implementation of the reflex fT4 test, an outpatient thyroid order panel, and associated communication to providers, are shown in the p-charts in Figs 3 and 4. After the first intervention (new reflex fT4 test and provider education), the percentage of TSH/fT4 co-order usage decreased from 67% to 29% (Fig 3), whereas the percentage of reflex fT4 test usage increased from 0% to 57% (Fig 4). After the second intervention (provider performance update), TSH/fT4 co-order usage decreased further to 23% (Fig 3), whereas reflex fT4 test usage increased to 66% (Fig 4). After the final intervention (new thyroid order panel and provider education), TSH/fT4 co-order usage decreased further to 15% (Fig 3) and reflex fT4 test usage increased to 77% (Fig 4), both of which were sustained for more than 8 months.

FIGURE 3

p-chart displaying monthly reductions in percentages of TSH/fT4 co-orders. The three interventions are labeled the month before implementation. n = monthly sum of TSH/fT4 co-orders, TSH, and reflex fT4 orders. The green arrow represents the expected direction of changes after interventions. fT4, free thyroxine; TSH, thyroid-stimulating hormone.

FIGURE 3

p-chart displaying monthly reductions in percentages of TSH/fT4 co-orders. The three interventions are labeled the month before implementation. n = monthly sum of TSH/fT4 co-orders, TSH, and reflex fT4 orders. The green arrow represents the expected direction of changes after interventions. fT4, free thyroxine; TSH, thyroid-stimulating hormone.

Close modal
FIGURE 4

p-chart displaying monthly increase in percentages of reflex fT4 tests. The 3 interventions are labeled the month before implementation. n = monthly sum of TSH/fT4 co-orders, TSH, and reflex fT4 orders. The green arrow represents the expected direction of changes after interventions. fT4, free thyroxine; TSH, thyroid-stimulating hormone.

FIGURE 4

p-chart displaying monthly increase in percentages of reflex fT4 tests. The 3 interventions are labeled the month before implementation. n = monthly sum of TSH/fT4 co-orders, TSH, and reflex fT4 orders. The green arrow represents the expected direction of changes after interventions. fT4, free thyroxine; TSH, thyroid-stimulating hormone.

Close modal

Secondary Measures (Direct Cost Analysis and fT4 Utilization Rate)

Among nonendocrinology providers, 5179 fewer fT4 tests were performed between 2019 and 2022, which resulted in a total 3-year direct cost reduction of $45 800. This reduction in fT4 testing yielded direct cost benefits of $25 693, $17 385, and $2722 in years 1, 2, and 3, respectively. Decreased year-over-year cost benefits were due to performing fewer fT4 tests yearly: The number of fT4 tests performed in 2019, 2020, 2021, and 2022 were 8160, 4972, 3601, and 2981, respectively, which translated to 3188, 1371, and 620 fewer tests performed yearly when compared with the previous year and an overall reduction of fT4 tests of 63% (5179/8160). Most of the cost savings were a result of implementation of the reflex fT4 test on February 25, 2020, because the thyroid order panel was not implemented until May 17, 2022. The fT4 utilization rate decreased yearly from 0.64 (8160 fT4/12 666 TSH tests performed) in 2019 to 0.22 (2981 fT4/13 486 TSH tests performed) in 2022.

Following the Model for Improvement Framework25  and continuous monitoring of data over 44 months, this project demonstrated successful implementation of interventions to improve thyroid function test use. TSH/fT4 co-orders were reduced from 67% to 15%, whereas usage of the reflex fT4 test increased from 0% to 77% among nonendocrinology providers and exceeded Smart Aim goals of 30% (TSH/fT4 co-order usage) and 50% (reflex fT4 test usage). These improvements were sustained for more than 8 months after implementation of a reflex fT4 test, an outpatient thyroid order panel, provider education, and necessary changes in EHR test builds. The reflex fT4 system was the biggest contributor to the reduction in fT4 testing, likely because the fT4 test could be performed when needed without a second blood draw. When TSH was abnormal, fT4 results were available sooner. Providers’ acceptance and reliance on this system was likely enhanced by simultaneous education, revisions to order sets, and clinical decision support provided at order entry in outpatient preference lists. Tailoring EHR interventions to the appropriate clinical setting was key in reducing potential inappropriate TSH/fT4 co-ordering and was the rationale for revising separate TSH and fT4 orders in some but not all order sets and preference lists. Additionally, because the balancing measure performed at baseline revealed 4.2% of TSH/fT4 co-orders had a normal TSH and abnormal fT4 and because we wanted to minimize unintended consequences of missing these cases using a TSH-centered strategy, it was important to identify when TSH/fT4 co-orders were appropriate. Because this was not easily identified from system-level data, a goal of 50% usage of the reflex fT4 test was set. This goal was exceeded, as demonstrated by a sustained reflex fT4 test usage of 77%.

Several previous studies have reported their findings following reflex fT4 test implementation2023  and clinical decision support.23  Barry et al20  reported significant (P < .001) reductions in TSH tests from 270 to 195/week and increases in reflex fT4 testing from 107 to 184/week after implementation of the reflex fT4 test but not a significant change in fT4 tests (95 to 91/week) (P = .386). It is unclear why fT4 volumes did not decrease and may have resulted from concurrent ordering reflex fT4 and fT4 tests. After reflex fT4 implementation, Taher et al,21  Gilmour et al,22  and Abitbol23  reported 39%, 34%, and 30% reductions in fT4 testing, respectively. It is likely our 64% reduction in fT4 use exceeded these studies because of our focus on changes among nonendocrinology providers only as well as variations in changes to order sets and preference lists, which varied from none reported22  to removing fT4 from existing order sets21,23  to optimizing preference lists.20  Our approach was to replace individual TSH and fT4 tests with the reflex fT4 test or add the reflex fT4 test to existing order sets or preference lists. Abitbol et al23  reported cost savings of $43 000 per year as a result of performing fewer fT4 tests. It is likely this cost savings is an overestimate because one would predict direct cost savings to decrease yearly because fewer fT4 tests would be performed year-over-year. Abitbol et al23  also used clinical decision support at order entry for fT4 whereby providers were required to select an appropriate indication to order fT4. Our clinical decision support panel was upstream of fT4 ordering because we wanted to encourage reflex fT4 ordering, which our panel defaults to, but also allows ordering of TSH and/or fT4 if indicated. Although similarities existed between our study and previous studies, learning and improving on existing studies adds value in health care by focusing on continuous quality improvement.

The financial implications of this and other Choosing Wisely initiatives are important to understand. At its core, Choosing Wisely is designed to remove non–value-added care. In this instance, we demonstrated a reduction in unnecessary fT4 testing and reported a direct cost savings of $45 800 over 3 years at our single institution. A total of 5179 fewer fT4 tests were performed between 2019 and 2022 among nonendocrinology providers. After several rounds of provider education, the direct cost savings associated by avoiding inappropriate fT4 test orders was mainly attributed to performing the reflex fT4 test and the cost savings were most pronounced in years 1 ($25 693) and 2 ($17 385). These cost savings were associated with corresponding reductions in fT4 utilization per case as measured by the ratio of fT4 to TSH tests. Although some initial costs in terms of technologist time and information technology resources were required to implement the reflex fT4 test, these 1-time build and setup costs were excluded from the overall cost analysis. True cost savings can be difficult to quantify because direct cost savings do not consider potential downstream effects such as repeat and/or unnecessary additional testing as well as referrals to specialists for incidental findings. On the reimbursement side, historically, laboratory testing is reimbursed in outpatient settings on a fee-for-service basis, with payment provided for each test performed. This outpatient payment model provides little financial incentive to manage test use. In the inpatient setting, laboratory testing is reimbursed as part of a diagnosis-related group, in which reimbursement is fixed regardless of the number of tests performed. Therefore, reducing test vol reduces costs but not revenue so financial incentives exist for managing inpatient test use. It is likely that reimbursement models currently in place for tests performed on inpatients will, in the future, also encompass outpatients. Capitated care, managed care, and accountable care models are built on the same principle, which involves paying a predetermined per-patient amount to cover all services over a defined time. In these models, the impact on fT4 reduction is similarly favorable to inpatient bundled payments. To improve value-based care, our study focused on improving thyroid function testing in both the outpatient and inpatient setting. Although this has a negative impact on hospital outpatient revenue, it ensures evidence-based best practice.

There were some limitations to this improvement project. First, direct costs are calculated locally and although calculations would translate broadly to other organizations, the actual dollar values and thus scale of cost savings may differ. Second, systems with weaker provider alignment may not achieve similar improvements. Third, data analytics may pose challenges because of: (1) pulling data from different sources (ie, laboratory system versus EHR) and use of different analytics software; (2) defining how data are pulled (ie, units of service/billed tests versus cases/ordered tests); and (3) changes in testing methodology, reference range, and/or costs during a multiyear study.

In this study, orders placed by endocrinology providers were excluded from analysis based on the rationale that these individuals are experts who diagnose and treat complex thyroid disorders and are familiar with clinical practice guidelines and Choosing Wisely recommendations. This patient population may include those with secondary or tertiary (central) hypothyroidism, thyroid hormone or thyrotropin resistance syndromes, or medication-induced hypothyroidism, in which monitoring and screening of fT4 and other thyroid function tests may be necessary. Previous studies have reported test ordering patterns among endocrinologists vary with experience and those with the least experience having the highest proportion of potentially inappropriate test requests.29,30  This suggests that future investigations into ordering patterns among endocrinology providers may provide an opportunity for further improvements in thyroid function test utilization. Additional future interventions may also target reducing redundant orders that occur proximate in time and ordered by different providers throughout the enterprise. With a recommended maximum TSH testing frequency endorsed by clinical practice guidelines of every 4 weeks, clinical decision support tools to define a minimum retesting interval of 4 weeks may be justified and has been used by other organizations.29,31 

We thank Carlo Cornacchione and Shannon Niziolek for their assistance in implementing the reflex fT4 test in the laboratory.

Dr Warshawsky participated in the study design, analyzed data, drafted the initial manuscript, and critically reviewed and revised the manuscript; Drs Lemerman, Gunkelman, Davidson, Baccon, and Ms Love assisted with the study design, implemented electronic health record changes, analyzed data, and critically reviewed and revised the manuscript; Drs Mandalapu and Uli served as subject matter experts, participated in the study design, and critically reviewed and revised the manuscript for relevant content; Ms Patterson and Mr Gannon developed the data analysis plan, performed data collection, analyzed data, and critically reviewed and revised the manuscript; Ms Engler advised with quality improvement science tools, assisted with generation of the figures, and critically reviewed and revised the manuscript; Dr Bigham advised on the study design and critically reviewed and revised the manuscript for relevant content; and all authors approved the final manuscript as submitted and agree to be accountable for all aspects of the work.

FUNDING: No external funding.

CONFLICT OF INTEREST DISCLOSURES: The authors have indicated they have no potential conflicts of interest to disclose.

1
Berwick
DM
,
Nolan
TW
,
Whittington
J
.
The triple aim: care, health, and cost
.
Health Aff (Millwood)
.
2008
;
27
(
3
):
759
769
2
Cassel
CK
,
Guest
JA
.
Choosing wisely: helping physicians and patients make smart decisions about their care
.
JAMA
.
2012
;
307
(
17
):
1801
1802
3
Zhi
M
,
Ding
EL
,
Theisen-Toupal
J
,
Whelan
J
,
Arnaout
R
.
The landscape of inappropriate laboratory testing: a 15-year meta-analysis
.
PLoS One
.
2013
;
8
(
11
):
e78962
4
Clinical and Laboratory Standards Institute
.
GP49. Developing and Managing a Medical Laboratory (Test) Utilization Management Program
.
Clinical and Laboratory Standards Institute
;
2017
5
Mold
JW
,
Stein
HF
.
The cascade effect in the clinical care of patients
.
N Engl J Med
.
1986
;
314
(
8
):
512
514
6
Optum
.
Laboratory benefit management. Available at: https://www.optum.com/business/health-plans/care-management-costs/laboratory-benefit-management.html. Accessed August 18, 2023
7
CareSource.
Care Source laboratory testing policies. Available at: avalonhcs.com/caresource-laboratory-testing-policies. Accessed August 18, 2023
8
Choosing Wisely. American Society for Clinical Pathology.
Thirty-five things physicians and patients should question. Available at: https://www.ascp.org/content/docs/default-source/get-involved-pdfs/istp_choosingwisely/ascp-35-things-list_2020_final.pdf. Accessed July 17, 2023
9
Choosing Wisely Canada.
Understand the gland. Available at: https://choosingwiselycanada.org/toolkit/understand-the-gland/. Accessed August 18, 2023
10
American Family Physician.
Choosing Wisely recommendations. Available at: https://www.aafp.org/pubs/afp/collections/choosing-wisely/355.html. Accessed July 21, 2023
11
Rose
SR
,
Wassner
AJ
,
Wintergerst
KA
, et al
;
SECTION ON ENDOCRINOLOGY EXECUTIVE COMMITTEE; COUNCIL ON GENETICS EXECUTIVE COMMITTEE
.
Congenital hypothyroidism: screening and management
.
Pediatrics
.
2023
;
151
(
1
):
e2022060420
12
Ladenson
PW
,
Singer
PA
,
Ain
KB
, et al
.
American Thyroid Association guidelines for detection of thyroid dysfunction
.
Arch Intern Med
.
2000
;
160
(
11
):
1573
1575
13
Garber
JR
,
Cobin
RH
,
Gharib
H
, et al
;
American Association of Clinical Endocrinologists and American Thyroid Association Taskforce on Hypothyroidism in Adults
.
Clinical practice guidelines for hypothyroidism in adults: cosponsored by the American Association of Clinical Endocrinologists and the American Thyroid Association
.
Endocr Pract
.
2012
;
18
(
6
):
988
1028
14
van Trotsenburg
P
,
Stoupa
A
,
Léger
J
, et al
.
Congenital hypothyroidism: a 2020-2021 consensus guidelines update-An ENDO-European Reference Network Initiative Endorsed by the European Society for Pediatric Endocrinology and the European Society for Endocrinology
.
Thyroid
.
2021
;
31
(
3
):
387
419
15
Léger
J
,
Olivieri
A
,
Donaldson
M
, et al
;
ESPE-PES-SLEP-JSPE-APEG-APPES-ISPAE; Congenital Hypothyroidism Consensus Conference Group
.
European Society for Paediatric Endocrinology consensus guidelines on screening, diagnosis, and management of congenital hypothyroidism
.
J Clin Endocrinol Metab
.
2014
;
99
(
2
):
363
384
16
American Academy of Pediatrics AAP Section on Endocrinology and Committee on Genetics
.
American Academy of Pediatrics AAP Section on Endocrinology and Committee on Genetics, and American Thyroid Association Committee on Public Health: newborn screening for congenital hypothyroidism: recommended guidelines
.
Pediatrics
.
1993
;
91
(
6
):
1203
1209
17
Lin
DC
,
Straseski
JA
,
Schmidt
RL
;
The Thyroid Benchmarking Group
.
Multi-center benchmark study reveals significant variation in thyroid testing in United States
.
Thyroid
.
2017
;
27
(
10
):
1232
1245
18
CareSource. Avalon.
19
BlueCross BlueShield of North Carolina
.
Corporate medical policy. Thyroid disease testing AHS – G2045. Available at: https://www.bluecrossnc.com/content/dam/bcbsnc/pdf/providers/policies-guidelines-codes/policies/commercial/laboratory/thyroid_disease_testing.pdf. Accessed August 12, 2022
20
Barry
C
,
Kaufman
S
,
Feinstein
D
, et al
.
Optimization of the order menu in the electronic health record facilitates test patterns consistent with recommendations in the Choosing Wisely initiative
.
Am J Clin Pathol
.
2020
;
153
(
1
):
94
98
21
Taher
J
,
Beriault
DR
,
Yip
D
,
Tahir
S
,
Hicks
LK
,
Gilmour
JA
.
Reducing free thyroid hormone testing through multiple Plan-Do-Study-Act cycles
.
Clin Biochem
.
2020
;
81
:
41
46
22
Gilmour
JA
,
Weisman
A
,
Orlov
S
, et al
.
Promoting resource stewardship: Reducing inappropriate free thyroid hormone testing
.
J Eval Clin Pract
.
2017
;
23
(
3
):
670
675
23
Abitbol
L
,
Tenedero
CB
,
Sepiashvili
L
,
Wasserman
JD
,
Palmert
MR
.
Routine T4 no more? Reducing excess thyroid hormone testing at a pediatric tertiary care hospital
.
J Pediatr
.
2021
;
236
:
269
275.e1
24
Strata
.
About. Available at: https://www.stratadecision.com. Accessed February 13, 2023
25
Provost
LP
,
Murray
SK
.
The Health Care Data Guide
, 2nd ed.
Jossey‐Bass
;
2022
26
Fliers
E
,
Bianco
AC
,
Langouche
L
,
Boelen
A
.
Thyroid function in critically ill patients
.
Lancet Diabetes Endocrinol
.
2015
;
3
(
10
):
816
825
27
Langouche
L
,
Jacobs
A
,
Van den Berghe
G
.
Nonthyroidal illness syndrome across the ages
.
J Endocr Soc
.
2019
;
3
(
12
):
2313
2325
28
Stockigt
JR
.
Guidelines for diagnosis and monitoring of thyroid disease: nonthyroidal illness
.
Clin Chem
.
1996
;
42
(
1
):
188
192
29
Gill
J
,
Barakauskas
VE
,
Thomas
D
, et al
.
Evaluation of thyroid test utilization through analysis of population-level data
.
Clin Chem Lab Med
.
2017
;
55
(
12
):
1898
1906
30
Miyakis
S
,
Karamanof
G
,
Liontos
M
,
Mountokalakis
TD
.
Factors contributing to inappropriate ordering of tests in an academic medical department and the effect of an educational feedback strategy
.
Postgrad Med J
.
2006
;
82
(
974
):
823
829
31
Sharma
A
,
Salzmann
M
.
The effect of automated test rejection on repeat requesting
.
J Clin Pathol
.
2007
;
60
(
8
):
954
955