Inter-rater Reliability and Validity of Good Pharmacy Practices Measures in Inspection of Public Sector Health Facility Pharmacies in Uganda

Inter-rater Reliability and Validity of Good Pharmacy Practices Measures in Inspection of Public Sector Health Facility Pharmacies in Uganda

By: Brian Sekayombya, David Nahamya, Laura Garabedian, Morries Seru, Birna Trap
Publication: Journal of Pharmaceutical Policy and PracticeJan. 2019; Vol. 12: 2. DOI:



The National Drug Authority (NDA) inspects and certifies private and public sector pharmacies in Uganda using an indicator-based inspection tool that measures adherence to good pharmacy practices (GPP). 67 measures identify the situation in the domains of premises, dispensing quality, stores management, and operating requirements. Although the GPP measures are well-recognized and used internationally, little is known about their validity and reliability. The study aimed to assess validity, which measures agreement of GPP measures between a gold standard inspector and NDA inspector and inter-rater reliability (IRR), which measures agreement among NDA inspectors, of GPP measures.


We assessed validity and IRR by four teams of inspectors in eight government health facilities that represent three levels of care. Each team inspected two facilities, resulting in 24 total inspections. Each team comprised one central-level inspector, one district-level inspector, and one gold-standard inspector (i.e., a very experienced central-level inspector). We calculated median validity and IRR for each GPP measure, overall, indicator categorized as either critical, major, or minor, by domains, by the inspection decision (i.e., certified or not certified) and by adequatevalidity and IRR score (i.e., score ≥ 75%).


The median validity for all GPP measures was 69%, with 29 (43%) measures having an adequate validity of ≥75%. The median IRR for all GPP measures was 71%, with 31 (46%) having an adequate IRR measure of ≥75%. Validity did not differ significantly by indicator category, domain or level of care. Adequate IRR and validity score (≥75%) was lowest for critical measures, which are key determinants of the certification decision, scoring 20 and 40% respectively. District inspectors had lower median validity for critical indicators and premises and higher validity for store management. Compared to central inspectors, the validity of district inspectors’ certification decisions was lower; in the eight facilities, three district inspectors agreed with gold standard inspector vs. all eight central inspectors.


Our findings question the validity and reliability of many GPP inspection measures, particularly critical measures that greatly impact certification decision. This study demonstrates the need for assessments of, and interventions to improve, validity and reproducibility of GPP measures and inspections.