Liu Liqun, Liu Xiaoyan, Liu Yang, Wan Xia, Yang Gonghuan. Building the standard operating procedure for improving health insurance data quality: Quality evaluation and improvement on the reimbursement records data of new rural cooperative medical system of a county in Henan province, 2013–2015[J]. Disease Surveillance, 2021, 36(3): 261-269. DOI: 10.3784/jbjc.202006120213
Citation: Liu Liqun, Liu Xiaoyan, Liu Yang, Wan Xia, Yang Gonghuan. Building the standard operating procedure for improving health insurance data quality: Quality evaluation and improvement on the reimbursement records data of new rural cooperative medical system of a county in Henan province, 2013–2015[J]. Disease Surveillance, 2021, 36(3): 261-269. DOI: 10.3784/jbjc.202006120213

Building the standard operating procedure for improving health insurance data quality: Quality evaluation and improvement on the reimbursement records data of new rural cooperative medical system of a county in Henan province, 2013–2015

  •   Objective   To assess and improve the quality of data of the new rural cooperative medical system (NRCMS) in a county in Henan province from 2013 to 2015, and provide evidence to build a standard operating procedure for improving medical insurance data quality and to improve data collection and collation in the future.
      Methods   The research team checked the completeness and internal consistency of the raw data set, including missing, abnormal and extreme value checks, source comparison of same information, logical consistency check and so on, and made possible modifications on the data set in these respects. At the same time, the research team carried out structure rearrangement on the data set, especially coding on diagnoses, and formed a data dictionary. After that, the research team checked the external consistency of the improved data set by carrying out descriptive statistics and comparison on key variables such as diseases and costs.
      Results   In the data set, 27.11% of the diagnosis records were "none", missing or unreadable codes. Other records showed the diagnoses of diseases in an unstructured text form. After manual rearrangement and coding, the research team formed a "diagnoses - standardized diagnoses - ICD-10 codes" dictionary. After transforming texts to ICD-10 codes, up to 96.00% of the records showed clear diagnoses of diseases. The logic consistence between diagnosis and gender or age reached 98.67%. All the records had cost and date information. Results of comparison showed that the cost information in this data set could be considered accurate. A few people had two or more hospital visits with time periods overlap, however, the percentage was very small (0.59%). Only 1 348 records (0.27%) had no demographic information or had abnormal or extreme demographic information, but sometimes, the gender or age information were not consistent with those extracted from the identification card numbers. From the above, the research team considered the data quality of the studied NRCMS data set was relatively high.
      Conclusion   Medical insurance data is an important material for understanding population based disease prevalence and diseases related payments. It is necessary to carry out data quality evaluation and quality improvement following a standard operating procedure. A data dictionary could help improve the data filling, so that the diagnosis information filled could be normalized according to the international disease coding rules, which would facilitate the medical insurance data to play a more important role in population health estimation and help make the reasonable payment judgment of medical insurance expenses.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return