刘利群, 刘晓燕, 刘阳, 万霞, 杨功焕. 改善医保数据质量的标准化流程建设:2013-2015年河南省某县新型农村合作医疗数据质量的评价和改进[J]. 疾病监测, 2021, 36(3): 261-269. DOI: 10.3784/jbjc.202006120213
引用本文: 刘利群, 刘晓燕, 刘阳, 万霞, 杨功焕. 改善医保数据质量的标准化流程建设:2013-2015年河南省某县新型农村合作医疗数据质量的评价和改进[J]. 疾病监测, 2021, 36(3): 261-269. DOI: 10.3784/jbjc.202006120213
Liu Liqun, Liu Xiaoyan, Liu Yang, Wan Xia, Yang Gonghuan. Building the standard operating procedure for improving health insurance data quality: Quality evaluation and improvement on the reimbursement records data of new rural cooperative medical system of a county in Henan province, 2013–2015[J]. Disease Surveillance, 2021, 36(3): 261-269. DOI: 10.3784/jbjc.202006120213
Citation: Liu Liqun, Liu Xiaoyan, Liu Yang, Wan Xia, Yang Gonghuan. Building the standard operating procedure for improving health insurance data quality: Quality evaluation and improvement on the reimbursement records data of new rural cooperative medical system of a county in Henan province, 2013–2015[J]. Disease Surveillance, 2021, 36(3): 261-269. DOI: 10.3784/jbjc.202006120213

改善医保数据质量的标准化流程建设:2013-2015年河南省某县新型农村合作医疗数据质量的评价和改进

Building the standard operating procedure for improving health insurance data quality: Quality evaluation and improvement on the reimbursement records data of new rural cooperative medical system of a county in Henan province, 2013–2015

  • 摘要:
      目的   对2013 — 2015年河南省某县的新型农村合作医疗数据进行质量评价和改进,形成改善医保数据质量的标准操作程序,为其他地区数据收集和整理提供思路。
      方法   首先评价原始数据库的完整性和内部一致性,包括缺失值、异常值和极值检查,同一信息的多个来源间比对,逻辑核查及查重,并且以同样的内容为目标,对数据库中能够补充或修订的内容进行相应完善。 对数据集(尤其是诊断)进行结构化整理及编码,形成数据字典。 对质量改进后数据库中的关键变量进行描述性统计和比对,评价外部一致性。
      结果   该数据集中有27.11%的记录诊断为“无”、空白或乱码;其余记录有诊断但为非结构化文本形式,对其进行人工整理和编码后形成了“诊断-标准诊断-ICD-10编码”字典。 编码后可见96.00%的诊断内容明确;诊断与性别、年龄间的逻辑一致性高达98.67%。 费用和就诊时间变量的完整性均达到100%。 费用信息基本准确;有极少量(0.59%)人次的就诊时间段与同一人其他次就诊间发生重叠现象。 关键人口学信息有极少量(0.27%)缺失、异常或极值,但可能出现与身份证号提取的性别或年龄不一致的情况。 整体看来,研究县的新农合数据集质量较高。
      结论   医保数据是了解人群患病和研究患病支付时的重要资料,在分析前有必要对其进行标准化的质量评价和改进。 依靠数据字典等技术手段改进填报工具,依照ICD-10编码规则来规范诊断的填写和编码,可促使医保数据在估算人群健康水平中发挥更大作用,也有利于判断医保的合理支付。

     

    Abstract:
      Objective   To assess and improve the quality of data of the new rural cooperative medical system (NRCMS) in a county in Henan province from 2013 to 2015, and provide evidence to build a standard operating procedure for improving medical insurance data quality and to improve data collection and collation in the future.
      Methods   The research team checked the completeness and internal consistency of the raw data set, including missing, abnormal and extreme value checks, source comparison of same information, logical consistency check and so on, and made possible modifications on the data set in these respects. At the same time, the research team carried out structure rearrangement on the data set, especially coding on diagnoses, and formed a data dictionary. After that, the research team checked the external consistency of the improved data set by carrying out descriptive statistics and comparison on key variables such as diseases and costs.
      Results   In the data set, 27.11% of the diagnosis records were "none", missing or unreadable codes. Other records showed the diagnoses of diseases in an unstructured text form. After manual rearrangement and coding, the research team formed a "diagnoses - standardized diagnoses - ICD-10 codes" dictionary. After transforming texts to ICD-10 codes, up to 96.00% of the records showed clear diagnoses of diseases. The logic consistence between diagnosis and gender or age reached 98.67%. All the records had cost and date information. Results of comparison showed that the cost information in this data set could be considered accurate. A few people had two or more hospital visits with time periods overlap, however, the percentage was very small (0.59%). Only 1 348 records (0.27%) had no demographic information or had abnormal or extreme demographic information, but sometimes, the gender or age information were not consistent with those extracted from the identification card numbers. From the above, the research team considered the data quality of the studied NRCMS data set was relatively high.
      Conclusion   Medical insurance data is an important material for understanding population based disease prevalence and diseases related payments. It is necessary to carry out data quality evaluation and quality improvement following a standard operating procedure. A data dictionary could help improve the data filling, so that the diagnosis information filled could be normalized according to the international disease coding rules, which would facilitate the medical insurance data to play a more important role in population health estimation and help make the reasonable payment judgment of medical insurance expenses.

     

/

返回文章
返回