Investigating Risk Factors for Cataract Using the Cerner Health Facts® Database

Title: Investigating Risk Factors for Cataract Using the Cerner Health Facts® Database Background: A retrospective study was performed using the Cerner Health Facts® database, a HIPAA compliant and deidentified database, to evaluate risk factors associated with cataract. Methods and Findings: Using ICD-9 codes, a study group population was determined by selecting all patients in the database who visited the eye clinic. A data-driven approach was used to select multiple variables and odds ratio analysis was performed to determine the association of the risk factors with cataract formation. Odds ratio analysis indicated that 7 variables out of 18 were at a 20% or higher odds for developing of cataract. These include gentamicin, hypertension, lipid metabolism disorder, obesity, steroids and type-two diabetes and lacrimal disorder. Conclusion: This is the first study to show a link between lacrimal disorders and the development of cataract.


Introduction
The human lens is located at the anterior part of the eye and is made of a high concentration of crystalline protein that focuses the image of the visual field on the retina [1]. With little protein turnover, crystalline protein in the lens must survive for a lifetime of an individual, leading to a reduction of focusing ability and lens transparency with age. This reduction is a result of the long living crystalline proteins being continuously challenged by endogenous and exogenous agents that drive the lens proteins to undergo a wide variety of alteration and eventually form larger light scattering aggregates [2,3]. Clinically significant alteration and formation of larger light scattering aggregates leads to the development of cataract. Cataract may also occur either from congenital due to mutation in genetic materials or from traumatic incidents. However, most cataracts are age related. Cataracts can be broadly classified into nuclear cataract, cortical cataract and sub capsular cataract based on the anatomy location and pathophysiology. According to the World Health Organization, cataracts are the leading cause of blindness among people age 40 and above worldwide [4]. The economic burden of cataract has been rising worldwide [5], however, the actual cause for increased prevalence of cataract is not known. Studies have suggested that it either could be due to age demographic shifts upward in the population or increased exposure to risk factors that accelerates development of lens opacification. In the United States, the prevalence of cataracts rose by 20% from 20.5 million in 2000 to 24.4 million in 2010 and is expected to double by 2050 [6]. Consequently, the Medicare cost for treatment of cataract is continually increasing [7]. A study by the World Health Organization showed that a delay of 10 years of the onset of cataracts would cut the number of people who need cataract surgery in half [8]. Therefore, understanding the risk factors associated with cataract would pave the way for better preventive measure, which may help to delay the onset of cataracts and subsequently decrease the financial burden of this disease.
A literature review revealed that the severity of the light scattering aggregates in the lens is accelerated by several factors including lifestyle, educational status, medication, smoking, sunlight exposure, diabetes, body mass index, tobacco use, alcohol consumption, and many other risk factors [9,10]. It also suggested possible protective factors including intake of anti-oxidants, higher physical activity, and use of certain medications such as aspirin, beta-carotene, and multivitamins [11,12] however, the conclusions from a large number of studies are inconsistent [10]. Thus, there is a necessity for a more robust approach which looks at a large number of patients to determine risk factors for the development of cataracts.
To determine risk factors for cataracts, data available in the Cerner Health Facts ® database was utilized. The database captures and stores de-identified, longitudinal electronic health records (EHR) which includes data on patient demographics, encounter, diagnosis, medications, procedures, laboratory tests, hospital information, and billing. This database facilitated a unique opportunity to study more patient variables than previously possible. The aim of this study was to identify potential risk factors for the development of cataracts.

Data source
This study utilized the Cerner Health Facts ® electronic database which complies with the patient confidentiality requirements of the Health Insurance Portability and Accountability Act (HIPAA). Data in Health Facts is extracted directly from the EMR from hospitals in which Cerner has a data use agreement. Encounters may include pharmacy, clinical and microbiology laboratory, admission, and billing information from affiliated patient care locations. All admissions, medication orders and dispensing, laboratory orders and specimens are date and time stamped, providing a temporal relationship between treatment patterns and clinical information. Cerner Corporation has established operating policies consistent with the HIPAA Laws to establish de-identification for Health Facts. The database contains unique clinical data on more than 48.9 million patients and uses an automated electronic medical record system to capture clinical events. Over 600 individual sites from 90 health systems participate in contributing data to the database. MU has agreements with Cerner's to use this data for research purposes. The institutional review board (IRB) at the University of Missouri approved the study protocol.

Study cohort
All patients diagnosed with an eye-related disease, age 35 and above, were included in this study to investigate risk factors for cataracts. Eye-related visits were determined using ICD-9 codes (361-379) and those with cataracts were then identified using ICD-9 ontology codes (366.01-366.04 and 366. 10-366.19). Congenital cataract and traumatic cataracts were not included for this study. Data was extracted on the resulting 947,059 patients.

Inclusion/ exclusion criteria
Data processing was performed to ensure the quality of the data. Patients with incomplete demography information were excluded from analysis, reducing the study to 830,125 patients. Patients with none of the selected risk factors were also excluded from the study, yielding a total of 699,680 unique patients for odds ratio analysis.

Risk factor assessment
A data-driven approach was used to determine risk factors. The medication and diagnosis tables were summarized and variables that had at least 4% of the samples were selected from each table. From the summarized tables, variables were selected, these included hypertension, lipoid metabolism disorder, diabetes type I & II, glaucoma, obesity, ischemic heart disease, hypotension, atherosclerosis, lacrimal disorder, aspirin, multivitamins, steroids, atropine, bacitracin, gentamicin, allopurinol and alcohol use.

Statistical analysis
The association between cataract and potential risk factors was assessed by univariate analysis. Odds ratio (ORs) and 95% confidence interval (CI) were calculated using SAS software package (version 9.4 SAS Institute Inc) to determine risk factors. Risk factors were chosen that had 20% higher incidence in the cataracts patients group compared to patients without cataract.

Results
A total of 699,680 patients with cataract were identified from the Cerner Health Facts ® database from 2000 to 2015 that satisfied inclusion/exclusion criteria of the study (Figure 1). The study group was 58% female (age range 35-93 years) and 42% male (age range 35-89 years) ( Table 1). The majority of individuals were aged 60 years and higher at the time of cataract diagnosis. Gender based analysis showed that 19% of females and 18% of males were diagnosed with cataracts. Race distribution of those diagnosed with cataracts was 20.2% of African-Americans, 19.7% of Pacific Islanders, 18.7% of Native Americans, 18.4% of Caucasians, 18.0% of Asians, 14% of Middle Eastern Indians, and 9.8% of Hispanics ( Figure 2).

Discussion
This study utilized Cerner Health Facts ® database to investigate risk factors for cataract. Several others studies, ranging from case-control to population-based, have also attempted to investigate these factors [13][14][15][16][17][18]. West et al. performed a meta-analysis and addressed several risk factors including education, gender, smoking, alcohol, blood pressure/ hypertension, ultraviolet radiation, and diabetes [10]. Older age is also widely reported as a vital risk factor [10,19,20]. Odds Ratio analysis in this study identified 7 risk factors with a 20% or higher odds of leading to cataract. Of these 7 risk factors, 6 have also been identified in the literature: gentamicin, hypertension, lipid metabolism disorder, obesity, steroids and type-two diabetes. Most significant is that this study identified lacrimal disorders as a risk factor. This is the first report of lacrimal disorders being linked to cataract formation.
Hypertension is a well-known risk factor for cataract [10,21]. It may be possible that blood pressure is a surrogate for the cataractogenic effects of certain anti-hypertensive medications [22]. Tyler Rim et al. have reported the results of the Korean National Health and Nutrition Survey and found that hypercholesterolemia, a type of lipoid metabolism disorder, is a risk factor for cataract development [23]. Many studies have shown that type two diabetes is associated with significantly higher odds of developing cataracts [19,20,24,25]. However, the colloquial is what causes the increased cataract among the diabetic patients is not known, whether it is just the elevated sugar level or a subsequent complication of the disease. The biochemical analysis of cataract lenses in patients with diabetes showed abnormalities in the levels of electrolytes, glucose, galactose, and glutathione and can lead to hypersonic effects like fiber cell swelling, vacuole formation, and lens opacification [26][27][28]. The consensus in the literature is that steroids are a risk factor for cataracts [10,29]. While widely accepted as a risk factor, the mechanism behind steroid-induced cataract formation is not generally agreed upon. Studies have shown an association between gentamicin toxicity and vision loss [30] however its role in the development of cataracts is unknown. Overall there are mixed views about obesity in the literature [31][32][33]. Our study found all 6 of these risk factors to have 20% or high odds of leading to development of cataract.
This study is the first to show that lacrimal disorders have a strong positive correlation with cataract (OR 2.62 95% CI 2.56-2.68). It is well known that a lacrimal disorder is associated with dry eye syndrome. The lacrimal gland contributes multiple components to the tear film to maintain the health and transparency of the eye and it may have a role on lens opacity as well. Further study is required to understand the biological consequence of lacrimal disorders and its association with the development of cataract. It important for physicians to create awareness amongst patients who have lacrimal disorders, as consequentially they may be at a higher risk to develop cataract. Routine follow-up and preventive measure will enable them to delay or take timely care of the cataract.
Beyond these 7 risk factors, several others have been investigated including: gender, ethnicity, alcohol consumption, multivitamin and aspirin use. Studies have found altered occurrence of lens opacities among menopausal women who have had hormonal therapy, suggesting a hormonal influence involved in lens opacification. This study showed negligible difference in the prevalence of cataract between males (18%) and females (19%). Eye Disease Study (AREDS) reported that age, gender, education, smoking, and diabetes are strong risk factors for cataracts [19]. Their study also reported that African Americans were at a higher risk of developing cataracts than Caucasians. The present study found in increased incident of cataract among African Americans compared to other ethnic groups. The race-based data reveals that the Hispanic race has less prevalence of cataract among the study groups. The reason behind this trend is unknown, but it could be due to lack of eye clinic visits among the Hispanic population due to their socioeconomic status. The literature has mixed results on alcohol's effect on cataract formation. Some affirm that light drinking is a protective factor [10], while others say that drinking is a risk factor [8,15,34]. West et al. reported that only heavy drinking is a risk factor for cataract development [10]. This study found that alcohol is associated with marginally higher odds of cataract formation. The underlying mechanism of how alcohol induces cataract is not clear. It has been suggested in the literature that conversion of alcohol to acetaldehyde is a harmful factor as the acetaldehyde can react with lens proteins and form light scattering changes [35]. Studies have shown that multivitamins offer a protective effect against cataract development [19,36,37], while another has shown that there was no protective effect [38]. This study found that multivitamins are associated with lower odds of cataract development. It is important to determine what part of these vitamins is leading to the reduction in cataracts. Similarly, aspirin has mixed reports in the literature. A few studies have shown a protective effect [19] and others reported no effect [10]. Gritz  4 study found that aspirin is associated with significantly lower odds of cataract development. The proposed mechanism of aspirin mediated protection against cataract formation is not known, however studies have shown that lowering plasma tryptophan levels and reduced aldose reductase activity may be involved [39]. Further, the use of allopurinol suggested that it may be a risk factor for cataract formation [40]. A proposed mechanism for allopurinol reducing the odds of cataract formation is through the inhibition of the antioxidant xanthine oxidase. This study shows no association between allopurinol and development of cataract. Studies have shown also that excess of antioxidants are also risk factors for cataract [41][42][43].
While there are many advantages of a retrospective analysis including large population size, ability to study more variables, and a constant database that others can analyze, there are some limitations to the study. The information available is limited by the data that the hospitals provide. Data is also limited to institution and geographical regions that Cerner's reporting hospitals serves. There is also incomplete history on the patients in the database and patients may have other conditions or have been diagnosed earlier and it was not recorded in the database. Longitudinal information varies between patients as well. For instance, some patients may have 15 years of information while others only have 3 years of data. Another limitation does not know the length of exposure or frequency of the drug used. While these limitations are typical of a retrospective analysis, the conclusions are still significant.