Alternatives to animal experiments for eye irritation 5. Validation of alternatives to ocular irritancy testsWorld-wide a lot of resources have been invested in the development and validation of reliable alternatives to eye irritation tests. Formerly, many small validation studies have been performed, and recently several large validation programs have been completed. The last studies were performed by EU/Home Office and COLIPA, and their goal was to provide the basis for a definitive conclusion regarding the replacement of animal experiments for ocular irritancy testing with alternative in vitro methods. This proces has, however, been considerably more complicated than expected. The EU/Home Office study In a recent study arranged by the EU Commission and the British Home Office, it was investigated whether 9 alternative methods could replace the Draize eye irritation test (see table 5.1). No significant correlations between results from the in vitro methods and Draize test MMAS values were found for 59 test materials. In addition, no reliable predictions of Draize test results were obtained, when the test substances were grouped in water soluble chemicals (n = 30) and non-water soluble substances (n = 18). Further, the alternative methods could not be used to identify severe irritants. The precision of the predictions of the Draize test results with the alternative methods was so low that the practical utility of the prediction was considered to be questionable. The only positive result was that relatively good correlations to in vivo eye irritation data were obtained with several in vitro methods for 12 surfactants (Balls et al., 1995). Table 5.1
The BGA study The outcome of the EU/Home Office study was not unexpected. The conclusion of a large German validation study on 136 mixed test substances was, for instance, that no significant correlation was found between in vivo eye irritation data and results from a cytotoxicity test with neutral red uptake. In contrast, the HET-CAM test was able to identify about 25% of the severe irritants (Spielmann et al., 1993). A database on 200 test substances has recently been established. Results for 9 substances were excluded from further analysis due to an unacceptable quality of their in vitro data. In addition, results for 48 chemicals were excluded because original Draize eye test data were not available. Very precise individual rabbit eye irritancy data from Draize tests performed according to OECD guideline 405 for up to 21 days have, however, been publiced on more than 25% of the excluded chemicals (ECETOC, 1998), and it would be interesting to reinclude these substances in the BGA database. By analysis of the database with the remaining chemicals, it was apparent that neither the cytotoxicity test nor the hens egg test could identify R41 irritants with more than about 50% sensitivity, and an acceptable specificity of more than 80% could only be obtained with the HET-CAM test. For 129 out of the 143 remaining chemicals data from both types of in vitro tests were available, and a linear discriminant analysis was performed on combined endpoints from the two alternative methods. Using this procedure, a false-negative rate for R41 substances of 29%, and a false-positive rate for other chemicals of 22% were obtained. The classification of R41 substances was, however, improved by including considerations to the solubility of the chemicals in water and oil (Spielmann et al., 1996) . The CTFA study The US Cosmetics, Toiletry, and Fragrance Association (CTFA) conducted from 1990 to 1996 a three phase validation program on approximately 25 alternatives to rabbit eye irritation tests. Low-volume eye test (LVET) MAS values were applied in the two first phases of the program, and Draize MAS values were used in the last phase. In both tests albino rabbits are used, but in the LVET test, 10 µl of the test substance is instilled on the cornea, whereas 100 µ l is instilled in the everted lower lid in the Draize test. Regression modelling was carried out on results from tests showing the best in vitro/in vivo concordances in an initial analysis. The best fitting data transformations were in general two or three parameter logistic models.The HET-CAM test, the neutral red release test, and the EYTEXTM test gave relatively good predictions of 10 hydroalcoholic formulations tested in phase I of the program (Gettings et al., 1996a), but none of the alternative tests were able to reliable predict the in vivo response of 18 oil-water emulsions in phase II (Gettings et al., 1998). Several tests did, however, give good predictions of Draize MAS values of 25 surfactant based products in phase III (Gettings et al., 1996b). The variability of the HET-CAM test and the neutral red release test appeared to consistently exceed the variability of the in vivo tests. The other alternative tests were considerably more reproducible than the eye irritancy tests. The IRAG evaluation The results of the EU/Home Office study were in line with the outcome of an evaluation that was carried out by a group of experts from various regulatory authorities (IRAG) in 1993. IRAG performed an overall evaluation of existing data from a large number of in vitro assays for ocular irritation. Fourty-one laboratories world-wide submitted 55 data set obtained with 23 in vitro methods on 9 to 133 test substances to IRAG. In the IRAG study, results from tests with chorioallantoic membranes of hens eggs were in poor to moderate agreement with Draize test results for up to 93 mixed test substances. Using the HET-CAM test, good correlations to in vivo data were, however, obtained for surfactant based substances and products, and the CAMVA test gave the best prediction of alcohol based products (Spielmann et al., 1997). An assay with isolated rabbit eyes and a test with isolated bovine corneas (the BCOP-test) were both considered to have potentials for the identification of severe irritants, but no general prediction of Draize test results was obtained. Results from a test with isolated chicken eyes and an assay with isolated bovine lenses were both in good agreement with in vivo eye irritancy data, but the data sets of the two tests were too limited to allow a general evaluation (Chamberlain et al., 1997). IRAG considered various cytotoxicity tests to have a potential for the prediction of Draize test results for water soluble substances at normal pH values (Botham et al., 1997; Harbell et al., 1997). The EYTEX test showed a poor correlation to in vivo data for 454 cosmetic ingredients and formulations, but the predictions were good for individual groups of materials, e.g. petrochemicals and solvents. The bacterial Microtox assay was considered to be potentially useful for test of surfactant based products. The SKIN2 ZK1200 tissue model was evaluated to be very useful in the screening of cosmetics and household products, when measurements of the viability of the tissues with the MTT test were used. The use of prostaglandin E2 as an irritation marker was evaluated to be problematic. Further development of the SKIN2 ZK1200 model was recommended, whereas the ZK1100 model with fibroblasts was not found to be suited for irritancy testing (Curren et al., 1997). The JMHW/JCIA study The Japanese Ministry of Health and Welfare (JMHW) initiated in 1991 a validation study together with the Japanese Cosmetics Industry Association (JCIA), national research institutes, universities and kit suppliers. A three-step validation study was initiated in 1993 and completed in 1996. Twelve alternative methods were evaluated using 38 cosmetic ingredients, and Draize tests were performed on the same lot of test substances. Two SKIN2 models were used, both with a MTT test protocol. The ZK1100 fibroblast model was used in 6 to 8 laboratories, and the ZK1200 co-culture model (TEA) was used in 2 laboratories for the full set of test substances, and in 6 to 7 laboratories for 13 test substances. Using the fibroblast model, the tissues were submerged in the culture medium and exposed to test substances dissolved in the medium. Using the TEA model, the reference substances were applied to the surface of the epithelium, but the protocol was not similar to the dosing regime used in the COLIPA study, since the test substances were dissolved or suspended in the culture medium at 10%. T50 values were derived from MTT time-response graphs. High interlaboratory variabilities were observed with both tissue models, with an average CV of 44.5% (n=30) on the data obtained with the ZK1100 model and with an average CV of 61.9 % (n=9) on exact t50 values obtained with the ZK1200 model in 6-7 laboratories. There was, however, total agreement between the 2 laboratories testing the full set of substances on establishing cut-off values for 17 compounds, and a very good correlation (r=0.84) was obtained by reanalysis of log-transformed t50 values on 16 of the remaining test compounds with exact t50 values obtained in the 2 laboratories. Relatively poor linear correlations were obtained to Draize test MAS values both with in vitro data from the ZK1100 model (r=0.71) and the ZK1200 model (r=0.63) (Kurishita et al., 1999). A reanalysis of the data obtained with the ZK1200 model in the two laboratories testing the full set of test substances was performed in the present study based on the prediction model used in the COLIPA study. This improved the prediction of the in vivo results, and relatively good correlations (r>0.78) were obtained between observed and predicted Draize test MMAS values. Exclusion of test substances incompatible with the culture media (acids, alkalies and alcohols of low molecular weight), however, considerably improved the in vitro/in vivo correlation obtained with the ZK1200 model (r=0.84) (Ohno et al., 1999). A MATREX tissue model, consisting of human fibroblasts grown in a collagen lattice, was also used in the study. Test substances were applied neat to the surface of the tissue for 24 hours, and the concentrations causing a 50% decrease in the MTT response were determined (EC50 values). Alternatively, a MATREX score indicating the lowest concentration reducing the viability by 20-80% was used. A total of 12 laboratories used the model, but the full set of reference substances was only tested in 3 laboratories. The interlaboratory reproducibility was considerably better using MATREX scores (CV=9.6%, n=39) than using EC50 values (CV=34.6%, n=33)(Ohno et al., 1999). Similar correlations (r=0.67) were obtained between EC50 values and MATREX scores and Draize MAS values (Ohuchi et al., 1999). It appear to be possible to improve the predictive ability of the assay considerably, if a non-linear logistic prediction model is developed. A poor correlation (r=0.31) to Draize MAS values was obtained with the EYTEX test (Matsukawa et al., 1999). In addition, moderate in vitro/in vivo correlations with a test based on measurements of denaturation of isolated bovine haemoglobin, and the interlaboratory CVs exceeded 240%. An excellent correlation to Draize MAS values (r=0.91) was reported using 50% denaturation as an endpoint (Hatao et al., 1999). The correlation was, however, based on tests of 8 substances only, and it appeared to be due to clustering of the data. Large interlaboratory variations (CVs>50%) were also observed with various hens eggs tests, and moderate correlations to Draize MAS values were obtained with the CAM tryphan blue absorption test (r=0.69, n=52) and the HET-CAM test (r=0.72, n=55) (Hagino et al., 1999). The predictive ability of the HET-CAM test may be considerably improved if a non-linear prediction model is applied. The results from various cytotoxicity tests were moderately to well correlated to Draize MAS values: normal rabbit corneal cells (r=0.53, n=28), the red blood cell haemolysis test (r=0.63, n=17), mammalian cell lines (r>0.71, n=29), and rabbit corneal SIRC cells (r>0.81, n=29-30). For these tests, the interlaboratory reproducibility was acceptable with CVs ranging from 24% to 37% (Ohno et al., 1999). The BCOP workshop, 1997 The predictive ability of the BCOP test has recently been evaluated by a working group of researchers from laboratories with a large in-house experience on the assay. A database of in vitro results on more than 200 test substances has shown concordances to Draize irritancy classes of 80-85%, and the assay has a good reproducibility (Sina and Gautheron, 1998). Tests of a large number of positive controls have shown an excellent intralaboratory reproducibility with CVs of total BCOP scores ranging from 12% to 16% (Harbell and Curren, 1998). The fluorescein leakage test with MDCK cells has demonstrated a very poor ability to discriminate between Draize test irritancy classes, and measurements of fluorescein leakage through the corneas have been shown to be relatively non-predictive of ocular irritancy (Sina and Gautheron, 1998). The COLIPA study The results of the COLIPA validation study on alternatives to eye irritation tests confirmed the results of the previous studies. Most of the in vitro methods used in the COLIPA study were not suited for the prediction of acute eye irritation caused by mixed ingredients and products (Brantom et al., 1997). This is in line with the conclusion of the EC/Home Office study, where none of the alternative methods used were found to be promising candidates for replacement of the Draize test (Balls et al., 1995). In the COLIPA study, most in vitro methods appeared to be suited for the prediction of ocular irritation caused by cosmetic ingredients, in particular of surfactant based substances. This was also the case in the EU/Home Office study, and in several former validation studies (Rasmussen, 1993). In the COLIPA study, tests with chorioallantoic membranes of hens eggs had a very poor reprodicibility, whereas the other in vitro assays had relative good interlaboratory reproducibilities. The most important result of the COLIPA study was that an in vitro model, SKIN2 ZK1200, was demonstrated to give a very good prediction of a broad spectrum of eye irritancy data. The SKIN2 ZK1200 model was the only alternative method that was able to fully meet the criteria on the reproduction of the prediction model used in the COLIPA study. In addition, the interlaboratory reproducibility of the SKIN2 ZK1200 method was evaluated as satisfactory based on results obtained in 3 laboratories (Southee et al., 1999), and the method is like other alternative tests considerably more reproducible than the Draize test. The main outcome of the COLIPA study was that an alternative test for the first time showed a good potential for replacing animal experiments for acute eye irritation in a large blind validation study with mixed chemicals and products The SKIN2 ZK1200 method has also been shown to be useful in the prediction of recovery from ocular irritation in a preliminary study (Espersen et al., 1997). For this reason, the method can be evaluated to show promise as a full replacement of the Draize test. The manufacture and sale of the SKIN2 models ceased shortly after the COLIPA study was completed. This, however, does not make the findings with the systems insignificant, since general knowledge on the prospects for use of tissue equivalent models has been gained. Alternative ocular tissue models Other available eye models with tissues of keratinocytes grown on microporous membranes include an EpiOcular TM model and a REC model, which both have shown promise as tests that may replace the Draize ocular irritancy test. In both models, substances are applied neat to the surface of the tissues, and the endpoints used are based on time-response relations obtained with the MTT assay. Using the EpiOcular TM model, t50 values for 28 chemicals from the ECETOC databank were very well correlated (r=0.90) to Draize rabbit eye scores. Using a prediction model developed from this study, a good correlation (r=0.87) between predicted Draize scores and actual Draize scores of 41 finished products was obtained (Sheasgreen et al., 1996). In addition, relatively good concordance was found between t50 values obtained with 43 samples including liquids, powders and gels in the EpiOcularTM model and Draize test classifications (Stern et al., 1998). A two parameter logistic prediction model for MTT data obtained with EpiOcularTM tissues has recently been developed based on in vitro/in vivo data for 19 water-soluble chemicals and 41 finished products. A good correlation (r=0.90) was obtained between predicted and observed Draize MMAS values. Using the prediction model, a good prediction of Draize MMAS values (r=0.89) was later obtained for 11 finished products. The reproducibility of the MTT test with the EpiOcularTM model appear to be good. Based on test of 132 samples, average CVs of approximately 5% were obtained with negative controls, and average CVs of approximately 25% were obtained with a positive control (0.3% Triton X-100) (Klausner et al., 1999). The REC model resembles closely the EpiOcularTM model, and a good correlation (r=0.89) has recently been found between MTT test data obtained with the REC model and Draize test MMAS values for 40 cosmetic formulations covering a broad spectrum of the scoring scale. In addition, acceptable CV% were found in a reproducibility study with 1% SLS (n=12, CV=18%) and a surfactant based product (n=15, CV=24%) (Doucet et al., 1998).
|