Given the rapidly proliferating varieties of nanomaterials and ongoing concerns that these novel materials may pose emerging occupational and environmental risks, combined with the possibility that each variety might pose a different unique risk due to the unique combination of material properties, researchers and regulators have been searching for methods to identify hazards and prioritize materials for further testing. While several screening tests and toxic risk models have been proposed, most have relied on cellular-level in vitro data. This foundation enables answers to be developed quickly for any material, but it is yet unclear how this information may translate to more realistic exposure scenarios in people or other more complex animals. A quantitative evaluation of these models or at least the inputs variables to these models in the context of rodent or human health outcomes is necessary before their classifications may be believed for the purposes of risk prioritization. This paper presents the results of a machine learning enabled metaanalysis of animal studies attempting to use significant descriptors from in vitro nanomaterial risk models to predict the relative toxicity of nanomaterials following pulmonary exposures in rodents. A series of highly non-linear random forest models (each made up of an ensemble of 1,000 regression tree models) were created to assess the maximum possible information value of the in vitro risk models and related methods of describing nanomaterial variants and their toxicity in rat and mouse experiments. The variety of chemical descriptors or quantitative chemical property measurements such as bond strength, surface charge, and dissolution potential, while important in describing observed differences with in vitro experiments, proved to provide little indication of the relative magnitude of inflammation in rodents (explained variance amounted to less than 32%). Important factors in predicting rodent pulmonary inflammation such as primary particle size and chemical type demonstrate that there are critical differences between these two toxicity assays that cannot be captured by a series of in vitro tests alone. Predictive models relying primarily on these descriptors alone explained more than 62% of the variance of the short term in vivo toxicity results. This means that existing proposed nanomaterial toxicity screening methods are inadequate as they currently stand, and either the community must be content with the slower and more expensive animal testing to evaluate nanomaterial risks, or further conceptual development of improved alternative in vitro screening methodologies is necessary before manufacturers and regulators can rely on them to promote safer use of nanotechnology.