Mitigating Model Risk in AI: Advancing an MRM Framework for AI/ML Models at Financial Institutions

This paper compares key AI/ML risks and risk cultures between Silicon Valley and the financial services industry, exploring the nature of AI/ML models and the nuances of model risk.

Financial institutions’ growing use of artificial intelligence (AI) and machine learning (ML) models introduces challenges that require significant updates to existing model risk management (MRM) frameworks. This paper compares key AI/ML risks and risk cultures between Silicon Valley and the financial services industry, exploring the nature of AI/ML models and the nuances of model risk. It also provides practical guidelines and best practices for firms integrating AI/ML capabilities into their MRM frameworks.

Crisil Integral IQ contributors

Prem Neupane – Head of Quantitative Solutions, Americas, Crisil Integral IQ

Shashi Sharma – Director, Crisil Integral IQ

Jump to: Section I: Key trends | Section II: Addressing AI/ML risks | Conclusion

Executive summary

Financial institutions are increasingly leveraging artificial intelligence (AI)/machine learning (ML) models to enhance decision-making, customer insights and operational efficiencies. This shift from traditional to AI-driven models introduces unique challenges that require significant updates to existing model risk management (MRM) frameworks. This paper highlights key AI/ML risks and risk cultures between Silicon Valley, the purveyors of the AI/ML technology, and the financial services industry, a regulated industry with a mature model risk practice. We will first explore the nature of AI/ML models and the nuances of model risk as applicable to financial services, and follow this with a discussion about AI/ML risk across various stages of the model lifecycle. Our goal is to provide practical guidelines to integrate AI/ML capabilities into MRM frameworks, drawing on best practices from larger institutions while considering the resource constraints that are typical of smaller institutions.

Key ideas covered in this paper include:

Understanding the shift in model risk. Unlike traditional models, AI/ML models are often complex, adaptive and data-intensive, requiring specialized risk controls to manage such issues as data bias, transparency and model drift. This shift necessitates enhancements to traditional MRM frameworks to account for AI/ML’s unique risks.

Bridging two cultures – Silicon Valley meets financial services. The financial services industry, with more than a decade of experience in MRM under such regulatory guidelines as SR 11-7, faces a new challenge of integrating cutting-edge AI/ML advances from Silicon Valley, where MRM was just an afterthought until recently. Bridging this gap requires firms to adopt Silicon Valley’s technical innovations while embedding the discipline and rigor of financial industry MRM practices to manage risks effectively.

Validation challenges with increasing model complexity. Independent model validation, a key pillar of MRM, faces unique challenges in validating AI/ML models not typically seen in traditional modeling. Moreover, it is not feasible to rigorously validate all AI/ML models, especially the large language models that do not have clear boundaries on inputs and outputs. AI/ML models often use diverse, unstructured data and employ complex methodologies, leading to ‘black box’ challenges that limit interpretability. This requires validators to be experts in assessing input quality, relevance and biases that can affect outcomes. In addition, they must have a deep understanding of the various AI/ML modeling frameworks and their respective weaknesses – and the creativity to design tests for the unexpected ways these models might fail. Additionally, the adaptive nature of AI/ML models means that MRM must set up robust monitoring to detect drift and ensure that models remain accurate and compliant over time.

Rise of ModelOps – standardization, automation and integration. Larger financial institutions are increasingly seeking to benefit from the industrialization of MRM processes, including the streamlining of model validation, monitoring and documentation. Practical steps include utilizing ModelOps platforms, implementing scalable workflows and automating routine validation tasks to increase efficiency and improve oversight. The proliferation of AI/ML models has accelerated this industry trend. Automation and industrialization of the processes and development of specialized analytical tools for input data processing, feature engineering and post hoc explanation of model outputs has become a prerequisite for the adoption of AI/ML.

Meeting evolving compliance expectations. Regulatory bodies across jurisdictions recognize the risks of AI/ML in financial services, with emerging guidelines focused on transparency, explainability, fairness and consumer protection. This paper highlights regulatory trends in North America, the UK and the European Union (EU).

Tailoring MRM frameworks to financial institutions. We present an analysis of 10 AI/ML risks that have received attention in our client engagements, industry discussions, regulatory guidelines and academic literature. We consider these risks in light of AI/ML use cases at financial institutions, contrasting them with those of Silicon Valley. The 10 risks discussed in this paper are accountability; bias and fairness; purpose limitation; explainability and interpretability; third-party dependency; data integrity, protection and privacy; transparency and robustness; ethical and legal compliance; scalability and performance; and human-AI interaction.

Discussions are dominated by risks that are more likely to manifest in broad Silicon Valley mass-market use cases. We argue that not all these risks are of equal concern for financial institutions in their more limited use cases for AI/ML models. We present arguments that four out of the 10 deserve more MRM attention. Specifically, we recommend that financial institutions:

  • Adopt purpose limitation practices. Financial institutions should establish policies that require explicit approval for each use case to ensure fit-for-purpose deployment. Models should be implemented only within their intended scope, with ancillary uses subject to heightened scrutiny and monitoring.
  • Prioritize explainability. Financial institutions should favor inherently interpretable models over complex black-box solutions, even at the cost of marginal performance loss. They should utilize analytical tools like SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME) but recognize their limitations in regulatory compliance.
  • Mitigate third-party dependency. Institutions (especially smaller ones) susceptible to vendor lock-in and reduced oversight should develop robust due diligence procedures that focus on transparency in model design, training data and potential fourth-party dependency.
  • Strengthen data protection. Institutions should establish strict controls to safeguard sensitive customer data, limiting its use to permissible purposes and raising awareness of the risk of ‘hidden learning’ by AI/ML models.

While financial institutions should not overlook the remaining six risks, we believe that existing MRM frameworks, strengthened in the aftermath of the Great Recession, are better equipped to mitigate these risks.

By proactively enhancing MRM frameworks to accommodate AI/ML model risk, we believe that financial institutions can not only mitigate emerging risks but also improve productivity, meet regulatory standards and build a foundation for realizing the promise of AI/ML in a controlled and sustainable manner.

This paper is organized into two distinct sections to guide the reader through the evolving landscape of AI/ML adoption and its implications for MRM. In the first section, we explore the key trends shaping the use of AI/ML in the financial services industry: the proliferation of model use cases, the unique complexities of AI/ML compared to traditional models, the shift in demands and expertise between model developers and validators, the industrialization of MRM processes and the evolving regulatory landscape. These trends provide a foundational understanding of the opportunities and challenges for MRM posed by AI/ML.

In the second section, we delve deeper into specific AI/ML risks, examining how they manifest across different stages of the model lifecycle. We present our views on how these risk factors can be prioritized for MRM purposes, tailoring our suggestions to the constraints and strengths of financial institutions.

Back to top

Section I: Key trends and the implications of the evolving AI/ML adoption landscape for MRM

Proliferation of model use for business decisions

The financial services industry has seen a rapid expansion in model use cases, driven by the need for deeper insights, faster decision-making and improved operational efficiency. From credit risk assessment to fraud detection and customer personalization, financial institutions are increasingly adopting models to enhance and automate critical decision-making. What’s more, models will play a crucial role in automating these additional tasks, as the regulatory environment becomes stricter and the volume of work increases (whether this means dealing with more reporting requirements or ensuring better documentation standards). As the number of model applications increases, more firms will start to use sophisticated analytics in areas where manual processes and qualitative analysis were once the norm.

Key factors behind this proliferation are the democratization of model development through low-code and no-code platforms, the ubiquity of data capture and storage, and the availability of open-source statistical packages for AI/ML model development, which enable professionals with domain expertise but limited technical skills to become ‘citizen developers’. These platforms provide intuitive, visual interfaces that allow non-technical users to build and deploy models with minimal coding, turning loan officers, compliance specialists and customer service managers into model creators. For financial institutions, this shift promises productivity gains by empowering staff to build tailored solutions, reducing reliance on traditional research and development teams, and enabling faster responses to business needs. It also allows firms of all sizes to tap into data-driven insights, making advanced analytics accessible to smaller institutions that may lack large teams of econometricians and model developers.

The democratization of model development is opening up areas previously resistant to data-led decision-making. For example, such qualitative functions as customer service and compliance auditing are now increasingly using sentiment analysis models to gauge customer satisfaction or flag potential compliance issues. Similarly, relationship managers can use data models to predict client needs, enhancing customer experience through more personalized service. In HR, models are helping to optimize workforce management, predicting attrition and even improving diversity by flagging hiring biases. These areas were once driven primarily by intuition and experience, but the confluence of multiple technological advances is transforming them into data-rich environments, expanding the adoption of models across financial services.

Understanding the shift in model risk

AI/ML models differ fundamentally from traditional models in their structure, adaptability and complexity (see Table 1). Traditional models, such as linear regression or decision trees, are often based on predefined mathematical relationships that describe an underlying economic theory or pattern of human behavior. They are also interpretable, with a clear and transparent set of rules that map inputs to outputs. AI/ML models, particularly those using deep learning or complex algorithms, operate differently by learning patterns from data rather than relying on fixed rules or explicit programming. This learning process allows AI/ML models to identify intricate relationships within large, high-dimensional datasets, making them powerful tools for such tasks as image recognition, language processing and personalized recommendations. However, this also introduces challenges in terms of transparency and interpretability, as it can be difficult to fully understand how an AI/ML model arrives at its conclusions.

Another key difference is that AI/ML models are dynamic and data-driven, whereas traditional models are often static and theory-driven. AI/ML models learn and evolve from new data inputs, continuously updating their parameters to improve accuracy over time. This ability to adapt to changing conditions in ways traditional models cannot is a key advantage. However, this dynamic nature also makes them more susceptible to issues like data drift, where shifts in the underlying data distribution can lead to performance degradation over time. Consequently, AI/ML models require ongoing monitoring and retraining to maintain their effectiveness, adding layers of complexity to their lifecycle management, compared to traditional models.

Furthermore, AI/ML models require specialized techniques to manage unique risks, such as bias and ‘black box’ behavior. Unlike traditional models, where outputs and decisions are generally explainable, AI/ML models, especially deep learning models, often lack transparency, so tracing the model’s logic can be a challenge. This opacity, combined with their reliance on large volumes of data, increases the risk that they will inadvertently introduce biases from the data into their decisions, which can lead to unfair or unethical outcomes. Managing these risks requires advanced validation techniques, such as explainability tools and fairness assessments, which go beyond traditional model validation processes. Overall, their adaptability, opacity and dependence on data make AI/ML models fundamentally different from traditional models, necessitating enhanced oversight and specialized risk management practices.

Bridging two cultures – Silicon Valley meets financial services

The convergence of Silicon Valley’s technical ingenuity and the financial services industry’s decade-long expertise in MRM presents both an opportunity and a challenge (see Figure 1). The financial services industry, guided by such frameworks as SR 11-7, has developed a rigorous approach to managing model risk, with strong practices in validation, monitoring and governance. In contrast, Silicon Valley’s focus has traditionally been on speed, scalability and innovation, often at the expense of established risk management disciplines. As financial institutions increasingly integrate AI/ML models pioneered in Silicon Valley, they must reconcile these two worlds, adopting cutting-edge technologies while embedding the rigor needed to safeguard against the unique risks of financial applications.

A critical distinction lies in the scope and scale of impact. Silicon Valley’s AI innovations, deployed in mass-market applications like social media, autonomous vehicles and healthcare, can carry wide-ranging societal consequences. Such issues as algorithmic bias, data privacy violations and misinformation have the potential to affect millions globally, creating ethical and legal challenges. Financial institutions, by contrast, often deploy AI/ML models in more confined use cases, particularly in the banking book, where the primary risks are institution-specific. A poorly managed credit risk model, for instance, might lead to bad loans or localized financial losses but is unlikely to trigger systemic consequences. The challenge for financial institutions is to borrow Silicon Valley’s agility while maintaining the discipline required to mitigate the risks unique to their domain.

Certain financial applications, particularly on the trading side of banking, present risks that could rival those faced by Silicon Valley in terms of societal consequences. The use of AI/ML in trading strategies introduces the potential for an ‘arms race’ in algorithmic trading, whereby competing algorithms manipulate or destabilize financial markets. Catastrophic scenarios could arise from unanticipated feedback loops, market manipulation or systemic failures caused by AI/ML-based high-frequency trading algorithms. The speed and complexity of these systems can make it difficult to detect and contain such risks in real time, with implications that extend far beyond individual institutions to the global economy. From the perspective of smaller financial institutions, this potential for exponential risk is not as relevant, given that they typically refrain from engaging in such high-risk activities.

Bridging these two worlds requires more than technical integration – it demands a cultural and operational synthesis. Financial institutions must adopt Silicon Valley’s data-driven, fast-paced innovations but temper them with a deliberate and careful rhythm of changes.

Validation challenges as model complexity increases

The expertise threshold for developing models with AI/ML has been continually declining. With sufficiently large amounts of data, one can relatively easily create models unbound by the restraints of having to understand the underlying relationship between data features and model outcomes. Ironically, this reverses the traditional pecking order between model development and model validation expertise. Validating AI/ML models demands significantly more expertise than development because of their inherent complexity, reliance on diverse data sources and dynamic nature. Validators must carefully assess the limitations of these models and ascertain when they are likely to fail. This is a tall order that requires not just the understanding of how the models work but also the technical prowess to develop tools to evaluate all the different unexpected points of failure.

One of the most challenging aspects of validating AI/ML models is addressing the ‘black box’ nature of such advanced algorithms as deep learning or ensemble methods. These models can have millions of parameters, making it nearly impossible to interpret their decision-making process without specialized tools. For example, in a financial fraud detection system that uses deep learning, the model might flag a transaction as suspicious but provide little explanation for its decision, complicating efforts to validate its accuracy and fairness. Validators must use explainability tools to uncover the reasoning behind model predictions and ensure they align with regulatory and business requirements. This need for explainability is critical in the financial industry, where decisions like loan approvals or investment recommendations must be defensible and transparent to stakeholders.

Another critical challenge is the adaptive nature of AI/ML models. While this adaptability enhances performance, it also introduces the risk of model drift, in which the model’s behavior changes over time due to shifts in the underlying data distribution. For example, a credit risk model trained on pre-pandemic data may fail to account for post-pandemic economic conditions, leading to inaccurate risk assessments. Validators must implement robust monitoring frameworks to detect and evaluate the drift promptly to ensure the model remains reliable and relevant. Additionally, this adaptability that results from frequent recalibrations may also lead to a change in the original functional relationship between input and output, potentially nullifying any initial validation. Without a clearly defined boundary that distinguishes material model changes requiring revalidation from changes that are acceptable without revalidation, the recalibrated model may remain in production without undergoing necessary revalidation, thus amplifying model risk.

Rise of ModelOps – standardization, automation and integration

Large financial institutions are increasingly adopting the industrialization of MRM processes to handle the complexity and volume of AI/ML models effectively. The traditional ‘workshop approach’ to MRM – characterized by manual, bespoke efforts for validation, monitoring and documentation – lacks the scalability and consistency required for modern AI/ML-driven operations. In contrast, the industrialized ‘agile approach’ emphasizes standardization, automation and integration to streamline workflows, increase efficiency and improve oversight (see Table 2). This shift enables institutions to handle a growing inventory of models, while maintaining regulatory compliance and ensuring robust risk management. 

One practical step toward this transformation is the use of ModelOps platforms, which provide end-to-end solutions for model lifecycle management. For instance, a ModelOps platform can automate the deployment of ML models into production, integrating monitoring systems to flag performance degradation in real time. This replaces ad hoc manual monitoring processes and enables faster identification and resolution of issues like data drift. By standardizing deployment workflows and monitoring protocols, ModelOps platforms reduce operational overhead and ensure consistent application of governance standards across diverse models.

Another area of industrialization is the automation of such routine validation tasks as benchmarking model performance, assessing input data quality and generating compliance reports. Automation minimizes human error, providing more reliable and reproducible results, which are critical for audits.

The standardization of documentation practices is a third area where industrialization has significantly improved efficiency in MRM. In the traditional workshop approach, model documentation was often tailored for each model, leading to inconsistencies and inefficiencies. With the agile approach, institutions can implement template-based documentation systems that automatically populate sections based on predefined standards.

The shift from the workshop to the agile approach brings several advantages. Workshop-based MRM relies heavily on individual expertise and manual processes, making it slow and resource-intensive. By contrast, the agile approach is scalable, repeatable and less dependent on specific personnel, enabling institutions to handle large-scale model inventories with fewer resources. Standardized workflows ensure that every model meets the same rigorous governance criteria, reducing variability and improving overall quality.

Meeting evolving compliance expectations

Meeting evolving compliance expectations for complex and black-box AI/ML models is becoming a critical focus for financial institutions, as regulators worldwide seek to address the unique risks these models pose. One of the major concerns of regulators is the harm to end consumers caused by decisions based on biased and unfair outcomes of AI/ML models. Additionally, AI/ML models’ voracious appetite for data has exacerbated the risk of data privacy breaches.

For instance, the EU’s Artificial Intelligence Act explicitly categorizes AI models used in financial services as ‘high risk’ and mandates stringent requirements for documentation, risk assessment and transparency. In the US, the Algorithmic Accountability Act requires companies to be transparent about their algorithms to ensure that algorithms are fair and unbiased. Similarly, Canada’s Artificial Intelligence and Data Act (AIDA) includes provisions to ensure fairness and minimize risks in AI-driven decision-making processes. These regulations reflect growing concerns that without proper controls, reliance on AI/ML models for decision-making in financial services may perpetuate or amplify historical bias. Fair lending regulations, enacted in an era prior to the emergence of AI/ML models, such as the Equal Credit Opportunity Act (ECOA) and Fair Housing Act (FHA) in the US, have also gained renewed attention in the context of AI/ML. These regulations prohibit discrimination based on race, gender, religion or other protected attributes.

The General Data Protection Regulation (GDPR) enacted in the EU is highly relevant to AI/ML models, as it governs the collection, processing and use of personal data. Its mandates are particularly significant for financial institutions employing AI/ML models, as these models require vast amounts of training data, some of which may be considered personal data. GDPR emphasizes the principles of data minimization, purpose limitation and transparency, requiring organizations to collect only the data necessary for a specific purpose and ensuring that individuals are informed about how their data is being used. Failure to comply with GDPR can result in severe penalties.

GDPR also enforces the right to explanation for automated decisions, ensuring that individuals can understand and challenge decisions made by AI/ML models. Similarly, the US Consumer Financial Protection Bureau (CFPB) rules mandate clear communication about the factors influencing decisions, particularly such consequential decisions as denial of credit or access to financial services. Both GDPR and CFPB recognize that black-box AI/ML models can undermine trust in financial institutions if their decisions are not explainable.

Such regulations as the US Federal Reserve’s SR 11-7 guidance and the Bank of England’s SS1/23 are highly relevant in addressing the risks posed by AI/ML models. The foundational pillars of these regulations, including conceptual soundness, independent model validation, model monitoring and benchmarking, provide a robust framework that can be adapted to manage the complexities of AI/ML. For example, SR 11-7 emphasizes the importance of effective challenge – a principle that is essential for managing the model risk of such advanced models as AI/ML.

Back to top

Section II: Addressing AI/ML risks across model lifecycles at financial institutions

Tailoring MRM frameworks

Institutions across tiers, but specifically smaller financial institutions, face unique challenges when adapting MRM frameworks for AI/ML models. Limited resources, including expertise, budget and technological infrastructure, mean that these institutions cannot replicate the comprehensive frameworks employed by larger banks. Instead, smaller institutions must adopt a simplified, risk-based approach, focusing their efforts on genuine risks while avoiding unnecessary ‘check the box’ exercises. This approach can ensure compliance with regulatory expectations and support sustainable AI/ML adoption without overstretching resources.

Based on our direct client engagements, industry discussions, regulatory guidelines and survey of academic literature, we have determined the 10 most frequently cited AI/ML risks, discussed below. Due to the presence of well-established traditional MRM frameworks, financial institutions already have a solid foundation from which to address many of the risks. As a result, depending on the model use case, only a subset of identified risks would require heightened attention and potential augmentation of MRM practices to ensure effective AI/ML model risk mitigation. 

Specifically, we present arguments that four out of the 10 – purpose limitation; explainability and interpretability; third-party dependency; and data integrity, protection and privacy – deserve more MRM attention (see Figure 2). While we do not advocate dismissing the remaining six risks, we believe that existing MRM practices at most financial institutions, especially at banking organizations, are generally well-equipped to mitigate them. We thus recommend that enhancements to the MRM framework at these institutions should be devoted to building safeguards for the above-mentioned four risk factors.

Risk factors deserving heightened MRM attention

1. Purpose limitation

Misuse or repurposing of AI/ML models beyond their intended scope can lead to severe operational, regulatory or reputational risks. These risks often manifest during deployment, underscoring the need for stringent controls at this stage. Such practices as clear documentation of intended use cases and thorough validation during independent review can help mitigate this risk. Model cards provide a standardized way to document a model’s limitations and appropriate applications, offering valuable guidance to ensure responsible usage. By incorporating these measures into their MRM frameworks, financial institutions can safeguard against the pitfalls of overgeneralized or improper use of AI/ML models.

Financial institutions must raise awareness about the importance of purpose limitation. Approvals should focus on specific use cases and fit-for-purpose assessments rather than the model as a whole. This practice, already well-established in larger institutions, ensures that models are deployed only in scenarios for which they were explicitly designed. Institutions should also foster a culture of skepticism, rigorously scrutinizing any unintended uses cases, and impose heightened monitoring requirements when such adjacent use cases are approved.

2. Explainability and interpretability

Explainability and interpretability are unique risks in AI/ML models that were not significant concerns in traditional modeling approaches. Traditional models, built on the principle of parsimony, emphasize simplicity and interpretability, favoring clear, economic rationale over complexity. This allows stakeholders to understand how model outputs are derived from inputs and identify potential errors or risks. In contrast, AI/ML models, particularly deep learning algorithms, embrace complexity to optimize performance, at the expense of transparency. While these advanced models may deliver superior accuracy, their intricate architecture and reliance on vast datasets make it challenging to explain their decision-making processes. This lack of transparency undermines trust, complicates risk mitigation and makes it harder to identify when and why a model fails.

AI/ML models are often referred to as ‘black boxes’ because their internal workings are difficult to interpret, even for experts. This poses significant challenges at various stages of the model lifecycle, including validation, deployment and monitoring. Without a clear understanding of how inputs influence outputs, institutions face heightened risks, particularly in such high-stakes scenarios as credit decisions, fraud detection or regulatory compliance. For example, a credit risk model that cannot explain why it denied a loan could expose a financial institution to regulatory scrutiny and reputational damage.

To mitigate model use cases that are exposed to high reputational and non-compliance risks, financial institutions should prioritize modeling approaches that emphasize inherent transparency and interpretability over ‘black box’ models in which the performance gains are only marginal.

For other use cases, in which AI/ML models are employed, institutions should invest in analytical tools such as SHAP and LIME, which have emerged to address explainability challenges. Institutions should also establish clear guidance on when and how to use global (‘on average’) and local (at observation level) interpretability methods. This includes establishing criteria for selecting appropriate techniques and sample sizes that are needed to assist in providing local/global explanations of model behavior.

However, institutions should remain cognizant of the limitations of these analytical tools. Regulators have noted that such tools often provide generalized explanations, rather than specific insights into individual outcomes. As a result, a reliance on post-hoc expandability tools, while helpful, cannot fully address the risk posed by opaque AI/ML models. Institutions must balance the use of these tools with proactive efforts to integrate transparency and interpretability into the design of their models from the outset. (The trend toward explainable AI continues to remain a promising industry development.)

3. Third-party dependency

Financial institutions rely on third-party vendors for AI/ML tools and datasets, introducing the risk of vendor lock-in and reduced oversight. Reliance on external vendors for AI/ML models, data inputs or other critical components introduces heightened risks, particularly for smaller organizations, which may lack the internal expertise, bargaining power and resources to conduct rigorous due diligence on vendor models. Increasingly, vendor models may expose institutions to fourth-party dependency – i.e., the reliance of a vendor’s model on other external data sources, algorithms or services provided by third parties unknown to the institution using the model. This adds risks, as it introduces potential vulnerabilities outside the direct oversight of both the financial institution and the primary vendor.

The lack of transparency creates significant challenges in assessing model appropriateness, understanding its limitations and ensuring its alignment with regulatory and business requirements. For smaller institutions, there is a higher chance that errors in vendor models may go undetected, as they may lack the technical sophistication to identify and address underlying issues. The inability to comprehend a vendor model fully exacerbates risks, particularly during the validation, deployment and monitoring stages.

Larger financial institutions can often demand more detailed disclosures and influence the vendor’s model development and enhancement roadmaps. Smaller institutions, on the other hand, may be forced to accept models with minimal insight into their inner workings, leaving them exposed to risks. Additionally, vendor models often require periodic updates or upgrades to maintain relevance and accuracy in changing environments. However, if these updates are not correctly classified as model changes, this failing to flag model revalidation can introduce new vulnerabilities, as institutions continue to use flawed or poorly calibrated models without robust testing and validation processes. Establishing a rigorous review process to identify and understand all assumptions, limitations and fourth-party dependencies prior to initial deployment and subsequent updates of vendor models is essential to mitigate these risks.

Institutions will also benefit enormously by thoroughly evaluating vendors’ contingency plans against key areas, such as response strategies for model failures, mechanisms for minimizing service disruptions, processes for rapid issue resolution, and safeguards to ensure compliance with regulatory requirements. This proactive evaluation can help institutions mitigate risk associated with vendor dependencies and maintain operational resilience in the face of potential model breakdowns.

4. Data integrity, protection and privacy

AI/ML models inherently rely on vast amounts of data to achieve high levels of accuracy and performance. In the context of financial institutions, which handle highly sensitive customer data, this creates a dual concern. First, these institutions are legally obligated to safeguard customer information and limit its use to permissible purposes that are clearly disclosed. However, the insatiable data demands of AI/ML models increase the risk that restricted or unauthorized data may be used inadvertently or intentionally for model development. This can lead to violations of privacy laws and regulations, such as GDPR, and erode customer trust. Second, the large number of parameters in AI/ML models can result in indirect learning of protected characteristics like race, gender or age, even if such data was not explicitly used for model training purposes. This ‘hidden learning’ can lead to discriminatory inferences and decisions, raising concerns about fairness and compliance with anti-discrimination laws.

Financial institutions can adopt structured measures to mitigate these risks throughout the model lifecycle. During data preparation, robust data governance frameworks should be in place to ensure that data used for model training adheres to legal and ethical standards, with clear documentation of its sources and permissible use (it has been tested for the presence of socially constructed biases, inaccuracies, errors and mistakes). In the model development and validation stages, such techniques as differential privacy and adversarial testing can be employed to ensure that models do not inadvertently learn sensitive attributes. Independent validation teams can test rigorously for compliance with data privacy standards. During deployment and monitoring, institutions can implement real-time monitoring for discriminatory outcomes.

Risk factors that existing MRM frameworks are well-equipped to mitigate

5. Accountability

A lack of accountability is a significant risk for AI/ML models, as any adverse impacts of the technology, including those related to human health, safety and financial well-being, may go unaddressed. Financial institutions (especially banks), however, are well-positioned to mitigate this risk due to their adherence to such established regulatory guidelines as SR 11-7. These guidelines emphasize the importance of clear roles and responsibilities throughout the model lifecycle. Financial institutions typically maintain robust model risk governance frameworks that assign ownership of every model in the inventory to specific individuals or teams. By building on these existing practices, financial institutions can maintain accountability for their AI/ML models, even as the complexity and count of their model inventory increases.

6. Bias and fairness

Bias and fairness are critical concerns for AI/ML models, particularly because these models thrive on vast amounts of data, operate without the constraints of underlying economic theory and focus solely on optimizing performance metrics. This ‘more is better’ philosophy – more data, more parameters and more complexity – can unintentionally reinforce societal and historical biases present in the datasets. The absence of a rationale to guide the model output further increases the risk of inadvertently perpetuating and amplifying inequities.

However, financial institutions are well-positioned to address bias and fairness due to long-standing regulatory obligations, including ECOA and FHA. Fair lending laws require institutions to ensure equitable treatment of customers and prohibit discriminatory practices. In principle, these mandates have forced financial institutions to implement controls to look for outcomes that may be deemed unfair or discriminatory. While AI/ML introduces new challenges, including identifying and mitigating hidden biases in large datasets, these are extensions of familiar fairness concerns that banks have managed for decades. By leveraging such existing practices as disparate impact testing and monitoring, financial institutions can mitigate the concerns around fairness and bias related to their AI/ML models.

7. Transparency and robustness

Model transparency refers to full disclosure about how a model operates, including its development, training data, testing processes, envisioned use cases, foreseeable limitations and potential failure modes. Transparency risks arise during multiple stages of the model lifecycle. For instance, during model development, insufficient documentation about the data sources, preprocessing steps and selection criteria can create opacity that obstructs future validation or regulatory review. Similarly, in the model deployment stage, a lack of clarity around intended use cases and limitations can lead to misuse or misinterpretation of the model’s outputs. Transparency is a minimum standard for managing AI/ML risks, but it does not guarantee interpretability, as even fully disclosed models can operate as ‘black boxes’, with outcomes that remain difficult to explain.

Robustness is the ability of a model to deliver reasonable and consistent results under a wide range of conditions, including extreme, missing or erroneous inputs. For AI/ML models, robustness risks often emerge during the data preparation stage, where poor data quality or spurious relationships between input features and outputs can compromise reliability. Overfitting during model development is another key vulnerability, where the model learns noise in the training data instead of genuine patterns, leading to poor performance on out-of-sample data. Additionally, robustness issues can surface during model monitoring and maintenance when shifts in data distributions or adversarial inputs expose fragilities in the model’s design. AI/ML models, being more data-centric and less grounded in economic theory, are particularly prone to these challenges, making robust testing a critical focus.

Financial institutions are well-equipped to navigate these risks due to their experience with traditional model validation and risk management frameworks. For example, tools and techniques used for stress testing and sensitivity analysis in traditional models can be adapted for AI/ML models to test robustness against extreme scenarios. Similarly, model validation frameworks that require comprehensive documentation and use-case-specific disclosures naturally align with the demands for transparency in AI/ML models.

8. Ethical and legal compliance

Ethical and legal compliance is a pressing concern in Silicon Valley, where AI/ML technologies are often designed for mass-market applications with far-reaching societal consequences. The unregulated operating environment of Silicon Valley can lead to significant gaps in ethical and legal compliance. However, in the financial services industry, such risks are less pronounced due to the industry’s heavily regulated status. Institutions face severe consequences, including fines, legal action and reputational damage, for any violations, creating strong deterrents to unethical practices.

Financial institutions are inherently structured to ensure compliance with these standards. Robust compliance departments play a critical role in overseeing adherence to laws and regulations, investigating potential issues and enforcing policies to prevent misconduct. These departments can work in tandem with risk management teams to address AI/ML-specific concerns (such as those related to bias and fairness risk, explainability and interpretability risk), ensuring that models align with ethical principles and legal mandates.

9. Scalability and performance

Scalability and performance are critical concerns for AI/ML models, especially as they process large volumes of data or perform in real-time scenarios. Scalability refers to a model’s ability to handle increasing data sizes or computational demands without degrading performance. Performance relates to a model’s ability to produce accurate and timely results consistently. These issues can arise during the deployment and monitoring stages of the model lifecycle, where computational efficiency, latency and reliability are put to the test in production environments. If a model cannot scale to meet operational needs or suffers from performance degradation, it can fail to deliver its intended outcomes, leading to business disruptions or customer dissatisfaction.

However, financial institutions typically deploy AI/ML models in limited, well-defined use cases such as credit risk assessment, fraud detection and loan underwriting, which have narrower scalability demands compared to those of the mass-market AI/ML systems that require intensive high-speed computational capacity. Moreover, the potential impact of performance failures is often contained within specific operational areas, limiting broader systemic consequences. For smaller institutions, which are limited in size (in terms of customers and transactions), and which typically avoid high-frequency trading or complex payment infrastructure roles, scalability and performance risks are inherently less critical.

10. Human-AI interaction

Human-AI interaction risks arise when users misunderstand AI/ML model outputs, leading to inappropriate decisions or actions. This risk can manifest at multiple stages of the model lifecycle, particularly during deployment and monitoring, where the interaction between the model and its users becomes operational. Examples include over-reliance on AI recommendations, neglecting to question or validate outputs, or improperly integrating AI insights into broader decision-making processes.

However, the financial services industry can be better equipped than others to mitigate these risks, thanks to its multiple layers of risk oversight, controls and audit structure. Financial institutions are typically required to maintain detailed tracking and audit trails, enabling them to identify and correct erroneous decisions promptly. Unlike the largely autonomous systems promoted in Silicon Valley, where human oversight may be minimal, the financial services industry prioritizes oversight and control. This regulated approach reduces the likelihood of catastrophic failures resulting from ungoverned human-AI interaction, making this risk more manageable in financial services than in less regulated industries.

Back to top

Conclusion

The financial services industry is undergoing a transformation driven by the proliferation of models across a wide range of business decisions, and fueled by the rise of AI/ML. These technologies have democratized model development and use. While this trend shows promise in unlocking tremendous productivity and innovation, it exposes institutions to unique and novel risks. The ‘black box’ nature of AI/ML models and the unpredictability of their outputs make validation a highly specialized and resource-intensive task. This dynamic represents a striking role reversal, in which validators must now possess more technical expertise than developers. A growing talent gap poses a challenge for institutions of all sizes, but is particularly acute for smaller organizations with limited resources.

Integrating the agility of Silicon Valley’s advances with the disciplined MRM frameworks of the financial industry is imperative. This convergence of cultures must balance rapid innovation with robust risk oversight to ensure that AI/ML adoption is both effective and sustainable. The industrialization of MRM through such initiatives as ModelOps and process standardization is becoming essential, particularly as the use of AI/ML accelerates. Automation of validation, monitoring and documentation processes allows institutions to manage increasingly complex model inventories more efficiently.

A focus on risk-based prioritization – concentrating efforts on genuine risks rather than a ‘check-the-box’ approach – can help smaller institutions deploy AI/ML responsibly within their resource constraints. We discussed the 10 most common AI/ML risks across the model lifecycle and made the case that four – purpose limitation; explainability and interpretability; third-party dependency; and data integrity, protection and privacy – should deserve more MRM attention. We argue that the existing MRM practices enforced upon them by regulators over past decades equip them to manage the remaining six effectively with minimum enhancements to their MRM function. 

Back to top

About Crisil Integral IQ (formerly Global Research & Risk Solutions)

Crisil Integral IQ delivers solutions and actionable intelligence to top financial institutions, driving strategic transformation, risk optimization, and operational excellence. Our offerings across research, risk, lending, analytics and operations have empowered clients to navigate complex markets, mitigate risks and unlock new opportunities. Our domain expertise, innovative solutions, future-ready technologies such as AI and data science give clients the confidence to accelerate growth and achieve sustainable competitive advantage. Our globally diverse workforce operates in the Americas, Asia-Pacific, Europe, Australia and the Middle East.

For more information, visit IntegralIQ.Crisil.com.

Back to top

  • LinkedIn  
  • Save this article
  • Print this page  

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe

You are currently unable to copy this content. Please contact info@chartis-research.com to find out more.

You need to sign in to use this feature. If you don’t have a Chartis account, please register for an account.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here.