Double Materiality Benchmarking: Beyond ESG Ratings

Ratings are a ubiquitous part of the Information Age. Need to figure out where to go to dinner? You can use Google Reviews. Need to buy a car? You can check Consumer Reports. Need to find a doctor? You can use HealthGrades.

Ratings are designed to distill large amounts of information into a simple, accessible data point used to inform decisions. But not all ratings are created equal. Some ratings pull from single sources of data, such as public reports, whereas others combine multiple sources of data to provide more robust insight, such as third party datasets, IOT device data, and survey data. The value of a rating lies in the quality of the underlying data sources and methodology, which ultimately determines its reliability and validity.

In the past year, one rating that’s become more widely scrutinized is environmental, social, and governance (ESG) ratings. Beyond the traditional factors included in investment analysis, ESG ratings were designed to incorporate a new set of factors to better predict financial performance. Despite their growth in popularity, there are a number of structural limitations with ESG ratings.

ESG Rating Flaws

The scope of ESG rating approaches is limited, and for investors interested in directing their money towards impactful companies, the methodology is fundamentally flawed. The list below outlines a few of these challenges and concerns:

Singular financial motivation: The singular goal of an ESG rating methodology is to identify environmental-, social- and governance- related factors that predict financial risk or opportunity, not to specifically improve social or environmental outcomes. However, a growing number of investors are looking to invest in companies that not only perform financially, but also generate the most positive impact on the world. As outlined in ESG Mirage, the “first generation” ESG ratings simply don’t measure real impact, which is why McDonald’s -- which generates more greenhouse gas emissions than Portugal -- received an ESG rating upgrade from MSCI in 2021, whereas Tesla was recently kicked out of the S&P 500 ESG Index. As a result, prevailing ESG ratings overlook an entire segment of investors who want to incorporate the total societal impact of a company into investment decisions.

Top-down approach: Current ratings only measure the financial risks and operational factors deemed by the ratings agency as relevant. ESG ratings leave out factors that the company’s customers, employees, and other stakeholders believe to be important, thereby neglecting entire issue areas that may not affect the short-term bottom line, but do affect long-term financial outcomes (e.g., customer loyalty, brand reputation, and employee retention). These factors that come from a bottom-up approach are closer to those individuals most impacted by the product or service, or by the company’s day-to-day operations, and are therefore arguably more likely to predict the long-term financial success of a company than the standard sets of metrics currently used in ESG ratings.

Unreliable measurement methodology: Both in the US as well as globally, there is a lack of standard metric definitions, common units of measure, and rigorous third party validation. In addition, many ESG raters use optical character recognition (OCR) or similar technology to automatically scrape metric results from publicly available reports, but this technique lacks the data intelligence needed to ensure consistency in definitions, units, and reporting timeframes. ESG raters also manually source self-reported data from companies, but this data is not validated and thus subject to the same reliability issues. For these reasons, company- and metric-level results included in ESG ratings are likely incomplete and inconsistent, jeopardizing the ability for investors to make truly informed and effective decisions.

Instead of relying on ESG ratings that oversimplify measurement methodologies and are limited to short-term financial materiality, investors should demand a new paradigm of benchmarking -- one that prioritizes both financial and stakeholder materiality (i.e., double materiality) and that creates a more comprehensive and accurate representation of companies’ past and predicted performance to drive investment decisions.

Double Materiality Benchmarking

Double materiality benchmarking is grounded in the principle that the next generation of investors will increasingly need their investments to achieve both a financial return and optimize for positive impact. This cohort acutely understands that the current state of affairs is unsustainable in part because they will be the ones to live through its direct ramifications.

Double materiality benchmarking also rests on the belief that investing with this dual focus is not concessionary. As outlined in my prior article, ESG & Impact: Why we need both for meaningful change, companies that meaningfully integrate the voices of their stakeholders into ongoing business strategy -- and adapt business practices to improve social and environmental outcomes – will ultimately increase customer retention, reduce hidden operational risks, and create long-term sustainable value with stakeholders and shareholders alike.

The Data Fundamentals

The first -- and perhaps most critical -- of any rating process is defining the scope of data and metrics used in the rating. Unlike ESG ratings that typically use a standard metric set for all companies in a given industry, double materiality benchmarking recognizes that even within sectors or industries, different companies have different stakeholders, products, and operational processes. This means that even within a single industry, what’s financially or impactfully material to one company may not be to another. The scope of double materiality benchmarking should have the flexibility for objective, stakeholder-driven determination on key performance indicators (KPIs) -- both qualitative practice indicators and quantitative performance indicators -- to be included in the rating. This means conducting an independent, objective double materiality assessment (with the dual focus on financial and impact materiality) by surveying stakeholders directly, and weighting issue areas and individual KPIs based on their perceived importance.

The second step in the double materiality rating process is determining where to source the data. Many ESG raters source data from public reports or low-fidelity self-reported data, but this has a number of challenges as described above. In addition, in many cases this method relies on the same data being publicly available, limiting the approach to only those companies that voluntarily disclose their information. This effectively creates a patchwork of disparate companies and metrics with varying data collection methods, levels of validation, and assurance quality. For investors in private markets, this approach is even further limited due to the lack of any disclosure requirements for private companies. Another option is to source benchmark data from public datasets, such as the World Bank, OECD, or a number of other open source providers, but many of these sources only have data at a country level, not an individual company level. Finally, these public datasets are also limited to specific KPIs and do not offer the ability for customized insight.

As reliance on business software has become a universal standard of any IT pipeline in both the public and private sectors, a new option for data sourcing has recently emerged – privately-shared, anonymized, raw datasets sourced directly from companies’ existing systems. Through direct data system integrations (a common, secure data transfer approach), companies and their investors can create an ecosystem of real-time aggregate performance at an individual data point- and metric-level. By sharing their data, companies can in turn unlock the unattributed benchmark data from similar peers across the same metrics. This is the ideal data sourcing approach to create meaningful and reliable ESG and impact benchmarks and indices, as it ensures companies’ disclosed results are customizable, auditable, and ultimately investment grade.

Dynamic Indexing

Indexing is the process of measuring performance of a group of similar organizations to develop a commonly accepted standard of comparison. Major regulations including the SEC’s proposed rule on ESG investment practices and the EU’s SFDR regulation, have included indexing as a mandatory part of disclosure.

ESG indices to date have been far too shallow in their methodology, and do not reflect the level of metric-level specificity and evaluation design needed for meaningful insights. Dynamic indexing is a core component of the double materiality benchmarking methodology, and requires two key characteristics that set it apart from its predecessor:

Increased Precision via Multifactor Filtering: Indices need to resemble the companies in question enough to serve as a valid point of comparison and contrast. For example, comparing a small consumer finance lending company in Colombia to a multinational bank index in the United States likely doesn’t have enough normalization for meaningful assessment. But comparing a small consumer finance lending company in Colombia to a South American consumer finance index likely does. At an individual KPI- or metric-level, indices should implement filtering and adjusted benchmark values based on company size, geographic location, industry, and -- most importantly -- product or service type. This level of precision in normative benchmarking will ensure that companies get true comparison points that are similar enough to serve as a standard. In scientific terms, this effectively creates a control group for means of comparison.

Longitudinal Comparison to Benchmark Performance: Comparing a company’s performance at one point in time to a static index -- as many ESG ratings do -- limits evaluation of true performance improvements, which often take years to manifest. In order to discover which companies are authentically the top ESG and impact performers, investors need to determine whether a company’s performance is due its own internal practices and policies or if this perceived outperformance is simply attributable to broader macro trends. In other words, they need to establish -- or come as close as possible to establishing -- causality. In the absence of experimental design, investors can compare the longitudinal performance of a given company (ideally over multiple years or similarly extended timeframes) relative to the benchmark performance over that same time period. By overlaying the benchmark in a time series graphic (see example in Exhibit A), investors can see how a company’s upward or downward trends fared relative to the benchmark. This allows investors to evaluate whether the company’s explicit actions led to change, or if this indicates a broader trend that applied to other similar companies. In scientific terms, this is referred to as a difference-in-difference design.

Exhibit A: Longitudinal Comparison to Benchmark Performance

This graphic illustrates how dynamic benchmarking -- with longitudinal trend lines -- can show whether a portfolio company is outperforming relative to the broader industry.

The principles of dynamic indexing – rooted in validated metric-level results, precise benchmark values, and longitudinal comparisons – sets the stage for a more reliable, valid, and effective method of predicting financial and impact performance.

ESG+I Ratings

As a growing number of investors increasingly seek ways to invest more responsibly and effectively, they require a solution with more depth and reliability than the current ESG ratings. This demanded solution – while necessarily robust –  must also be simple to incorporate into existing processes and easy to understand. Using the principles from the double materiality benchmarking methodology, a new ESG+I (ESG + Impact) rating system could serve as the data intelligence engine that investors need. ESG+I ratings would include ratings on three different levels:

Data quality rating: Different companies have varying levels of rigor in their data collection and management processes (e.g., self-reported aggregate data versus directly integrated granular datasets, unvalidated results versus random sampling verification). The first level would provide a rating based on the overall confidence level in the reliability, validity, and accuracy of the results.

Practice rating: Companies that follow best practices in ESG and impact management most closely (e.g., those outlined in the UN’s SDG Impact Standards, such as stakeholder-driven materiality assessments) receive higher ratings, as this lays the foundation for high fidelity, double materiality benchmarking.

Performance rating: Performance ratings focus on a company’s qualitative and quantitative KPIs that are material to the company (e.g., greenhouse gas emissions, gender wage equity, customer quality of life).

Together, these three rating levels comprise the ESG+I rating. As illustrated in Exhibit B below, ESG+I ratings assign “grades” on a curve, based on the company’s year-over-year performance on metrics that are material to the company, relative to the company’s index performance. Results for each metric are rolled up to a company rating, and company ratings are rolled up to fund ratings.

By leveraging the principles of double materiality benchmarking to create ESG+I ratings, investors can be more confident in their ability to assemble a portfolio of assets that will be more likely to achieve high financial and impact performance. In addition, companies will be better equipped to use concrete data to ‘move the needle’ on specific, material issue areas that matter to their stakeholders. This will create a positive feedback loop and lay the groundwork for systems change efforts, including performance-based investing. The opportunity to bridge the gap between data-driven evaluation and meaningful business action is possible -- now it’s up to our generation to build and use the technology needed to make it a reality.

To read the article on Impact Entrepeneur click here: