The Shapley value is a concept from cooperative game theory that provides a way to fairly distribute the total value generated by a group of individuals or players among its members. It was introduced by Lloyd Shapley, an American mathematician and economist, in the early 1950s. In cooperative games, players come together to form coalitions, and each coalition produces some value or outcome. The Shapley value helps determine the fair distribution of this total value among the players based on their contributions to different coalitions. In the context of machine learning, it has found application in machine learning, particularly in the realm of model interpretability, feature importance, and understanding predictions.
Feature Importance: Shapley values provide a way to attribute the contribution of each feature in a predictive model to the final prediction. It assesses the impact of a feature by evaluating its contribution in different coalitions with other features.
Interpretability: They help in understanding how the inclusion or exclusion of a particular feature influences a model's output by considering all possible combinations of features and their contributions to predictions.
Model Explainability: Shapley values offer a coherent way to explain individual predictions made by complex models such as ensemble methods, neural networks, or gradient boosting machines. They provide insight into why a model makes a specific prediction for a given instance.
Calculation of the Shapley value involves considering all possible permutations of the players and evaluating their marginal contributions. For each player, it computes the average marginal contribution over all permutations. This average is the Shapley value for that player. The formula for calculating the Shapley value for player i in a game with N players is as follows:
.
N is the total number of players.
S is a coalition that does not contain player i.
v(S) is the value of coalition S (payout function).
|S| denotes the number of players in coalition S.
The Shapley value is widely used in various fields, such as economics, political science, and artificial intelligence, to fairly distribute benefits or costs among multiple stakeholders in a cooperative setting. It provides a mathematically sound and principled way to allocate resources and incentivize cooperation. In machine learning, the Shapley value has been adapted and applied to explain the predictions of complex models and to attribute contributions to individual input features.
Properties of the Shapley value
The Shapley value takes into account the of each player to all possible coalitions. A player’s marginal contribution is how much their presence adds to the value of a coalition when they join it. The Shapley value allocates the total value across all players, and the sum of the allocated shares is equal to the total value generated by the grand coalition, which includes all players. Moreover, the Shapley value is the only attribution method that satisfies the properties Efficiency, Symmetry, Dummy and Additivity, which together can be considered a definition of a fair payout.
Efficiency: The player contributions must add up to the difference of payout for the player x and the expected value for the game payout with all players.
Symmetry and fairness: The Shapley value satisfies important properties, such as symmetry (interchangeability of players) and fairness. It ensures that players with similar contributions receive equal shares of the total value. In other words, the contributions of two players j and k should be the same if they contribute equally to all possible coalitions.
Dummy: A player j that does not change the payout value – regardless of which coalition of players it is added to – should have a Shapley value of 0.
As an example in machine learning area, suppose you trained a random forest, which means that the prediction is an average of many decision trees. The Additivity property guarantees that for a feature value, you can calculate the Shapley value for each tree individually, average them, and get the Shapley value for the feature value for the random forest.
Advantages and Disadvantages
Advantages
The Shapley value has various advantages, making it an appealing and widely used approach in several fields. Some of the advantages of the Shapley value include:
Fairness: The Shapley value ensures fairness by distributing the value generated by a group of players among them based on their individual contributions. It is the unique allocation that satisfies several desirable axioms, such as efficiency, symmetry, and additivity.
Coalitional Rationality: The Shapley value promotes cooperation among players. Each player is incentivized to work together with others to form coalitions, as their contribution to the coalition’s value will be appropriately recognized.
Non-partisan: The Shapley value treats all players symmetrically. It considers the contribution of each player in all possible orders of coalition formation, preventing any bias towards specific players or subsets of players.
Recommended by LinkedIn
Uniqueness: Unlike some other allocation methods, the Shapley value has a unique solution for any cooperative game. This uniqueness ensures consistency and avoids conflicts when deciding on the value distribution. Independence of Irrelevant Alternatives: The Shapley value does not depend on the presence or absence of irrelevant players. The value assigned to each player is only determined by their individual contribution to the coalition, regardless of other players’ participation.
Widely Applicable: The Shapley value can be applied to various scenarios, including business partnerships, network routing, cost-sharing, and coalition formation in games. Its versatility makes it suitable for analyzing different types of cooperative systems.
Computationally Feasible: Although calculating the Shapley value can be challenging for large games due to its combinatorial nature, there are efficient algorithms available, such as the Shapley value Monte Carlo simulation or polynomial-time algorithms for specific types of games.
Overall, the Shapley value provides a principled and fair approach for distributing the benefits or costs among cooperative entities, making it a valuable tool in various fields to address allocation problems. However, it’s essential to consider the context and assumptions of its application, as certain situations may call for alternative methods depending on specific requirements.
The scope of the Shapley values can be modified by changing the background data set. The Shapley can be compared to a big data set or to a single instance. That makes that Shapley values have the possibility to give contrastive explanations by choosing a certain background data set.
Disadvantages
While the Shapley value has several advantages, it also has some disadvantages and limitations:
Computationally Intensive: Calculating the Shapley value can be computationally expensive, especially for large cooperative games with many players. The computation time for the Shapley values is exponential in the number of players. The Shapley values are namely the sum over all possible coalitions, so for N players this is a sum with 2 N terms. The Computationally Intensive disadvantage of shapley value make it impractical for large games. In games with a large number of players, it may not be feasible to compute the Shapley value directly due to its computational complexity. Luckily, in the machne learning area, there are some approximation methods like Kernel Approximation, Monte Carlo Approximation, and Likelihood Weighted Approximation. These methods are implemented in Python in the SHAP package and it makes possible to calculate the Shapley values for models with more features. However, the approximation of the Shapley values to the exact Shapley values becomes worse for higher numbers of features.
Not Suitable for All Types of Games: While the Shapley value is widely applicable, it may not be the most appropriate solution for certain types of cooperative games. For instance, in games where the players’ contributions are highly interdependent or where the coalition structure is restricted, other solution concepts like the core or nucleolus might be more suitable.
Lack of Individual Rationality: The Shapley value guarantees fairness at the coalition level but does not necessarily ensure individual rationality. Some players might receive values that are lower than their contributions to certain coalitions, leading to potential dissatisfaction.
Sensitivity to Player Order: The Shapley value depends on the order in which players join a coalition. This sensitivity can be seen as a disadvantage in some cases as it can introduce instability or uncertainty in the value allocations.
Assumption of Transferable Utility: The Shapley value assumes transferable utility, meaning that the value generated by a coalition can be easily divided among its members. In real-world scenarios, this assumption may not always hold true.
Unknown dataset distribution: Another disadvantage is that a background data set is necessary to calculate the Shapley value. For example, a prediction function of the ML model is not enough, because of two reasons. First, the Shapley values explain the difference between the prediction and the average of the background data set. Second, the calculation method of the Shapley value makes a new random instance by drawing some features from the background data set. Without a background data set, the random samples can have feature values that are not possible in the real world.
The last and maybe the biggest issue is that the calculation of Shapley values has problems when there are correlated features. Current implementations assume independence between features, for instance the kernel Shap implementation. Random sampling goes wrong when there are correlated features because each feature is sampled from the marginal distribution. If the features are dependent, then the random samples can have very unlikely features values.
conclusion
In this post I discussed that despite Shapley value has many disadvantages, the Shapley value remains a powerful and widely used concept in cooperative game theory due to its fairness properties and unique allocation guarantees. However, depending on the specific requirements of a given scenario, other allocation methods or solution concepts might be more appropriate.