Weighted Factor Model
This is one of 10 documents on Charity Entrepreneurship’s research process.
A summary of the full process is here.
A summary of the full process is here.
SummaryThis document explains why and how CE uses weighted factor models (WFM) as part of its research process. WFM consists of generating a set of criteria with assigned weighting for each and then assessing how a possible option scores on each of these. It is particularly useful as it allows researchers to combine a large number of objective and subjective factors and identify which ones drive the results. However, it has some weaknesses such as its lack of flexibility, which suggests that it is best used in combination with other methods.
CE uses WFMs at three stages of our research. At the first stage (idea sort), each intervention is assessed for twenty minutes based on all the methodologies, including WFMs. At the second stage (prioritization report), two hours are spent on each intervention using this method, but only for our animal research area. Finally, WFM is one of the four methods used for the eighty-hour assessment of each of the top interventions (intervention report). Concretely, lead researchers will fill in a prebuilt model taking into account four main criteria: strength of the idea, execution difficulty, limiting factors, and externalities. Each of these will be estimated using included built-in questions, which are more or less specific depending on the amount of time allocated. Table of contents:
1. What is the weighted factor model 2. Why is this a helpful methodology 3. Why it is not our only or endline perspective 4. How much weight we give to the weighted factor model 5. How CE generated the weighted factor model 5.1. Factors summary 5.2. Z score 5.3. Color coding 6. Different lengths of weighted factor estimates 6.1. Five minutes 6.2. Two hours 6.3. Twenty hours 6.4. External expert data 7. Criteria 7.1. Strength of idea questions 7.2. Limiting factor questions 7.3. Execution difficulty questions 7.4. Externalities questions 8. Deeper reading 1. What is the weighted factor modelBroadly, the process of creating a WFM involves generating preset criteria and weightings and then evaluating how a possible option scores on each of these. WFMs often involve a number of preset criteria ranging from three to twelve factors. They typically generate an endline score based on the option score and the criteria score (normally multiplied together). Both hard factors (such as population size in absolute numbers) and soft factors (such as a score out of ten for population size) can be used in WFMs. The way our team uses WFMs involves pre-generating consistent research questions that are asked across all charity ideas to produce a score for a given criteria.
Example weighted factor model (Charity Entrepreneurship 2019) 2. Why is this a helpful methodologyThe WFM is a highly versatile tool because it can incorporate a large number of factors including subjective ones clearly, but simultaneously uses a numerical calculation to determine the endline result. This allows it to produce surprising results and makes it easier to track down the factors that lead to the result.
Reasons this is a helpful tool (in rough order of strength)
Systematism in idea comparison: WFMs encourage considering the same aspects of criteria across multiple ideas. This allows much closer idea comparison than the other models, which each have more idea-to-idea variability. For example, comparing an idea’s limiting factor in very similar terms across all charity ideas can lead to a much stronger sense of how well an idea does in this aspect. Enables comparison of all ideas with equal rigor: We ensure that we apply equal rigor when evaluating the ideas by answering the same research questions that define the criteria in the same way, spending the same amount of time on each idea, and evaluating it in the same way. Reduced gaps: Many models can be largely affected by unconsidered factors or gaps in the information. For example, if a single important factor was not included in a cost-effectiveness analysis (CEA), it would be hard to detect but could largely affect the results. Due to the same questions being asked across all interventions and the same factors being filled in, there is a lower chance of gaps affecting one idea but not another in a WFM. Allows integration of multiple factors: Many models are not conducive to including many different factors in a single number. For example, CEAs do not handle limiting factor concerns very well unless multiple CEAs are done for many different possible levels of scale. Similarly, many CEAs do not include strength of evidence other than as a simple discount at the end of the calculation, which does not capture how to weigh different types of uncertainty (e.g., Knightian vs. non-Knightian). Sandboxing: A large difference between CEAs and WFMs is the total weight that a single factor can hold. In a CEA, one very large number can swamp many small numbers. For example, if an intervention affects a huge number of beings but has a very low chance of working, this initial huge number can make all the other numbers in the CEA trivial. Due to each factor having an effective maximum weight, a single factor affects a WFM far less. You could say the impact of that factor is “sandboxed” within a single factor. Allows soft and hard inputs to be combined: Some important factors are easy to get a single hard number on, for example “total population affected by measles”; however, other factors are impossible to put a hard number on, for example “the tractability of founding a new charity in India.” These factors can be given a soft number but in a consistent and comparable way in a WFM. These soft numbers can be calculated with harder numbers’ Z-scores to determine what ideas are outliers in terms of many positive factors. More angles for learning: One of the purposes of our research process overall is to generate better empirical information about how to rule in or out charity ideas more quickly in the future. WFM is the only system we use in which subcomponents could be correlated individually to our endline results. For example, we could determine if the evidence base predicts very strongly what interventions are recommended after deep reports are conducted. Pulling out a single aspect like this from a CEA or expert interviews would not be easy. Preregistration: In many ways a WFM leaves the fewest areas open to interpretation, with preset questions and descriptions for how different items would score ahead of time. This means that researchers with fairly different starting points and intuitions will more often reach the same conclusions when compared to systems that are more open to researchers’ interpretations. This concern most affects our informed consideration (IC) but can also largely affect CEAs. Understandability: Intuitive systems can be built into a WFM, making it quick and easy to understand relative to other systems. Color coding is easily used to show areas of comparative strength and weakness across a large number of ideas. Both expert views (EpV) and IC lend themselves to written paragraphs, which are slower. A CEA is quicker to understand the endline number but takes longer than any other system to understand the full logics and weightings behind the numbers. Encourages quantified consideration: Like CEAs, WFMs encourage quantified and numerical consideration of factors. By default, most people (including experts) do not think in quantitative terms. For example, when asked if an event will happen, most people think of this as a binary question (yes/no) rather than thinking about the probability of the event happening. WFM’s require quantitative inputs for each variable, which encourages quantitative thinking and calibration (e.g. an event being 20% vs. 80% likely). Can lead to novel conclusions: Like CEAs, WFMs can lead to surprising conclusions. Due to the methodology and the calculations being preset, it is common that after filling out the data, a WFM will suggest something to be high impact that would not have appeared so by taking a softer, more higher-level look. Makes it easier to communicate conclusions: Because all the factors are researched and scored separately, we can easily distill the advantages and disadvantages of each idea and explain why the given idea is better than the other. 3. Why it is not our only or endline perspectiveLast year this model was the primary one we used when comparing charity ideas, although in some cases we also used unweighted factor models. We think that although this model has considerable promise, it also has many weaknesses that can be counteracted by using multiple models. We also see considerable learning value in testing multiple models and seeing which ones best predict our endline mixed model conclusions.
Flaws of WFM (in order of importance)
Not a commonly used methodology: WFMs are not a commonly used system in the same formal way we use them. Thus there are few established norms and a lower level of initial understanding from both researchers and readers. It also suggests there might be an unknown but good reason why this sort of methodology is not used more often. Low flexibility: This system is the least flexible and adaptable across different charity ideas with the questions, full methodology, and criteria weightings all preset. This reduces bias but can also give a large amount of weighting to a factor that might be important overall but far less important for a specific idea. Limited question cross applicability: A subsidiary concern of flexibility is that specific questions will not be important to cover but research hours will go into them anyway. Likewise an idea that is important but specific to a given charity is less likely to be covered by this methodology. Considerable upfront time required: A huge amount of upfront methodological time is required when compared to other systems, because most of the methodology is designed ahead of time and closely followed throughout the process. This means that research is not produced for a long time at the start of a research year and also does not yield feedback loops as quickly when updating the methodology. Can make nonnumerical data look numerical: A concern with the WFM is that it assigns numerical ratings to nonnumerical data. This can both confuse and mislead people when considering the objectivity of the system if not explained clearly. Can be hard to determine source or reasoning of weighting criteria: Endline weights are often the only factor closely examined, and due to endline weightings being used to represent a large number of questions and sources of evidence it can be hard to track down what questions factored into this weighting and how heavily each question was factored. 4. How much weight we give to the factor modelDespite its flaws, we view the WFM as a highly important aspect of our process; we see it as having many of the benefits of CEAs but also being somewhat less error prone and likely to have gaps. Ultimately we think that WFMs, as one of our four perspectives, will generally get between one-quarter and one-half of our total endline weighting, with there being considerable variation depending on the specific charity idea and cause area. We expect WFM to be stronger in areas where there are many different factors at play and limited hard data.
5. How CE generated the weighted factor model5.1. Factors summaryRelated posts hyperlinked
Detailed information on what question is asked to research each criterion and how a score for each criterion is generated are covered below. 5.2. Z ScoreA Z-score is a numerical measurement, used in statistics, of a value’s relationship to the mean of a group of values, measured in terms of standard deviations from the mean. If a Z-score is 0, it indicates that the data point’s score is identical to the mean score. A Z-score of 1.0 would indicate a value that is one standard deviation from the mean. Z-scores may be positive or negative, with a positive value indicating the score is above the mean and a negative score indicating it is below the mean.
Z-scores can be used informally to: i) Standardize values measured across multiple different criteria, so they can be combined into an overall score and compared to other ideas. For example, we can have an overall z-score for a given idea based on how it compares to an average in terms of CEA, expressed in $ per DALY; population size affected, expressed in millions; and crowdedness, expressed in percentage of the problem addressed by other entities. ii) Assess how a given idea scores compared to all the other ideas considered (including an average idea), for example, idea x is better than 70 percent of the ideas on our list. iii) Spot what values are anomalous. For example, if one of the factors in the scale was an objective number such as population size, a Z-score value would show which countries are outliers relative to others even though population size can differ by orders of magnitude. iv) Reduce risk of some biases, for example, in a situation where the score is not converted to a z-score, we may happen to use a higher range of values for one criteria but not for another one, effectively changing its weight. For example, suppose a given intervention is evaluated on each factor on an arbitrary scale of 1 to 10. However, one criterion, scale, varies significantly, and you tend to give out sevens and eights frequently, while on the criterion of tractability, you tend to give very consistent scores of four or five. The net effect is that even if you think tractability is more important, you end up weighting scale higher. The method of converting this to a z-score takes care of this. More on z-scores can be seen here and in “The Failure of Risk Management” by Douglas W. Hubbard. 5.3. Color codingColor coding is used throughout the spreadsheet to increase ease of reading, with red values generally being areas of weakness and green values being areas of strength. This can allow the reader to quickly see which areas to look deeper into and which areas lead to the resulting total score.
6. Different lengths of weighted factor estimates6.1. Five minutesProcess
In five minutes, the larger-scale questions cannot be considered in depth. Instead, each factor can be intuitively considered and given a ranking; these rankings can then be added together, resulting in a total score. The factors should be understood first (by reading the content in this document and the linked content for each of the metrics). Questions to consider
Expected outcomes
6.2. Two hoursProcess
At the two-hour stage, each question can be considered from the full set of prebuilt questions with deeper research occurring for the most important question within each section. In this stage the questions work as more of a guide than a necessary list to answer. This year, this stage will only be applied to animal advocacy. In other cause areas we will use a different two-hour methodology. Questions to consider The weighted factor model questions Expected outcomes
6.3. Twenty hoursProcess
At twenty hours, each question in the document should be considered, researched, and answered. Ratings should be given thoughtfully and updated as new evidence comes in. Template for 20h WFM: These questions are guide questions and should be answered, even if tentatively. More questions can be added and answered that seem applicable for the specific charity idea.
7. Criteria7.1. Strength of idea questionsKey question: When you look at the theory of change that includes assessment of the strength of evidence, and rough cost effectiveness model, how promising does this idea look?
Theory of change and cost effectiveness We suggest implementing this section at the end of the factored model. NB do not make a cost effectiveness model because this will be an entirely different section. This section is more to flesh out other people’s CEA models and plausible theories of change that will then be used to create a CEA. Key question: What is the plausible path to impact of this charity idea? What do current estimates or expert views of cost effectiveness look like? Does it seem like it compares favorably to other ideas in the area?
Evidence Key question: Overall, how well-evidenced does this intervention look? Does the supporting evidence come from many different robust sources?
Evidence: Robustness
7.2. Limiting factor questionsKey question: What is the main limiting factor to scaling this intervention? At what size does it cap the intervention?
Funding availability
Talent availability
Size of problem
Logistical bottlenecks
7.3. Execution difficulty questionsKey question: Overall, how hard is it to set up and run this intervention well relative to others on the list?
Difficulty of founding
7.4. Externalities questionsKey question: What other possible negative and positive effects will this charity have? How large are they estimated to be? How much evidence is there? How much confidence do you have in the effects?
Within cause area
Outside of cause area
Information value
Other
How WFM compares to other methods used (timeline of an 80-hour report)
Expected outcomes
8. Deeper reading1) Our process for narrowing down which charity ideas to research
2) Metrics 3) Cost effectiveness 4) The importance of evidence 5) The importance of being flexible 6) Why you should care about scalability 7) Why you should care about indirect effects 8) How logistics will influence which intervention you pick 9) Counterfactual impact: what would happen if you didn’t act? 10) Why we look at the limiting factor instead of problem scale 11) Using a spreadsheet to make good decisions 12) Sequence thinking vs cluster thinking 13) Larrick, Richard P. “Broaden the decision frame to make effective decisions.” Handbook of Principles of Organizational Behavior (2009): 461–480. |
About UsCharity Entrepreneurship (CE) is a project of Charity Science Foundation of Canada, a foundation registered in Canada (charity number 80963 6236 RR0001). CE supports its incubated charities through a fiscal sponsorship with Players Philanthropy Fund (Federal Tax ID: 27-6601178), a Maryland charitable trust with federal tax-exempt status as a public charity under Section 501(c)(3) of the Internal Revenue Code. Contributions to CE are tax-deductible to the fullest extent of the law.
Privacy Policy: You can read our Privacy Policy here Terms of Use: You can read our Terms of Use here |
ConnectContact usPlease use our contact form.
|