• Center for Problem oriented policing

Step 20: Formulate hypotheses

Whenever we confront some new and perplexing crime pattern we form hypotheses about its causes, often based on incomplete information. Experience and theory are good sources of hypotheses. You should (1) clearly state your hypotheses, (2) not be wedded to them, and (3) use data to objectively test them. Expect all hypotheses to be altered or discarded once relevant data have been examined because no hypothesis is completely right. For this reason it is often best to test multiple conflicting hypotheses.

A set of hypotheses is a roadmap for analysis. Hypotheses suggest types of data to collect, how this data should be analyzed, and how to interpret analysis results. If you were investigating drinking-related assaults in bars you might begin with the question, "How many bars are problem locations?" Based on the 80-20 rule (Step 18), you would state the hypothesis that some bars will have many fights, but most will have few or none. You would then test this hypothesis by listing the licensed drinking places and counting the number of assault reports at each over the last 12 months.

If your hypothesis was supported, you might ask the question, "What is different about the bars with many fights compared to the bars with few assaults?" The concept of risky facilities (Step 28) would help you form a set of three hypotheses:

  1. Risky bars have more customers.
  2. Risky bars have features that attract assaulters.
  3. Bar staff in risky bars either fail to control behaviors, or provoke fights.

You can test these hypotheses by gathering data on the number of customers at high- and low-risk bars, analyzing the number and rate of assaults per customer, observing the interactions of people at troublesome and trouble-free bars, and interviewing staff and customers.

If your first hypothesis was contradicted by the data, and you found that there was no great difference in numbers of assaults across drinking establishments, then you might ask the question, why are so many bars troublesome? This suggests another hypothesis: It's a perception problem; the city has about as many bar assaults as other comparable cities. This hypothesis suggests that you will need data from comparable cities.

If, after you collected the relevant data, you found that your city has an abnormally high number of problem bars, you might ask the question, "What is common to most bars in the city that produces a large number of assaults?" One hypothesis is that it is the way liquor licenses are dispensed and bars regulated. Another hypothesis is that there is something about the nature of bar customers in your city. Testing each would require you to collect relevant data and assess the validity of the hypothesis.

Notice how the questions and hypotheses structure the analysis. Test results - positive or negative - reveal new, more specific questions. The objective is to start with broad questions and hypotheses and, through a pruning process, come to a set of highly focused questions that point to possible responses.

Hypotheses suggest the type of data to collect. In the bar assaults example, the test of each hypothesis requires specific data. Sometimes the same data can test multiple hypotheses (as is the case with choosing among the three alternative explanations for risky bars). Often a variety of data is required to select among alternative hypotheses (as is the case with the last set of hypotheses). The more specific your hypotheses, the more focused your data collection will be. This is why it is more important to have a clear hypothesis you personally dislike, than an unclear hypothesis you approve of, or worse, no hypothesis at all.

Paralysis by Analysis

The lack of explicit hypotheses can lead to "paralysis by analysis," collecting too much data, conducting too much analysis, and not coming to any useful conclusion.

Hypotheses can help direct the analysis of data. Every clear hypothesis suggests a pattern of data that you should be able to observe, if the hypothesis is correct. In the example above, the hypotheses derived from the concept of risky facilities can be tested using a simple analytical procedure. If a bar is a crime generator, then you should see a high number of assaults, a high number of customers, but a low assault rate (see Step 17). Failure to find this pattern suggests the hypothesis is wrong. So it is important to have a clear idea of what you should observe if your hypothesis is correct, and what you should observe if your hypothesis is wrong (see third column of the table). If you cannot do this, then this is an indicator that your hypothesis may be too vague.

Hypotheses help interpret the analysis results. Let's assume that the analysis of bar fights showed that a few bars had most fights, and observations of the high- and low-risk bars indicated that the security staff of the risky bars provoked fights. This immediately suggests a possible avenue for intervention. In short, the validity of a hypothesis must make a difference. That is, if the hypothesis is true you will take a different decision than if it is false. If you will make the same decision regardless of the test results, then the hypothesis and its test are irrelevant.

In summary, hypotheses are important for guiding analysis. To formulate hypotheses you need to ask important questions, then create simple and direct speculative answers to these questions. These answers are your hypotheses. These speculations must be bold enough that they could be wrong, and there must be a way of showing whether they are right or wrong. If possible, create competing hypotheses.

Hypothesis formation is a useful group exercise, as it allows participants with contrary views to put their perspectives on the table in a way that allows clear and objective tests. In this way, participants contributing invalid hypotheses make substantial contributions to the analysis of the problem. If each hypothesis is linked to a potential solution, the test of these hypotheses simultaneously directs attention to feasible responses and rules out ineffective approaches.

Next Step

Questions, Hypotheses, and Tests

QuestionExample HypothesisPossible Test
Why is this place risky?It is due to a large number of targets being available.Count the targets at the location and calculate the crime rate. Compare this rate to rates for the surrounding area. If the risky place has a higher rate, the hypothesis is false, if it is about the same or lower, then the hypothesis is true.
Why are there more car thefts in the problem area than in nearby areas?Residents park their cars on the street, unlike residents of other areas.If the problem area has similar or lower on-street parking rates than the others, reject the hypothesis. If higher, accept it.
Why did the theft of copper piping from new construction suddenly increase?A new owner bought a nearby scrap metal dealership.Compare the thefts of piping before and after the change in owners. If the theft rate is the same before and after, or the trend in thefts was already going up before the change, then the hypothesis is probably false. If otherwise, the hypothesis appears reasonable.