The Midsummer Crime Cluster: an investigation
the true crime that's worth your time
[the rest of SDB’s prob-stat adventure is right here, or grab a paid sub and get all content in your inbox, in its entirety]
For years, I'd thought in passing about putting my July-tragedy-cluster theory to the test…and then decided to think about something else instead. The theory itself – that major-case crime events and other horrific tragedies occur at a higher rate or volume in the middle of July – just sounds old-wives'-talesy on its face, too much so to bother running numbers on. (See also: this Forbes piece from last summer, primarily a lament about mass shootings followed by a shrug emoji on causation.)
Plus, there's the confirmation bias: because many of the horrors in question land on July 17, my non-Zodiac dad's birthday, that means I'll notice a cluster in a way I wouldn't if the same event density manifested on, say, October 17.
And what would such a test even look like? How much data would I have to gather; how would I really know what any of it meant? Would it turn into a whole foil-hat boondoggle – or worse, I'd become like that satanic-ritual "expert" witness in the West Memphis 3 case who got his certification from the back of Mad Magazine or some damn thing? "Who needs that," I decided. "Nobody cares anyway."
Cut to my esteemed colleague Kevin Smokler, caring. Kevin texted me a couple weeks ago to ask for help locating my (too-)frequent mentions of the July Tragedy Cluster in the Best Evidence archives. I chased down a couple links for him; then I bounced some data-collection discussion prompts off him the other day. Kevin has always quite generously humored me about the July Tragedy Cluster, and I thank him for his service – go read his books! – and urge you to read his recent piece on the San Ysidro massacre.
But despite my best efforts to postpone the arithmetic of it all with a conversation about methodology, it became clear that I would have to 1) do math but 2) try to trust the process. Like, let's see if there's a "what" there first and worry about whether it's serving a "why" later.

The first task was to list all the crimes/tragedies in the cluster "range" – which started out as a week, July 12-19, but which I expanded to July 10-20. I had a running list in an Excel file, plus I combed through date articles and lists of unsolved murders and disappearances on Wikipedia.
Then it seemed like I should do a rough "qual pass" to weed out non-criminal tragedies like Edwardian fires and wrecks; count the events remaining; count how many qualified as "major case"; and note down both the raw numbers and the rough percentages.
It didn't take long for a "vapor data" issue to arise, because things may happen on a case on a certain day – trials begin; sentences get handed down – that aren't necessarily the "date in chief" for a given case. Jeffrey MacDonald's trial started within the date range in 1979, for instance, but the date in chief for that case is, I would say, in February, not July. So, in an attempt to correct for confirmation bias, those sorts of "weak-tie" events got chucked out as well.
That pass left me with 75 crime "events" in the 10 days, 26 of them major case (a handful of those are borderline, but a handful I didn't count are also borderline, so it should even out). So, about a third of them; not sure what that'll end up meaning.
Next, I scrolled through a dozen pages of Exhibit B. Books's major-case tag, planning to jot down 12-15. I ended up with more like 50; in any event, the idea here was to compile a non-exhaustive but still wide-ranging list of bold-type cases (I also Googled phrases like "most notorious true crime cases" and "most popular crime stories") that looked like a useful cross-section, and chart all those dates. That list contains cases you'd expect (Manson, Lindbergh), and some that might not stand the test of major-case time or "heft," but seemed worth including for my purposes here (Murdaugh, Hae Min Lee).

The takeaway after the major-case date listing…well, there were a couple, starting with the serial-killer problem. Beyond the problem of their being evil scourges who terrorize society, that is – but they're also fucking up my research program, because when you run those search terms I mentioned, you get top-ten lists, the most-searched crimes in your state, blah blah blah fishcakes, but in most of the variations, the list is like 70 percent serial-killer cases. This one: 80 percent. This one: 70 percent, at least – I didn't count Gein or Alcala, so probably more like 80 there too.
The issue here is that, depending on the case and the number of victims "assigned to" or claimed by the killer, one or more of their murders falling within the July date range isn't significant. I mean, unless it is, and the guy had some weird capital-T Thing about midsummer moon phases, but that's not per se relevant either; it's a serial killer, his rationalizations aren't ours to understand.
But between the high volume of victims or "events" within the container of a single case, the delay between when a given victim disappeared or got killed and when their remains might have gotten discovered…it's just difficult to tell whether I should bother checking the best-known serials' charge sheets for murders in the date range. The apparent randomness that makes them so terrifying also makes them hard to contextualize properly, I guess.
To give you a specific example, Ted Bundy committed two murders in the range: Janice Ott and Denise Naslund, two of the better-known victims on Bundy's list, taken by Bundy within hours of each other at Lake Sammamish on July 14, 1974. But Bundy killed dozens of people, so it would strike me as odd if he didn't show up in this range, not to mention that he may have killed others in that ten-day span and never got around to specifying them to law enforcement. But he does. Probably doesn't mean nothing; could mean anything.
At the same time, you have other famous cases, responsible for dozens of deaths, that don't pop in the July range at all; exactly one scene in the Greek tragedy that is the Kennedys is in the range; and for what it's worth, the question as to the significance of serial killers' "charting" in the range also comes up with organized crime. The percentage is lower, somewhat remarkably so – it's just Carmine Galante, really, of the front-page gangster killings; what, they all go down the shore for the summer and the Seaside boardwalk is neutral territory? – but it's the same qualitative point, to wit: in a so-called high-volume case like BTK or one of the Columbo Wars, does it mean anything that a part or fraction of it happened on a given date? Or is it just a day-ending-in-Y thing?

Don't know yet! I do know this agonizing is what you get when a poetry major tries to run a data set during a heat wave! Hang in there, y'all. So, the other takeaway: it looks like there's another cluster range in late November: JFK, DB Cooper, George Parkman, Moscone and Milk, Natalie Wood if you want to count that.
A bunch of reasons that could happen, a bunch of things that could look like reasons but may have nothing to do with anything, a bunch of ways to frame the range so you get more or "bigger" cases into the set – and it's pointless to speculate further until I pull a random comp range and look at those stats.
Over to random.org to get an arbitrary day of the year, and plot the ten days around that date. Because I want to finish this before next July, I went more cazh with the data collection here – i.e., I didn't make an Excel file – but I did throw out any dates that had Valentine's Day, Halloween, the winter holidays, or my birthday within the range, and settled on April 8. Putting that at the center of the range, I looked at April 3-13, using most of the same searches I used for the July list, which left me with 53 cases overall, 19 of them major-case – and then I added 10 percent to each of those numbers, to account for the lists I didn't check this time.
Like I said, the comp is more back-of-the-envelope, but again, trying to finish this…ever, so the April range nets out to 58 cases, 21 major cases. Fewer cases overall; slightly higher percentage of household-namers (36.2 percent, versus 34.6 for the July range).
What else can we take from those numbers, though? N…ot a whole hell of a lot, from the numbers themselves. I think there's some meaning in the higher number in July, but I can't prove it, because there's simply too much air in the data: should I have counted all criminal-justice "contact events"; did I use an agreed-upon definition of "major case"; what splits do I get if I comb Murderpedia for significant dates in serials' files, on and on. It's too small a sample size, and any calculator I used to determine the validity gave me a result of, like, plus or minus 28% – so, about as reliable as arson forensics. (That means "not very.")
Certainly the major-case percentages don't tell us anything – and I didn't expect them to; those should remain fairly stable across sets of data – although I might go back and look at the kinds of major cases, to see if, say, kidnappings cluster in a different range from U.S. assassinations or famous robberies. Maybe that's how I should have run the numbers from the beginning, now that I think about it, starting with the 10-12 most notorious of each type of case and seeing if/where they tended to bunch up.

But as far as the July range overall, I don't know what we can infer there. It may "matter" that the July number is higher; it may just be a thing that is. If I run the November range and get an elevated number, I could maybe draw some inferences – it's Thanksgiving; short-staffed institutions + frayed nerves + booze 'n' fam = you get it – but here again, it would depend on what kind of crimes get committed. If the range is the first ten days in August, same thing: what do Manson, Lizzie, and Watergate have in common besides single-name major-case status? Nothing.
So, the stats don't necessarily tell us anything, but the project of gathering them might, like the fact that, although a big part of what attracts us to true crime is the desire to control chaotic evil with information, serial killers in particular defy our predictive efforts. (Yeah yeah, VICAP, but with credit for the intent, I'd classify it as more of a who (or whether) database than a when algorithm, if that makes sense.) Like how we think about major cases. Like whether rising global temperatures will find another way to make climate change lethal.
I'll keep collecting info on this; it's worth noting that I had to update the numbers more than once after finishing the main draft of this piece, sigh. If any of you want to roll into the B.E. stat lab and fine-tune the July Crime Cluster idea; peek at my spreadsheets; or have other ideas about organizing or interpreting this data, I'd love to hear them; drop down to the comments, or shoot us a call/text at 919-75-CRIME.
First off, if you gather the data, I will be more than happy to run the statistical analysis on it. Second, my working hypothesis would be that crimes would go up in the Summer because people are more likely to be going out at night than in the winter (darker earlier, colder). This means more opportunities for criminals to encounter victims in the dark, when they’re outside of their homes. As my criminologist colleagues argue, crime requires three elements: a potential perpetrator, a potential victim, and a place where the two come together. If people are more mobile in the Summer, you’re going to get more of the third.