A couple of weeks ago, the Pandemic Mitigation Collaborative rolled out some updates to their models, which I’ve posted about before: here and here. Since then, I’ve seen Eric Topol, director of the Scripps Research Translational Institute and member of the National Academy of Medicine, cite the “one million cases per day” pseudo-statistic on PBS Newshour, which sucks. I’ve seen the figures from this model cited all over the news — in the absence of real information, grifters like Hoerger fill the void, and with the sorry state of media today, nobody is any the wiser and his bullshit claims get spun up and amplified into synthetic truths. It’s really awful to watch. I am critiquing these models as a PhD epidemiologist. This is exactly my lane, this is exactly my expertise, and people like me and our insights are being completely marginalized as the pandemic grifters crowd us out with fake data, hyperbole, truly horrendous and stigmatizing claims about “airborne AIDS,” and all other manner of dog shit.
The PMC published a “technical appendix” about the updates to their models. I reviewed the technical appendix and it is one of the most bizarre documents I’ve ever laid eyes on. So, in today’s bonus edition of this newsletter, let’s close-read it — or at least some of it (I will get to anything I don’t cover today, a Friday evening, at a later time). The TL;DR: Dr. Hoerger definitely read my previous posts (LOL) but doesn’t really have a response to any of the critiques in them, and the technical appendix doesn’t go far enough in elucidating exactly what their modeling methodology is or how accurate it is. I am repeating my call for the PMC to publicly share the code they use to run these models (this can be done easily on GitHub and is in fact common practice among working scientists, at least in epidemiology) and will reiterate it at the end.
I’m going to take it by sections as they appear in the appendix, with heading titles italicized and bolded and direct quotes of text from the appendix italicized. My commentary will appear below any text quoted, not italicized.
What’s new?
We previously relied solely on Biobot for forecasting and a Biobot-IHME data linkage for case estimation. It was a Biobot-heavy model. The current model is not tied strictly to any data set, but rather the PMC’s best estimate of the truth, a true-case model that uses multiple data sources in the spirit of IHME’s original work in this area. Essentially, we link all three data sources, which have been active over different points of the pandemic to derive a composite “PMC” indicator of true levels of transmission. The indicator is weighted based on which data sources were available and their perceived quality at each point in time. We scale this composite PMC indicator to the metric the CDC uses when helpful for comparisons with their website, and scale it with the true case estimates of the IHME otherwise, as true cases are more relevant than arbitrary wastewater metrics.
Because it is not clear from the text: Biobot and CDC are sources of wastewater data. IHME was a source of estimated, aka modeled, case estimates, but is no longer active. My first post was about the impossibility of estimating cases from wastewater levels. I won’t rehash that here, but it is extremely misleading for this author (is it Dr. Hoerger? I don’t know!) to refer to “true case estimates” from the IHME since the IHME estimates a) were not counts of cases but modeled estimates and b) no longer exist. Linking together three shit data sources doesn’t give you a good data source. It gives you a bigger piece of shit. But we’re not done polishing it yet!
A great feature of the model is that it continues to integrate real-time data from Biobot and the CDC. From the perspective of Classical Test Theory, this is a huge advantage, as it provides a much more reliable indicator of what is currently happening with transmission. Both sources often make retroactive corrections for the most recent week’s data, sometimes sizable, and pitting the two indicators against one another reduces measurement error on average, which offers vital improvements in forecasting.
It might sound impressive that the model “continues to integrate real-time data from Biobot and the CDC” until you realize that this is meaningless — these are both sources of wastewater data. The fundamental flaw in the PMC model is that the methodology they are using to translate wastewater concentrations to case counts is bunk, so fatally flawed as to be not worth doing. I also must flag the mention of Classical Test Theory. Classical Test Theory is a theory of psychometric testing that attempts to deal with measurement error in people’s scores on psychometric tests (like IQ tests, Meyers-Briggs, or aptitude tests). I’m not that familiar with it, but it seems like pretty basic shit to me. I don’t think Classical Test Theory means that “pitting” the indicators against each other “reduces measurement error in the average.” I can sort of see where this is coming from, but it’s wrong, and it’s a (futile) attempt to make it seem like anyone at the PMC knows how to handle/model complex scenarios using aggregated data.
What’s more, measurement error is not the only kind of relevant error. In epidemiology (and probably lots of other disciplines as well), we roughly categorize errors as “random” or “systematic.” Measurement error can be a form of either type. Random errors are those that just happen randomly from measurement to measurement — if you take your temperature four or five times in a row, you’ll probably get roughly the same but not exactly equivalent readings each time. Systematic error, on the other hand, is also called bias. If the thermometer you were using to take your temperature was broken in some way such that it always shaved ten degrees off the temperature it recorded (or always added ten degrees), giving a quantity that is systematically biased away from the true value, this would be an example of systematic error. It’s sort of a hoary old chestnut in statistics, called the “law of large numbers,” that as sample size or number of observations increases, whatever parameter you want to estimate from the data (like the mean, for example) converges to the “true” value — that is, the random errors cancel each other out.* This does not apply in situations of systematic error. The wastewater data certainly features systematic error, which you can refer back to my other posts to read more about.
What are the biggest improvements in the model?
Accuracy in Real-Time Data – In integrating two active surveillance data sources, the real-time data will be more accurate. The biggest predictor of next week’s transmission levels, and the shape of how transmission is increasing or decreasing, accelerating or decelerating, is the current week’s real-time data. If the real-time data are off by 5% or 10%, the big-picture take on the forecast will still be reasonable, but a more precise estimate allows for greater accuracy in estimating the height and timing of waves.
Misleading!!! Notice how the author of this document has slid effortlessly into talking about transmission — predictors of next week’s transmission, accelerating or decelerating transmission, etc. But the PMC models have no transmission data. They have wastewater data from CDC and Biobot, and they have extremely outdated modeled estimates (not actual cases) from the now-defunct IHME modeling project. The big issue, which remains unaddressed, is the method for estimating cases (transmission) from wastewater (not transmission). Spoiler alert — they will not address this directly.
What’s the Same in the Current Model?
The analytic assumptions underlying the forecasting model remain the same. It uses regression-based techniques common across all industries, using a combination of historic data (median levels of transmission for each day of the year) and emerging data from the past four weeks to characterize how transmission is growing or shrinking. Holidays and routine patterns of behavior that map on well to a calendar are “baked in” to the historic data. “New variants” and atypical patterns of behavior are baked into the data on recent patterns of transmission. It’s a top-down big picture model.
Regression-based techniques common across all industries… except epidemiology or infectious disease modeling. Care to state what those are? New variants are “baked in” to the data on recent transmission, my ass. The PMC model does not have estimates of transmission. It’s not “top-down” or “big picture,” it’s a machine that turns garbage into more garbage.
What are the Biggest Drawbacks of the New Model? (ed. note, the capitalization in this document drives me fucking nuts)
Documentation of Accuracy – We have excellent data on the accuracy of the prior model and will submit a report for publication shortly. All prior reports are publicly available. Many report quick facts on longitudinal accuracy, international comparisons, use in news articles, and references to use in peer-reviewed scientific journal articles. We cannot document the real-time accuracy of the new model yet, but know that when using historical data, the model accounts for 98% of the variability in wastewater transmission 1-week into the future, which is 2% higher than our prior model. The vast majority of forecasting errors have been and will continue to be based on inaccuracies in the real-time data wastewater surveillance companies report, and the model changes reduce those issues. We hope you will trust our history and that the methodologic changes represent improvements.
R-square, the 98% number, is not an index of the accuracy of a model’s predictions. It simply isn’t. Dr. Hoerger can cite this till he’s blue in the face, and it will never be evidence that this model is accurate. There is no basis to say that the “vast majority” of forecasting errors are based on inaccuracies in wastewater surveillance — in fact, the forecasting errors, which the PMC team are completely ignoring, are most likely to be a result of the erroneous wastewater-to-cases estimation method and the extremely messy, disjointed use of multiple shitty data sets from different time periods. The model changes do not reduce these issues, nor the issues with inaccuracies in wastewater modeling (how would you even be able to fucking tell how accurate wastewater data is without transmission data?). As for the last sentence, how about a resounding FUCK NO? Why should anybody trust you? This is science, not team-building. Show your work and quit crying.
Data Integration (ed. note, this is mostly word salad and has been abridged for length)
In comparing data files, we noted that the lag phase varied marginally over time in some data sets (e.g., 7 day lag versus 2 day lag) and corrected the files accordingly. This allowed the longitudinal transmission estimates to line up closely. Then, we developed conversion multipliers to go from the metrics of one data set to another using a 10% trimmed mean. The intercorrelations among IHME, Biobot, and CDC ranged from .93 to .98 (all near perfect), indicating that they all are getting at the same thing. Correlations >.70 would be desirable, and the >.90 values indicate extremely high validity across multiple teams with different methodologies, basically that they all are getting at the same thing (construct validity). The minor discrepancies are likely reasonable given different assumptions or geographic coverage. All three data sources were converted to a single PMC transmission metric, which was then standardized to both the CDC levels as well as the IHME true case estimate. More details on the date, weights, and conversion multipliers will appear in an eventual publication; however, this should suffice to demonstrate the general methodologic approach. Overall, the intercorrelations among data sets were extremely high, near-perfect, and much more encouraging than what we would have imagined or been willing to integrate effectively.
It’s absolutely wild to make correlations do all this work. Correlations don’t mean shit if your data are all crappy and your methodology is bunk. The author of the document is trying to reify a quantity that doesn’t exist. This does not suffice to demonstrate the general methodological approach. The IHME estimates are not true cases. The mention of construct validity makes me think that Dr. Hoerger has read my posts. Correlations don’t indicate “validity” of any kind, and certainly not construct validity, if they don’t take into account the deficiencies in the underlying data. Big yawn.
Case Estimation (ed. note — also abridged to draw out the most interesting parts)
The PMC composite estimate of transmission correlates near-perfectly (r=.98) with the IHME model’s estimate of “true” cases (not merely reported or counted cases), with the intercorrelations among individual data sets all >.93. A simple multiplier is used to convert the PMC composite indicator to IHME estimated daily cases.
So my original critique was correct, they are doing this modeling incorrectly, and incorrectly interpreting (and falsely presenting) correlation coefficients as indicative of model accuracy. Further, I cannot say this enough, the IHME estimates are not true cases and they don’t even exist anymore — so these correlations are based on unnamed past estimates whose relevance to the present day is completely unknown but likely not very great.
Wastewater data are extremely valuable for tracking transmission. Among individuals who are new to tracking wastewater surveillance data, a common concern is that new subvariants could lead to differences in the quantity of virus individuals excrete into the wastewater. Such concerns are reasonable when considering a particular local wastewater tracker with unknown methodology or even in the WastewaterSCAN dashboard (which we do not presently use), where the waves get unrealistically bigger and wider. However, in the CDC and Biobot data, such a critique would be a bit Dunning-Kruger as it would assume that one has a vastly more sophisticated understanding of how to standardize wastewater data than the environmental scientists and epidemiologists who are trained and experienced in that exact niche.
Wastewater data is valuable for getting a broad and general sense of transmission trends. It is not in fact usable for “tracking transmission.” It’s the Dunning-Kruger reference (sooooo fucking weird for a technical appendix, no?) that really gets me. The author is referring to the Dunning-Kruger effect: “a cognitive bias which people with limited competence in a particular domain overestimate their abilities.” I am trained as an epidemiologist — in exactly this — at a terminal, PhD level. My issue, as I keep repeating, is not with the wastewater surveillance, or the teams who run it, as I’m certain that those people do know how to standardize wastewater data (this refers to things like adjusting the data for the number of people/animals contributing to a watershed and things like that). The existence of wastewater data has never been my issue. My issue is that Dr. Hoerger uses wastewater data and spurious methods to cook up an estimate of daily cases, which is wrong, but which is nevertheless being widely cited and amplified in the information vacuum around COVID-19 created by the federal government. Again, my PhD is in doing exactly this: evaluating the quality of different data sources and statistical analyses, and actually doing statistical analyses of messy observational data to arrive at accurate or at least internally valid inferences. Dr. Hoerger’s PhD is in clinical psychology, which is not this. If anyone is suffering from the overestimation of their abilities in a domain of limited competence, I’m sorry to say, it’s him — he’s playing around in data of huge national import like a toddler in a sandbox.
At a more simplistic level, we simply do not see true case estimates (or percent infectious estimates) bouncing around from absurdly low to impossibly high values, and certainly not at random time points throughout the year when a new subvariant become dominant. Thus, surveillance and case estimation should be rigorous, and any outright dismissal viewed as specious, and sometimes driven by substantial financial conflicts of interest where people are making considerable advertising and subscription revenue by feeding people pleasant denialism. Dozens of publications show the importance of wastewater surveillance for estimating community transmission and other metrics, which can be found through a Google Scholar search for covid wastewater cases and using similar terms.
Well, one reason you don’t see case estimates bouncing around is that you do not have true case estimates. The case estimation strategy here is not rigorous, it is bald misinformation. The amount of time and energy I have put into debunking these shitty models is not “outright dismissal,” it’s very carefully considered and highly informed dismissal. And, I have no financial conflicts of interest — I’m not an academic anymore, I’m a person of zero relevance in any professional sphere, I’m broke, I’m not a covid denier, and this newsletter is free. LOL.
I’ll be back sometime soon to recap the last two pages of this weird document. The technical appendix is littered with references to fears that if they post any more details about their methodology, people will steal from them. This is a deflection; all sorts of researchers post all sorts of code on GitHub all the time, I’ve done it! Dr. Hoerger, if you’re reading this: post your code or get lost, grifter!