What constitutes “turnover-worthy?”
I looked it up. I was
hoping that
a skilled human being watched the film, made
assessments, tallied them (thus creating a
number). Then, of course, all these numbers -- because they are numbers -- would be compiled and subject to all sorts of sorting. And numbers (to the unguarded) mean "science." Voila! Whereas, the work would have been done principally in the first step and them covered over.
Unfortunately, it's not as satisfactory as what I described above imo.
We’ll start with building the model for expected interceptions. Our independent variables will be air yards, pass location, qb hits, number of pass defenders and season.
Start by creating an incompletions dataframe, which filters out all plays that do not result in incomplete passes or interceptions. Additionally, create the pass broken up (pbu) variable based on the number of pass defenders listed. The assumption being if more defenders are listed as defending a given pass, the more congested the throwing lane was.
I don't understand how a
completed pass in principle could not be "turnover-worthy." For example: an amazing catch of a terribly overthrown throw ball (with a DB behind the WR), a ball taken away from a better positioned defender, a defender in position to make the catch but who falls, etc.
My guess is that is because those types of things require judgement and can't rely completely on numbers already provided in available stats. ThenI noticed how peculiar the instructions were in other regards. Again,as it appears to me, by choosing to preference easily available numbers to judgement.
Additionally, create the pass broken up (pbu) variable based on the number of pass defenders listed [in the game stats]. The assumption being if more defenders are listed as defending a given pass, the more congested the throwing lane was.
I have seen superb balls expertly thrown into traffic. I have also seen balls deflected by a receiver into defenders. I have also seen defenders in a crowd tip terrible balls to a receiver. Etc.
They are going for
speed in processing. Their name is NFL faster ("nflfastr"). They seem to be targeting the fantasy football world. Maybe you can even gamble on their numbers; who knows?
It is pretty interesting, though. I'll grant that. And the difference between a bottom-rung Levis score compared to the leaders or even the middle of the pack seems to much too great to be the product of the kinds of things I pointed to. It's a really bad comp for him.
And there is a mountain of information on the site.
OTOH, they do train the model by using it to project next season's numbers and comparing the data. And prolly other things I didn't bother to pour over. Of course, the are variables pertinent to QB's evaluation that change from year to year and even game to game. I have the impression (although I do not know) that the separate outfit that produces PPF numbers (actually score film). That doesn't mean it is a perfect system.
Anyway, I'll let you take it from here. I'll only conclude with my prediction that Devo is about to see
a ghost made of loads of ectoplasm.
Building expected interceptions and expected fumbles models to find QBs likely to increase or decrease their interceptions and/or turnovers per dropback from 2019 to 2020.
opensourcefootball.com