Thanks to a decades-
A pair of unfortunate side effects of this ensciencification and the ever-
Two standard measures of accuracy in information retrieval and computational linguistics circles are precision and recall.
For the technically minded, precision is the percentage of correct results among the results returned for a given task. Precision is usually tractable to calculate in that it is typically possible to review the results returned and judge their adequacy. In some cases it may also be possible to have 100% precision by giving only one guaranteed-
In contrast, recall is the percentage of correct results returned out of all possible correct results for a given task. Recall can be difficult to calculate in that it may not be possible to survey the entire universe of possible results to determine which would have been adequate. In some cases it may also be possible to have 100% recall by returning all possible results, thus guaranteeing that all correct results are included as well
Colloquially, recall measures “how much of the good stuff you got.” Precision measures “how much of what you got is good stuff.”3
Balancing precision and recall is often difficult
So, rather than trying to dumb things down and arrive at such a single “accuracy” number, we propose instead to dumb things up
These two new measures are called recision and precall.
Recision is a measure of the amount of data that must be ignored (or surreptitiously dumped in the river with a new pair of cement shoes) in order to get publishable results. If 10% of your data must be “lost” in order to get good results that support your pre-
Precall is a measure of your team’s ability to quickly and correctly predict how well your algorithm or system will perform on a new data set that you can briefly review. Correctly predicting “This will give good results.” or “This is gonna suck!” 90% of the time translates directly into a precall score of 90%. Good precall (especially during live demos) can save a project when results are poorer than they should be. The ability to look at some data and accurately predict and, more importantly, explain why such data will give poor results shows a deep understanding of the problem space.5 Even when performance is decent, though, prefacing each data run with “We have no idea how this will turn out!” makes your team look lucky, at best, or, at worst, foolish.
Spend time improving the quality and complexity of your algorithms to decrease your recision score. Spend money improving the quality and complexity of your team to increase your precall score. Once you have mastered these important conceptual metrics, success, fame, and/or tenure await you!
1 Here we use need in the sense of “that which is required to acquire funding and/or tenure”.
2 An affront to those who remember when our gentle form of madness was known as “philology”.
3 Interestingly, this is one of the few times when linguists, with more experience resolving subtle scoping differences, usually have a leg up on their computer science colleagues in properly internalizing technical vocabulary. If you are still lost, please do not submit your résumé to the Center for Computational Bioinformatics and Linguistics. Thanks.
4 But still good enough for government work!
5 A tip for our neophyte readers: mention your “deep understanding of the problem space” (or its equivalent, if you actually understand what that means) in every conversation you have with anyone who has influence over your funding.