Cynthia Rudin wins the 2021 AAAI Squirrel AI Award

gwf · on Oct 12, 2021

I was also one of Cynthia's Ph.D. advisors when she was a graduate student at Princeton, some twenty years ago. It was obvious to me then that she would go on to do great things, so it's delightful to read this news this morning.

My fondest memory of Cynthia, however, has nothing to do with science, and everything to do with just being a kind person. We were at the NEC Research Institute's company picnic where they had an inflatable dragon for the kids to jump around within its interior. Me, Cynthia, and my wife went inside without any kids and jumped around like idiots for a while. Cynthia and my wife got bored, so I stayed behind for One More Big Bounce. With the epic bounce, I also succeeded in cracking a vertebra, nearly passing out on the spot from the pain. Eventually, I would crawl out, an ambulance was called, and I was brought to the Princeton ER.

I would have a full recovery, but I was in the ER for several hours that night. Cynthia came with us to the ER, and when she saw how uncomfortable I was on the gurney, she went back to her dorm to retrieve her favorite blanket, so that I would have even a small comfort. I am not sure how long she stayed, but I know that she was there with me longer than anyone else except my wife.

Anyhow, she's a lovely human being and I am honored and proud to have known her and witnessed the origins of her career.

typest · on Oct 12, 2021

In my senior year at Duke, I took her ML class (her first semester at Duke). She was an excellent professor, one of the absolute best I had while there. She focused heavily on both implementation and theory, which I found to be rare.

Her class became so popular within the add/drop period that Duke added a second section and also doubled the attendance for each section. I'm pretty sure she went from being supposed to teach about 70 students to teaching 300. Nevertheless, her teaching was top notch, and I learned more there than pretty much any other CS class, and still rely on this knowledge today!

I too am really glad she won this award.

adrian_mrd · on Oct 12, 2021

Thanks for sharing. It’s frequently the little acts of everyday kindness that go a long way in this world, like the blanket example you cite.

nothrowaways · on Oct 12, 2021

Congratulations! I think Interpretable AI mentioned in the article is what is commonly called explainable AI.

qPM9l3XJrF · on Oct 12, 2021

Cynthia has a great paper where she distinguishes between "interpretable" and "explainable"

https://arxiv.org/pdf/1811.10154.pdf

randcraw · on Oct 12, 2021

"Explainable" just means you didn't build your app/service using any technology that isn't interpretable. Her argument is that the strategy of reverse engineering a method to convert it from inexplicable to explicable is inherently less effective than maintaining explicability at all times in the app's genesis -- from the design phase through implementation.

But Rudin's Premise is philosophical more than practical. If the problem at hand is better solved using a black box (in terms of accuracy, precision, robustness, etc), her premise says simply, don't do it. Unfortunately in the cutthroat world of capitalism, that strategy can't compete with the cutting edge.

Where Rudin's Premise is more suitable is in writing regulations to address AI app problems where social unfairness is unchecked (like the COMPAS app that advises legal authorities on meting out parole decisions without explaining its reasoning). There are many such (ab)uses for AI today in social services or policing which merit rethinking since AI-based injustice so offer bedevils the proprietary lack of transparency in such apps.

Another excellent discussion of problems like these is Cathy O'Neil's book "Weapons of Math Destruction". Too bad she couldn't share the Squirrel prize. https://www.amazon.com/Weapons-Math-Destruction-Increases-In...

pjmorris · on Oct 12, 2021

From the linked paper, linked again [0] below, I think this represents Rudins philosophical view, and why it could be practical:

  Here is the Rashomon set argument: Consider that the data permit a large set of reasonably accurate predictive models to exist. Because this set of accurate models is large, it often contains at least one model that is interpretable. This model is thus both interpretable and accurate. 

  Unpacking this argument slightly, for a given data set, we define the Rashomon set as the set of reasonably accurate predictive models (say within a given accuracy from the best model accuracy of boosted decision trees). Because the data are finite, the data could admit many close-to-optimal models that predict differently from each other: a large Rashomon set. I suspect this happens often in practice because sometimes many different machine learning algorithms perform similarly on the same dataset, despite having different functional forms (e.g., random forests, neural networks, support vector machines). As long as the Rashomon set contains a large enough set of models with diverse predictions, it probably contains functions that can be approximated well by simpler functions, and so the Rashomon set can also contain these simpler functions. Said another way, uncertainty arising from the data leads to a Rashomon set, a larger Rashomon set probably contains interpretable models, thus interpretable accurate models often exist.


  If this theory holds, we should expect to see interpretable models exist across domains. These interpretable models may be hard to find through optimization, but at least there is a reason we might expect that such models

exist.

[0] https://arxiv.org/pdf/1811.10154.pdf

qPM9l3XJrF · on Oct 12, 2021

>If the problem at hand is better solved using a black box (in terms of accuracy, precision, robustness, etc)

It's been a while since I read her work, but IIRC one of the positions she argues for, which I find plausible, is that interpretable models can be performance competitive. For example, it could be that the only reason black box methods outperform is because they've been more heavily researched, and if we were to put more research into interpretable methods, we could achieve parity. I also mentioned a few reasons why we might expect interpretable models to perform better a priori in this comment https://news.ycombinator.com/item?id=28838321

randcraw · on Oct 12, 2021

I'd find Rudin's argument a lot more convincing if she offered an existence proof, like using the same number of examples to train an equally discriminative SVM or random forest (or hybrid) that can equal the performance of AlexNet in 2012 on the ImageNet ILSVRC (or in another domain where DNNs are SOTA).

Until that can be done, I think few outside academia will invest time or money in alternative non-DNN methods in the hope of competing with today's even superior DNN variants. There's a decade of evidence now that DNNs are incontestable discriminators in numerous domains, relative to pre-2012 ML technology anyway.

qPM9l3XJrF · on Oct 12, 2021

>There's a decade of evidence now that DNNs are incontestable discriminators in numerous domains, relative to pre-2012 ML technology anyway.

Do we know that this is due to inherent superiority of DNNs, vs just experiencing a virtuous cycle of success leading to increasing investment leading to more success?

nothrowaways · on Oct 12, 2021

Oh thanks for the pointers

pjmorris · on Oct 12, 2021

Explanatory quote from the linked paper (and thanks for the link, qPM9l3XJrF!):

"Rather than trying to create models that are inherently interpretable, there has been a recent explosion of work on “Explainable ML,” where a second (posthoc) model is created to explain the first black box model. This is problematic. Explanations are often not reliable, and can be misleading, as we discuss below. If we instead use models that are inherently interpretable, they provide their own explanations, which are faithful to what the model actually computes"

srean · on Oct 12, 2021

Hearty congratulations. I am very happy for her.

I am more familiar with her older work on ranking and boosting. I do not have any technical commentary to add, just a personal anecdote that she is one of the nicest, warmest person that I have met. I wish her well with utmost sincerity.

ChemSpider · on Oct 12, 2021

Is this a respected prize? I never heard of it before.

That said, no doubt that explainable ML/AI is important.

dagw · on Oct 12, 2021

Is this a respected prize?

It's a very new prize (this is only the second winner), so it's still too early to tell. But is backed by a reasonably respectable organization.

enriquto · on Oct 12, 2021

> is backed by a reasonably respectable organization.

Respectability of the prize will arise mostly from the people who receive the prizes, not from the organization itself. Would you like to receive the same prize that got all these other geniuses?

The Nobel is respectable because so many great scientists got it. The composition of the Nobel committee is irrelevant, as long as they keep giving the prize to the best.

ndr · on Oct 12, 2021

Is $1 Million respectable?

wongarsu · on Oct 12, 2021

Can money buy respectability?

If I awarded $1 Million to a random person every year, receiving that award wouldn't make that person more accomplished, and the award wouldn't be mentioned outside the local newspaper. On the other hand an award that gives no money but consistently awards the best researcher in the field can be very noteworthy.

What the $1 million does accomplish is make people pay attention, so everyone will much more quickly reach a verdict whether this is a price worth paying attention to. But two years is a bit quick for that verdict.

Igelau · on Oct 12, 2021

If you give me $1 million, I promise I'll respect you.

newsbinator · on Oct 12, 2021

> Can money buy respectability?

Definitely!

adrian_mrd · on Oct 12, 2021

Hello Golden Globes and the Hollywood Foreign Press Association!

melony · on Oct 12, 2021

Just tie it to a stochastic metric like random acts of kindness and make a big overblown press release about it. The Nobel peace prize speaks for itself on how useless metrics empower the well connected in creating prestige.

JustFinishedBSG · on Oct 12, 2021

I mean, respected or not I would not say no to a free meal and USD$1M.

nothrowaways · on Oct 12, 2021

Yeah, this is their second year i assume.

thedrake · on Oct 12, 2021

Several great insights from a person that truly cares about not only the outcome of models but what is causing the outcome. Her talks about parole guidelines being taken over by ai are great.

dr_dshiv · on Oct 12, 2021

What’s with “Squirrel AI”? Did that seem slightly out of place?

paranoidroid · on Oct 12, 2021

Squirrel AI is a Chinese online education company. It is the first large scale AI-powered adaptive education provider in China [...] https://en.wikipedia.org/wiki/Squirrel_AI Thousand Talents Obfuscation Initiative?

paranoidroid · on Oct 12, 2021

The benefit for humanity has award has only been a thing since 2021, Mrs. Rudin received the 2022 version.

So lets say its been running for 2 years.

And the only other comparable scientific awards of such monetary value are Turing and Nobel?

Wow very generous people.

1. https://www.aaai.org/Awards/squirrel-ai-award.php

ur-whale · on Oct 12, 2021

Any link to her actual work?

walnut_eater · on Oct 12, 2021

Her most cited paper is https://arxiv.org/abs/1811.10154 and it references a lot of her other work. It provides a good representation of what she does and what she is well known for.

qPM9l3XJrF · on Oct 12, 2021

Of the "representative publications" on her homepage here https://ece.duke.edu/faculty/cynthia-rudin the one that seems most relevant to the topic of the article is this paper on interpretable financial lending: https://arxiv.org/pdf/2106.02605.pdf

"The machine learning model is a two-layer additive risk model, which resembles a two-layer neural network, but is decomposable into subscales. In this model, each node in the first (hidden) layer represents a meaningful subscale model, and all of the nonlinearities are transparent. Our online visualization tool allows exploration of this model, showing precisely how it came to its conclusion. We provide three types of explanations that are simpler than, but consistent with, the global model: case-based reasoning explanations that use neighboring past cases, a set of features that were the most important for the model’s prediction, and summary-explanations that provide a customized sparse explanation for any particular lending decision made by the model."

I was curious about the customized sparse explanation. It looks like there is an illustrative example from later in the paper:

"For all 700 (7.1%) people where:

• ExternalRiskEstimate ≤ 63 , and

• NetFractionRevolvingBurden ≥ 73,

the global model predicts a high risk of default."

"A rule returned by OptConsistentRule is globally-consistent, in the sense that there exists no previous case that satisfies the conditions in the rule but is predicted differently, by the global model, from what is stated in the rule. In contrast, explanations (from other methods) that are not consistent may hold for one customer but not for another, which could eventually jeopardize trust (e.g., “That other person also satisfied the rule but he wasn’t denied a loan, like I was!”)"

You can see the online visualization tool her team built here: http://dukedatasciencefico.cs.duke.edu/models/

In retrospect, it's not all that surprising to me that a model such as this is able to outperform a black box like a neural network. For example, one of the things this model does which black box models don't do is enforce "monotonicity constraints" which ensure that as risk factors increase, the estimated risk should also increase. It makes sense that this would be a useful inductive bias which improves generalization performance -- if a black box model found that an increase in risk factors decreased estimated risk, it seems likely that this would be a result of overfitting (or multicollinearity gone haywire).

Of course another reason to expect simple/interpretable models to generalize better is Occam's Razor.

My big question about this sort of approach would be whether it's able to extend to the sort of unstructured data problems that deep learning has done really well on. It looks like some of her recent papers on Google Scholar address this: https://scholar.google.com/citations?hl=en&user=mezKJyoAAAAJ... (specifically thinking of the BacHMMachine paper and the Interpretable Mammographic Image Classification paper). Maybe someone else can summarize them.

Bostonian · on Oct 12, 2021

Interesting article, but I think this sentence was unfair to other AI scholars, who also want AI to help society.

"While many scholars in the developing field of machine learning were focused on improving algorithms, Rudin instead wanted to use AI’s power to help society."

pjmorris · on Oct 12, 2021

A Peter Norvig quote from yesterday's article about his transition to Stanford...

"In the past, the interesting questions were around what algorithm is best for doing this optimization. Now that we have a great set of algorithms and tools, the more pressing questions are human-centered: Exactly what do you want to optimize? Whose interests are you serving? Are you being fair to everyone? Is anyone being left out? Is the data you collected inclusive, or is it biased?"

[0] https://news.ycombinator.com/item?id=28833933

jhgb · on Oct 12, 2021

That makes the sentence quoted by GP sound like a category error to me. Developing better algorithms and developing more useful models to run on those algorithms are not an "either/or" situation.

qPM9l3XJrF · on Oct 12, 2021

Most AI papers I see aren't directly focused on using AI to help society. It's unclear how a small performance increase on ImageNet helps society, for instance.

jhgb · on Oct 12, 2021

I would imagine that it helps by saving computational resources, making the results cheaper to obtain.

tremon · on Oct 12, 2021

I think this is meant as a distinction between theoretical and applied science, phrased to a low common denominator (which, as a consequence, causes miscommunication due to not being specific enough).

lallysingh · on Oct 12, 2021

Many AI researchers do focus on algorithms, no?

JustFinishedBSG · on Oct 12, 2021

Plenty focus on theory.

varispeed · on Oct 12, 2021

I wish one day I'll read that Computer Scientist _earns_ $1 Million.

Most engineers here get like £60k salary (£3600 a month after PAYE tax), while companies they work for make billions out of their work. Not only that, but they also don't contribute back into the local communities, because they use aggressive tax avoidance strategies. Corporations need to start sharing their profits with the workers and pay taxes otherwise it will eventually spark another revolution.

twen_ty · on Oct 12, 2021

I assume you're from the UK? Where? I ask because salaries in London are definitely higher. A senior engineer should get 80k and a principle engineer/architect should be on 90-120k.

typest · on Oct 12, 2021

It is quite common for engineers with Cynthia Rudin’s ability to earn > 1 million. If you have that level of ML skill you can lead teams at many companies for over that amount.

JustFinishedBSG · on Oct 12, 2021

> I wish one day I'll read that Computer Scientist _earns_ $1 Million.

They do.

Not that I personally believe they deserve it. For the exact same reasons I don't think a CEO is not "worth" thousands of engineers, I don't think that just because you happened to graduate in ML you are worth tens/hundred times more that the others.

Or more accurately maybe they are worth that much, but the general population is severely underpaid.

Someone · on Oct 12, 2021

IMO, the salaries are (ballpark) fine for the effort and risk involved, flexibility of working hours, stress levels, etc. I don’t see a good reason why engineers should make more than, say, teachers or nurses, just because the company they work for makes billions.

What’s wrong is that these companies make billions, most of it because they happened to get to the top of the food chain.

randcraw · on Oct 12, 2021

Reportedly, Ilya Sutskever was offered upward of $2 million/year to remain at Google as he departed for OpenAI. By many accounts, he's not alone at earning over $1 million/year as an independent contributor.

When Hinton, Krishevsky, and Sutskever sold DNNresearch (incorporated only days before) to Google, their $44 million crossed that line too, since the company had no products or IP that was independent of UToronto, AFAIK. The three were effectively "hired" as indep contributors.