Our research seeks to empower individuals and organizations to
control how their data is used. We use techniques from cryptography,
programming languages, machine learning, operating systems, and other
areas to both understand and improve the security of computing as
practiced today, and as envisioned in the future.
Everyone is welcome at our research group meetings
(most Fridays at 11am, but join the slack group for announcements). To
get announcements, join our Slack Group (any
@virginia.edu email address can join themsleves, or email me
to request an invitation).
Security Research Group Lunch (12 December 2017)
Haina Li, Felix Park,
Mainuddin Jonas,
Anant Kharkar,
Faysal Hossain Shezan,
Fnu Suya,
David Evans,
Yuan Tian,
Riley Spahn,
Weilin Xu,
Guy "Jack" Verrier
Projects
Recent Posts
Here's a video of Xiao Zhang's presentation at NeurIPS 2019:
https://slideslive.com/38921718/track-2-session-1 (starting at 26:50)
See this post for info on the paper.
Here are a few pictures from NeurIPS 2019 (by Sicheng Zhu and Mohammad Mahmoody):



Finding Black-box Adversarial Examples with Limited Queries
Black-box attacks generate adversarial examples (AEs) against deep
neural networks with only API access to the victim model.
Existing black-box attacks can be grouped into two main categories:
Transfer Attacks use white-box attacks on local models to find
candidate adversarial examples that transfer to the target model.
Optimization Attacks use queries to the target model and apply
optimization techniques to search for adversarial examples.
Hybrid Attack
We propose a hybrid attack that combines transfer and optimization attacks:
Transfer Attack → Optimization Attack — take candidate adversarial examples of the local models of transfer attacks as the starting points for optimization attacks.
Optimization Attack → Transfer Attack — intermediate query results from the optimization attacks are used to fine-tune the local models of transfer attacks.
We validate effectiveness of the hybrid attack over the baseline on three benchmark datasets: MNIST, CIFAR10, ImageNet. In this post, we only show the results of AutoZOOM as the selected optimization method. More results of other attacks can be found in the paper.
Local Adversarial Examples are Useful (Transfer → Optimization)
Below, we compare the performance of AutoZOOM attack when it starts
from 1) the local adversarial examples, and 2) the original
points. Here, we report results for targeted attacks on normal (i.e.,
non-robust) models:
Local AEs can substantially boost the performance of optimization
attacks, but when the same attack is used against robust
models, the improvement is small:
This ineffectiveness appears to stem from differences in the attack
space of normal and robust models. Therefore, to improve effectiveness
against robust target model, we use robust local models to produce the
transfer candidates for starting the optimization attacks. The figure
below compares impact of normal and robust local models when attacking
the robust target model:
Tuning with Byproduces Doesn’t Help Much (Optimization → Transfer)
Below, we compare the performance of AutoZOOM attack on MNIST normal
model when the local models are 1) fine-tuned during the attack
process, and 2) kept static:
Tuining local models using byproducts from the optimization attack
improves the query efficiency. However, for more complex datasets
(e.g., CIFAR10), we observe degradation in the attack performance by
fine-tuning (check Table 6 in the paper).
Batch Attacks
We consider a batch attack scenario: adversaries have limited
number of queries and want to maximize the number of adversarial
examples found within the limit. This is a more realistic way to
evaluate attacks for most adversarial purposes, then just looking at
the average cost to attack each seed in a large pool of seeds.
The number of queries required for attacking a specific seed varies
greatly across seeds:
Based on this observation, we propose two-phase strategy to prioritize easy seeds for the hybrid attack:
In the first phase, the likely-to-transfer seeds are prioritized
based on their PGD-steps taken to attack the local models. The
candidate adversarial example for seed seed is attempted in order to
find all the direct transfers.
In the second phase, the remaining seeds are prioritized based on
their target loss value with respect to the target model.
To validate effectievness of the two-phase strategy, we compare to two seed prioritization strategies:
Retroactive Optimal: a non-realizable attack that assumes adversaries already know the exact number of queries to attack each seed (before the attack starts) and can prioritize seeds by their actual query cost. This provides an lower bound on the query cost for an optimal strategy.
Random: this is a baseline strategy where seeds are prioritized in random order (this is the stragety assumed in most works where the adverage costs are reported).
Results for the AutoZOOM attack on a normal ImageNet model are shown below:
Our two-phase strategy performs closely to the retroactive optimal
strategy and outpeforms random baseline significantly: with same
number of query limit, two-phase strategy finds significantly more
adversarial examples comapred to the random baseline, and is closer to
the retroactive optimal case. (See the paper for more experimental
results and variations on the prioritization strategy.)
Main Takeaways
Transfer → Optimization: local adversarial examples can generally be used to boost optimization attacks. One caveat is, against robust target model, hybrid attack is more effective with robust local models.
Transfer → Optimization: fine-tuning local models is only helpful for small scale dataset (e.g., MNIST) and fails to generalize to more complex datasets. It is an open question whether we can make the fine-tuning process work for complex datasets.
Prioritizing seeds based on two-phase strategy for the hybrid attack can significantly improve its query efficiency in batch attack scenario.
Our results make the case that it is important to evaluate both
attacks and defenses with a more realistic adversary model than just
looking at the average cost to attack a seed over a large pool of
seeds. When an adversary only need to find a small number of
adversarial examples, and has access to a large pool of potential
seeds to attack (of equal value to the adversary), then the effective
costs of a successful attack can be orders of magnitude lower than
what would be projected assuming an adversary who cannot prioritize
seeds to attack.
Paper
Fnu Suya, Jianfeng Chi, David Evans and Yuan Tian. Hybrid Batch Attacks: Finding Black-box
Adversarial Examples with Limited Queries. In USENIX Security 2020. Boston, August 2020. [PDF] [arXiv]
Code
https://github.com/suyeecav/Hybrid-Attack
In this repository, we provide the source code to reproduce the results in the paper. In addition, we believe our hybrid attack framework can (potentially) help boost the performance of new optimization attacks. Therefore, in the repository, we also provide tutorials to incorporate new optimization attacks into the hybrid attack framework.
Xiao Zhang will
present our work (with Saeed Mahloujifar and
Mohamood
Mahmoody) as a spotlight at NeurIPS
2019,
Vancouver, 10 December 2019.
Recent theoretical results, starting with Gilmer et al.’s
Adversarial Spheres (2018), show
that if inputs are drawn from a concentrated metric probability space,
then adversarial examples with small perturbation are inevitable.c The
key insight from this line of research is that concentration of
measure
gives lower bound on adversarial risk for a large collection of
classifiers (e.g. imperfect classifiers with risk at least $\alpha$),
which further implies the impossibility results for robust learning
against adversarial examples.
However, it is not clear whether these theoretical results apply to
actual distributions such as images. This work presents a method for
empirically measuring and bounding the concentration of a concrete
dataset which is proven to converge to the actual concentration. More
specifically, we prove that by simultaneously increasing the sample
size and a complexity parameter of the selected collection of subsets
$\mathcal{G}$, the concentration of the empirical measure based on
samples converges to the actual concentration asymptotically.
To solve the empirical concentration problem, we propose heuristic
algorithms to find error regions with small expansion under both
$\ell_\infty$ and $\ell_2$ metrics.
For instance, our algorithm for $\ell_\infty$ starts by sorting the
dataset based on the empirical density estimated using k-nearest
neighbor, and then obtains $T$ rectangular data clusters by performing
k-means clustering on the top-$q$ densest images. After expanding each
of the rectangles by $\epsilon$, the error region $\mathcal{E}$ is
then specified as the complement of the expanded rectangles (the
reddish region in the following figure). Finally, we search for the
best error region by tuning the number of rectangles $T$ and the
initial coverage percentile $q$.

Based on the proposed algorithm, we empirically measure the
concentration for image benchmarks, such as MNIST and
CIFAR-10. Compared with state-of-the-art robustly trained models, our
estimated bound shows that, for most settings, there exists a large
gap between the robust error achieved by the best current models and
the theoretical limits implied by concentration.

This suggests the concentration of measure is not the only reason
behind the vulnerability of existing classifiers to adversarial
perturbations. Thus, either there is room for improving the robustness
of image classifiers or a need for deeper understanding of the reasons
for the gap between intrinsic robustness and the actual robustness
achieved by robust models.
Paper
Saeed Mahloujifar★, Xiao Zhang★, Mohamood Mahmoody and David Evans. Empirically Measuring Concentration: Fundamental Limits on Intrinsic Robustness. In NeurIPS 2019 (spotlight presentation). Vancouver, December 2019. [PDF] [arXiv]
Code
https://github.com/xiaozhanguva/Measure-Concentration
I was honored to particilate in a panel at an event on
Adult Education in the Age of Artificial Intelligence that was run by The Great Courses as a fundraiser for the Academy of Hope, an adult public charter school in Washington, D.C.
I spoke first, following a few introductory talks, and was followed by
Nicole Smith and Ellen Scully-Russ, and a keynote from Dexter Manley,
Super Bowl winner with the Washington Redskins. After a short break,
Kavitha Cardoza moderated a very interesting panel discussion. A
recording of the talk and rest of the event is supposed to be
available to Great Courses Plus subscribers. I’ve included a fairly
complete text from the script I wrote (with some modifications and
additional comments).
As a tenured professor at a well-endowed university, I have the good
fortune to be about as sheltered as anyone can be from disruptions in
employment. But, I do have two young children, so I have a strong
personal interest in there being good jobs available for them in this
time period.
My four-year old son drew the picture, and I hate to disappoint him
that the train driver position he dreams of probably won’t work out,
and I don’t think there is anything we can do to save it. But, I do
hope there will be something fulfilling for him to do when he grows
up.
For the actual
talk, I was able to use beautiful images from Getty Images which The
Great Courses has a license to (and not allowed to use Create Commons
images, since they do not count as non-commercial), so I've replaced
the images from the talk with CC-licensed images. Unfortunately, I wasn't able to find a cc image of a match making
factory, like the one in
this Getty image that I used in the talk. The replacement image is a
spinning room from Fall River, MA, 1912.
I want to start by talking about history. Machines taking jobs away from humans is not a new
thing - goes back hundreds of years.
The picture shows a match-making factory in the 1870s. No one has a job as this kind of match-maker today – and lots of the jobs the other kind of match-maker have also been taken over by machines.
Another example is automated teller machine. ATMs automated many of
the roles previously done by human bank tellers.
ATM Machines didn’t eliminate jobs for bank tellers – actually made
more, since they make banking cheaper and more accessible, and meant
banks could open more branches which still needed to hire tellers for
the more interesting and complicated banking activities.
We’ve seen many transformations like this, and many tasks that could
once only be done by humans are now routinely automated. Although past
technological advances have caused massive disruption and pain for
many individuals, as a species on the whole, these advances have not
diminished overall human employment, and there shouldn’t be any doubt
that on the whole, technological and scientific program have made all
of our lives tremendously better.
One way to measure that progress is what fraction of our workforce is needed to feed us.
As recently as 150 years ago, nearly everyone worked in agriculture, and we still couldn’t produce enough food to feed everyone. Now, only about 1 out of 75 people work in farming, and we are so ridiculously productive in producing food with we can burn about 1/3 of it as fuel.
So, if we were driven as a species to avoid work once our subsistence needs are met, we shouldn’t be working 8 hours or more a day.
If the ordinary wage-earner worked four hours a day, there would be enough for everybody and no unemployment... This idea shocks the well-to-do, because they are convinced that the poor would not know how to use so much leisure.
Bertrand Russell,
In Praise of Idleness (1932)
We should be working about
8 minutes a day, and still living better than people did 150 years ago.
Somehow we’ve managed improve productivity by a factor of several hundred, without reducing the demand for human work. All the previous scientific and technological progress that has automated work has just led to finding new productive things for humans to do.
Will Artificial Intelligence Change Everything?
The question for today is if we are reaching the end of that – are the
kinds of automation that can be achieved (or soon will be) with
artificial intelligence that they won’t just improve productivity by
automating some human tasks, but that they will eliminate the
opportunity for humans to contribute productively at all.
First, its important to distinguish AI from the traditional automation
that has driven productivity gains for the past hundred years.
We’ve only experienced the disruption computing causes over the past
half century, but a few people had the vision of what was possible
much longer ago.
The first person to see the potential for universal
computers, was Gottfried Leibniz back in the 17th century.
The human race will have a new kind of instrument which will increase the power of the mind much more than optical lenses strengthen the eyes and which will be as far superior to microscopes or telescopes as reason is superior to sight.
Gottfried Wilhelm Leibniz (1679)
Leibniz is talking about increasing the power of our minds the way a telescope
increases the power of our eyes. This is what traditional
computing has done.
More recently, Steve Jobs talked about computers as bicycles for the mind.
We have machines now that can do any computation we understand well enough to describe in simple steps quadrillions of times faster than humans can do it. We can also build computing systems, like the one most of us have in our pockets, that don’t just do one one human understands faster, but that can take the collective efforts of millions of humans over decades to make it possible for us to play “Angry Birds”.
This is an amazing accomplishment for our species, but AI goes beyond it.
All of the previous capabilities can only solve problems humans already understand well enough to solve.
AI allows us to solve problems no humans understand.
I’m going to focus now on Machine Learning, which is just one sub-field of AI, but it’s the one that has the most hype over the past decade, and one that is being rapidly deployed in ways that automate human tasks.
The main distinction between traditional computing and machine learning is that traditional automatons are programmed, while machine learning automatons are trained. Humans set up a training process, but the machine learns how to solve the problem on its own.
I’ll illustrate with a simple example – building a machine to distinguish women and men in pictures.
This is something most humans are quite good at, but if I asked you how you do it, you couldn’t explain it. And, since humans can’t explain how to do it, we don’t know how to write a program to do it, but we can train a model that does it well.
First, we need to collect training images.
Training complex models requires a lot of training data – a few
million images. Fortunately, you can find billions of images on the
Internet (at least if you don't care about copyright; for the talk, I was allowed to
use these images, but they blurred them out of the recording).
Next, we need to label the images. This is what is called
“supervised learning” – we are telling the machine the correct answers, and it is training to find a model that matches those answers.
Labeling 10M images might seem like a lot of work, but you can find low-paid humans to do it.
Now, we’re ready to train a model. But first, we need to decide on a model architecture.
This just means designing a function with lots of unknown parameters that takes all the pixels of an images as its inputs, and outputs a prediction of gender and confidence.
We train the model by starting with random values for all those parameters, testing the labeled images, and updating the weights until we get a model that makes good predictions.
There are some tricks for how to update the weights in the right direction, but if all goes well, we’ll find a model that has high accuracy on our tests before we run out of money.
Then we deploy the model and everything is great...
...except we see results like these.
We have people that look like men, but are identified as women. What they have in common is they are carrying umbrellas.
These examples were provided by
Tianlu Wang, and the
results showing how ML models learn biases (and a method to mitigate this) are in
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints,
Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang. (In
Empirical Methods in Natural Language Processing (EMNLP) 2017.)
The umbrella-revealing-gender example is from Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations by Tianlu Wang, Jieyu Zhao, Mark Yatskar, Kai-Wei Chang, Vicente Ordonez. (In International Conference on Computer Vision (ICCV), 2019.)
The problem is the learning process is just about learning statistical patterns in data – the model is not developing any real understanding of men and women.
Since most of the people in the labeled data who were holding umbrellas were women, the model learned a strong pattern that if you are holding an umbrella you must be a woman.
Being mis-gendered because of holding an umbrella may not be such a serious problem, but the same kinds of learning methods are used in many more sensitive tasks, like predicting who is a terrorist and who gets a job interview.
Amazon Created a Hiring Tool Using A.I. It Immediately Started Discriminating Against Women. (Slate, Oct 2018)
If you are traveling in Hong Kong, being face-recognized with an umbrella nearby may lead to other problems.
Predicting the Future
The wisest quote I know of about making predictions is from English
footballer Paul Gascoigne, who said "I never predict anything, and I never will."
The hard thing about making predictions, is that our brains and the
experiences we relate to are linear – we get one year older every
year, and can relate to linear change easily.
But, nearly everything we care about is actually changes exponentially.
People tend to talk about exponential change as though it is rare
and special, but it is actually the way nearly everything that matters
changes.
All it means to be exponential, is that the change is a
percentage of the current value, like increasing by 5% a year or
doubling every decade.
Here’s what doubling every year looks like for 10 years (compared to the blue line which is adding 100 every year).
Looks about the same, until we go out another year...
After 20 years, the linear line is so overwhelmed by the exponential
growth, that it is indistinguishable from the flat axis. In another
10 years, the doubling rate has produced another factor of 1000.
Almost everything we care about looks like this – here we see GDP per capita for a handful of countries.
The exponential growth is so powerful, that World War II looks like a little glitch.
We see similar exponential curves for almost anything we look at – here it is books published per person, and crop yield per acre.
Predicting exponential growth is the easiest and safest prediction
to make. The only way it goes wrong is if there is some physical limit
that stops the growth – Malthus thought we were at the limit of how
much an acre of land could produce in 1800 – but you can see from the
graph on the right that we weren’t, and this is one of the reasons he
got things so wrong.
The main challenge isn’t predicting the exponential growth, it is guess how much more growth is needed for things to work – that is, what are the labels on the vertical axis.
One property of exponential curves, is if you zoom in on any part of them, they look basically the same.
When autonomous vehicles started to work in 2005, if you thought we
were on a curve like this and only needed to get about 100x better,
you were at risk of making really bad predictions.
If you saw the rate of improvement in autonomous vehicles from 2004 to
2016 and extrapolated continued exponential growth, maybe it wasn't so
unreasonable to predict in 2016 that we would have coast-to-coast full
autonomy by 2017.
Of course, no we know it was a lot further off.
Even though its easy to predict exponential change, it is hard to
predict looking forward how much more change is needed to get to the
point where something actually works well enough.
Predicting dropping costs is easier. The red curve is also exponential, but here, it is decreasing by 7% a year instead of increasing.
For example, let’s look at the cost of communicating.
If you wanted to send a message from Washington to California in 1800, you couldn’t even imagine doing it.
By 1803, one person could – Thomas Jefferson envisioned the Lewis and Clark expedition.
It cost \$50,000 in 1803 dollars, which is hard to convert, but let’s guess about \$100M.
Congressional appropriation methods, however, haven't changed much - the initial request, shown in the letter, was for $2,500.
Let’s zoom in on the next 100 years.
By 1903, we had telephones. If you were really rich and important you could have one in your office.
A 3-minute call from NY to Chicago cost \$5.45 in 1903 dollars.
I couldn’t find out if you could call from Washington to California, and adjusting for inflation is tough, but somewhere around a few $1000 seems reasonable.
Why was it so expensive? Here’s the phone operators that were needed
to make it happen. (Those jobs don’t exist today either.)
By 2003, we had early smartphones.
Today, no one thinks twice about the cost of sending messages to
California.
Sending messages has gotten so ridiculously cheap, spam is profitable.
One way to think about the jobs of the future is to predict when the
cost of automating the job will drop below the cost of hiring a human
to do it today.
Nearly all individual jobs could be automated today, if you invested
enough in a special-purpose solution and in training the machine to do
that specific job.
Jobs that Won’t be Replaced by AI
It is hard to predict when particular jobs will be automated out of
existence, but it seems eventually most jobs people have today will
be.
I think there are three types of jobs that won’t be – the first two
will be a bit depressing, but don’t give up hope for a hopeful
conclusion until I get to the third one.
An example of a profession that is well organized to resist job losses
due to technology is
medicine. When
Johnson & Johnson marketed a machine for performing anesthesia,
the American Society of Anesthesiologists
objected.
The
medical professional organizations are so powerful, that
nearly all medical technology is designed and marketed to assist
doctors, not replace them.
Amazon Fulfillment Center, Richmond, VA
(picture by me)
It is well worth taking a tour of an Amazon Fulfillment Center. Other
than conveyer belts, scanners, and fork lifts, there are not many many
machines operating the warehouse — nearly all the work is done
by humans, being told by software what to do.
Its cheaper to get
humans to do this work since it is physically complex, varied, and
high skill, but the skills required to do it are readily available and
not highly valued by the marketplace, so humans with such skills can
be hired at low cost and easily replaced when necessary. It'll be a
long time before it is cheaper to do these jobs by machine.
We don’t watch sports to see the fastest, strongest, most agile
objects in the universe - we want to be inspired by the individual
achievements of other humans, and the synergy of a team of humans
working together.
Unfortunately for me, and most of us in this room (unlike Dexter Manley,
pictured, who was our keynote speaker), we don’t have the
athletic talents to succeed as professional athletes.
But this intrinsic value in work being done by humans applies to many other endeavors.
Machines can already produce music that musicologists can’t
distinguish from humans — but no one wants to listen to it.
We want to listen to music to connect with the humans (or at least
animals!) who wrote and performed it, and it matters who they are.
Burger flippers can be replaced by machines once the costs get low
enough, but not high-end chefs — food tastes more savory when we
can imagine it being cooked by a perfectionist chef lording over a
kitchen.
White House Photo (Public Domain)
Education works best when the teacher is a human who cares about what she is teaching, and best of all when the teacher also cares about the students as people.
## Hopeful Conclusion
In the future, we should all have jobs like these!
Everyone should have a job that values their intrinsic humanity, and
technology will soon advance to the point where we all can. In many
ways (recall the diminishing percentage of our workforce employed in
agriculture from earlier) we already have. We just have
to face the challenge of restructuring society to make this work.