Manuel Blum Santosh Vempala, Jeremiah Blocki Elan Rosenfeld, Michael Wehar, Samira Samadi, Bhiksha Raj

Yüklə 4,3 Mb.

Pdf görüntüsü

tarix	08.08.2018
ölçüsü	4,3 Mb.
	#61626

Manuel Blum

Santosh Vempala, Jeremiah Blocki

Elan Rosenfeld,

Michael Wehar, Samira Samadi, Bhiksha Raj

11/5/16

5/55

Human ComputaAon with an

ApplicaAon to Passwords

© Manuel Blum 2016

Human ComputaAon with an

ApplicaAon to Password GeneraAon

• 

This talk is about

human computaAon

. My

goal is to

apply CS ideas to what humans can

compute in their heads

.

• 

(

I hope to extend this eventually to allow wriCng. At ﬁrst only required

wriCng will be allowed: think Crossword Puzzles, Sudoku, and various

Crypto Protocols. In that case, what’s wriOen is then available for

further computaCon and output. There is no eraser and no other

paper/pencil.

)

• 

My running

Example

will be

Password

GeneraAon

. In this example, I would like all

computaCon to be done enCrely in the head. This

means no paper or pencil. Even the generated

password is to be invisible.

11/5/16

2/55

Major point: password generaAon is

best

viewed as a funcAon from

challenge

è

response

website

è

password

I require funcAons to be humanly computed

(computed in one’s head w/o paper pencil)

Password GeneraAon

11/5/16

• 

I need to show you that

it is possible for a

normal human being (me) to generate – and

regenerate – random looking passwords

quickly on the ﬂy.

• 

I do this by applying a (publishable) humanly

usable algorithm that I call a (public)

schema

which uses a randomly chosen (private)

key

parameter, to the website name.

• 

I want you to observe that I can use schema + key

to generate a random-‐looking password response

to any website quickly this way.

Password GeneraAon

11/5/16

Password GeneraAon

For this demonstraCon I need a

volunteer

– someone

whom you trust is not in cahoots with me – to run my

Python Program. This is to prove to you that I’m not

cheaCng, and to check my work (because I make mistakes).

I also need:

1.  Someone to give me a 5 or 6 leVer word typical of

a website name, like AMAZON or JASON, maybe

your own name, to use as challenge;

2.  Someone to write challenge and my response on

board behind me. You will see that I can create

passwords w/o paper, pencil or notes.

11/5/16

Do you see any paVern?

MAGIC

è 436115

MAGIC

è 292849

MAGIC

3

è 984239

MAGIC

4

è 123404

MAGIC

5

è 311575

MAGIC

6

è 609092

MAGIC

è 040782

MAGIC

è 778658

MAGIC

è 565927

11/5/16

If you know the alphabet A B C D E F … X Y Z,

then you can map each leOer of the challenge

(website) to the next leOer in the alphabet:

Example:

EBAY èFCBZ

Unfortunately

, AAAA èBBBB

Also,

unfortunately

, if all of you did this,

you would all have the same passwords.

The weakness is that this public

schema

does

not use a private

key!

An Easy-‐to-‐Use but Insecure

Algorithm for GeneraAng Passwords

11/5/16

A Public Schema for GeneraAng

Passwords must use a secret random KEY

For example, memorize a random phone number as KEY.

Example: Key =

(246) 357-‐0189

Use Key as shih:

E+2 = G, B+4 = F, A+6 = G, Y+3 = B

EBAY è E+

B+

A+

Y+

E+

B+

A+

…

è (GFG) BJI-‐AZMK (= GFGBJIAZMK)

AAAA è (CEG) DFH-‐AB I J (= CEGDFHAB I J)

Now AAAA is not a problem , and

Short challenges è 10 leVer passwords

To do this, you must memorize a phone number.

Are you willing to memorize a phone # you use frequently?

11/5/16

For example, memorize a random permutaCon on the

leOers, like "TH

E

Q

UICK

B

R

OWN FX JMPS V L

A

Z

Y

D

G."

EBAY

è

QRZD

, but

AAAA

è

ZZZZ

OUCH !!

To ﬁx the AAAA and password length problems, as already

suggested, memorize a random phone number like

(246) 357-‐0189

, then use it as a shih.

How much harder is it to memorize and use “TH

UICK

OWN FX…” than to memorize the key to a password vault?

It’s slightly harder. A social security number? Slightly harder.

How about 100 passwords?

A lot harder !!

Many diﬀerent Algorithms for

GeneraAng Passwords are possible

11/5/16

1.  Check if

Google

recognizes (typical) passwords that the

schema generates.

2.  Use a

password checker

passwordmeter.com

CauAon: Run tests 1 and 2 on faux passwords!

This completes PART ONE of this talk

To saAsfy password requirements

start every password with aA1@bB2$

AOaching aA1@bB2$ to every password gets

you nearly 100% from every password checker.

X

To test a password schema,

use not one but BOTH of the following

tests:

11/5/16

PART TWO: Human Usability

11/5/16

PART TWO: Human Usability

I will shortly present

A Model for Human ComputaAon

En route, you may rightly ask:

How will this model help me?

Will it

help me solve crossword puzzles or Sudoku beOer? Will it help me play a

beVer game of chess?

NO. The model of human computaCon won’t help you play chess beOer,

just as the model of a Turing machine won’t help you do matrix inversion

faster.

LOGIC

, the importance of the

Turing machine

is in disCnguishing what

is computable from what is not.

In

THEORETICAL COMPUTER SCIENCE

, the importance of the

Turing

machine

is in disCnguishing what is eﬃciently solvable from what is not.

In this

COG SCI COMP SCI

area, the

model of human computaAon

for disCnguishing what is humanly computable from what is not, and for

esCmaCng the human eﬀort required to do what is humanly computable.

11/5/16

What the Model of Human

Conscious ComputaAon Does

It provides a way to

esAmate human usability

of a

schema without having to test 100 people for each new

schema or for each modiﬁcaCon of a schema.*

Of course, we need to validate our model experimentally,

but we can then

count steps and prove human

usability mathemaAcally.

The model is needed to

prove the impossibility

schemas (humanly usable algorithms) for certain

problems. Example: are zero knowledge protocols

achievable with humanly usable schemas?

11/5/16

How the Theory of Human

Conscious ComputaAon Diﬀers

from Computer Science Theory

1.  A

major

diﬀerence between computers and humans

:

besides that

computers are much faster than humans

doing arithmeCc operaCons,

computers are constantly

improving in speed and memory, whereas human

replacements are the same old model.

On this account,

When counCng steps to determine the complexity of schemas,

speciﬁc constants are crucial. O(n)

is not good enough.

Step counCng must specify constants much more precisely, e.g.

use



n + O(1) for a speciﬁc c

rather than

O(n)

11/5/16

Human Conscious ComputaAon

versus Computer Science Theory

3. 

Guiding us is the following

Major Open Problem: can

humans compute 1-‐way funcAons (in their head) that a

powerful supercomputer cannot invert?

Think about it.

Do you think it possible? Is it even imaginable?

4.  Similarly, can a human generate pseudo-‐random

numbers that are indisAnguishable from truly random

numbers? More formally, can a human in a few minutes

transform 20-‐digit random nos into 40-‐digit pseudo-‐

random numbers that a supercomputer working for an

hour cannot disCnguish from truly random 40-‐digit nos?

Here and henceforth,

we pit human against

human_with_powerful_computer adversary.

11/5/16

A Model of Human

Conscious ComputaAon

For human usability, I need to present a

model of human

conscious computaAon*

This model is a Turing machine with two

disAnct memories in place of tape.

____________________

* Unconscious computaAon

: ride a bike.

* Conscious computaAon

: mulCply a 2-‐digit number

by a 1-‐digit number in your head. Recall image of the Mona Lisa.

11/5/16

The two memories of our

Turing machine are:

1.  Long term or permanent memory (LTM),

which

is hard to write (can only be wriOen to slowly) but easy to

read (can be read quickly). RetenAon in LTM requires

rehearsals on Wozniak's doubling schedule.

2.  Short term or working memory (STM)

, which is

easy to read & write, but Cny:

7±2

chunks. Storage is ﬂeeCng:

items in STM are pushed out by new deposits.

11/5/16

As computer scienAsts you recognize STM as

cache

(but STM is much more than cache)

.

Long term memory LTM

• 

Long term permanent memory (LTM) is

virtually

inﬁnite

but permanence requires rehearsals on Wozniak's doubling

schedule. LTM is used to store both Schema and Data.

• 

Virtually

inﬁnite

means that LTM can store info (like a

secret key) for life -‐ unCl the human can no longer use it (the

password) anyway.

• 

References

:

Piotr Wozniak

(not Steve Wozniak)

Super Memo

:

11/5/16

Magician

Juan Tamariz

teaches

how to memorize a random 1:1 map

from the 52 cards to {1,2,3,…,52} in 3

hours.

Short Term Memory STM

What I write in BLUE, fascinaAng as it is, is not strictly

part of the model.

DescripAon of the model is in black.

From

the literature, the contents of STM are what we are

consciously aware of.

11/5/16

•

 

STM is the stage in the theater of consciousness

(Bernard Baars).

•

The contents of STM are actors performing on

that stage.

•

 

Audience members are the many processes of

the brain. They pay aOenCon to the stage.

•

Any audience member with relevant info can

pass it back to the stage*

_______________________________________________________

*In this sense, STM is much more than cache.

• 

Chunks are pointers into memory.

• 

STM can send info to and receive info from

wherever that info is located in memory.

The items in STM are what

psychologists call chunks.

11/5/16

QuesAon: why does short-‐term memory have room for only

7±2 chunks, not more?

Are there reasons

why this number is so small… ?

Is it really beVer to pay full aVenAon to a

few important things than divided aVenAon

to many more things?

The CS model: PREP

araAon and

PROC

essing

Games and crypto-‐problems (Crossword Puzzles,

speed chess

,

Password

generaAon) have

2 parts: PREP and PROC.

1.  PREParaAon

has to do with 1. obtaining/generaCng data, and

2. wriCng schema and data into permanent LTM. E.g.

Learn/play chess

.

2.  PROCessing

has to do with applying schema (in LTM) with associated data

(in LTM) to the input and generate output.

Observe board; make move

.

x

Our measure of human eﬀort is human compute Ame.

1.  #PREP

is the preparaCon Cme – including rehearsals – to write schema and data

into LTM.

For password generaAon, we place an upper bound of at most 3 hours,

preferably 1 hour − including rehearsals − on what is permissible #PREP.

2.  #PROC

is the processing Cme it takes to apply the schema to the input and to

generate the output.

For password generaAon, we place upper bound of at most

6 secs/leVer, preferably 2 secs/leVer, on permissible #PROC.

11/5/16

PREP

PREParaAon stores the SCHEMA and associated DATA

in LTM.

For solving crossword puzzles, the SCHEMA is the set

of instrucCons one somehow acquires − by googling,

reading, talking to friends, or thinking about the

problem. Typical instrucCons include: “search ﬁrst for

short words,” and “an answer could be a phrase.”

The DATA includes such stuﬀ as a dicConary of short

words commiOed to memory.

For

password generaAon

, the SCHEMA is a humanly

usable (public i.e. publishable) algorithm and the

DATA is a private KEY (a parameter of the schema).

11/5/16

ProperAes of PREP

InstrucCons must be as detailed for humans as they are for

computers. For example, for humans to memorize a random map

from leOers to digits, I would create a 26x10 chart of leOers to digits,

then show only the 26 cells that must be memorized. Here are:

W -‐> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

11/5/16

A major challenge to schema designer is to

make required memorizaAons easy and fun.

For example, to memorize 2 independent random

funcCons from the standard 26 leOers è10 digits can be

confusing.

o  Instead, memorize a single random funcCon

f: 26 leOers è2-‐digit numbers.

Then set f

= msd(f) and f

= lsd(f).

o  Another possibility is to let F map capital leOers and

f map lower case leOers to digits.

To esCmate PREP Cme, point out a familiar

memorizaAon acAvity of the same sort.

Best is a scholarly study to measure Cme.

ProperAes of PREP

11/5/16

DeﬁniAon of #PREP

#PREP

:= a triple consisCng of

1. 

Work to generate random numbers, e.g. (number of

tosses of a k-‐sided die) x (log

k) for generaCon of

numbers in {1,…,k}.

2. 

Work to store data in permanent memory. This depends

crucially on the data structures used for storage, which

must be speciﬁed. Singly linked lists are much cheaper to

memorize than hash funcAons. State work as “a singly-‐

linked list of length L” or “a hash funcAon of N items”.

3. 

Work to store schema in usable form in permanent

memory, e.g. number of lines of code in the schema.

11/5/16

PROC

PROCessing (e.g. transforming challenge è

response) is a fast

execuCon of schema. For this reason, it does NOT write to LTM:

It may read and write to STM.

It may use the pointers in STM to read − but not write − LTM.

PROCessing uses a pointer from STM into schema (stored in LTM)

plus it uses pointers into relevant data.

In password computaCon, PROCessing the schema includes cycling

thru the challenge (the website name) and outpu•ng the response

(the password).

Our model requires schema have at most 3 pointers in STM.

This requirement is necessary for the schema to be humanly usable.

For

computaAon of a password, I bound the total number of steps

a schema may take – so that the schema takes at most a minute

to transform a 10 leOer challenge into a 10 leOer password.

11/5/16

DeﬁniAon of #PROC

#PROC := total number of steps to execute

the given schema. This number counts the

total number of atomic operaCons executed

by the schema, e.g. jump or condi-onal jump,

reset a given pointer in STM to head of a

given data structure in LTM, read an item in

STM or LTM, write or rewrite a variable in

STM, such as x := x+1.

11/5/16

ProperAes of Schemas

and STM

4.15. 2-‐3 pointers

. Pointers into a singly-‐linked list can

shih right (or circle round to the start of the list) but they

cannot shih leh

4.17. The model enables to count PROCessing steps.

Steps are simple: each takes at most 1 sec.

• 

Shih a pointer into a linked list, be it challenge or data,

by one symbol to the right.

• 

Reset a pointer into challenge to start or end of challenge.

• 

Add or mulCply 2 digits mod 10.

• 

Add or mulCply 2 digits mod 9 or 11.

*

Whence come these properAes and rules?

Answer:

From construcAng password schemas.

11/5/16

Typical

steps

x

1

1

1

2

ProperAes of Schemas

InstrucAons must be as detailed for humans, as they

are for computers.

• 

For example, instrucCons for storing data in LTM

must assert whether the data is stored in a singly

linked list (like the alphabet) or random access (like

names ot faces).

• 

In password generaCon, discuss possible

starAng

locaAons

in a challenge. For example,

Start 1 past 1

vowel.

Start 2 past 1

leOer that has a verCcal i.e. B D E F H ….

Start 1 past 2

leOer that rhymes with eee i.e. B C D E G P T V Z.

Start from 1

occurrence of a leOer in SHRDLU.

In general, spell out all possibiliCes from which a random

selecCon must be made.

11/5/16

Example 1: Telephone Schema

PREP

:

Store Schema. Create and memorize a random

hash funcAon (key)

h

mapping 26 leOers

è

10 digits. Create

and memorize random hash funcAon (key)

T

mapping 10

digits

è

10 digits. For human usability, view T iniCally as a

random 10-‐digit Telephone # T = 412 596-‐4063.

Treat challenge as a circular linked list.

PROC

: (

Input Challenge = CMU)

Evaluate Schema on Challenge:

• 

c:= 1

leOer of Challenge (= C M U ), a linked list ………….

• 

t:= 1

digit of Telephone # (T = 4 1 2... ) ………….…………

• 

Repeat unAl T is exhausted (10 Cmes): ………………………….

h

(c) := mapped value of c. …………………………………………….

o  sum :=

h

(c) + t mod 10 …………………………………………………

o  Output sum …………………………………………………………………..

o  Increment c ………………………………………………………………….

o  Increment t ………………………………………………………………….

11/5/16

Steps

1

1

1

1

1

1

1

53

Total

=

Roughly 20

to 60 secs

• 

STML is our current best suggesCon for humanly usable

and secure (in a manner to be explained) “Pseudo

Random Generators (PRG)”, 1-‐way funcCons, and

Password Generators.

• 

It’s deﬁniCon depends on the STML SubrouCne.

Example 2:

STML (Skip-‐To-‐My-‐Lou) Schema;

not to be confused with

STM (Short Term Memory).

11/5/16

• 

Preprocessing: Memorize a random f: leOers è digits.

The STML subrouAne is easier to understand if the input, e.g.

AMAZING, is replaced by its mapped value:

f(AMAZING) = 1316947

• 

Processing: Given a challenge consisCng of a string of leOers -‐

now viewed as a string of digits -‐ apply the following algorithm:

1.  Set SUM = last digit of challenge

2.  Set current digit (pointer) = ﬁrst digit of challenge

3.  Repeat unCl current digit shihs past last digit of challenge:

1.  Set SUM = SUM + current digit (mod 10)

2.  If SUM is at least 5, output SUM

3.  Shih current digit in challenge one digit to the right.

6. The STML (Skip-‐To-‐My-‐Lou) SubrouAne

11/5/16

• 

Example: We assume leOers are hashed to digits.

• 

Suppose the challenge is 31415926.

• 

The running sum is 6+3=9, 9+1=0, 0+4=4, 4+1=5…

resulCng in 31415926

è

90450917

è

9597

•

Processing measure of complexity =

[apply map +

set SUM + shih pointer + (n-‐1) x (apply map + add

mod 10 + compare to 5 + output (maybe) + shih

pointer)] = 3 + (n-‐1)(1+1.5+1+0.5+1)

=

5n-‐2.

STML Idea in an Example

11/5/16

• 

Example: We assume leOers are hashed to digits,

.

T = , T is memorized as hash funcAon.

Suppose the challenge (post hashing) is 31415926.

• 

The running sum is T(6)+3=2, T(2)+1=5+1=6, T(6)+4=3,

T(3)+1=9… resulCng in 31415926

è

59867978

• 

The following computaCon not only counts steps, but also shows in

which order operaAons are performed:

Processing measure of

complexity =

[apply map + set SUM + shih pointer + (n-‐1) x (apply

map + add mod 10 + compare to 5 + output (maybe) + shih pointer)]

= 3 + (n-‐1)(1+1.5+1+0.5+1)

5n-‐2 steps.

For n=10, 5n-‐2 = 48, Cme t

is in the range

16 secs < t < 48 secs

.

STML SubrouAne in an Example

11/5/16

123 456 7890

êêê êêê êêêê

758 429 -‐ 1360

• 

Preprocessing: Memorize a random f: leOers è digits.

Memorize a random T: digits è digits.

The STML subrouAne is easier to understand if the input, e.g.

AMAZING, is replaced by its mapped value:

f: AMAZING è 1316947

• 

Processing: Given a challenge consisCng of a string of leOers -‐

now viewed as a string of digits -‐ apply the following algorithm:

1.  Set SUM = last digit of challenge

2.  Set current digit (pointer) = ﬁrst digit of challenge

3.  Repeat unCl current digit shihs past last digit of challenge:

1.  Set SUM = T(SUM) + current digit (mod 10)

2.  If SUM is at least 5, output SUM

3.  Shih current digit in challenge one digit to the right.

6. The STML (Skip-‐To-‐My-‐Lou) SubrouAne

11/5/16

The n è2n STML "PRG”

Wlg, we assume that the random map x from the 26 leOers to the 10 digits

has already been applied to the challenge, C, so C is a random seed/string

of digits of a length n (to be determined).

The iniCal output F

with key x to challenge C. This string F

complete output of the PRG. Indeed, its expected length is just half the

length of C. To produce a longer string, at the cost of one more (expensive)

pointer, the idea is to run STML on many substrings of C and concatenate

the results. The substrings of C have to be chosen in a humanly usable way.

For a seed C of length n, let C

i

, be the substring of characters obtained by

skipping the ﬁrst i characters of C, then including the next i characters,

skipping the next i, including the next i, and so on. E.g., C

is the substring

of characters in even posiCons, C

2

is the substring of characters obtained by

alternately skipping two and including two, and so on… stopping when n/2

digits are skipped.

11/5/16

The STML (Skip To My Lou) Schema

Theorem:

An adversary with an ordinary laptop who knows

that the user is using STML on integer challenges of length 10,

mapping 10 leOer challenges to 20 leOer passwords, can with

high probability invert a password in just a few hours.

Open QuesAon:

Is it possible for a modern (2016)

supercomputer working for a week to invert a 40 digit

password?

Theorem (Michael Wehar):

The STML schema, which

takes n digit challenges to 2n digit responses, can recover a/the

challenge from the response in O(n

steps). Consequently,

STML is not what cryptographers would call a true pseudo-‐

random sequence generator.

11/5/16

Part three

Security

11/5/16

Part three

Security

• 

Big diﬀerence between computers and humans

besides

that they are much faster than humans in doing arithmeCc operaCons,

computers are constantly improving in speed and memory, while human

replacements are the same old model. For this and other reasons,

• 

When counCng steps to determine the complexity of schemas,

speciﬁc

constants are important

. O(n) is not good enough. Step counts

must specify constants beOer, e.g. c



n + O(1) for speciﬁc c is beOer than

O(n).

• 

1-‐way funcCons are examples. A typical 1-‐way funcCon is mulCplicaCon

of 2 primes. But any two numbers that a human can mulCply in their

head can be factored in a fracCon of a second by a laptop. The whole

quesCon whether humans can compute 1-‐way funcCons (in their head)

that a powerful supercomputer cannot invert is open. Is it possible? Is it

even imaginable? Santosh’s STML suggests a candidate schema to do it.

11/5/16

To deal with security, we deﬁne a

measure of the Quality

Q

of a Schema.

The deﬁniCon of a Schema’s Quality assumes that

challenges are sampled from a DistribuCon of

challenges called the

DicConary, D.

Examples:

may be the set of all strings of

length 10 on the 26 leOers, each with probability

1/26

, or D may be the set of all website names

with their associated probabiliCes of being called.

11/5/16

Password Schema Quality Q

• 

Password Schema Quality Q is then deﬁned in terms of a game

between an "all-‐powerful" adversary and a trusted judge.

• 

The adversary is a Turing machine that is "all-‐powerful” in the

sense that 1. it has unbounded but ﬁnite running Cme, and 2. it

knows the public Schema, but 3. it does not know the private Key.

• 

The judge knows the private key. IniCally, the judge draws a

random sample from D and gives it to the adversary. Aher each

such draw, the adversary makes 10 guesses at the password. If

none is correct, then the judge gives the adversary the correct

password and draws another challenge. This repeats unCl adversay

makes a correct guess.

• 

Q is deﬁned to be the expected number of draws with

replacement from D unCl the adversary guesses correctly.

11/5/16

The Quality Q of various Schemas

For the schema that maps each leOer

(like X) to a digit (like 4), independent of

the locaCon of X in the challenge (so X

è

implies XXX

è

444), we know that Q = 7.

The lower bound (Q ≧ 7) is info theoreCc.

The upper bound (Q ≦ 7) is proved by

supplying an opCmal program for the

adversary, due to Elan Rosenfeld.

11/5/16

The Quality Q of various Schemas

For the schema that maps each leOer

(like X) to a digit that depends solely on

the leOer and its posiCon in the challenge

(e.g. XX -‐> 41, YY -‐> 59; so XY -‐> 49), we

know Q = 14. The lower bound of 14 is

info theoreCc, and the upper bound of

14, which is harder to prove, is again due

to Elan Rosenfeld.

11/5/16

Future DirecAons

Major Open QuesAon:

Can a human generate

pseudo-‐random sequences in her head?

More formally:

Does there exist a publishable schema such that a

human with just

A FEW HOURS PREP

(to learn the public schema and to

generate and memorize a private random key) and

A FEW MINUTES

PROC

per input seed can transform a dozen private random 20-‐digit

numbers/seeds into corresponding 40 digit numbers such that a 2016

supercomputer running for an hour cannot disCnguish the dozen 40-‐digit

pseudo-‐random numbers from a dozen truly random 40 digit numbers?

Manuel’s principal interest is to extend the model of human computaCon

to applicaCons in which the computaCon is not done enCrely in the head.

Santosh’s principal interest is to invesCgate how far one can go in doing

cryptography in one’s head. He looks to construct secure humanly usable

Pseudo Random Generators, 1-‐way funcAons, Trapdoor funcAons, etc.

11/5/16

Yüklə 4,3 Mb.

Dostları ilə paylaş: