Publishable Humanly Usable Secure Password Creation Schemas



Yüklə 0,7 Mb.
Pdf görüntüsü
tarix08.08.2018
ölçüsü0,7 Mb.
#61628


 

 

Publishable Humanly Usable Secure Password Creation Schemas*

 

Manuel Blum and Santosh Vempala

 

CMU, Pittsburgh PA 15213 and Georgia Tech, Atlanta GA 30332 

mblum@cs.cmu.edu

, vempala@cc.gatech.edu  

 

 

 



Abstract 

What can a human compute in his/her head that a powerful adver-

sary cannot infer?  To answer this question, we define a model of 

human computation and a measure of security.  Then, motivated 

by the special case of password creation, we propose a collection 

of well-defined password-generation methods. We show that our 

password generation methods are humanly computable and, to a 

well-defined extent, machine uncrackable.  For the proof of secu-

rity, we posit that password generation methods are public, but 

that the human’s privately chosen seed is not, and that the adver-

sary will have observed only a few input-output pairs.  Besides 

the application to password generation, our proposed Human 

Usability Model (HUM) will have other applications.  

 Introduction

 

  

Passwords are maps from challenges (e.g., website names) 

to  responses (strings of characters). The ideal password is 

hard to crack, i.e., an adversary that knows the password 

generation method, but not the human’s privately chosen 

seed, and that knows only a few challenge-response pairs, 

will 

provably (as opposed to probably) have little clue 

about the response to a new challenge. Even the knowledge 

of a half dozen challenge-response pairs will give a combi-

nation supercomputer-human adversary little chance to 

respond correctly.  

   Constructing a password is a trivial problem for a com-

puter. All it needs to do is store a table containing a com-

pletely random string for each challenge of interest.  It can 

then respond to any challenge by table look-up

.  


In this 

case, an adversary who knows all but one challenge-

response pair has negligible information for responding to 

the missing challenge.

 

  But this does not work for humans [Shay et al., 2014]. 



Indeed, people commonly use passwords that are too sim-

ple [Bonneau, 2012; Cranor, 2014] and/or they use the 

same password for many different websites [Ives et al., 

2004] and/or they use password vaults that, if cracked, 

expose all their passwords [Li et al., 2014]; see [Blocki, 

                                                 

Copyright © 2015, Association for the Advancement of Artificial Intelli-

gence (www.aaai.org). All rights reserved. 

 

2014] for a detailed discussion and references. People are 



mostly unwilling to make the effort to generate and memo-

rize random responses to 100 challenges, so this password 

generation method is 

not humanly usable. The motivating 

problem for this paper is the following:  

Are there any humanly usable, secure and publishable 

password generation methods?   

Informally, human usability has two requirements:  

Preprocessing Time (PT1): Any initial long-term 

memorization should take at most 1 hour, preferably less 

than 20 minutes; future rehearsals should take at most 2 

minutes each for a total of no more than 1 hour over the 

user’s lifetime.  

Processing Time (PT2): Generation of a 10-character 

password should be done entirely in the user's head 

(without the use of paper, pencil, phone), and should 

take at most 30 seconds, preferably less than 20 seconds.

 

  With these stringent requirements, what can a human 



do? What level of security can be achieved for password 

generation?  

  In the next section, we formally define the space of 

allowable password schemas. Following that, we give 

several candidate schemas. Then we introduce our Hu-

man Usability Model (HUM) to define and quantify the 

space of human algorithms. The subsequent section de-

fines security parameters. Armed with these measures, 

we analyze the proposed schemas.  This will let us state 

our (existence) theorem for good password schemas.  As 

a note of caution, we also demonstrate a natural schema 

that is not humanly usable (an impossibility theorem).  

 

Password schemas: definitions and desiderata 

 

Passwords are maps from challenges to responses.  We 

also use the word password (as is commonly done) to de-

note the response to a challenge

1

. With this definition, a 



challenge is typically a website name, e.g. AMAZON, 

                                                 

*

 PHUSPCS, pronounced “phoospics” as in “toothpicks”.   



1

 Hopefully, this dual use of the word will not be cause for confusion.   




AMEX, EBAY; the associated response is the password.  

To login to a website such as AMAZON, for example, the 

user must enter -- besides the user’s login name -- the 

password response to the challenge AMAZON.   

A  password schema consists of a sample space of al-

lowable challenges called the dictionary and a set of in-



structions for transforming challenges in the dictionary 

into passwords.  That set of instructions typically has two 

stages, of which the second is the processing stage for 

transforming challenges into passwords (about which we 

say more later).  The first is the preprocessing stage

which explains: 

1.

  what must be memorized (e.g., a random map from 



26 characters to 10 digits), and  

2.

  what operations are to be performed on the chal-



lenge (e.g., a direct map from the characters in the 

challenge to the corresponding digits in the re-

sponse) to create the password. 

Both 1 and 2 above must include explicit directions for 

memorizing and for operating on challenges so that a hu-

man can do these in a well-defined acceptable amount of 

time.  

The goal is to have a schema for creating and generating 



passwords that is 1) humanly usable, 2) self-rehearsing, 3) 

secure,  4) publishable and 5) analyzable.  Here are the 

meanings of these terms, in reverse order:  



 

Analyzable means that the schema is so explicitly well-

defined that a Turing Machine, in practice a computer 

without access to the web, can follow its directions.  By 

comparison, most currently recommended schemas for 

constructing passwords are "soft": they are not algo-

rithms.   “Take a sentence and concatenate the first letters 

of all words to create the password,” while good advice as 

far as it goes, invariably does not say how the sentence is 

to be chosen; on the rare occasions when it does say how, 

it does not say it in a way that a Turing Machine can gen-

erate the password. Precise (Turing Machine) algorithms 

are necessary for security and human usability to be ana-

lyzed. 

 

Publishable means that the schema for generating and 

recalling passwords is publicly available for everyone to 

know.  In particular, the adversary knows exactly how the 

user creates responses (passwords) to challenges (website 

names);  the only unknown to the adversary is the particu-

lar random choices made by the user in the preprocessing 

stage. This selection is private (to the user). 

Note that analyzable and publishable are not the same: 

many schemas have been published that are not sufficiently 

precise to be analyzable.  Schemas that are analyzable are 

not in general published. 

 

Secure means that an adversary cannot generate a user’s 

password to a new website, a website whose user password 

she has not seen.  

Security for passwords schemas is measured very differ-

ently from security for most (other) cryptographic sche-

mas.  With most cryptographic schemas, security can be 

based on the fact that the adversary has limited, typically 

polynomial time bounded, computational power.  In pass-

word schemas the adversary is a human with a supercom-

puter, while the user is a human without a computer.   As 

the adversary has vastly greater power than the human us-

er, proofs of security are generally information theoretic: 

the adversary is assumed to be capable of computing any-

thing computable.    

In this work, complexity considerations do arise, but on-

ly in the rather severe limits they place on the user. Even 

finite automata are too powerful to model the computations 

that a human can do in his head.  The only satisfactory 

proof that a human can compute something in a given 

amount of time is to see a human do it.  Equivalently, we 

can specify the steps that the human must perform to gen-

erate passwords, estimate the time for each, then put this 

all together to estimate the time it will take the human to 

do his part.   



 

Self-rehearsing means that in the process of computing 

passwords for frequently entered websites, the user gets all 

the practice he needs to compute passwords for infrequent-

ly entered websites.  In other words, computing the pass-

words for commonly visited sites rehearses both the algo-

rithm (schema) and the maps that are used to construct the 

passwords for the other sites.     

In our experience, at typical rates of password genera-

tion, all 22 letters except Z, X, Q, J are rehearsed suffi-

ciently often to be remembered without additional rehears-

als. These are the self-rehearsed letters. As for the Z, X, Q, 

J maps above, they can either be deliberately rehearsed, or 

all have the same map.  In addition, only lines of code in 

the password algorithm that are used for the most frequent-

ly visited websites will be rehearsed often enough to be 

remembered without additional rehearsals.  

 

Humanly usable means that a human can learn the sche-

ma, then use it to generate passwords, and that all this can 

be done satisfying certain time requirements (given earlier) 

for preprocessing and processing.  In a later section, we 

will present a Human Usability Model.  It defines a set of 

operations that humans can perform and gives measures for 

the costs of these operations. 

 

The Dictionary:  A password-generation schema must 

define the sample space of (allowable) challenges. Exam-

ples include: 

1.

 

All 3-letter strings of letters, with every 3-letter string 



equally likely. 

2.

 



A standard English dictionary of 10,000 or so words, 

each with its probability of occurrence in ordinary 

English text. 



3.

 

The top 500 currently most popular websites, each 



with its probability of being logged into. 

Sample schemas 

Most of our schemas will involve implicit or explicit char-

acter-to-character maps. We write letters of the challenge 

as capitals A, B, C… and their map values (A)f, (B)f, 

(C)f… as lower-case letters (A)f=a, (B)f=b, (C)f=c….  We 

use (A)f instead of f(A) because human computation re-

quires first retrieving A and then applying f. 

  Our first set of schemas use a letter-to-digit map. 



Digit Schema 1 (DS1) 

Preprocessing: Memorize a random letter-to-digit map f 

(we discuss the important issue of how to do this later). 



Processing: Apply the map to the distinct letters of the 

challenge in the order they appear.  

Example: f(AMAZON) = (A)f (M)f(Z)f (O)f (N)f = amzon 

(Note: the lower-case letters represent digits; we write f 

after the letter it is applied to; we skipped the second A).  

Caution: short challenges, or ones with many repeated 

letters will lead to short responses, e.g., f(AAA) = (A)f = a.  

This is not a favorite schema. 

 

  In what follows, we use   and 



 to denote addition and 

multiplication mod 10. 



Digit Schema 2 (DS2) 

Preprocessing: Memorize a letter-to-digit map f. 

Processing:  

For a challenge 

1. Output  



  

2. For 


  

 Output 


 

 

 



Remark 1: A variant of the above schema is to use a ran-

dom-looking starting point for processing. This can be 

done by first computing a starting location 

 and starting at that location with 

, then wrapping around the chal-

lenge when needed to get b

1

…b

n-1



.  

 

Digit Schema 3 (DS3) 



Preprocessing: Memorize a random letter-to-digit map f 

and a random digit permutation g. 



Processing:  For a challenge 

1. Output b



0

 = ((A


n-1 

A

0



)f )g  

2. For 


  

  Output 

b

i

 = (b



i-1

   (A


i

)f )g   


 

  The next few schemas do not use any arithmetic. 



Word Schema 1 (WS1) 

Preprocessing: Memorize a letter-to-word map, i.e., pick 

a word for each starting letter A-Z. For this we suggest that 

the user first choose a topic (e.g., “Animals”), then gener-

ate a list of 10 or more words from that topic for each start-

ing letter (using the internet if he wishes) picking one of 

those 10 words at random as his map value. If he cannot 

find 10 words for some letters, he can widen the topic (e.g., 

“Animals and Birds”). For rare letters such as Q and X, he 

can simply pick a random word. Let f map a letter to the 

first TWO consonants in the word following the starting 

letter. E.g., if A

   


AARDVARK, then (A)f=RD. 

Processing: Given a challenge, apply the map f to the 

first k distinct letters of the challenge, or to all letters of the 

challenge if it is of length less than k. We suggest k=4.  

Example: Suppose 

 A  AARDVARK  

 M  MONGOOSE 

 Z  ZORILLO 

 O  ORANGUTAN  

Then, f(AMAZON) = (A)f (M)f (Z)f (O)f=RDNGRLRN. 

 

Letter Permutation Schema 1 (LP1) 



Preprocessing: Pick a random permutation of the 26 

letters A-Z, and memorize the sequence. For this we sug-

gest the user recite the permutation to a tune, e.g., the 

standard alphabet song: 

ABCD   EFG 

HIJK   LMNOP 

QRS and TUV 

W and X, Y and Z… 

For example, if the random permutation is  

XBUDVOWTNRPGAHFJSZYLIMQKEC, then the tune 

could go: 

XBUD   VOW 

TNRP   GAHFJ 

SZY and LIM  

Q and K, E and C… 

Let f map a letter to the next letter in this sequence, with 

the last mapping to the first. 

Processing: Given a challenge, apply the map f to each 

distinct character of the challenge.  



Note: The user does not have to make a pass through the 

challenge to determine which letters have not occurred 

earlier.  On challenge MAMAMIA, for example, the user 

will recall, on the second and third occurrences of M and 

A, that he already printed (M)f and (A)f respectively. 

Caution: It is advised that each challenge in the user’s 

dictionary should contain at least 4 distinct characters. 



Remark 2: For the above schemas that output characters, a 

random-looking start could be obtained by fixing a rule 

such as: “start one past the second vowel” or “start at the 

vowel before the last consonant”. 



Remark 3: Many real-life websites require that passwords 

contain special characters and/or both capital and lower-

case letters. To comply with this, we suggest that the user 

employ a fixed rule for all passwords, such as: “capitalize 




the first letter and terminate the password with @1”. The 

purpose of this is not to increase the security but to have 

self-rehearsing methods for making passwords legal. 

 

Letter Permutation Schema 2 (LP2) 



Preprocessing: Pick a random word with a great many 

different letters. In a table of the 26 letters, mark off those 

letters that appear in the word. Pick another random word 

that uses many unmarked letters (and possibly some 

marked ones) and mark off those letters in the table. Pick a 

third word that uses many unmarked letters and mark those 

letters off the table. Memorize the three words in the string 

word1-word2-word3, and a final string4 consisting of let-

ters (if any) that do not appear in any word, e.g., QXZ if all 

letters except Q, X and Z appear in the three chosen words. 

Let f map each letter in the challenge to whichever conso-

nant follows its first occurrence in the sequence word1-

word2-word3-string4.  

Processing: Apply f to the distinct letters of the chal-

lenge.  If a letter is a repeat, map it to the next letter in the 

sequence, wrapping around the unused letters if needed.   

Example: ANTIVIRAL-MOHAWKBUG-ESCAPED are 

the three words and FJQXYZ is string4. Then, 

 

f(AMAZON) = (A)f(M)f((A)f)f(Z)f (O)f(N)f = NHTFHT. 



 

  The above schemas can be extended to multiple maps 

(letter-to-digit or letter-to-letter), allowing for proportional-

ly more security, needing proportionally more prepro-

cessing, but only slightly more processing. We call these 

multi-map schemas, and provide one illustration. 

 

[MMS1] 



Preprocessing. Memorize m letter-to-letter maps, 

.   


Processing. Given a challenge of length n, compute t = n 

(mod m), apply the t

th

 map   to the challenge. 



 

Remark 4: There are several ways the user might choose a 

meta-map to decide which of his maps to apply to a chal-

lenge. E.g., he could use a letter-to-digit map and compute 

  to determine which map to use. 

  All the above schemas can be applied by making one or 

at most two passes over the challenge. In contrast, the fol-

lowing schema appears to require substantially more mem-

orization and far too many passes. 

 

 

Random Matrix Schema (RM) 

Preprocessing: Memorize an n x n matrix M of random-

ly chosen digits and a random letter-to-digit map f. 



Processing: given a challenge 

,  


for 

  

Output  



 

   



The Human Usability Model (HUM) 

  We model the human as a finite state machine with both 

long-term and short-term memory. The long-term memory 

is written in the preprocessing stage, when the human 

stores the function(s) that must be memorized.  That long-

term memory is used in read-only mode together with a 

tiny, short-term, read-write memory in the processing 

stage to transform any given challenge to its response.  The 

state diagram of the machine that does this, i.e., accesses 

both types of memory in order to transform challenge into 

response,

 

is described in part in the preprocessing stage 



and in part in the processing stage of each password sche-

ma. The state diagram must be learnable in minutes, not 

hours, from a description of the code and a few examples. 

It must also be completely self-rehearsing in that every line 

of code defining that state diagram must be used in a sub-

stantial fraction (quarter? half? the larger the fraction, the 

more self-rehearsing is the code) of challenge-response 

computations. A password schema that chooses its starting 

location in the challenge by randomly hashing the first 

letter of the challenge to the start location would not be 

self-rehearsing and therefore would not be permissible. 

 

More on Long- and Short-term Memory  

For the purpose of transforming challenges into responses, 

human memory consists of a large long-term relatively 

permanent memory and a small short-term working 

memory. 


 

1. Long-term memory is unbounded except for  

 the (substantial, proportional) time and effort needed to 

acquire it (up to a concentrated hour or so),  

 the relatively small time and effort (in spaced rehears-

als) required to maintain it (measured in minutes over a 

lifetime), and  

 the length of one’s (mentally healthy) life. 

 

2. Long-term memory (for our purpose) consists of se-

quences of text. The storage of these texts may be based on 

visual, auditory, sensory, motor, experiential and other 

memories, all of which can play a role in computing and 

remembering passwords.  Since this work is concerned 

with transforming challenge strings into response strings, 

only the memory required to store strings of characters 

(rather than, say, visual images), and the rules/password 

schemas for transforming them are described here. 

 

3. Just as humans typically store their memory of the al-

phabet A, B… Z in a singly-linked list, which can typically 

only be recited left-to-right from the starting point and a 

few other entry points in the list, the input challenge words 

are streamed into the machine looking much like a singly-

linked list in long-term memory. The human has direct 




access to the input characters in the order they arrive. 

Short-term memory enables going back at most one char-

acter.  The human also has pointer access to the first, se-

cond, last and last-but-one characters of the input se-

quence.  

 

4. Working memory is severely bounded. In password 

computations, it typically consists of 2 characters (2 digits, 

2 letters, or 1 letter and 1 digit) and a pointer into one or at 

most two sequences, these being typically the challenge 

itself and at most one additional sequence.

 

 

5. Working memory fades quickly. Items in working 



memory that are needed but not recalled for more than 2 or 

3 operations must be reconstructed.  

 

We are now ready to define our measure of the cost (or 



perhaps more precisely, the effort) of human computation. 

HUM Processing Complexity = total number of reads 

from and writes to working memory.  

The total combined Preprocessing Complexity (PC1) 

and Processing Complexity (PC2) is a vector: HUM = 

 



Complexity of some natural operations 

● Working memory for letters and digits is a stack of two 

elements (letters or digits). The cost of accessing the top 

(most recent) entry is 1, and that of accessing the bottom 

entry is 2.  

●  Retrieving a sequence from recently retrieved long-term 

memory, i.e., retrieving a pointer to the beginning of the 

sequence, has Cost=1.  

●  Following a pointer into a recently-retrieved sequence in 

long-term memory, moving it 1 step to the right, resetting 

the pointer to start/start+1 or to end/end-1 has Cost=1.  

●  Operations of =, + and x (mod 2,3,4,5,9,10) on two sin-

gle-digit numbers. Cost = number of digits created (not 

necessarily all written) during the operation.  

Example: 4+3 (mod 10) = 7 has cost 1; 4+9 (mod 10) = 3 

has cost 2.  

● Operations of =, + and x (mod 11) on two single-digit 

numbers has Cost = number of digits created (not neces-

sarily written) before applying the mod operation + 1 for 

the mod operation on numbers > 11.  

Example: 4+3 (mod 11) has cost 1; 4+9 (mod 11) has cost 

3.  


● Operations of =, + and x (mod 2,3,4,5,9) on two single-

digit numbers are treated similarly. 

● Applying a map such as a character-to-digit map is the 

same as following a pointer and has Cost =1. 

 

Remark 1

: The HUM only indicates proportional effort for 

a human, the actual time will vary with the human. This is 

analogous to asymptotic complexity, which counts the 

number of operations, while the running time depends on 

the machine executing the algorithm and its current load. 

An important difference between human and machine 

computation counts is that, since challenges typically have 

length n ≈ 12 or less, care must be taken to estimate all 

constants, not just accept their big 

Ο or big Ω dependence 

on n. 


Remark 2

: A major reason for specifying the operations 

and their costs is to determine, albeit roughly, the expected 

human usability of proposed password schemas. These 

computed values should be confirmable experimentally.



 

 

Cost of memorization 

How are we to measure the cost of preprocessing, i.e. long-

term memorization?  First, fix an upper bound on how 

much one can memorize in a single sitting and how many 

sittings one is willing to do per day.  While this can vary 

from person to person, most people can memorize one 7-

digit phone number in a single sitting. An efficient way to 

retain the number in long-term memory is the following: 

taking T = 15 minutes, a phone number memorized at time 

t should be rehearsed at times t+T, t+2T, t+4T, t+8T,…, a 

roughly “doubling” schedule [Pimsleur 1967; Woźniak & 

Gorzelańczyk 1994].  These purposeful rehearsals of a 

phone number can stop once the required time-between-

rehearsals exceeds the expected time-between-successive-

uses of the phone number. 

  Suppose the cost of memorizing one 7-digit phone num-

ber is C, and the cost of rehearsing a 7-digit phone number 

is c, c « C.  Then we observe that n phone numbers can be 

memorized at a cost

2

 of Cn + cn·(log n).  If no more than n 



phone numbers are ever memorized, the next sitting will 

require just c·(log n) time, a big drop from (C + c·(log n)), 

and this c·(log n) cost will then decrease slowly to zero. 

Security measures 

The human usability of a password schema is measured by 

its preprocessing and postprocessing (processing) com-

plexities PC1 and PC2.  We define additional measures Q 

and K to evaluate the security of password schema. 

 

1. Quality Q is formally defined by the password game 



below.  Informally, Q is the expected number of randomly 

chosen challenges and associated responses that an adver-

sary must see in order to have sufficient information to 

                                                 

2

 To see this, suppose a new phone number is memorized in each of sit-



tings k = 1…n, and that in sitting k in which we memorize the k

th

 tele-



phone number, that sitting’s work also includes the log k rehearsals, each 

costing c, sufficient to retain all previously memorized phone numbers. 

The cost for this k

th

 sitting will be C + c·(log k).  That is at most Cn + 



cn·log n total for all n phone numbers, reckoned on the sitting in which 

we memorize our last (n

th

) phone number.  




generate the correct response in one of at most 10 tries to 

the next randomly chosen challenge. Our Q’s are typically 

in the range of 6 to 10. By comparison, cryptographic ze-

ro–knowledge protocols have Q = 10

n-1

 for integer pass-



words of length n.  

2. Exponent K is an integer defining a probability, typical-

ly 

 for an integer response to a challenge and 



 

for a response consisting of letters and digits. This 

 

is the probability that an adversary (Alice) can guess the 



correct response to a new challenge once she has seen (in 

previously observed challenge-response pairs) all but one 

of the characters in the current challenge.  For example, 

suppose the schema maps letter strings (challenges) of 

length n to digit strings (responses) of length 2n by hashing 

each letter in the challenge to a pair of randomly chosen 

digits.  Further suppose that on challenge EBAY, the ad-

versary has previously seen E, B, A, but not Y.  Then K = 

2, indicating that the correct response to EBAY is any one 

of 10


K

 = 100 equally likely possibilities. 



The Password Game (for determining Q) 

Recall that the design of a Password schema includes in its 

specification a sample space of permissible challenges, the 

dictionary.  The game is played between the User (he) of 

the schema and the Adversary (she) under the watchful eye 

of a trusted Judge.  There are no winners or losers, just a 

determination of Q. 

Until the Adversary's 10 responses to a challenge is 

guaranteed to contain the correct one:  

• 

The Judge selects a challenge at random (with 



repetition) from the dictionary. 

• 

User and Adversary independently and privately 



supply responses to the Judge.  Here,  

1.

  the User supplies the unique correct response 



to the judge;  

2.

  the Adversary (without seeing the User’s re-



sponse) guesses/computes 10 possible respons-

es, which she supplies to the Judge. 

3.

  If none of the 10 responses are correct, then the 



Judge gives the adversary the correct response. 

End Until 

Return the expected number of challenges, Q, up to and 

including the one on which the Adversary supplied a 

correct response. 

Analysis of human usability and security for 

select schemas 

[DS1] 

Usability.  

Postprocessing: This schema requires the user to apply his 

map to the first letter of the challenge, output the value, 

and shift his “pointer” to the next letter of the challenge, 

each operation costing 1 unit in our model. Thus, for an n-

letter challenge, the Human Usability Measure is HUM = 

#(apply map, output, shift pointer in challenge)·n = 3n, 

where #(x, y, z) denotes the number of steps to perform 

operations x, y, z in this order.  The average time it takes to 

do this is about n seconds on an n letter challenge.  This 

comes to 10 seconds for a 10-letter challenge, which is 

well within the desired 20 seconds (and the required 1 mi-

nute) per password to satisfy requirement PT2. 



Preprocessing: The preprocessing complexity for memo-

rizing a letter-to-digit map, while significant, can be made 

less than 1 hour by use of an appropriate memorization 

technique. Specifying a technique that upper bounds the 

human’s memorization or computation time is an im-

portant aspect of human computability, akin to specifying 

an algorithm to achieve a certain run time. 

  After the user has tossed a 10-sided die 26 times to cre-

ate a random map from 26 letters to 10 digits – an amount 

of time that we do not count toward our maximum 3-hour 

requirement – he can memorize his map.  This he does by 

using a 26 row x 10 column table in which each entry pre-

scribes a visual way to memorize the associated letter to 

digit map.  For example, row W column 6 shows how to 

remember that the chosen map takes W   6: 

         

  

  

  



 

 

   



 

  The authors of this paper can do an initial memorization 

(up to the first successful recall of the entire map) in a con-

centrated 15-30 minutes using this table. Once the user has 

completed an initial memorization of the material, he can 

set a timer to rehearse his map using the [Pimsleur, 1967; 

Woźniak & Gorzelańczyk, 1994; Blocki et al., 2013] rec-

ommended doubling schedule, which rehearses at intervals 

of 15 min, 30 min, 1 hr, 2 hrs, 4 hrs etc.  The first day is 

hell, but after that the rehearsals are short and (relatively) 

painless.  Assuming each rehearsal takes 2 minutes, the 

total time spent over the user’s lifetime is at most an addi-

tional 45 minutes, which comes to a total of one hour. This 

satisfies the PT1 time requirement. 



Security. The adversary can respond correctly to a chal-

lenge only if she has seen all letters in the challenge in pre-

vious challenges. If she hasn’t seen even one letter, the 

chance of guessing the response to a single letter of the 

challenge is at most 1/10, thus K=1. The quality measure Q 

will be the expected number of challenge-response pairs up 

to (and including) the first challenge all of whose letters 

have appeared in earlier challenges. This value depends on 

the dictionary. The table below shows the Q value (expec-

 



tation), its standard deviation and its 90% value (i.e., the 

length of challenge at which the probability that Q is at 

least this value is at least 0.9) for three dictionaries: a large 

English dictionary, the top 500 domain names and random 

7-letter strings. All estimates of Q were done using 

100,000 trials with repetition. 

 

Dictionary Q 



STD 

90%-value 

English 

50K 


6.23 1.80 4 

Top 500 domains 

6.60 

2.03 


Random 


7-letter 

7.54 1.58 6 



Table 1. Q values 

For a dictionary of random strings, Q depends only slightly 

on the length of the challenge, as shown in the next table. 

 

1 2 3 4 5 6 7 8 9 10 



7.09 8.89 9.12 8.82 8.39 7.96 7.54 7.12 6.97 6.91 

Table 2. Q for random strings 

[DS2] 

Usability. Preprocessing for this schema is the same as 

[DS1], requiring the user to memorize a letter-to-digit map.  



Processing involves applying this map, adding mod 10, 

outputting and shifting pointer. More precisely,  

HUM = go to last letter, apply map, go to first letter, apply 

map, add mod 10, output; then repeat n-1 times: (apply 

map, add mod 10, output, shift pointer) 

      = 6.5 + (n-1)(4.5) = 4.5n+2. 

Here we have counted 1.5 per addition mod 10, since some 

additions create 2 digits while others create one, and both 

types are of the same number in expectation. 

Security. From knowledge of the schema and a challenge-

response pair, the adversary can recover all the map values 

of letters in that challenge: 

. Thus the 

parameters K and Q are the same for [DS2] as for [DS1]. 

  

[DS3] 



Usability.  Preprocessing: This schema needs the user to 

memorize both a letter-to-digit map as before (15-30 min 

up front, 1 hour total over the user’s lifetime), and also a 

random permutation on the digits 0-9.  He does this by 

choosing a random cyclic permutation of the digits, i.e., a 

single 10-digit number

 

with no repeated digits. The map 



takes each digit to the next digit, and the last digit to the 

first. This is as hard as memorizing a single 10-digit phone 

number, and takes 2 minutes up front and less than 10 min 

over the course of the user’s lifetime. We note that with 

only the 10-digit number memorized, access to an input 

digit takes longer initially – about 5 steps on average.  The 

user has to scan multiple digits to find the right one.  Each 

time he does this, however, he rehearses his map: this will 

soon give random access to the digit map. 

The  processing complexity is HUM = #(go to last letter, 

apply f, go to first letter, apply f, add mod 10, switch to g, 

apply g, output) + #(select next letter, apply f, add mod 10, 

switch to g, apply g, output, shift pointer)·(n-1) 

      = 8.5 + (n-1)(7.5) = 7.5n+1.  

The “switch to g” is for computing g after the user has 

memorized the map as a linked list but not yet as a random 

access hash.  Its cost would initially be 5 but would drop 

eventually to 1. 



Security. A computationally unbounded adversary can 

consider all possible cycles on 10 digits (9! =362,880 of 

them), and decode the user’s maps as follows: whenever 

she notices an inconsistency in the values of f for a particu-

lar g, that g is eliminated. The quality Q is the number of 

challenges up to and including those for which the surviv-

ing f, g pairs

 

all give the same response. In our evaluations, 



the Q for [DS3] is about the same as for [DS1] and [DS2]. 

Nevertheless, [DS3] provides more security in at least one 

sense: the adversary needs to see multiple challenge-

response pairs before she can determine the value of f for 

even a single letter. 

 

[WS1] 



Usability. Preprocessing: The user has to create a map 

from letters to words of a topic, perhaps aided by a word 

generator, and then rehearse his map. Creating the map 

takes 30mins or so, but after that, the map is surprisingly 

easy to rehearse and to remember since the starting letter 

together with the topic gives a strong clue to the word. 

Budgeting 2min per rehearsal, with spaced repetition at 

doubling intervals, we get a total of at most 45min over the 

user’s lifetime. 

Processing: Recalling that we process k letters of the chal-

lenge and output 2k letters, the processing complexity is 

HUM = #(apply f, output two consonants, shift pointer on 

the challenge)·k = 4k = 2n. This is the lowest complexity 

per output character of all the schemas we discuss. 

Security. As in the case of [DS1], an adversary has to have 

seen each of the first k characters of a challenge in previ-

ous challenges to be able to respond correctly. However, 

each challenge-response pair reveals only k values, where 

k is typically less than the length of the challenge. Using 

k=4 (responses of length 8) gives a significantly higher 

value of the quality Q compared to [DS1, DS2, DS3] that 

use the entire challenge (Table 1).  

 

Dictionary Q 



STD 

90%-value 

English 

50K 


7.25 2.02 4 

Top 500 domains 

7.56 

2.24 


Random 


4-letter 

8.82 2.21 6 



Table 3. Q values for 4-letter challenges 


The security parameter K is between 1 and 3. If there is 

one letter in the new challenge that the adversary has not 

seen, he has to map it to two letters. If these two letters 

were both completely random then his chance would be 

only 

 However, since we 



output consecutive consonants of a word, and consonants 

are not equally likely, the true probability is a bit higher. 

 

[LP1] 

Usability.  Preprocessing: Memorizing a random letter 

permutation by singing it to a tune takes 5 mins initially 

and 1 min per rehearsal, giving a total of 20 additional 

minutes over the user’s lifetime. 



Processing. 

HUM = #(apply map, output, shift pointer)·n = 3n.  

We remark that while we have charged a cost of 1 for ap-

plying the “next-letter” map, this could initially take signif-

icantly longer, as the user would have to recite (sing) an 

expected 13 letters to compute his hash. With time, how-

ever, he will get faster.  This schema will be truly self-

rehearsing if and only if the user recites the entire permuta-

tion each time he generates a password. 

Security. The quality Q is the same as in previous schemas 

that use all n letters of a challenge (Table 1). 

  

[LP2] 

Usability. This schema has the distinct advantage of very 

low preprocessing complexity --- once the user has chosen 

his three words, he has practically got them already memo-

rized! The processing complexity is similar to LP1, con-

verging to HUM = 3n, after sufficient practice. 

Security. The quality Q is potentially lower, since an ad-

versary who knows the schema, after seeing some letters, 

could try to guess the user’s words, which come from a 

significantly smaller set of possibilities than all letter per-

mutations.  

 

[RM] 

Usability. This schema, introduced primarily to demon-

strate the tradeoff between usability and security, requires 

a huge preprocessing effort, random access memorization 

of an n x n matrix of random digits, for n about 8, 9, 10, 

with the user learning a pointer to a value in each position 

(i,j) of the matrix. Unlike all the previous schemas, which 

we have successfully implemented on ourselves, we have 

not done or even tried to do this one. We do believe it is 

possible, given routine memory feats such as learning the 

digits of   up to hundreds of digits. The processing com-

plexity, at least for the natural human algorithm, can be 

computed as follows: for each output digit, make one pass 

over the challenge: apply the matrix map, apply the letter-

to-digit map, multiply mod 10, add mod 10, then shift the 

pointer into the challenge. Thus, for an n-digit output, 

HUM = #((apply M, retrieve f, apply f, multiply, add, shift 

pointer)· , output, shift M pointer)·  

      = 


Here we counted 2 per multiplication since multiplying 

two single digits typically results in two digits. This im-

plementation uses n passes over a challenge of length n. 

The resulting processing complexity is the highest of all 

the schemas we have considered. Is there a faster human 

algorithm? In a later section, we will prove that the number 

of passes needed by any  human algorithm grows linearly 

with the length of the challenge. 

Security. The adversary has to know the entire matrix M 

and the user’s map on the letters of the next challenge, to 

be able to output the correct response with probability 

more than 1/10. Each challenge-response pair gives n 

equations. Thus Q is at least n. An unbounded adversary 

could keep track of all 

 possible letter-to-digit maps f, 

and eliminate them as she finds inconsistencies. A rough 

lower estimate of Q is n + (26/n) = 11.25 for n=8.     

 

[Multi-map schemas] 



Usability. The preprocessing complexity is m times the 

preprocessing complexity of a single map. For letter-to-

digit maps he can memorize a single map from letters to 

m-digit numbers. For letter-to-word maps, he can pick dif-

ferent topics for each map.  

  While processing, first the user has to compute the map 

number t. Then HUM = n+1 if he uses 

. If 


he uses 

, then HUM=5 for identify-

ing the map. The remaining processing cost is that for a 

single map schema. For example, for [DS1], whose pro-

cessing cost for a single map is 3n, the processing cost of a 

multi-map schema would be 3n+5 with the latter rule for t. 



Security. This schema has a Q with expected value that is 

nearly m times the Q-value for a single map. To see this, 

suppose for argument that it takes exactly Q challenge-

response pairs for a single map. If we think of the k maps 

as k bins and each challenge as a ball thrown into a random 

bin, then the game stops when one of the bins reaches Q 

balls. With k bins, we stop as soon as one of them reaches 

Q balls, which will happen slightly before all k of them do. 

 

 

 



 

 

 



 

 

 



 

 

 



Schema Preprocessing  Processing 

HUM  Q 


DS1 15+45 

3n 


6-7 

DS2 15+45 

4.5n+2 

6-7 


DS3 17+55 

7.5n+1 


6-7 

WS1 30+45 

2n 

(=16) 


7-8 

LP1 5+20 

3n 

6-7 


LP2 1+1 

3n 


(eventually) 3-4 

MWS1 m(30+45) 

3n+5 

6·m 


RM ?? 

 + 2n 


11 

Table 4. Comparison of Schemas 


The ultimate proof that a schema is humanly usable is to 

see a human do it within the prescribed constraints. To do 

this, and to validate the HUM, we tested ourselves on most 

of these schemas. These results are in the tables and charts 

below. Just as a computer’s timing depends on what other 

processes it is executing, a human’s speed also depends on 

several factors such as energy level, distraction etc. To 

avoid these confounds, we timed for all the schemas in one 

sitting, using 11 challenges in random order. 

Table 5. Timings for H2 

We compared these observed timings to their HUM val-

ues by plotting the data for [DS1], [DS2], [DS3] and 

[WS1] against the length of the challenge (Figure 1). The 

HUM for [WS1] is 8 for a challenge with at least k=4 dis-

tinct letters, i.e., constant.  

Using the same challenges, we timed 8 other human sub-

jects on [DS1] and [LP1], with each schema timed in a 

separate sitting. The average timings are shown in Figure 

2. The two schemas have comparable timings, consistent 

with their HUM values. We expect them to get even closer 

as the users rehearse their letter permutations. 



 

 

 

 

 

 

A human password schema theorem… 

  Based on the previous section, we assert the following. 

 

THEOREM 1. There exist publishable PASSWORD 

SCHEMAS that are  

1. WELL-DEFINED (i.e., implementable unambiguously 

on a Turing machine),  

2. MACHINE UNCRACKABLE to a well-defined extent 

by unbounded Turing machines (i.e., information- theoreti-

cally, based on the Q value), and 

3. HUMANLY USABLE as witnessed by a feasible human 

algorithm with a bounded HUM, and demonstrated by at 

least one normal dedicated human being with at most 1 

hour of preprocessing, and at most 30 seconds of pro-

cessing (i.e., shown empirically). 



…and an impossibility theorem 

Here we show that any human algorithm that implements 

the [RM] schema has to make a prohibitively large number 

of passes over the challenge, even for some simple M.  

 

THEOREM 2. There exist 

matrices 

 such that for 

any 


, any algorithm that computes 

 by making 

at most   passes over any given  -digit challenge  must 

use a working memory of size at least 

 digits.  

 

The theorem is tight: we can compute n/k digits of output 



in each pass using k digits of working memory.

 

However, 



since the working memory allowed in our human usability 

model is a small constant (2 digits), at least 

 passes are 

needed to compute 

 for a challenge of length  .

 

 



Proof of Thm 2.  Let M be the matrix that "reverses" chal-

lenge 


, i.e., 

. First we 

prove the bound for any 1-pass algorithm.  A 1-pass algo-

rithm that scans   only once cannot output   till it sees the 

last digit of  . Consider the working memory contents S at 

that point. The algorithm is able to compute   from just S 

and 

.  From   we can reconstruct   since M is full rank. 



 

N DS1 DS2 DS3 WS1 LP1  LP2 

CMU 



3 6 16 



5 5 6 

APPLE 


4 6 22 


6 10 

GMAIL 



4 9 19 


6 8 18 

DELTA 


4 6 14 


6 7 19 

CHASE 


5 7 16 


6 14 

15 


AMAZON 

7  10 24 7  15 17 



GATECH 

6 8 23 



6 14 

24 


PAYPAL 

5 8 20 



7 9 17 

DROPBOX  7 

8  11 22 6  14 18 

PNCBANK  7 

6  13 25 7  18 21 

WELLFARGO 

10 

10 18 29 5  17 35 



Figure 1. Timing comparison in one sitting 

Figure 2. Average times of 8 participants 

 

 




So S and 

 together determine  , which means S must 

have at least n-1 digits, and along with reading 

 (which 


is essential), this is a total of n digits of memory.  

  To analyze multi-pass algorithms, we take the challenge 

to be a random digit sequence. For a proof by contradic-

tion, suppose that the algorithm uses only   digits of 

memory for 

. Then by the above argument, after 

one pass, the algorithm can output at most   digits of  , 

and these must be a subset of the s-prefix of  . In the se-

cond pass, consider the point at which the digit 

 is out-


put. After this, due to the space restriction, at most s-1 dig-

its can be output in the second pass, so at most a 2s-long 

prefix of   is output after two passes. This conclusion 

clearly holds if 

 is not output in the second pass. We 

repeat this argument, using 

 in the i

th

 pass, con-



cluding 

that 


k+1 

passes 


are 

needed.         ■ 

 

  We remark that for a random matrix M, the above proof 



technique shows that with probability at least 9/10 (over 

the choice of M), any 1-pass algorithm needs memory of 

size at least n. This is because the last entry of the first row 

is nonzero with probability 9/10; in this case, the algorithm 

can start its output only after reading the last digit, at which 

point it needs as much memory as its total output.  

  This lower bound technique applies to many natural 

matrices. For other matrices, a human algorithm making 

only one or two passes can compute the product C

·

M.  E.g., 

DS2 uses the matrix with 1’s on and below the diagonal 

and zeros above it.  This raises two questions for future 

research: for which matrices can C



·

M be computed by a 

human algorithm? Which of these gives the most security?  



Further considerations 

Using letter frequencies. The frequency of letters in Eng-

lish words is highly skewed, and the first 10 letters by fre-

quency make up more than 70% of all occurrences. Thus, 

an easier memorization option that works nearly as well as 

a complete map is to learn a map for just the first ten most 

frequent letters: E, T, A, O, I, N, S, H, R, D. In the case of 

a letter-to-digit map, the corresponding digits could be read 

off as a single phone number. For a letter-to-letter map, we 

recommend mapping only to consonants, then inserting 

letters to make the 10 consonants form words and sounds 

that are more familiar and therefore easier to remember. 

 

Changing passwords. Passwords sometimes have to be 

changed. How should a user who is using one of the pass-

word generation methods described here change his pass-

word(s)? The simplest, oft-suggested method to change 

passwords is to attach an extra character at the beginning 

or end of the password and cycle through its chosen range 

of values, e.g., 0,1,2,0,1,2…. This has a serious problem: a 

new password that differs from the old one in only one 

digit will not be particularly secure.  

  A more effective method to change passwords is to set 

challenge  equal to one of challenge0, challenge1, … chal-

lenge9 (or 0challenge, 1challenge, …). These challenge  

will turn into passwords that provably look quite different 

from each other.  This is in fact a desideratum for password 

generation methods. 

 

Acknowledgements

 

We are deeply grateful to Jeremiah Blocki, Lenore Blum, 

Anupam Datta, Adam Kalai, V. V. Rao and Haofeng 

Zhang for helpful discussions, and to many others for try-

ing out some of these schemas. This work was partially 

supported by the NSF.



  

References 

Blocki, J. 2014. Usable Human Authentication: A Quantitative 

Treatment. Ph.D. Thesis.  CMU-CS-14-108.pdf 

Blocki, J., Blum M. & Datta, A. 2013. Naturally Rehearsing 

Passwords. ASIACRYPT (2): 361-380.  

Bonneau, J. 2012. The science of guessing: analyzing an anony-

mized corpus of 70 million passwords. IEEE Symposium on Secu-

rity and Privacy (SP), 538–552. 

Bonneau, J., Herley, B., Oorschot, P. C. v., and Stajano, F. The 

Quest to Replace Passwords: A Framework for Comparative 

Evaluation of Web Authentication Schemes 2012. IEEE Sympo-



sium on Security and Privacy, 553 – 567. 

Cranor, L.F. What’s wrong with your pa$$w0rd? 2014. TED talk

http://www.ted.com/talks/lorrie_faith_cranor_what_s_wrong_wit

h_your_pa_w0rd/transcript?language=en 

Li, Z., He, W., Akhawe, D. and Song, D. 2014. The Emperor's 

New Password Manager: Security Analysis of Web-based Pass-

word Managers. USENIX Security 2014: 465-479. 

Ives, B., Walsh, K. R., and Schneider, H. 2004. The Domino 

Effect of Password Reuse. Communications of the ACM, April 

2004, 47(4), 75-78.  

Kruger, H., Steyn, T., Medlin, B. and Drevin, L. 2008. An empir-

ical assessment of factors impeding effective password manage-

ment. Journal of Information Privacy and Security, 4(4):45–59. 

Pimsleur, P. 1967. A Memory Schedule. The Modern Language 



Journal, 51(2), 73—75. 

Shay, R., Komanduri, S., Durity, A. L., Huh, P. (S.), Mazurek, M. 

L., Segreti, S. M., Ur, B., Bauer, L., Christin, N. and Cranor, L. F. 

2014.  Can long passwords be secure and usable? CHI 2014: 

2927-2936. 

Woźniak, P. A., & Gorzelańczyk, E. J. 1994. Optimization of 



repetition spacing in the practice of learning. Acta Neurobiologi-

ae Experimentalis, 54, 59-62. 

Yüklə 0,7 Mb.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə