Add content for slides

This commit is contained in:
Maximilian Friedersdorff 2019-05-30 13:43:56 +01:00
parent 9d72f9f066
commit 7d3423f860
4 changed files with 67 additions and 140 deletions

BIN
password_reuse_1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

BIN
password_reuse_2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

BIN
password_reuse_3.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

View file

@ -1,158 +1,85 @@
Plotting with Matplotlib Why is Password Reuse a Problem?
------------------------ --------------------------------
.. image:: password_reuse_1.png
.. image:: password_reuse_2.png
.. image:: password_reuse_3.png
Also creating a presentation with rst2pdf About password strength
========================================= -----------------------
How is strength measured?
=========================
'Entropy' `s` depends on the size of the alphabet `a` and the length `n` of the
password:
Data Structures .. math::
--------------- s = log_2(a^n)
Favour simpler data structures if they do what you need. In order:
#. Built-in Lists * 0889234877724602 -> 53 bits
- 2xN data or simpler * ZeZJieatdH -> 60 bits
- Can't install system dependencies
#. Numpy arrays
- 2 (or higher) dimensional data
- Lots of numerical calculations
#. Pandas series/dataframes
- 'Data Wrangling', reshaping, merging, sorting, querying
- Importing from complex formats
Shamelessly stolen from https://stackoverflow.com/a/45288000 Why are weak passwords problematic?
===================================
Loading Data from Disk Weak passwords are trivial to crack in many situations. A password with 53 bits
---------------------- may be cracked by a criminal organisation in less than an hour.
Natively
========
.. code-block:: python
>>> import csv
>>> with open('eggs.csv', newline='') as csvfile:
... spam = csv.reader(csvfile,
... delimiter=' ',
... quotechar='|')
... for row in spam:
... # Do things
... pass
Loading Data from Disk
----------------------
Numpy
=====
.. code-block:: python
>>> import numpy
>>> spam = numpy.genfromtxt('eggs.csv',
... delimiter=' ',
... dtype=None) # No error handling!
>>> for row in spam:
... # Do things
... pass
``numpy.genfromtxt`` will try to infer the datatype of each column if
``dtype=None`` is set.
``numpy.loadtxt`` is generally faster at runtime if your data is well formated
(no missing values, only numerical data or constant length strings)
Loading Data from Disk
----------------------
Numpy NB.
=========
**Remind me to look at some actual numpy usage at the end**
- I think numpy does some type coercion when creating arrays.
- Arrays created by ``numpy.genfromtxt`` can not in general be indexed like
``data[xstart:xend, ystart:yend]``.
- Data of unequal types are problematic! Pandas *may* be a better choice in
that case.
- Specifying some value for ``dtype`` is probably necessary in most cases in
practice: https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
Loading Data from Disk
----------------------
Pandas
======
.. code-block:: python
>>> import pandas
>>> # dtype=None is def
>>> spam = pandas.read_csv('eggs.csv',
... delimiter=' ',
... header=None)
>>> for row in spam:
... # Do things
... pass
``header=None`` is required if the flie does not have a header.
What about strong passwords?
============================
They are difficult to remember, a problem especially when you use a different
strong password for every service. You are also tempted to write them down, or
reuse them.
Generating Data for Testing It's surprisingly difficult for humans to generate good passwords!
---------------------------
Generating the data on the fly with numpy is convenient. Password Managers to the Rescue!
--------------------------------
Password managers allow you to create a unique and strong password for every
service.
.. code-block:: python Additional benefits:
>>> import numpy.random as ran * Remembers passwords for you
>>> # For repeatability * Generates passwords for you
>>> ran.seed(7890234) * Automagically fills in passwords on websites for you, this is important!
>>> # Uniform [0, 1) floats * Makes passwords available on all your configured devices
>>> data = ran.rand(100, 2) * Can store additional related data, usernames, answers to security questions,
>>> # Uniform [0, 1) floats pins for debit/credit cards
>>> data = ran.rand(100, 100, 100)
>>> # Std. normal floats
>>> data = ran.randn(100)
>>> # 3x14x15 array of binomial ints with n = 100, p = 0.1
>>> data = ran.binomial(100, 0.1, (3, 14, 15))
Plotting Time Series Any of the mainstream password manager is equivalent in the above respects.
--------------------
Plot data of the form: Can you trust password managers?
--------------------------------
Yes*
.. math:: y=f(t) How do they keep passwords secure?
----------------------------------
1. User supplies a password
2. The password is used to derive an encryption key. This process is designed
to be slow, even on modern hardware
3. The so generated encryption key is used to encrypt/decrypt your passwords
Note that the security of the encryption depends on the strengh of your
password. With a poor password (50 bits), it would take the entire computing
power of the world less than a month to crack the database. With a decent ish
password (60 bits), it would take on the order of 50 years on average. With a
better password (70 bits), it would take on the order of 50,000 years.
Subplots Generating a Strong Password
-------- ----------------------------
Passphrases are better than passwords:
* Tr0ub4dor&3 -> 28 bits of entropy, hard to remember
* correct horse battery stable -> 44 bits of entropy, easy to remember
Saving Plots Use passphrases everywhere you have to remember.
------------
So far I've just displayed plots with ``plt.show()``. You can actually save Generate passphrases with Diceware
the plots from that interface manually, but when scripting, it's convenient ==================================
to do so automatically: 1. Roll 5, 6 sided, *physical* dice
2. Read the numbers left to right
.. code-block:: python 3. Find the word with that number on a list 6^5 (7776) words
4. Repeat until desired length is reached. For a password manager, use at
>>> # Some plotting has previously occured least 7.
>>> plt.savefig('eggs.pdf', dpi=300, transparent=False) 5. Write down your passphrase on paper and keep it somewhere secure
6. If you are 100% confident that you will not forget the passphrase, destroy
The output format is interpreted from the file extension. the paper by burning
The keyword arguments are optional here. Other options exist.
Error Bars
----------
Stacked Bar Graph
-----------------
Resources
---------
NumPy User Guide: https://docs.scipy.org/doc/numpy/user/index.html
NumPy Reference: https://docs.scipy.org/doc/numpy/reference/index.html#reference
Matplotlib example gallery: https://matplotlib.org/gallery/index.html
Pandas: It probably exists. Good luck.
This presentation: https://git.friedersdorff.com/max/plotting_with_matplotlib.git