Compare commits

..

No commits in common. "43b2b1f813274c09be56678a4e1eca29eed7c6cf" and "6f9e7174819f2e0387443abafaa8f27ff7ecdae4" have entirely different histories.

6 changed files with 137 additions and 3941 deletions

View file

@ -1,6 +1,6 @@
pageSetup:
size: null
width: 12cm
width: 16cm
height: 9cm
margin-top: 0cm
margin-bottom: 0cm
@ -44,7 +44,7 @@ styles:
table-heading:
parent: heading
backColor: #666666
backColor: black
alignment : TA_LEFT
code:

Binary file not shown.

Before

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

3787
slides.pdf

File diff suppressed because one or more lines are too long

View file

@ -1,175 +1,158 @@
Surviving phishing
------------------
Password reuse, password managers and strong passwords
======================================================
.. contents:: :depth: 1
Plotting with Matplotlib
------------------------
Why is Password Reuse a Problem?
--------------------------------
.. image:: password_reuse_1.png
:height: 6.5cm
Also creating a presentation with rst2pdf
=========================================
Consider the following hypothetical users that reuse a strong password in
most places and the following common scenario:
Data Structures
---------------
Favour simpler data structures if they do what you need. In order:
+------------------+--------------------------+
| User | Password |
+==================+==========================+
| mark1@gmail.com | QUo5Qt+1Wa/Q1smDJRDbFg== |
+------------------+--------------------------+
| mark2@gmail.com | +9Hz+/20rVkSkbcsmgdVFw== |
+------------------+--------------------------+
| mark3@gmail.com | wnYkRcbi7Kkh7Fx2uR8EeA== |
+------------------+--------------------------+
#. Built-in Lists
- 2xN data or simpler
- Can't install system dependencies
#. Numpy arrays
- 2 (or higher) dimensional data
- Lots of numerical calculations
#. Pandas series/dataframes
- 'Data Wrangling', reshaping, merging, sorting, querying
- Importing from complex formats
#. User registers an account with a careless service, eg Facebook, Yahoo,
Google, Equifax etc. etc.
#. The service is hacked and the password and email is leaked
#. The hacker logs in to the email account
#. The hacker resets passwords on all important accounts tied to that email
address
Shamelessly stolen from https://stackoverflow.com/a/45288000
Loading Data from Disk
----------------------
Natively
========
.. code-block:: python
>>> import csv
>>> with open('eggs.csv', newline='') as csvfile:
... spam = csv.reader(csvfile,
... delimiter=' ',
... quotechar='|')
... for row in spam:
... # Do things
... pass
Loading Data from Disk
----------------------
Numpy
=====
.. code-block:: python
>>> import numpy
>>> spam = numpy.genfromtxt('eggs.csv',
... delimiter=' ',
... dtype=None) # No error handling!
>>> for row in spam:
... # Do things
... pass
``numpy.genfromtxt`` will try to infer the datatype of each column if
``dtype=None`` is set.
``numpy.loadtxt`` is generally faster at runtime if your data is well formated
(no missing values, only numerical data or constant length strings)
Loading Data from Disk
----------------------
Numpy NB.
=========
**Remind me to look at some actual numpy usage at the end**
- I think numpy does some type coercion when creating arrays.
- Arrays created by ``numpy.genfromtxt`` can not in general be indexed like
``data[xstart:xend, ystart:yend]``.
- Data of unequal types are problematic! Pandas *may* be a better choice in
that case.
- Specifying some value for ``dtype`` is probably necessary in most cases in
practice: https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
Loading Data from Disk
----------------------
Pandas
======
.. code-block:: python
>>> import pandas
>>> # dtype=None is def
>>> spam = pandas.read_csv('eggs.csv',
... delimiter=' ',
... header=None)
>>> for row in spam:
... # Do things
... pass
``header=None`` is required if the flie does not have a header.
About password strength
-----------------------
How is strength measured?
=========================
'Entropy' `s` depends on the size of the alphabet `a` and the length `n` of the
password:
Generating Data for Testing
---------------------------
.. math::
s = log_2(a^n)
Generating the data on the fly with numpy is convenient.
* 0889234877724602 -> 53 bits
* ZeZJieatdH -> 60 bits
.. code-block:: python
Why are weak passwords problematic?
===================================
Weak passwords are trivial to crack in many situations. A password with 53 bits
may be cracked by a criminal organisation in less than an hour.
>>> import numpy.random as ran
>>> # For repeatability
>>> ran.seed(7890234)
>>> # Uniform [0, 1) floats
>>> data = ran.rand(100, 2)
>>> # Uniform [0, 1) floats
>>> data = ran.rand(100, 100, 100)
>>> # Std. normal floats
>>> data = ran.randn(100)
>>> # 3x14x15 array of binomial ints with n = 100, p = 0.1
>>> data = ran.binomial(100, 0.1, (3, 14, 15))
Plotting Time Series
--------------------
Plot data of the form:
.. math:: y=f(t)
What about strong passwords?
============================
They are difficult to remember, a problem especially when you use a different
strong password for every service. You are also tempted to write them down, or
reuse them.
It's surprisingly difficult for humans to generate good passwords!
A strong password, as of 2019, has at least 80 bits of entropy.
Password Managers to the Rescue!
--------------------------------
Password managers allow you to create a unique and strong password for every
service.
Additional benefits:
* Remembers passwords for you
* Generates passwords for you
* Automagically fills in passwords on websites for you, this is important!
* Makes passwords available on all your configured devices
* Can store additional related data, usernames, answers to security questions,
pins for debit/credit cards
Any of the mainstream password manager is equivalent in the above respects.
Can you trust password managers?
--------------------------------
Yes*
How do they keep passwords secure?
----------------------------------
1. User supplies a password
2. A slow function derives an encryption key
3. The encryption key is used to encrypt/decrypt your passwords
Security of the encryption depends on the strengh of your
password:
+---------+------------------------+
| Entropy | Time to crack, |
| | assuming 1 second per |
| | attempt per typical |
| | CPU |
+=========+========================+
| 50b | < 1 Month |
+---------+------------------------+
| 60b | ~ 50 Years |
+---------+------------------------+
| 70b | ~ 50,000 yers |
+---------+------------------------+
Generating a Strong Password
----------------------------
Passphrases are better than passwords:
* Tr0ub4dor&3 -> 28 bits of entropy, hard to remember
* correct horse battery stable -> 44 bits of entropy, easy to remember
If you have to remember it, use a passphrase.
Generate passphrases with Diceware_
===================================
1. Roll 5, 6 sided, *physical* dice
2. Read the numbers left to right
3. Find the word with that number on a list 6^5 (7776) words
4. Repeat until desired length is reached. For a password manager, use at
least 7.
5. Write down your passphrase on paper and keep it somewhere secure
6. If you are 100% confident that you will not forget the passphrase, destroy
the paper by burning
What about phishing?
====================
* A password manager will refuse to fill out a password on a spoofed website,
for instance faceb00k.com vs facebook.com
* Using different passwords on every service protects all other services even
if phishing is successful on one of them
* Good password managers will navigate to the login page for you, reducing the
risk of spoofed websites
Subplots
--------
Other advice
Saving Plots
------------
In no particular order:
* Only log in on webpages that you navigated to by typing in the url yourself,
by searching on google, duckduckgo or some other reputable search engine or
from a bookmark. If after clicking a link in an email you are directed to a
log in page, it's probably a phishing attempt
* Only log in to webpages that are protected by SSL/TLS (HTTPS). Look for a
green address bar, or a green lock icon or similar in your browser
* Use two factor or two step authentication everywhere if possible
* Turn of automatic image rendering. Better still, disable HTML rendering and
authoring entirely in your email client
* Be suspicious of *all* emails. Risky things: HTML email, images, unknown
sender, poor spelling/grammer, 'Your email client can't display this email,
click here to view in your browser' or similar attempts to coerce you to click
on things
So far I've just displayed plots with ``plt.show()``. You can actually save
the plots from that interface manually, but when scripting, it's convenient
to do so automatically:
.. code-block:: python
>>> # Some plotting has previously occured
>>> plt.savefig('eggs.pdf', dpi=300, transparent=False)
The output format is interpreted from the file extension.
The keyword arguments are optional here. Other options exist.
Error Bars
----------
Stacked Bar Graph
-----------------
Resources
---------
NumPy User Guide: https://docs.scipy.org/doc/numpy/user/index.html
`EFF notes on Diceware`_ They generally have good advice for these kinds of
topics.
NumPy Reference: https://docs.scipy.org/doc/numpy/reference/index.html#reference
`This Presentation`_
Matplotlib example gallery: https://matplotlib.org/gallery/index.html
`Keepass`_, an offline password manager
Pandas: It probably exists. Good luck.
`1Password`_, a pay to use password manager with some nice features
`LastPass`_, an online password manager with a gratis tier
.. _Diceware: http://world.std.com/~reinhold/diceware.html
.. _EFF notes on Diceware: https://www.eff.org/dice
.. _This Presentation: https://git.friedersdorff.com/max/intro_dice_and_pmgmnt
.. _Keepass: https://keepass.info/
.. _1Password: https://1password.com/
.. _LastPass: https://www.lastpass.com/
.. target-notes::
This presentation: https://git.friedersdorff.com/max/plotting_with_matplotlib.git