With this Blog finally finished moving to it's new .tech domain (announcement later) it's time to write some techy stuff.
So I think I'll be doing some tutorials. And the first one is about somethink I like, I use and I like using. Secure Authentication!
So, this post is about what a webmaster can do to make his website secure and at the same time not too annoying for the user.
Step 1: Password storage
this is a big thing.
First and foremost, you need to store the passwords of the users in a secure way so that even you have no way of accessing the passwords in the database. This means in General that you need to hash the password (which basically means, to create an overly complicated checksum of it)
For hashing there's like a metric ton of different algorithms available, MD5, SHA1,SHA256, bcrypt and MANY more, so which are good?
In general it's best to use an algorithm made for password hashing, which usually also means that it has a work factor and are made to be SLOW, yes you read that right, not fast but SLOW, and I mean it. on systems where only a single user is active, one can go and make it so hashing a password can take a few seconds on that system, on multi-user systems like web servers (in this case any client can be a user that requests to login) you dont have that luxury.
Why that? simple. the longer it takes to try a password against a hash, the slower any attempts at bruteforcing the damn thing will be in total, a LOT slower. If you are the real user, waiting a few splitseconds isnt a lot but it is a lot more for trying out a literal ton of passwords because these numbers are going to add up.
someone tested MD5 (which sadly still is used way more often than it should [pretty much each time it is used is 10 times too many]) has been tested and had a Result of 200 Giga-Hashes per second (which basically means that in one second someone can try 200 BILLION passwords put whether they are correct or not) on 8 Nvidia GTX1080s, and while 8 big GPUs are not really a cheap thing, for someone who really wants those passwords, this is nothing) other algorithms made to be slow can be accellerated, but definitely not in such an extreme factor without going into extreme costs like supercomputers.
in that benchmark there are also numbers for bcrypt, which are intresting so we have a good comparison. but we need to note that bcrypt in this case uses a work factor of 5 which is even less than the default back in the day of 6. more common today is 11 or 10 on higher used systems, and as the number is the exponent of 2, a difference of 6 means that we have to divide that hash values by 2^6=64. the benchmark states 105.7 Kilohashes per second on bcrypt with level 5. this divided by 64 is about 1.65 Kilohashes per second, meaning from over 2 BILLION hashes down to less than 2 Thousand. this means it is 121 279 091 times slower, this means what takes one second with MD5, takes 200 WEEKS (a bit less than 4 years) with bcrypt, and md5 definitely isnt getting slower in the future.
A common choice is bcrypt because of its big availability in languages like PHP, and it's generally not bad its work factor can still probably go for quite a while, but it has one sole problem. it's only computation intensive, but doesnt ask for a lot of quick memory meaning that you can throw together a machine with enough GPUs and have your fun that way, as stated above. still a lot better than traditional hashes but we can be even better.
Memory-hard hashes. scrypt has become pretty well known for this, and while it isnt made for password storage (it's actually a password based Key-derivation function) but it isnt bad, but it does have its problems. Aside from sycrypt, recently (it was a few years ago but historically speaking that is recent) there was a competition for password hashing algorithms. the Winner, Argon 2 is simple to make memory-hard, and specifically that makes it a lot harder to do ASICs or GPU hashing because RAM is pretty expensive compared to computation power like GPUs, which apparently work pretty well for normal hashes.
And it gets better. with Version 7.2 PHP's password_hash() function adds argon2. Sadly it's just argon2i which is vulnerable to some time-computation tradeoffs (which basically just means that you save memory by throwing in more computation power on the problem). As far as I have read it isnt too bad yet, but argon2id would have made a nice addition as well as there's less tradeoff possibilities here.
but even as good as the hashes can be, there's one thing someone should keep in mind. the cost of the hash. Obviously you dont want to set it low because the point of password hashes is to be SLOW and MASSIVE, but on the other hand, you dont wanna overkill it because you dont want your users to wait or expose your server to some heavy DDoS.
especially on shared webhosters finding a good line is a problem, because they usually dynamically allocate ressources and if no one else is around at the time this means that you can use resources of others, also some providers of shared servers apparently dont like it if you just use your allotted resources fully all the time, even though they rent to you precisely those resources. in such cases, it might be helpful to ask your provider what they think would be the best cost settings, and if they dont want you to use even half-decent settings, better switch your provider to someone who is better with that.
on your own server (or a rented dedicated one), or on full root-virtual servers that dont have any "boosts" in your plan, finding the line is a lot easier because you cant go into the resources of others, and especially on dedicated servers they cant be overprovisioning because you rented access to the full machine so any stupid ideas of getting more v-servers on a server than it could handle when all users are really using those are out, because you are completely alone on that machine.
This means you can literally go all-out and throw together a benchmark script to see how far we can go.
but before that. on PHP and possible other environments that can enforce a per execution memory usage, you might wanna bump the value to as high as you are comfortable with to keep at least 5 users on a small site, more on a larger site, able to login at the same time.
Pro Tip: if it is a website where only you any maybe a few other people who all have no problem with 2FA can login, you could go for the solution of Requiring 2FA at the same time or before you can input your password, which lowers the possibility of anyone attacking your password space, meaning, you can go for more memory.
after that I would loop argon2 hashing and verification functions from 16MB as a small start to whatever you are comfortable with at a time modifier of 1 and on all the threads you can spare, each time doubling the memory, for normal user authentication I would stop when you reach 0,1 to 0,2 seconds of time for the hashing and verification each. if you reached your maximum memory amountbefore reaching the time threshold, go and raise the time period one by one until you reach that time. It's also very important that you do this test on the same environment (e.g. php on apache) on the same or similar machine as what you will be actually using, otherwise, your results are skewed, which you dont want.
TLDR: use argon2 if you can, otherwise go with bcrypt. in any case, use as many resources as you can without getting yourself vulnerable to DoS
Step 2: Session Security
This thing is HUGE. even if your passwords are safely stored, if an attacker can just go and forge a session, it'll be bad, that's why we need some way to secure the session.
there are many ways of doing that.
- Using a random cookie and keep the session data local, nice in general, just if an attacker finds out that random secret it's over until the session ends.
- giving the client a signed (or otherwise tamper-resistant) cookie and fully relying on that, generally not bad, but you still need a way to retract those.
- using a tamper-resistant cookie with Server-side data, NOW we are talking.
Personally I like the concept of JWTs and because they are not really hard to use if you have json, base64url (or base64 and a character replace command), and HMAC. the point is that you basically store your data as a JSON, encode it in base64url, add a header, and a signature of everything which is one LONG string which contains everything.
Basically the signature makes sure that the cookie was something you gave out and that it hasnt been tampered with, this is pretty important, because that lowers the chances of anyone bruteforcing their way in DRASTICALLY.
but a nice token isnt everything. you obviously need to make sure that these tokens can be invalidated before they expire, for example if a user logs out, throwing the token into oblivion isnt enough, the server needs to know that the token is dead and reject it.
one way of doing this is keeping a database with session data and check the token against yoour db. another thing you could do especially for long-time sessions like a user having the "remember me" function active, is to "ratchet" the cookies from time to time, meaning that after a certain time since the last action happen, you swap the old cookie with a new one, meaning a cookie, even if stolen, gets useless after a while. the workings of this would be fairly simple. first, you would obviously need to define how often you swap the cookie. then as soon as the ratchet time has come and the user does an action, you replace something which identifies the users session with something else and keep that synchronized on both server and cliient.
I for example have in one of my auth PHP scripts both the standard numeric session-IDs for quickly identifying a session, and an encrypted "session secret" inside my JWTs which has to match the hashed session secret inside the database, and in this case, it would be enough to swap out the session secret and therefore making sure that old session secrets dont work anymore, and I could even take this one step further. if someone tries to access a session with an old but cirrectly signed cookie, I can grill the session and give the user a warning that the session has been potentially stolen and no matter whether the attacker has the recent or the old session cookie, he wouldnt get in because the session doesnt work anymore.
Step 3: Secure Authentication for Users
I slided along this topic already a bit on step one but one thing that REALLY helps securing your site is giving the users tools to secure themselves. Especially people who like security will thank you for that.
So, what can you do? First and foremost, do something about 2 Factor authentication. This is a simple, yet very effective solution to make your site a lot more secure against people who had keyloggers or other password stealers.
There are 2 things a website should implement at minimum for 2FA.
The first is the good old TOTP standard, a pretty intresting, yet simple way to secure your website, and it isn't even THAT hard to implement, and many security-aware users know it.
it basically works by using a shared secret, transferred in Base32, but used in binary as a key and the unixtime, as a 64 bit value as the input for an HMAC-SHA1 (SHA256 and SHA512 are also in the standard but implemantations usually lack these) and then a bit of fing out which part of the hash to actually use for the crazy math the follows and then you get your (usually 6-digit) number which you can enter. There are implementations for pretty much anything on this planet and even if there isnt, it isnt too hard to implement yourself, especially if you take a look on implementations in other languages.
The second one is U2F, which I also tend to call "foolproof smartcard", because that's basically what it is. if your browser supports U2F you just plug the thing in, perhaps press some kind of button on the stick and you are logged in.
This thing is a lot more secure than your average TOTP since it is a smartcard at heart, which means copying the keys from it are virtually impossible without destroying the stick and by that alerting the user because it's lost or broken and since it requires interaction on the hardware level (really cheap sticks have to be pulled and plugged in again to be used again, and better ones have some kind of button to activate the U2F function) this things are literally malware-proof, unless they catch you doing an authentication, which definitely isnt the easiest thing to do and since there are easier targets it is probably safe to assume that nobody will be trying stuff like this soon.
but on the other hand it isnt a problem if you just keep your stick plugged in, unlike some tokens which will automatically act upon request, because of this hardware interaction. This makes U2F one of the safest things you can offer to the users, and since there are no shared secrets, a server breach can at best go for any replace or add new device attacks but this is something the user can be infomed about and therefore notice.
Step 4: HTTPS
why does this come so late? well building security begins with building the website and HTTPS is something you do when the site is finished.
Anyway, HTTPS is one of THE MOST IMPORTANT things you can give to your website. basically it secures all the traffic and makes sure that there's no fool in the middle trying out some bad ideas. especially when transmitting passwords and other potentially sensitive data this is HUGELY IMPORTANT.
Long Story short, HTTPS is Encryption, aka confidentiality, meaning that anything the user sends to you, can be READ by you and ONLY BY YOU (or rather, your server), and Authentication and Integrity, meaning that what the user sees is truly SENT BY YOU (or rather, your server) and HASNT BEEN TAMPERED WITH.
Depending on what you do on your website you want want to go for one of 3 certificate types:
- Domain Validation (DV), simple and cheap
- Organization/Identity validation (OV/IV), not so cheap, but needs validation
- Extended Validation (EV), expensive, needs A LOT of validation
all certificates give the same advantages listed above, although they all answer one simple, yet important question differently:
"Who is You?"
in case of DV, the Identity in the certificate is essentially just the domain name, for example in case of this blog, it's just "blog.my1.tech", the certificate doesnt contain any information about who I am in real life, just the info that I am whoever has control of this domain. This type of identity is the most simple to validate, and therefore these certs are cheap and quickly made. Also these can be obtained for free by services like Let's Encrypt.
But, depending on what you do, this is enough. in case of a simple blog like me, I need a safe login and I can make sure that for example the email addresses of anyone who comments arent seen in transit or that the comment isnt changed or whatever.
Next we have IV and OV, which is a more or less basic validation of the Real-Life entity of the website, it obvoiusly contains DV, since the cert is also limited to specific domain, but it does have Identity infomation stored.
Sadly, the average user wont see any difference between and OV/IV and a DV, but by viewing the certificate info, you can see this. for example on amazon.de you can see that it was issued to Amazon.com, Inc. in the USA, for services like amazon which are primarily known by their domain name anyway this isnt much, but for things that are more known in Real Life like for example the store chain Saturn, this makes a difference because you can clearly associate that the website is truly operated by them and not that somebody else has got the name, but since this cert needs some validation of your identity or the organization controlling the website, it takes a while to get these and these are more expensive than DVs.
The Last, and most expensive type of certificate are Extended ones, I honestly dont see the point of those because the Requirements of EVs explicitly exclude pretty much the CAs use to compare EVs against the other cert types. EVs specifically arent meant to prove the business is active, legal, trustworthy or safe in any way, so aside from the green bar that most browsers will show and the fact that browsers are a bit more diligent when checking the revocation, there doesnt seem to be much technical benefit from EVs.
I would say if you are a bank or similar organization, go for EV just for the green bar, because users will (sadly) assume that the site is safer, and at the very least phishing is harder because of the green bar (at least unless you are using internet explorer) because CAs that can get the green bar are compiled in the browser, and the absence of the green bar should be a pretty strong sign.
Otherwise if you are a small non-commercial site like a forum or a blog like this, just go with DV and use a free cert. unlike CAs like to lie to you, there's no difference between a Let's encrypt and for example a comodo DV cert except that maybe other archaic browsers dont trust it, but as a rule of thumb, if a browser doesnt trust Let's Encrypt over it's Cross-Signing with IdenTrust, don't use it.
Let's Encrypt as of now runs over an intermediate certificate by IdenTrust, which according to it's Root certificate is active since 2000, and while it does take time to get trust, aside from browsers in game consoles, which are badly maintained and properly pretty old anyway there's no half-modern computer system or browser which doesn't have IdenTrust in their Trusted CA list.
Back to the Topic of which cert to use. If you run a shop or so, I would see whether an OV is possible, if yes, go for that, otherwise go for DV so that you at least have something, and get an OV Later.
Well that's it for now. if you have more Ideas just throw a comment.
I hope you read again!