> Another important file is _users.csv which contains user credentials and roles. It has the same format as other resources, but with a special _users collection name. There is no way to add new users via API, they must be created manually by editing this file:
> Here we have user ID which is user name, version number (always 1), salt for password hashing, and the password itself (hashed with SHA-256 and encoded as Base32). The last column is a list of roles assigned to the user.
I haven't had to handle password hashing in like a decade (thanks SSO), but isn't fast hashing like SHA-256 bad for it? Bcrypt was the standard last I did it. Or is this just an example and not what is actually used in the code?
Like others have guessed, I limited myself to what Go stdlib offers. Since it's a personal/educational project -- I only wanted to play around with this sort of architecture (similar to k8s apiserver and various popular BaaSes). It was never meant to run outside of my localhost, so password security or choice of the database was never a concern -- whatever is in stdlib and is "good enough" would work.
I also tried to make it a bit more flexible: to use `bcrypt` one can provide their own `pennybase.HashPasswd` function. To use SQLite one can implement five methods of `pennybase.DB` interface. It's not perfect, but at the code size of 700 lines it should be possible to customise any part of it without much cognitive difficulties.
Golang does not have built in SQLite. It has a SQL database abstraction in the stdlib but you must supply a sqlite driver, for example one of these: https://github.com/cvilsmeier/go-sqlite-bench
However using the stdlib abstraction adds a lot of performance overhead; although it’ll still be competitive with CSV files.
Well the project goal seems to be extreme minimalism and stdlib only, and the choice of human readable data stores and manually editing the user list suggests a goal is to only need `vim` and `sha256sum` for administration
Fast hashing is only a concern if your database becomes compromised and your users are incapable of using unique passwords on different sites. The hashing taking forever is entirely about protecting users from themselves in the case of an offline attack scenario. You are burning your own CPU time on their behalf.
In an online attack context, it is trivial to prevent an attacker from cranking through a billions attempts per second and/or make the hashing operation appear to take a constant amount of time.
Ok, but if the passwords are stored in a broken sha hash, and the server is compromised, how do api keys prevent users who use “packers123” for every site from having their passwords exposed?
I think the more interesting conversation goes like:
How many CPU seconds should I burn for every user's login attempt to compensate for the remote possibility that someone steals the user database? Are we planning to have the database stolen?
Even if you spin for 30 minutes per attempt, someone with more hardware and determination than your enterprise could eventually crack every hash. How much money is it worth to play with a two-layer cake of unknowns?
Has anyone considered what the global carbon footprint is of bitcoin mining for passwords? How many tons of CO2 should be emitted for something that will probably never happen? This is like running the diesel generators 24/7/365 in anticipation of an outage because you couldn't be bothered to pay for a UPS.
I haven't had to handle password hashing in like a decade (thanks SSO), but isn't fast hashing like SHA-256 bad for it? Bcrypt was the standard last I did it. Or is this just an example and not what is actually used in the code?