Between Christine Lemmer-Webber (nprofile…glmr)'s https://dustycloud.org/blog/how-decentralized-is-bluesky/ and bryan newbold (nprofile…8v5d) response https://whtwnd.com/bnewbold.net/3lbvbtqrg5t2t I am still struggling to wrap my head around the storage bits.
Not for lack of grokking storage (full disclosure, I used to consult for iXSystems behind the likes of TrueNAS [previously FreeNAS]).
But more like:
"our out-of-box relay implementation (bigsky) requires on the order of 16 TBytes of fast NVMe disk, and that will grow proportional to content in the network."
Really?
Why?
I am also kind of confused about why the storage rhetoric therein seems to default to hosting providers and their pricing (ranging from $414.20/month to $55k/year on their estimates, YIKES!). DevOps anti-pattern. ;( We need more than 37Signals folks decrying why the "cloud" is bad (I mean, there are others who have been stating as much but I guess they have even less attention?).
Today in late November, 2024: A Western Digital 24TB "Gold Enterprise" yadda yadda hard drive cost $569 around on eBay (not factoring in taxes and shipping and handling, etc.). Sure, that is "spinning rust" and not as fast as a RAID0 (never do this, please!) or even (my preference) a RAIDz3 (ideally with some hot spares) of fast NVMe drives.
But y'know, OpenZFS exists. For quite a while now folks have benefited from the ZFS (yes such things are older than the project being renamed OpenZFS) cache vdev and having hybrid things where you might have a bunch of slower more economical hard drives but then a fast flash based ZIL or L2ARC?
Even if you ignore that. I cannot, in good conscience, think of ANY reason why ANY person or organization (I guess if you are the organization selling such services?) should ever waste so much money on a hosting provider storage ($55k/year? For reference, in 2020 Nimbus was selling 100TB SSDs for $40k MSRP supposedly and by 2022 they were saying 200TB SSDs were on the way and I am guessing their pricing is lower even if their densities aren't and IIRC their speed wasn't on par with Liqid/"Honey Badger" type products and whatnot). Western Digital sells 8TB NVMe SSDs (e.g. SN850 priced around $600 USD checking the usual suspects of online and brick and mortar retailers) and Samsung's recently announced PM9E1 will apparently be in 4TB densities and probably will be competitive in pricing too?
Admittedly, even back in 2013/2014 I was administering a TrueNAS HA CARP pair with 384TB of disk raw; and I don't think it was insanely expensive or slow (quite performant, really) for that matter?
Circa 2020, SAP tried to head hunt me and were telling me of one of their systems with 48TB of RAM. That seemed, kind of evil to me (what good applications on Earth require that much RAM? Even though yes, some super computers have even more RAM than that) but their job application form wouldn't even accept UTF-8/Unicode characters which seemed far worse. I never relish dragging tech debt companies kicking and screaming to the distant past (UTF-8 is from 1992 and is 32 years old now).
Samsung - DDR5 - module - 256 GB - DIMM 288-pin - 4800 MHz / PC5-38400 - registered is around $4k, but I see 1TB (4 x 256GB DIMMs with ECC no less) kits from other vendors for around $5k; and claims that 1TB RAM sticks exist (though I cannot seem to find manufacturer part numbers nor sale prices on such things when I look a little).
Sure, decent server motherboards and chassis and power supplies (redundant, hot swappable, for the love of all that is good) cost some, but you could build a HONKIN server (or two) if you had a $55k/year budget and rack it with some decent uplink connectivity in a data center (or two or more maybe) and not deal with the overhead of a hosting provider without too much I think?
But it's more like: Why does Bluesky need 16TB of fast NVMe?
If it does: it is not nearly as expensive as hosting providers are offering if you have a bit of a DIY angle (and I guess I am old enough that if you wrote code, you probably built hardware before you wrote software, so putting together a server should be a piece of cake. Just look at Linus Tech Tips: he does hardware builds all the time and never even graduated college! Albeit, it was only within the last year that they caught up to systems with specs comparable to systems I was administering 9 years ago.).
Does it need to be that fast? If speed is really a bigger issue than storage density as that claim seems to imply, are NVMes going to be fast enough? Maybe you need to be building a server with gobs of RAM? Even if you do: you can build some servers were gobs of RAM for a lot less than they used to cost.
There is of course, the software design malady of why Bluesky currently needs 16TB of fast storage and why that will continue to grow; but obviously, having worked for a company which had 384TB of storage (in just two devices, we had much more than those two devices) I am not unaccustomed to never ending growth at scale being a design criteria. Some (relatively few) organizations have petabytes of growing data, some evil organizations even have exabytes of growing data. Maybe some (most certainly insidious, e.g. NSA) might have more? Generally speaking, the more never ending growing data an organization has, the more likely it is pure evil and evilnomics are afoot (think: the mercenary industrial complex, or more biological analogy: cancerous cells metastasizing).
This is why so many (to date: ALMOST ALL) so-called "social" networks have been complicit in surveillance capitalism, because to "scale" data from all their users and not charge the users for storage of that data "indefinitely", they find other ways to monetize their user bases.
That design criteria is not mandatory. It's probably entirely unwise! NNTP (which predates Google/Alphabet Inc. and "Google Groups" by a decade at least to the point where Bryan mentioning it in the same sentence I found repugnant and offensive; particularly since it ignored that DejaNews was an early Google acquisition that they unceremoniously ruined) did not have the criteria of storage data indefinitely. That was sure, a constraint, but it was also smart. It was not a defect in the protocol design.
The fact that warez folks glommed onto NNTP with NZBs and such and the burgeoning of commercial long-term-storage NNTP providers even created a strange cottage industry around such things, but in that the longer-term-storage of the system is paid for by a group of users, moreover, users who would probably be mortally affronted if their data were being resold unscrupulously to 3rd parties as is commonplace in the current FB/Meta/IG, Alphabet/Google/YouTube, Twitter/X, etc. deplorable malaise.
There are many other models which value user privacy (the employer who had 384TB in that TrueNAS HA CARP pair over a decade ago, also had paid subscribers and did not resell their data to 3rd parties; but it also did not really have user supplied content. Think of it more like a Netflix, but older than Google and more "adult".)
But: why mandate that user data be stored indefinitely? If so, why does it need to be on fast NVMe? When chances are that's only in aggregate usage that any speed is a real consideration given that most users have paltry bandwidth at home relative to cross connects at datacenters and IXPs? That really seems like a mis-design, when if users feel a real need or want to have extreme amounts of data shared online for an indeterminate period of time REALLY FAST, they should level up and learn how to do it; like the rest of us SysOps had to already.
I can't help but think that BitTorrent already addressed some of the network effects of distributed data storage and retention better while minimizing costs than just about every other prior art example and to date, still seems to be faring better than every other NIH Syndrome/"reinventing the flat tire" effort as well. ;-/ The "rarest first" principle alone was a protocol implementation genius with which I continue to be impressed by decades later.
Bluesky isn't a realtime video streaming platform, is it?
I can understand when Netflix engineers like Drew Gallatin are giving presentations at EuroBSDCon on tuning FreeBSD to saturate the PCIe bus to push 400Gbps over their NICs (e.g. https://www.youtube.com/watch?v=_o-HcG8QxPc). Now that there are 800GbE devices, I expect a similar update from them or someone else like them, soon.
It's been my observation that PeerTube (nprofile…kz65) actually does as well (if not better) than some of the realtime video streaming systems I used to administer for a previous employer and I am pretty sure it is doing it on far humbler hardware than I had at my disposal or Netflix has at theirs given their copious amounts of overpaying subscribers.
Maybe I am misreading this, but is Bigsky implemented in Python and expected to run in Docker: https://github.com/bluesky-social/indigo/tree/main/cmd/bigsky ?
Because if so, gosh do I see some remedial re-implementation in lower level (compiled languages, ideally, didn't you learn that lesson in grade school as I did when BASIC was still the in vogue interpreted language [as mentioned here, there were 1980s vintage compilers for BASIC which could net a 10x speed improvement: https://www.codeproject.com/Articles/5347103/BASIC-is-Not-Dead-Time-to-Erase-the-Myths-about-Ba] which was more memory efficient and I would hazard to guess is far more efficient than Python? [even MicroPython requires a minimum of 16KiB of RAM, whereas BASIC ran happily in a stock Apple ][ with 4KiB] [alas, lamentably BASIC is also omitted from https://niklas-heer.github.io/speed-comparison/]) languages and without container overheads (VMs and hypervisors, on average add a 40% performance hit; even PCI pass-through paradigms such as in FreeBSD's bhyve tend to hit 9% or so. Docker and FreeBSD's jails and chroots also add overhead, whether you like it or not) as being WAY MORE IMPORTANT than throwing money and hardware at Bluesky's storage woes.