The Marie Kondo Approach: How Cryptocurrency Projects Will Lead to Clean Data

The Life of a Data Janitor

Given all the noise and chaos in the space right now a lot of you might be skeptical of what I’m about to say, but I do believe that one thing that will come out of the crypto/blockchain movement will be cleaner and more efficient data. Why? Because the ones that don’t follow good data practices will fail, and fail fast. “Fail hard and fail fast” is part of the startup world’s ethos anyway, but with crypto projects the speed of which these things happen is just that much quicker.

If you’ve ever tried to run an artificial intelligence model with bad data you’ll probably notice one thing — it doesn’t work. At all. It doesn’t matter if the AI is the smartest thing humanity has ever created — if you feed it bad information, it will make bad and nonsensical decisions as a result. The algorithm is the shiny object that everyone likes to talk about, but what makes any software model work — even for simple projects like building a personal website — is clean, reliable data. In that sense, blockchain/cryptocurrency projects are no different than the rest.

Since the vast majority of services on the Internet up until now have been given away for free (the ad-based model), most of the “effects” of bad data practices were subjective, hard to measure…and if we’re being totally honest here, not really all that impactful in the bigger scheme of things. Sure, targeted ads and content suggestions might be completely off-base some of the time or most of the time, occasionally bordering on offensive, but who cares? It’s free. And in cases where it does matter, like a hacked credit card for example, don’t you worry your pretty little head about it — the banks and corporations will just reimburse you, no questions asked!

While blockchain-esque technologies (like Hyperledger) has been in existence for a while now, what crypto brought to the tech space was something fairly simple: money. Even people who had no interest in software up until that point (like my dad) suddenly found themselves interested in what’s going on — learning how to code in order to make memes/cat pictures? Who cares. Something that might actually help pay the bills and earn me an income? Tell me more!

There’s been a lot of critiques of crypto tech from technical standpoints—some more valid than others — but you have to keep in mind that we have to talk about these issues in relation to what we’re using right now. I can’t speak for others but I think that it’s actually kind of absurd that we’re attempting to run a global economy based on credit card systems, where all hackers need is 16 numbers and your name/address in order to steal both your identity and entire savings account. (Sometimes it’s a wonder that anything even runs at all, really.) Being that blockchain systems are objectively more secure (this isn’t a matter of debate anymore — they simply are) there are lots of use cases out there that haven’t been fully considered, making the space fertile for further exploration.

I do, however, think that when talking about the details of these issues a lot of people have a tendency to miss the bigger picture, which is that crypto is more than just a technical tool — it represents a very real philosophical and cultural shift in the way we think about technology and technology practices in general. There have already been many articles written about this matter but here I’d like to focus on a specific issue: how this will affect how we use and think about data.

Does it Spark Joy? If not, feel free to hit that delete button.

Lately I’ve been obsessed with Aya Miyaguchi’s (current director of the Ethereum Foundation) idea of “Beauty of Subtraction”, since I feel that a dose of her perspective is what the technology space needs right now — pretty badly, in fact. For a lack of a better metaphor, you can kind of think of Miyaguchi’s philosophy (influenced by Zen Buddhism) as the “Marie Kondo” approach to tech practices: less is more; smaller is better; balance over growth; quality over quantity. The fact that I’m Japanese-American might also have something to do with my obsession since there’s a side of me that feels like this stuff runs literally runs through my veins, just because of my background.

In my experience, a lot of the data practices you find in organizations — large or small, public or private — tend to be driven by a “hoarding” mentality that I feel is mostly unhealthy. It’s generally assumed that the more data you have the better off you are, and it’s always better to keep *everything* that you collect. And when you start a job as a data person — administrator, engineer, product manager, data scientist…any role with the word “data” in it, honestly — the first problem you run into is always the same: the database is a complete mess. I often jokingly call myself a “data janitor” when I introduce myself because in the vast majority of cases, most the work I do, no matter where I am or what title I have, is doing clean-up work.

In practice, most of the data that you collect won’t really be used at all since it’s just there as part of a collection process — and the reality is that in a world where things are changing all the time, data does “go bad” after a while. Unless the data is explicitly historic (date of birth, schools attended, sex-at-birth), most attributions are meant to be temporary since they were always meant to be understood as “snapshots” of a particular time and place, rather than an expression of reality. But many projects are currently running real-time processes with old data sets that may be outdated or inaccurate, so we ought to be skeptical of the sorts of results that they spit out.

Being that that’s the case, when looking at a column of data, one should ask “does this spark joy?” and if the answer is no, just delete it from the database, guilt-free. But as with the hoarder’s mentality, there’s always that sense that just because we paid for it in the past, it must still be worth something today. But the reality is that most databases out there is a hoarder’s nightmare, full of unless junk and decaying corpses of various dead animals that somehow found its way in. (Sorry for the graphic imagery, there.) We should be less apprehensive of throwing things out, when the situation calls for it.

Changing the culture around the way we think about data will not be an easy task but it’s something that has to happen if we want to really make technology work for (rather than against) us in the future. Bad data practices not only lead to errors and annoyances — but in more serious circumstances could even become a source of injustice: if a government organization processes their low-income subsidies incorrectly, “oops” just won’t cut it — we need to put in the time and effort to do these things correctly.

But again, the people doing the hard work of figuring out how to put these ideas into practice are mostly happening in the blockchain/crypto communities right now, since incumbent organizations have gotten stuck trying to make incremental improvements in their currently (very messy) legacy systems that have now turned into a series of self-compounding problems. The Ethereum team in particular stands out as being exceptional in that they’re one of the few organizations that really “get it” when it comes to the potential that this technology can really take us in the long term so their progress is worth paying attention to — but I also do know that there are also a lot of thoughtful people out there working on things right now that may end up surprising all of us some time in the near future. (The Gridcoin team also deserves a shout-out here as well — their new wallet is actually very impressive.)

As for actionable items go, I have a few brief recommendations:

For Developers: Remember that the code you’re writing will affect people’s financial situations in a very direct way, so there is now a greater responsibility on your part to get things right. Crypto/blockchain projects require a higher standard of personal conduct and this is something we all have to embrace if we really want this movement to stay relevant in the upcoming future.

For Startups: as usual, just make cool things that people will want to use. As explained above, though, I do think that teams that have the discipline to maintain clean data environments despite the chaotic nature of the startup environment will do well. Due to reasons mentioned above, blockchain/crypto projects have a tendency to fail *even faster* than traditional startups but the good news is that as long as you’ve bought into the spirit of the startup ethos, this is something that you can learn to embrace.

For Corporations: A lot of companies are experimenting with this already, but blockchain technologies are great for creating database structures between inter-departmental relationships, which is something that can be built within a realistic budget and time-frame. Also a great opportunity and excuse to clean out the cobwebs that might have accumulating in the backshed storage systems for years on end.

For Government: If you can ignore all the scams and nonsense surrounding the crypto movement right now for just a second (just 1 second!), blockchain technologies are objectively more secure than any of the systems we have right now (especially processes that require multiple-signatures) so if security and sensitive information is at stake, this technology is something that you can realistically consider. People have lost money due to personal scams and security breaches, but the blockchain systems of Bitcoin, Ethereum, and most of the major chains out there have *never* been breached, despite many attempts by many people at doing so.

Personally I’d like to see the government directly involved in creating pan-corporate systems that would be useful for building things like content rights management and copyright systems, which would be useful for both improving the quality of digital content, as well as an infrastructure layer to fight “fake news” and other types of misinformation campaigns. It’s in the government’s own interest to protect themselves from the chaos of these types of attacks, after all.

At the end of the day, however, all of these situations will require good data practices from top to bottom in order to succeed. In the blockchain/crypto space, the will to tackle these problems head on is as valuable as the technology itself brings, so if you’re looking for talents with radical visions of how your data systems can be improved, that community would be the place to go. If you’re looking for ways to separate the charlatans from the real deal — just take a look at their data structures and organizational structures in general. Does it spark joy?

