Do you remember the uproar about cookies in the late 90s?
I don’t mean the sugar-filled kind.
I’m talking about web cookies – the bite-sized bits of information that your browser stores locally so that when you return to a website it already knows who you are.
If you worked in tech at the time, you might have been confused by the uproar. Cookies make the web experience significantly better. They are used to keep you logged in to your favorite sites, they remember your preferences and settings, and they enabled one of the earliest forms of persistent identity on the web.
They helped us move from a one-size-fits-all view of the web where the same information was disseminated to everyone to a more personalized view where your actions influenced what you saw.
But if you didn’t work in tech, cookies were scary. It was the first glimpse for most people of a world where the services they used knew who they were, what they were doing, and remembered it over time.
This is a far cry from how most of the physical world works. When I walk in to a drugstore, the cashier doesn’t know what I bought on my last visit or that I am most likely to visit on a Tuesday morning. Of course, loyalty cards are changing this, but these were rare back then.
For most people, cookies raised the specter of being watched, of being tracked, of being monitored. Putting aside the recent NSA news, you can’t get any more un-American than this. And as a result, there was a big uproar, rightfully so.
Fast-forward 20 years and most people would be shocked if they knew how much the web-based services they used knew about them.
Taking a Hard Look at the Data We Collect
Most of us in the industry know that every page load, every click, every data field entered is being tracked.
We know that this data isn’t only being used in aggregate. We know that companies are profiling our individual behavior and changing the experience we see based on our own actions.
But how often do we take the time to think about it?
We probably notice it when we see a Converse ad on Facebook after shopping for a new pair of Chucks on Zappos.
But we might be less likely to notice it when Google adjusts our personal search results based on sites we’ve visited in the past.
We get a glimpse of it when we see books from our favorite authors highlighted on our Amazon home page.
But how often do we spend conscious time thinking about how much data companies collect? How much data we collect with our own products?
Until this past week, I didn’t at all.
Don’t get me wrong. I’ve always strived to keep the user’s best interest in mind, as I’ve built products.
But I’ve never thought twice about adding tracking events throughout my products, about tracking open and click rates, or even building up profiles of individual preferences to feed machine learning algorithms.
After all, it’s what we do. It’s how the internet works. The more we track, the better our products get.
All In the Name of Faster, Better, More
Collecting this data isn’t inherently wrong. Most companies are using this information to make their products better.
The last thing I’m trying to do is scare or to panic. After all, you already know this. You work in this industry.
You know how helpful this data is.
You know how powerful it can be to personalize an automated product email based on the actions of the user’s last visit.
If you use KissMetrics or MixPanel, you know how much easier it is to track down edge cases when you can pull up the behavior profile of a user to see what events led up to the problem.
You also know that you would never use any of this data to do harm.
Your goal is to make your product better, to do right by the user.
And that’s why we don’t think about the broader implications. Our intent is good. So we don’t think twice about it.
Ask Yourself These Two Questions
- If your users knew how much information you collected about them, would they be okay with it?
- If your users knew exactly how you used that information would they be okay with it?
For the first question, I mean everything.
Not just the data you store in your database. The data Google Analytics collects. The data in KissMetrics or MixPanel. Email open and click rates. The data your advertising partners are collecting when they serve an ad. The date you collect through Optimizely, Visual Website Optimizer, Pardot, Marketo, HubSpot, and the dozens more tracking and analytics apps. The preferences you collect over time to drive your machine learning algorithms and your recommendation engines. The “big data” that sounds anonymous, but we all know is often tied to unique identifiers.
The second question is even trickier. You make product decisions intended to drive engagement, to grow revenue, and to fuel growth. But do these interests always line up with your users’ best interests?
Are you helping your users when you optimize for average revenue per user? Or are you aggressively pursuing revenue growth to get to cash-flow positive?
Do your users really want an email every time there’s new content on your site/ Or are you bolstering your monthly active users to hit your next fundraising milestone?
Every business faces these trade-offs. But what makes the internet different, is we know way more about our users than businesses ever have before.
We are in uncharted territory. And we need to start asking ourselves, how do we do right by our users and not just focus on doing right by our investors and our business.
When We Get The Answers Wrong
Last week, we learned that the Facebook social contagion study was approved by Cornell’s Institutional Review Board as a dataset study. This means that it was approved after the data was collected. Facebook collected this data as part of their normal course of doing business.
Now it’s easy to vilify Facebook and argue that they never should have collected this data in the first place.
Let’s take a step back and remember that we aren’t talking about a faceless entity. The people at Facebook who collected this data are product managers, engineers, and data scientists, just like you and me.
They decided to collect data that they believed would make their product better. I’ve talked to several people who have a hard time believing this. They want to assume malicious intent.
But ask yourself, how many times in your job have you intentionally made a decision to harm your users?
I bet the answer is zero. Why would you think the people who work at Facebook are any different?
So let’s start by assuming positive intent.
But we also can’t ignore the reaction to this study. Many people were upset by it. They felt Facebook crossed a line. What went wrong?
From a product standpoint, here’s what Facebook did:
- They collected data about the frequency of positive and negative words in status updates.
- They used that data to modify their newsfeed algorithm.
This is a black-and-white case where users found out what data a company was collecting about them and how they used it and they were not okay with it.
Facebook Is Not the Evil Empire
But before you feel smug and conclude that Facebook is the evil empire and that you would never do this yourself, let’s hear from Facebook.
Adam Kramer, data scientist at Facebook and the lead author on the PNAS article, writes in this post:
The reason we did this research is because we care about the emotional impact of Facebook and the people that use our product. We felt that it was important to investigate the common worry that seeing friends post positive content leads to people feeling negative or left out. At the same time, we were concerned that exposure to friends’ negativity might lead people to avoid visiting Facebook.
So yes, Facebook was worried about losing users, who among us isn’t?
But they also cared about the impact their product was having on their users.
Do you really think that Adam Kramer and friends were sitting around one day thinking to themselves, how can we manipulate our users?
Or do you think that perhaps they were asking themselves, just like you and I ask ourselves, how can we better serve our users? How can we improve our product?
And yet, from the reaction to this study, it is clear that people are not okay with the data Facebook collects or how they use it.
Our Responsibility to Our Users
What this tells me, is that we can’t answer these two questions in a vacuum. We are not always going to be the best judge of what data we should collect and how we should use it.
As an industry, we need to get better at understanding where the line is. We need to do the work to help our users understand what we are collecting and how we are going to use it. We need to let them be the judge of what is socially acceptable.
It’s easy to write this off as a big company taking advantage of its users. But we know better.
We all collect this type of data. We all modify our algorithms to drive the behavior we want.
We all share in the responsibility of figuring out what’s the right way to proceed.
Where We Go From Here
Most people aren’t consciously aware (even if they are intellectually aware) of how much information corporations are collecting about them and they have no idea how this information is being used.
Even if we have the best data usage policies and our intent is aligned with the user’s best interest, this is still a problem.
Consumers have a right to know and we, as product builders, have a responsibility to our users to help them understand the implications of using our products.
We have gotten very good, in recent years, at iteratively testing our products, at collecting user feedback, at understand what the market needs.
Let’s take these same tools and apply them to our data collection and usage policies.
Let’s do a better job of helping the people who use our services understand how the internet works, what we know about them, and how we use that information.
And most importantly, let’s listen and adjust when they tell us we’ve gone too far. Don’t lose this opportunity to learn from Facebook’s mistakes. Let’s do the work to get better as an industry.
P.S. If you’ve read this far and you still think this is all blown out of proportion, do me a favor. Pick up a copy of Dave Eggers, The Circle. Read it and tell me you can’t imagine that’s a possible near-future.