How you, me, and all of us are turning into a cloud of data – and that's not good.

In partnership with the FCW Cultura Científica magazine, from the Conrado Wessel Foundation, The Conversation Brasil presents a series of articles on the impacts of social media on society. In the text below, Professor Marcelo Soares , a specialist in digital and data journalism, explains how social media collects and processes user data at multiple levels, lacking transparency and concealing criteria and algorithms, hindering oversight and fostering the spread of misinformation. The logic of engagement prioritizes emotional and controversial content, creating personalized bubbles that isolate users and can lead them to extremism.
Eight out of ten internet users in Brazil use social media frequently, according to 2024 data from the Brazilian Information and Coordination Center (NIC.Br). Silently, as you browse, social media collects far more information than we consciously provide.
This data is collected in several layers. The "zeroth" layer corresponds to the basic information we fill in—name, age, interests, and our network of contacts. From this base, the other layers monitor a universe of behavioral data. Networks observe what we like, share, comment on, watch, and the time we spend on a post or video. These patterns are statistically compared to those of other users, and from this, they infer who our tastes, desires, and even aspects of our personality resemble.
Danish media consultant Thomas Baekdal calls this first-order data. It's created by platforms, refining everything they extract from our activity, which can allow them to make inferences about what's opening our wallets. This data is organized into profiles that are sold to advertisers automatically, in auctions that occur in microseconds.
They don't need to get it 100% right; if they get it wrong, the worst that can happen is that we'll ignore an ad. The second layer, therefore, is the platforms' commercial relationship with advertisers based on our data. Based on everything they know about us and what they infer, the platforms build highly detailed profiles. They don't necessarily need to know who you are, but they know that if they get your taste and timing right, you might watch a video, click on an ad, or buy something. It's a bit of demographics, a bit of psychographics, and a bit of cunning.
Cookies served by third partiesThere's also a third layer: third-party cookies, which are even more invasive than platform cookies. This is how ads follow you from one site to the next. Cookies are small files that websites dump on your browser when you visit a page. Their purpose is to identify you on your next visit, and they become a sort of logbook of everything you've done on the site.
They can originate from the website itself: when you return to a news site, the links you've already clicked appear purple instead of blue. But in the lucrative third-party cookie business, cookies dropped on your computer by the website you visit may come from ad networks also present on other sites. These networks don't have a direct relationship with you, but rather with the sites you visit. If the same cookie company monitors a news site and a food site, for example, it sees who visits both and who doesn't. This is why so many sites use pop-ups asking for your permission to use cookies: if they don't use third-party cookies, there's no need to ask for permission.
These companies collect data from your browsing to refine a marketable profile of you. At a higher level, this data is sold and cross-referenced with external databases, such as leaked Serasa data, business history, and even health data. This is how data brokers emerge. They have no connection to your internet usage or the websites you visit. They are companies that compile all this to sell complete profiles, often including name, CPF, phone number, or names of relatives. This market fuels everything from legitimate advertising to scams and fraud, whether through advertising ( 84% of complaints about ads are about digital ads ) or more direct ones. People who commit scams on WhatsApp, for example, often purchased the victim's profile, complete with photo, from a website that provides this .
Those who use these databases have no idea they're being marketed as a product. Some of those who do realize this fall into the fallacious notion that "all our data is already public." It's a fallacy: in these more serious cases, the data was stolen or obtained under false premises.
Island of personalized stimuliThe most direct effect is that the content you see on social media isn't neutral. Platforms are interested in keeping you on them for as long as possible because that way they can show you more ads. You'll see more posts from the friends you interact with most, not from the ones you miss most. Generally, you interact more with those who use the platform more, posting more nonsense or more engaging things. If, instead of a friend, it's a topic, the platform will show you the topics that make you show more signs of interest—likes and shares.
If you open YouTube in a newly installed browser, without logging in, you'll see a parallel universe of what engages most in Brazil. Once the platform gets to know you, it personalizes everything it can to keep you there. The more time passes, the more data it collects, the more seductive it becomes, and the longer you stay there. This cycle is vicious. This logic has been compared to that of slot machines.
The infinite feed is one such tactic. The most aggressive platforms also "punish" links. Facebook, for example, reduces the reach of posts with links because they drive the user off the platform. Instagram doesn't even allow clickable links in regular posts. Twitter, now X, has also started to prioritize native content. The goal is to keep the user inside the bubble, so it can continue collecting data and selling ads. According to author Shoshanna Zuboff , platforms transform user behavior to make it more predictable and better define what to offer. The problem is that this creates very closed bubbles. It's a system that turns each of us into an island of personalized stimuli.
Algorithms that push towards extreme approachesPlatforms use the idea of "transparency" very flexibly. There's consumer-facing transparency—the option to see which ad categories you've fallen into—but it's difficult to access, hidden in settings, and vaguely worded. Public transparency about how platforms operate is being systematically dismantled: Twitter's API, which allowed users to search for who was talking to whom about certain topics, was shut down with Elon Musk's purchase in 2022. The closure, which required an exorbitant fee, was the nail in the coffin for these studies, meaning we've lost the ability to monitor disinformation and extremism in real time and to inform demands for accountability.
Over the past decade, social media has made it harder to access reliable information. People have lost the habit of going directly to newsmakers, expecting platforms to deliver the essentials. Besides harming journalism, this has reinforced the myth that "if it's important, it will reach me"—an idea that originated around 2012, when we were all in love with social media. But what circulates most on social media isn't the essentials, but rather the most clickable and emotionally charged.
Jonah Berger, a marketing professor at the University of Pennsylvania (USA) and author of the book "Contagious: Why Things Catch On," demonstrates this clearly in his study. Emotionally charged news is read and shared more often and is more likely to go viral, especially if it provokes anger. This ongoing mobilization has political, social, and even interpersonal consequences. And, of course, the algorithm prioritizes this type of content precisely because it engages. The algorithm doesn't want you to be informed; it wants you to react by liking, sharing, and commenting—as long as it's all within the walled garden of the platforms. Politically, this ends up demobilizing and mobilizing asymmetrically.
Zeynep Tufekci , a professor of sociology and public relations at Princeton University (USA), showed on YouTube that this model can contain a pathway to extremism. You start by searching for videos about healthy eating, and if you follow the platform's recommendations, it gradually escalates the tone until it shows conspiracy theories about vaccines. The algorithm pushes you toward the most extreme content on that topic because it's what captures your attention the most. The more you follow the algorithm's suggestions, the more predictable your behavior becomes, and the easier it becomes to target you with ads. Even if your own family can no longer stand it.
Linguistic stuffing machinesAI has long been present in these processes, especially in the form of machine learning. It analyzes what was viewed, clicked, and ignored in aggregate, compares it, and uses it to predict what people with similar tastes might want to watch next. This applies to social media, Netflix, and any platform that recommends content. In the worst-case scenario, you choose something else to watch on the same platform, which is another important signal to them, or you turn off the TV and go to sleep.
With generative AI, the scenario becomes even more complex. People are already using these models to produce content at scale—videos, books, podcasts—sometimes containing false information. In 2023, ten "biographies" of Claudia Goldin, winner of the Nobel Prize in Economics , were released the day after the announcement, some of them generated by AI. These models simulate very convincing language, poured onto the screen based on probability. I joke that they are "linguistic stuffing machines," which guess with the self-confidence of a middle-aged white man at a bar. Anyone who uses generative AI without properly checking their results has a high probability of spreading misinformation around the world. Anyone who properly checks and corrects the results generated by linguistic stuffing machines, on the other hand, runs the risk of losing all the time they gained using the digital assistant. Or even more.
One of the greatest pressing needs of today is to ensure transparency on platforms: forcing them to open up the operation of their algorithms to audits—so we know who sees what, why they see it, and who paid to be featured—and releasing post data. This is what Bluesky or the late Twitter did, something Facebook and Instagram never did. Likewise, it is essential to regulate influencers, whose power to sell products, ideas, and even bets, as evidenced by the CPI das Bets (Criminal Parliamentary Inquiry into Bets), operated without clear regulation, commercially exploiting the trust created during the pandemic in an often abusive manner, which demands limits. Only understanding this logic of how networks work and legislating based on it, and not just in traditional communication, can curb this escalation of vulgarity.
At the same time, some social networks have different dynamics. Bluesky, for example, doesn't use viral algorithms: those who post in the morning only reach those who are online at that time, and reposts are necessary to increase reach. It doesn't penalize links or prioritize controversial content, making the environment less toxic. Automoderation tools allow you to disconnect offensive comments, follow community lists, and block profiles en masse. Mastodon has an even more open protocol than Bluesky and allows for much more flexible ways of moderating, but many considered it too technical, and it ended up not catching on in Brazil outside of more technical communities.
Hope exists, but it depends on us: we need to better understand how these platforms work, demand transparency, support regulatory initiatives, and seek healthier environments for public debate. There are many good people researching and proposing paths forward—such as researchers Letícia Cesarino ( UFSC ), Rosana Pinheiro Machado, Francisco Brito Cruz (founder of InternetLab and now independent), Rafael Evangelista ( Labjor/Unicamp ), and many other critical and technical voices.
Social media shapes what we see, what we think, and even how we behave; we can't leave all this power in the hands of companies that can be bought or undermine user protection mechanisms. We need to collectively think about how we want to inform ourselves, communicate, and build our worldview, trying to gradually wean ourselves off dependence on each "big tech" and seek alternatives. These alternatives exist, but they are poor sellers of themselves.
Even more important would be to unplug from social media more often, to talk face-to-face without the need to go viral or be harassed by ads. Offline life is what gives meaning to anything that happens on the screen.
CartaCapital