How To Fight AI Abusing You

Can PureOS stop AI from scraping our data?

It’s more than apparent that AI has become Stalkers favourite tool to get to know you, and use that against you whether you like it or not.
A report from Euronews shows why, and how we need to amour ourselves even more from the Stalkers that monitor our every move, including mouse/drag positions, to inject code in to our devices, in order to record and control us.

Not only is the article informative, but it is also a road map why, and how.
~s

2 Likes

No, and neither can any other operating system. Web scrapping in general can be mitigated by rate-limiting (CAPTCHAs) and authentication pages, but that is not enough to stop the activity entirely.

2 Likes

These kinds of news have popped up during the last two years frequently. They are about gathering data - or hoarding it, more like - for AI training. The key feature is the amount. They are not, in this case, interested in individuals, but want/need massive quantities of any data made by humans (due to the fact that there are limits how and how much synthetic data can be used to train AI models). The data is used to create large language model (LLM) type AIs in an effort to make them less dumb, hallucinate less. There is a privacy issue that AIs may unintentionally include private info (if it has been available online when scraped) in their hallucinations/outputs, but the difference is that private info is not been targeted for scraping - just anything easily and openly available (Youtube was scraped by Nvidia, it seems, OpenAI&Microsoft scraped NY Times etc.).

The only defense or limitation seems to have been robots.txt instruction file that has a request that “please don’t scrape here” but that’s not much of a block if the site is not behind proper login/paywall. For example, I’m pretty sure this forum’s open threads are scraped but for example hidden Round Table area is not (unless it’s hacked, which is much more serious than just scraping) - and not our user metainfo (IP, logs etc.). Then again, it might be a good thing that this forum is/would be scraped as that would add some balance and varied views to the AI “brain” (and good that our Round Table discussions were not). The discussion might be (as such discussions are done elsewhere too), should that happen and should it be for free, for a fee and who’d get the payment/settlement (Purism? Users per word or per post or per like or per solution or or per years or per tier or per something else? Or donate?). TOS is from 2013 and doesn’t specifically cover this point but there’s a lot there to apply. Btw. take a look at the forum robots.txt: https://forums.puri.sm/robots.txt

[edit to add, example: I just tested with one AI and it answered about L5 based on a Tuxphone article and about MiMi based only on Purism marketing materials. Would forum add to those infos? Would it be right?]

Just saying, this particular problem is limited to specific area of un-ethicalness that’s focused on large caches of data, like websites, not at individual devices, like phones (although, technically, if you run a webserver from your L5, it’s potentially available for scraping). At the moment it seems it’s more feasible to scrape/copy/steal data as it’s valuable and the risk is seen as negligible but I’m expecting big litigations which may change that sentiment. [edit to add: Yes, I think linux is part of the solution due to it being mostly secure and preventing random scrapers getting in to your computer, if they are not stopped already before that at home router or such.]

(Btw. What doesn’t help this is that CAPTCHAs that are intended to stop bots are pretty much useless against AI-bots - and Google has been using humans to make profit with them )

1 Like

Probably not but a privacy-respecting ecosystem can.

There are many ways in which data can be collected

  • voluntarily if you publish your data, on social media or otherwise
  • involuntarily if you use a spyphone or own and operate other spying technology
  • involuntarily or semi-voluntarily in the many interactions with government or business that we may have daily

and there’s no one answer that will address all of that.

2 Likes

Boycott, bug out and/or bunker down.

Well, yes, OK - you can exit society completely and hence by definition exit the internet.

That will fully cease the collection process, although that leaves the legacy of all the data collected about you so far.

1 Like

That goes full circle:

1 Like

Fair points.

That might lead to a question: how can one, in practical terms, accelerate the process of data becoming stale? Governments / companies will keep the data anyway though, is my guess.

One answer might, paradoxically, be: rather than starve them of your personal data, instead feed them false personal data (particularly in respect of companies where it could be legal to do so).

Another interesting angle, given the presence of “AI” in the topic title … if your data is used to train AI, that definitely persists beyond even the deletion of the original data, and perhaps the rate at which it goes stale is lower. However maybe the OP’s original concern was only with the actual original data that directly relates to an individual.

1 Like

Let me take that to a theoretical path for a bit: Let’s say AI is inevitable like Thanos. If all those that feel they are in margin, somehow at risk and have something to lose, that data denial could snap half of all data out of existence - or out of AI reach at least. The other half doesn’t have to be majority or evil, just ignorant, complacent etc. But then the result will be, that all data that’s left and used - and therefore AI - would be eschewed towards… something (hard to say how positive, negative or weird it would be, but that’s multiverse for you). At least the efefct would be that AI would not be able to take into consideration minorities and smart people since it wouldn’t have the data and precedent to base its calculations on (that one in fifteen million or so thing). So, although the data assassination plan has its appeal (just go for the head), in the long run a massive show of force of all the different heroes may be better to over-run the generic CGI-data-army. Perhaps we should start generating more of our kind of data to feed this phase of the AI universe, thus making sure it’s endless same old comic content in the future. :star_struck:
Not something that happens just by snapping fingers, I’m afraid. :sparkles:

1 Like

Making deliberate and conscious decisions to avoid public participation for an extended and/or indefinite length of time. An immediate and personal example was the time I lurked on the Purism community forums from 2018 to 2023. A more recent example is my ongoing efforts to become unbanked. The pandemic has taught me how to rationalize every decision to interact with the public, even within digital mediums.

1 Like

IMO:
It’s 2024 and no site needs to play Google’s puzzle games (CAPTCHAs reCaptcha, or any thing requiring the visitor to click to prove they are capable of clicking on demand. But I like:
“Please prove you are a human - enter your credit card and CVV numbers”

Back in the day (2023) we removed the tattlers and used a method that didn’t require notifying Google, or play click-the-bus garbage. The one method we used was to challenge the visitor’s computer.

IMO
If Artificial Idiots are being trained to scrape, the PondScum abusing AI for personal gain, might need another AI just to unravel it.

1 Like

I wonder when the opposite of those CAPTCHA Turing tests appear - systems testing and preventing humans getting in (“Find Waldo from DB of 10 million faces” or “Which of these 100 buses has the median hue of #DAA520”), so we don’t mess things up :wink:

[Really, some admins would love having that feature to prevent idiots and knowitalls from doing stuff, and I’m not referring to using own personal systems]

[Edit to add: just funny: xkcd: Machine Learning Captcha]

1 Like

When I see any kind of captcha, I know the people that built the site don’t know enough about the evils of Google and just how many trackers, SMIRC’ers and stalkers they let ride in the template they used, or plug-in bells and whistles they dress up the site with.
Any captcha where the visitor has to provide Google they are real is part of the assimilation.

IMO, anything Google provides is like getting into bed with the mafia. You’ll never get out. We become part of the collective.

1 Like

This is just too cool not to mention here: How about a game of DOOM for CAPTCHA?

Though the same couldn’t be said for most of us mere mortals, Vercel CEO Guillermo Rauch had a productive festive period, resulting in a CAPTCHA that requires the user to kill three monsters in Doom – on nightmare mode.

" As The Register noted last year, OpenAI’s GPT-4 is capable of playing the game – badly. Give it a few years, though, and similar AI models may make mincemeat of Doom on nightmare. So, once again, CAPTCHA’s effectiveness as a bot defense comes into question."

Try it: https://doom-captcha.vercel.app/ :scream:

3 Likes

GitHub repository (on my GotHub instance):

1 Like

Where is “Anonymous” when we need them?

Most of the info around AI appears to be by US corporations for the US (and collateral areas). As people write, they seem to do so with borders. We won’t debate that partt.

IMO, at the rate that AI is being abused (a machine cannot be abusive), I won’t be surprised that a powerful well coded set of algorithms will hold the New York Stock Exchange hostage, or all planes en route, all of Musk’s satellites… can make a movie about AI takes over the world.

Oops that’s been done - a movie based on the book [Colossus: The Forbin Project] ( A Youtube “Official Trailer”.(https://www.youtube.com/watch?v=kyOEwiQhzMI) already and as bright and intelligent as we think we are, we fail to heed it’s warning - still.

AI is progressing exponentially and at breakneck speed. AI, in the wrong hands and it will be soon, will be the world’s next biggest threat to all of our rights.

Doom and gloom
~s

1 Like

AIs are used to abuse. And AIs are abused to abuse others.
Plural. There is no one AI, “AI”, AI technology, nor AI system. Not even a singular (pun) idea of an AI. Even the big ones are thousands and millions of separate instances that do not communicate.

Stock exchanges have been algorithm based high-speed trading for a long time now (and screwed by them too).

As things progress, I don’t see AIs becoming a threat to rights sooner than the other potential threats associated with AI technologies that can harm societies and individuals, which I’d be more worried about: economical bubbles created and burst to create financial chaos and recession, need for energy increases use of fossil and other fuels and grids, need for data centers increasing need for water for cooling and raw materials (some of them from very poor places), use of AI for entertainment and frivolities instead of solutions for global problems etc. A bit of doom and gloom.

I have to ask, why should Anon set it’s sights against DOOM?

1 Like

Only my learned opinion of course but there is no “it’s” for Anonymous. Any one person may be “anonymous” and join ‘anonymous’ in their quests. Anonymous has no leader - no one is in charge, but they are able to police those that pull a stunt that disagrees with the ‘Anonymous’ intents.

I wouldn’t say “Anon set [it’s] sights against DOOM” - heck, if Doom and Gloom were to raise it’s ugly smug look - we should all set our sites and prevent the doom and gloom that lay ahead when someone manages to take control of a AI.

Too, Anonymous are people that come with a very high-rated set of skills. Being anonymous is one, hacking in to “state of the art security” into police computers, power plants, silos, military aircraft, banks, Bitcoin, Canada Mint, hospitals and etcetera… and not be hacking for profit.
I think they would be very encouraged to list all the stakeholders of major AI and why.

Why Anonymous? Because of the skills they have. And, the Anonymous people are people that want to see world peace and use their skills to oust those that work against honesty and freedoms. i.e.

Five Star Hacks: (randomly picked)

  • the collective campaigned against child pornography protected by anonymous hosting techniques. They temporarily DDoSed 40 child porn sites, published the usernames of over 1500 people frequenting one of those websites, and invited the U.S. Federal Bureau of Investigation and Interpol to follow up.

  • Anonymous hacked 485 Chinese government websites, some more than once, to protest the treatment of their citizens. They urged people to "fight for justice, fight for freedom

There are dozens more. Like anything, there are good and bad Anonymous actors. Some proven to be police ‘hack’ to make Anonymous look bad. Nothing new in that department.

I hope I have answered your question, (say Yes or I’ll reply with another novel :rofl:
~j

1 Like

I’ll take another novel, but you can also be concise too, if you wish :slightly_smiling_face:

I’m aware of Anon. I was trying to ask, why should Anon be interested in DOOM the game, what does that do, why should Anon (or anyone else) care about DOOM that much? Why were you asking where they are, while quoting the DOOM game CAPTCHA? I though there was something I missed, since that is intended as a tool to set boundaries to AIs - a jokey tool, but still.

… and speaking of bad jokey tool ideas, if AIs start to beat DOOM on nightmare level settings, I wonder if some other games could also be used - so how about “Tux racing CAPTCHA” (or maybe a little Counter Strike, or Civ, or… Sims? Full circle with that :upside_down_face:), forcing users to win a few laps before login or command is allowed. Mandatory fun to drive people nuts…?

1 Like

There is a very good movie called Anon, IMO, where

About Movie

“The Girl” can’t be ID’d and who is a hacker by trade. She makes the famous statement “It’s not that I have something to hide. I have nothing I want you to see” Anon the movie

But (wait for it)

AI and Supercomputer

"…an IBM supercomputer named Deep Blue defeated
then-world chess champion Garry Kasparov in a 1997 rematch.
" [/More]
Isn’t that AI as well? The algorithms are not that different from each other. If it was not AI winning at chess, then what does it take to be a AI? Would AI be able to tell us why AI needs to be involved in Doom cAptcha anyway? My head hurts now.

If a 'super computer can beat the bajeepres out of a chess champion human in a game of a chess, why should it be so hard to Glock a few floarting heads? :thinking:

I was asking ‘where’ in sense it is permitted use when suggesting where did they go, why are they so quiet, as in ‘Where’s a cop when ya need one.’
And, I quoted Doom game being abused to subject visitors to part of a difficult Captcha . That’s what AI’s are gong to be for :rofl:

Most are already nuts. I think the minority of folk will avoid the sites that use kiddie-kewl bell and whistle stuff. There’s enough bling on sites as it is. I say to them thar AI thinga-me-bobs, 'not in my back yard ya binary bit byte’in whipper snappers!

or, Googles 5x5 find the pics with a thing. I went by a captcha where it’s a challenge to one’s vision.

Off Topic a Bit

I personally know of people that wear a chip in the web between their thumb and finger. Logs them in to the office, it’s bio-metric as well. Employees can be tracked where ever the goes. They don’t play games to access rooms or devices. It can tell how many bars the chip went to after hours. Even logs how long they are in the lavatory.
A wonderful man I know was born in to a wheelchair. They said he’d make it to twelve, He’s fourty-one now. The chip in his web turns his lights on when he enters, and off when the room is is empty. He manages heat, and access to his house.
I can not imagine the many ways AI can IoTs can help us all in so many ways. I also see AI can be misused in many many criminal quests.

Such things like those chips and IoT’s can better help humankind. Look at what they have accomplished in health care alone. Long distance surgery, and AI is being tested already at diagnosing a patient.

AI is showing up in new places every day. And there are no laws, morals or ethical standards set up.

Bear or bare with me on this next one…
We should remember our history because we are repeating it - again.
In March 22, 1933 the Dachau concentration camp opens. The Nazis traveled through Germany hunting for Jews. The Nazi would go to villages, and towns raiding the Bürgermeister’s (mayor) Catholic priests and records . Two authority stored the data of everyone’s birth date, marriage, death and religious affiliation was kept. Get the picture.

Now Google has all that and much more. Google may even have something you’ve said or done on the Internet that you wouldn’t want your Mum to see. And many want to let AI loose? SkyNet?

We need to harness AI before it harnesses us. The people pushing their AI setup will have nothing to do with putting a harness on their alter ego.

Colossus, we are ripe for the assimilating.
~s
It’s been fun. I got to test the many talents of Discourse.

1 Like