How To Fight AI Abusing You

Are you able to tell us which AI you used?
TIA
~s

1 Like

In addition, for publicly accessible pages, “scraping” is a normal part of web search indexing. (So if you successfully prevent scraping, by any means, then you are preventing anyone finding your site’s pages via a web search. For a company that aims to derive revenue that could be badness.)

Any web site can create a resource /robots.txt that asks for parts of the web site to be indexed / not to be indexed by specific web crawlers / all web crawlers - but that is a request that a web crawler is free to ignore. A reputable web crawler will comply.

Just for fun … for my own web sites, in addition to telling web crawlers to f… off, for those web crawlers that publish a list of the IP addresses that they use to crawl your web site … they will see a “virtual web site” that differs from the real web site. That is, the only resource that the web crawler sees is
/robots.txt and there is no other content to index if they choose to ignore /robots.txt

A moment’s thought though will reveal “gaps” in that approach.

I do what I can.

3 Likes

Sorry, it’s been a while. Probably one of the bigger ones.

2 Likes

IMO, that depends on the business. We could “drive revenue” without Google or real search engines.

Back to the scrape. IMO - We don’t need Google. What we need is a search engine.

A fellow with a local swimming pool care service wants his business services to be number 1 found by Google. Google will first push local businesses. One does not need to do the $EO thing. Google will start with a list local to the info seeker any way it can, pending of course one isn’t looking for a vacation in San Moritz! But one needs to skip paid-to-google pages about watches, and San Moritz martini first.

I wouldn’t let Yahoo’s slurp in to scrape tho.

I believe sites will be indexed (scraped?) anyway, with or without Google’s blessings, if they type the business name in the search. Then they come up at the top.

BTW, both Google and Bing are capable of “scraping” web sites for illegal child abuse images - but won’t do it. I can back that up.

Having to fight off and often times pay to hide from hacking and slashing our rights to privacy, shouldn’t be like this.

just my opinions of course,
~s

1 Like

Part 2.

Another way that I counter scraping on my own web sites is by having pages that are unreachable from the home page.

In other words, a typical scrape / crawl would start at the home page and look for the URL of any pages that are referenced by the home page (and on the same web site). It would then apply that process iteratively (or recursively) to any pages so found. Until it has found (and downloaded and indexed) all pages (on the web site) that can be reached from the home page by a finite number of navigations.

But that set of pages can be much less than the total set of pages on the site.

Indeed.

1 Like

A few other technical counters include authentication and using different overlay networks:

See also:

and

I get the feeling that we pay more and more for what we don’t want (anti-virus, anti-malware, anti-tracking, ads…) and protection costs is getting very expensive, complicated, and in the end, a lot of sites block us if we don’t comply.

I’m debating with myself if I even need the Internet.

Thanks guys for the helpful tips too.
~s

1 Like

I don’t know whether we are paying $$ for those things but we certainly pay with our time and attention.

2 Likes

I pay for anti-virus and anti malware. Really have to with Windows around still. As consumer, I think I help to pay for the ads I don’t want - it’s in the cost of products and services we pay for.
Indirectly, Canadians helped pay to elect the President, and send US bombs to Israel.
Here, in British Columbia our electricity supplier charges extra to cover the cost of people whose I help pick up of those hydro users that don’t, won’t or can’t pay their electronic bills.

AI is already telling us how to behave, limits our choices, changes our outlook (or else), … even making medical diagnosis and providing the prescription.

I still think the I in AI is I dio t. :rofl:
~s

1 Like