- will be operational in 2025
- infrastructure, called European Search Perspectives (EUSP), based in Paris
- infrastructure will be available for other independent search engines and technology companies
Here is a quote from the article that caught my eye:
Hopefully, the AI aspect is in the context of improving indexing and search results.
Here’s a more in-depth article: Ecosia and Qwant, two European search engines, join forces on an index to shrink reliance on Big Tech | TechCrunch
Here is a quote from the latter article:
I assume that their usage of GenAI within their search engines will be answering queries as text summaries.
Won’t Ecosia and Qwant just become Big Tech then? Even so, anything to reduce reliance on Google is to be applauded.
Likely some of the problems with LLMs and GenAI will just be replicated within the new infrastructure.
No, both entities are only focusing on creating an independent privacy-first search index, not creating proprietary operating systems and/or social media platforms. The rest of the infrastructure, notably LLMs, remains heavily reliant on various third-parties’ API access, which are still as susceptible to EULAs, pricehikes, ratelimiting, etc.
In respect of LLMs etc., I was (very unclearly) commenting on a hypothetical future where even that was brought into the new “sovereign” digital infrastructure. In other words, yes, you are surely right about the situation today, as described in the article, but the logical extension would be to sever all dependencies … but there will still be e.g. “hallucinations”. So it’s a bit like putting lipstick on a pig.
… which would definitely be a step forward but I was considering some of the wider issues with “Big Tech”. In fact, when I see the word “sovereign”, that rings alarm bells.
Right, “European digital sovereignty” is inherently a political term, but the current situation is that most search engines are based in the US:
List of search engines - Wikipedia (on my Wikiless instance)
As quoted earlier, Ecosia and Qwant are not planning to develop AI models themselves, so the reliance on third-party LLMs and their APIs will continue to exist unless stated otherwise. Their plan is to provide the infrastructure for others to use it, so I assume that doing so will bring more attention to them and potentially attract European investment to further their mission.
And another chance wasted to create an European open search index. Instead we get something “fancy with AI”. I didn’t expect anything else from companies.
FYI there already is an anonymous privacy-respecting European search engine:
And it’s a pretty good one at that too! Integrates nicely with FF
Startpage uses Google as its search engine under the hood.
Are you sure? I thought they used Bing (because in fact Google didn’t want to make a deal with them…)
Absolutely.
Ok, That would be good news - Google is still better than any other (unfortunately)
You can use the list of search engines I linked from my Wikiless instance above if you want to compare the current available search engines, or if you want more backstory surrounding my usage of Startpage, you can read about it on the Purism community forums:
Currently I use my own Whoogle instance:
Nice!
Your Whoogle instance would give the same results as if it were a Google search?
I would not know, as I have stopped directly using Google services a long time ago. What I do know is that my Whoogle instance fetches Google results and has not been ratelimited since I have deployed it.
… and I would hope that when the project as per this topic is complete, Startpage changes over to using the fruits of this project.
Here is another quote from the latter article:
Even if an independent European privacy-first search engine is realized and Startpage migrates over to this “solution”, Big Tech can and will use technical and legal measures to prevent gratis indexing of their proprietary platforms, effectively locking projects out of usable search results unless they pay for the “privilege”.
Up to a point that is reasonable. If the information is only accessible by logging in and even then only accessible for personal use then it may be reasonable to prevent crawling or scraping. In any case, a polite web crawler should respect the “robot directives” - and a company in this position should be issuing same.
Where this would get very dodgy is if, say, a Big Tech company offers search and offers web site hosting - and they allow their crawler to crawl those web sites but they prevent any other crawlers from crawling those web sites, or by default they prevent that.
I have no problem with a social media site outright preventing all web crawling.
Then there are the free, public web sites that want to prevent scraping in order to monetise their content - but are not averse to being indexed. (That can be handled by the site owner indexing its own content and then handing the index to the crawler.)
You are right though that there are many legal stoushes and many regulatory stoushes between where we are now and any kind of comprehensive European “sovereign” search / indexing capability.