Ecosia and Qwant building a European privacy-first search index

amarok · November 12, 2024, 3:49pm

Article: https://www.euronews.com/next/2024/11/12/europes-answer-to-google-ecosia-and-qwant-partner-to-build-new-search-index

will be operational in 2025
infrastructure, called European Search Perspectives (EUSP), based in Paris
infrastructure will be available for other independent search engines and technology companies

FranklyFlawless · November 12, 2024, 7:35pm

Here is a quote from the article that caught my eye:

amarok · November 12, 2024, 8:51pm

Hopefully, the AI aspect is in the context of improving indexing and search results.

Here’s a more in-depth article: Ecosia and Qwant, two European search engines, join forces on an index to shrink reliance on Big Tech | TechCrunch

FranklyFlawless · November 12, 2024, 9:04pm

Here is a quote from the latter article:

I assume that their usage of GenAI within their search engines will be answering queries as text summaries.

irvinewade · November 12, 2024, 10:20pm

Won’t Ecosia and Qwant just become Big Tech then? Even so, anything to reduce reliance on Google is to be applauded.

Likely some of the problems with LLMs and GenAI will just be replicated within the new infrastructure.

FranklyFlawless · November 12, 2024, 10:24pm

No, both entities are only focusing on creating an independent privacy-first search index, not creating proprietary operating systems and/or social media platforms. The rest of the infrastructure, notably LLMs, remains heavily reliant on various third-parties’ API access, which are still as susceptible to EULAs, pricehikes, ratelimiting, etc.

irvinewade · November 12, 2024, 11:01pm

In respect of LLMs etc., I was (very unclearly) commenting on a hypothetical future where even that was brought into the new “sovereign” digital infrastructure. In other words, yes, you are surely right about the situation today, as described in the article, but the logical extension would be to sever all dependencies .. but there will still be e.g. “hallucinations”. So it’s a bit like putting lipstick on a pig.

… which would definitely be a step forward but I was considering some of the wider issues with “Big Tech”. In fact, when I see the word “sovereign”, that rings alarm bells.

FranklyFlawless · November 12, 2024, 11:47pm

Right, “European digital sovereignty” is inherently a political term, but the current situation is that most search engines are based in the US:

List of search engines - Wikipedia (on my Wikiless instance)

As quoted earlier, Ecosia and Qwant are not planning to develop AI models themselves, so the reliance on third-party LLMs and their APIs will continue to exist unless stated otherwise. Their plan is to provide the infrastructure for others to use it, so I assume that doing so will bring more attention to them and potentially attract European investment to further their mission.

Ick · November 13, 2024, 1:28am

And another chance wasted to create an European open search index. Instead we get something “fancy with AI”. I didn’t expect anything else from companies.

TiX0 · November 13, 2024, 3:30am

FYI there already is an anonymous privacy-respecting European search engine:

And it’s a pretty good one at that too! Integrates nicely with FF

FranklyFlawless · November 13, 2024, 3:30am

Startpage uses Google as its search engine under the hood.

TiX0 · November 13, 2024, 3:31am

Are you sure? I thought they used Bing (because in fact Google didn’t want to make a deal with them…)

FranklyFlawless · November 13, 2024, 3:32am

Absolutely.

TiX0 · November 13, 2024, 3:34am

Ok, That would be good news - Google is still better than any other (unfortunately)

FranklyFlawless · November 13, 2024, 3:44am

You can use the list of search engines I linked from my Wikiless instance above if you want to compare the current available search engines, or if you want more backstory surrounding my usage of Startpage, you can read about it on the Purism community forums:

Currently I use my own Whoogle instance:

https://whoogle.franklyflawless.org/

TiX0 · November 13, 2024, 3:49am

Nice!
Your Whoogle instance would give the same results as if it were a Google search?

FranklyFlawless · November 13, 2024, 3:52am

I would not know, as I have stopped directly using Google services a long time ago. What I do know is that my Whoogle instance fetches Google results and has not been ratelimited since I have deployed it.

irvinewade · November 13, 2024, 3:58am

.. and I would hope that when the project as per this topic is complete, Startpage changes over to using the fruits of this project.

FranklyFlawless · November 13, 2024, 4:14am

Here is another quote from the latter article:

Even if an independent European privacy-first search engine is realized and Startpage migrates over to this “solution”, Big Tech can and will use technical and legal measures to prevent gratis indexing of their proprietary platforms, effectively locking projects out of usable search results unless they pay for the “privilege”.

irvinewade · November 13, 2024, 4:30am

Up to a point that is reasonable. If the information is only accessible by logging in and even then only accessible for personal use then it may be reasonable to prevent crawling or scraping. In any case, a polite web crawler should respect the “robot directives” - and a company in this position should be issuing same.

Where this would get very dodgy is if, say, a Big Tech company offers search and offers web site hosting - and they allow their crawler to crawl those web sites but they prevent any other crawlers from crawling those web sites, or by default they prevent that.

I have no problem with a social media site outright preventing all web crawling.

Then there are the free, public web sites that want to prevent scraping in order to monetise their content - but are not averse to being indexed. (That can be handled by the site owner indexing its own content and then handing the index to the crawler.)

You are right though that there are many legal stoushes and many regulatory stoushes between where we are now and any kind of comprehensive European “sovereign” search / indexing capability.