I’m going to start using Gigablast.com as my default daily search engine for the next 3 weeks or so.  Gigablast does not bill itself as a privacy search engine so I figure they are tracking some behavior.  However, they are not in the advertising business so I don’t think it matters too much for me.

Once again, I invite you to do the same with me and let me know what you think of Gigablast.  Just set Gigablast as your default search engine and use it first every time you search.

Liked this post? Follow this blog to get more. Follow

I always have thought web directories should have backfill from a search engine when you search them.  Especially, general directories.

Yahoo (dubbed “Mighty Yahoo”) was the first directory to add a backfill provider on it’s search results. First Alta Vista, then Inktomi, then Google, then Inktomi again provided backfill results for Y!.  Snap/NBCi directory had backfill and so did Looksmart.

How backfill worked was somebody would search the directory, when the directory ran out of listings that matched the query, you would see search results from the backfill provider.  (Being Mighty Yahoo’s backfill provider cemented Google’s reputation and proved to be one of the biggest mistakes Yahoo ever made.)  What was important is the search portal needed to provide something in a search that would satisfy the person searching, otherwise they would start going elsewhere. And they could go elsewhere because there was genuine competition in web search back then.

It gets harder for niche directories.  For example: if you go to a Star Trek niche directory and search for “uniforms” it is assumed you are only going to get uniform sites that are related to Star Trek.  In this instance you don’t want a general backfill giving results for police uniforms or nurses uniforms.

For backfill to work on a niche subject directory you almost need a place to put in an extra search “slug” to bring it into context.  So in our example, you need to have a place in your admin panel to add “star trek” as a term to the backfill.  So that somebody searching for uniforms on your Star Trek directory are automatically searching for “star trek uniforms” on the backfill.  The “star trek” is always added to the backfill.

It gets even harder for a local or geographic directory.  Frankly I think backfill should not be used with a local directory.

I did some looking around, Google and Bing are way too expensive to get a search feed from.  Smaller engines like Mojeek.com (or Mojeek.co.uk) or Gigablast.com provide search results API’s.  Both are affordable, but Gigablast’s is really really affordable.

Of course, I don’t know how to actually code this.  That’s what coders are for.  😀

But the trick here is this:

  • The more listings you get in your directory the more users will use the search function.
  • You have to have a search function.
  • No matter how sophisticated your directory search is, you have a finite amount of listings.
  • Mobile devices make the search function even more important.
  • This is why most will need backfill.

Liked this post? Follow this blog to get more. Follow

First, let me say, I understand that the IndieWeb movement already has a lot on their plate and that they have already accomplished a lot.

It seems to me, at some point, the problem of commercial silos of web search engines must be addressed since 1. a near monopoly is held by Google, 2. both spidering engines (Google and Bing) are oriented towards brands, data mining user profiles, advertising and the commercial.

How we build websites today, is largely controlled by what Google likes and dislikes.  If you don’t build the site Google’s way, you don’t rank in Google.  If you don’t rank in Google, you might as well be dead.  It has happened slowly over time, but Google has warped the Web into it’s own image. We don’t build websites the same way we did Before Google (BG).

But there are alternative search engines, and I like and use several of them.

Duckduckgo – protects user privacy. Draws search results from many sources. Bing is the backbone of their search results.  They do have their own spider and index but I’m not sure how large that is.  If Bing should cease operating DDG will be in trouble.

StartPage – protects user privacy. Is basically Google feed stripped of geolocation and personal search history data. But if Google ever turns off access to the search feed, StartPage is gone.

Hotbot –  claims to protect privacy.  Appears to be straight up Bing feed, stripped of advertising and tracking customization.  Clean, minimalist results pages. But if Bing should cease operating, HotBot will go down.

As much as I like Duckduckgo and StartPage, both depend on the search indexes of large siloed companies.  With only two major search engines (Google and Bing) that spider the web it seems like an unhealthy state of affairs.

What can the IndieWeb do?

Off the self solutions

There are a couple of smaller search engines that are trying to stay alive.

Mojeek.com – privacy protecting. UK based. Active spidering as money permits.  Has potential.

Gigablast.com – Open Source code. Active spidering as money permits.  Can index URL’s very quickly. Has potential.

Both need larger databases. Both need funding. Both need some R&D to improve search results and ranking logarithms.

One option might be for the IndieWeb to campaign for private donations for one or both of these independent search engines. Publicity within the movement would help.

Build Our Own IndieWeb Search Engine(s)?

Here are a few resources available “off the self.”

Searx – (running instance). Open Source Meta search script.  At best this would be a stop gap solution.  One major problem is it is scraping results from other engines without permission, sooner or later that will get shut down.

Gigablast – The script is Open Source.  Gigablast, as is, is quite impressively capable if one can afford to keep it indexing and provide the coding to fork it and improve it.

More Open Source and P2P Search Engines.

Curlie – is the revival/continuation of the Open Directory Project.  Large meta directories have had their day but Curlie could provide two things: 1. a big index for a starting crawl for a new search engine, 2. A ranking indicator of quality for web pages. Human reviewed collections like ODP were used by Google in the early days as a quality indicator in ranking and can be used by new engines.

Wikipedia – 1. Using Wikipedia itself to answer queries, 2. Wikipedia contains a lot of outside links, so it would be another place to use as a starting crawl.

Guerrilla Discovery Options

Webrings and Blogrolls – Webringo is free and seems underutilized, why not use it to create web rings for discovery?  Blogrolls, hey why not?

Directories and Filter Blogs –  niche directories can still work when tied to an interest community (just ask almost any local Chamber of Commerce).  Filter blogs might work too. Boingboing is an example of a filter blog that leads you to things posted on the web.  Most bloggers already do some filter blogging.

I’m sure this has already been discussed within the IndieWeb community.  Coming up with a full fledged search engine would be a monumental task and expensive.  But I’ve also tried to lay out smaller interim steps that will gain experience or help break the corporate silos.

Agree? Disagree? What am I missing? Feel free to comment.

 

 

 

Liked this post? Follow this blog to get more. Follow