Ansearch answers

All had been quiet on the Ansearch front as I awaited a response from Ansearch CEO Dean Jones, promised a hair under two weeks ago when I alluded to an ear­lier analysis/criticism I’d writ­ten when talk­ing about the state of play with Aus­tralian search engines, specif­i­cally refer­ring to the then-newcomer Ansearch.

Dean picked up my post via Tech­no­rati, a blog search engine that uses RPC update ser­vices to track what peo­ple are talk­ing about in real-time. I was suit­ably impressed by this dili­gence and appar­ent desire to hear what the mar­ket has to say about their prod­uct: could this be the same com­pany whose birth was so marred by a spat of cyber-squatting, in what Dean Jones was reported to have described as a fit of “youth­ful exu­ber­ance”?

Appar­ently so. Ansearch’s begin­nings, though marred by dubi­ous prac­tices1, received praise from var­i­ous quar­ters of the main­stream press — or, at least, those quar­ters not con­trolled by News Corp, whose domains had come under threat. How­ever, the Inter­net com­mu­nity responded qui­etly, and those voices that were heard were mostly of dis­dain at Ansearch’s domain practices.

Strangely enough, my orig­i­nal post wasn’t about any of that. I hadn’t heard of Ansearch until I read an arti­cle on them in the SMH — an arti­cle which reads a lit­tle too much like a rehashed press release for my lik­ing: the tell­tale sign is in the clos­ing sen­tence “Ansearch is the search engine divi­sion of Optum Ltd.” — if it were filed in the Busi­ness sec­tion of their paper, I’d under­stand, but it wasn’t.

I wan­dered over to their site, played around for a bit, and decided their offer­ing was mediocre. In hind­sight, it prob­a­bly didn’t help that I wasn’t shop­ping for any­thing in par­tic­u­lar — accord­ing to a ZDNet arti­cle, “In the short term [Ansearch] is focus­ing very heav­ily on the com­mer­cial end of the market.” — but at that point in time, I also don’t think they’d tuned their list­ings par­tic­u­larly well, as a search for Dash­Lite turned up my Word­Press hack over com­mer­cial list­ings for the actual Dash­lite brand I inad­ver­tantly used.

I say “at that point in time”, because it appears to have sub­stan­tially improved since, as per Jones’ claim: “Much has changed since your first arti­cle on us some 6 months ago.”

Much improved, it seems, on sev­eral fronts. Their core offer­ing has shaped up nicely, and some facets of my ini­tial com­plaints regard­ing acces­si­bil­ity have been met. Their ancil­lary prod­uct offer­ings seem to have devel­oped nicely: Ansearch CEO Jones claims “Each of [our prop­er­ties] goes through up to 7 stages rang­ing from an ini­tial, sim­ple SERP/Directory style page through to a more involved ser­vice, mini por­tal, search tool, etcetera.” He went on to say that these ancil­lary prop­er­ties (such as http://www.picsearch.com.au/, http://www.videosearch.com.au/, http://www.thefreedictionary.com.au/ and http://www.messengers.com.au/ amongst sev­eral oth­ers) are cur­rently being actively sep­a­rated from the core Ansearch site (he described it as “quar­an­ti­ning”), and the exact direc­tion of a num­ber of these projects would become clear over the com­ing months, with the appoint­ment of a full time man­ager of these online properties.

I’m a tad con­cerned about his descrip­tion of their strat­egy with regard to these — he said this would become clear over the months to come, and I’m hang­ing off two words here: dis­trib­uted por­tal. Whilst I can see this as being of value to users (espe­cially for generic, non-brand-specific/legally dubi­ous domains such as jokes.com.au and the ones listed above), it doesn’t seem to fit Ansearch’s core strength as I per­ceive it: as a com­mer­cial por­tal, and not as another Google. “We are not aim­ing to be another Google… we don’t have their bud­get and, to be frank, there are enough peo­ple try­ing to clone them: why build another?”

In fact, Jones sug­gested that Ansearch’s strengths lie in that it is not the ubiq­ui­tous search behe­moth, and that its index is “some­thing unique… some­thing faster… [and] against the so called “arms race” of search (my SE has more links than yours etc…)”. I’d agree this is indeed a strength, and also a rea­son for them not to try and be a por­tal. Aus­tralia already has Yahoo! and NineMSN for domes­tic por­tals, and I’m strug­gling to see what Ansearch will do to dif­fer­en­ti­ate them­selves in this: but I’m happy to be surprised!

Ansearch appar­ently holds an index of only 500,000 web­sites con­sid­ered by its met­rics to be “most pop­u­lar”. I argued that this was poten­tially a bad thing as rel­e­vant con­tent might lie out­side this realm: for exam­ple, this web­site per­forms well when peo­ple search for reviews of the HP 2610 or infor­ma­tion about Apache on Ubuntu linux or ACT files from MP3 play­ers that record audio, but isn’t included in Ansearch’s core index.

Which is per­fectly valid, for a commercially-focussed site, I just think they could be miss­ing out a lit­tle bit. They can lever­age on my con­tent for their adver­tis­ing impres­sions and poten­tial click­throughs, because they have more valu­able con­tent show­ing up in their list­ing along­side adver­tised prod­ucts. If some­one reads my HP 2610 review after hav­ing found it in Ansearch, and decides they’d like to buy it and remem­bers hav­ing seen a “Buy HP print­ers!” ad on Ansearch, they’ll most likely click “back”. It’s abstract, behav­ioural stuff, but valu­able nonetheless.

Whether it’s valu­able enough for them to bother is another mat­ter. “We spi­der our own con­tent… some­thing that over time will be done daily,” says Jones. “Hav­ing only 500,000 web­sites will allow us to index sites more often, and as is the case with the ‘site info’ pages, pro­vide far more info on these pages.” Which is a value-add, and worth pre­serv­ing. If that’s all resources per­mit, I think they’re doing the right thing as is. Jones openly admits Ansearch’s index of pop­u­lar­ity “has a com­mer­cial flavour to it” — and rightly so. Given their much-touted gen­der and age demo­graphic based search fea­ture, this makes sense.

Their index of pop­u­lar­ity seems to be fairly slow-moving. “Monthly we add around 20,000 sites… and take out 20,000.” I’d guess this would be the low­est 20,000 that gets shuf­fled, and this seems to make sense. One has to won­der whether all the higher-ranking pages can have sub­stan­tially fresh con­tent month after month, but pre­sum­ably they do — it’s one of the things the SEO experts have always cried from rooftops.

It was inter­est­ing to hear Jones speak­ing about these peo­ple, too: amus­ing, even! Web devel­op­ers the world over often join in spec­u­la­tion as to what exactly makes search engines tick, such that we can boost our clients (or employ­ers) website’s per­for­mance. It seems the reverse is also true: search engines all over the world sim­i­larly spec­u­late as to what those hor­ri­ble devel­op­ers are doing to screw with their indexes day in and day out!

I don’t say this in jest, and I believe they’re right to com­plain: “The larger SE’s are hav­ing a very tough time com­ing up with clever ways to index con­tent to counter SEO… only to have SEO’rs quickly find ways around it. Cat and mouse…” I think “counter SEO” was a poor choice of words, given that rel­e­vant con­tent should hope­fully still be rewarded, but his point stands.

Just as inter­est­ing is Ansearch’s strat­egy to avoid falling prey to dodgy SEO tactics:

By only index­ing the root page, we remove almost all SEO trick­ery. This works in 2 ways. Firstly, peo­ple rarely put spam on their home page — that is, door­way pages, link farms, etc. usu­ally reside away from the main index… and, sec­ondly, it deletes mul­ti­ple results from the same web­site. It also stops the site owner/webmaster from say­ing they are rel­e­vant to 100 or 1000 key­words or phrases.

Kids, we just found a new argu­ment against clients who love their splash pages!

Con­tent rich front pages aren’t, how­ever, an absolute solu­tion (at least, not in Ansearch’s index). Accord­ing to Jones, Ansearch’s pol­icy of “rank­ing sites in true usage pop­u­lar­ity, both on and off­site” is “SEO proof… or at the very least, extremely resis­tant.” I’d agree it’s a pow­er­ful met­ric, but my reser­va­tions above still stand.

One caveat of Ansearch’s algo­rithm that appears poten­tially exploitable is its fail­ure to exclude con­tent in the from indexing. I don't just speak of standard meta author/keywords data, but of something else.

A screenshot highlighting the inclusion of information between style tags in Ansearch's index

As high­lighted in the screen­shot above (click for orig­i­nal page, link may expire), Ansearch’s list­ing is includ­ing con­tent between <style> tags. This presents poten­tial for SEO abuse2, as most browsers hap­pily over­look errors in CSS — and <style> tags can be placed towards the top of a doc­u­ment: if we are to believe the SEO myths, increas­ing their rel­e­vance in engines. Of course, it’s entirely pos­si­ble the con­tent bears no weight at all — but the ques­tion of why it is stored in their index at all remains unanswered.

This is another rea­son to reward web­sites that use seman­tic markup prop­erly, though at this stage that would exclude dis­pro­por­tion­ate amounts of the web, so I under­stand engines’ hes­i­tance to embark on any­thing like this. It’s not some­thing a lot of sites use”, says Jones, before con­tin­u­ing “but it will be used more and more in the future.” Well, so much of the web com­mu­nity hopes.

This formed part of Ansearch’s defense for not hav­ing embraced seman­tic markup from the out­set. Accord­ing to Jones, it’s built on a tech­nol­ogy devel­oped for a pre-April 2000 (dot com crash) search engine — so that par­tially excuses the markup at launch time. Jones’ first com­ment on their fail­ure to use seman­tic markup was sim­ply that “The majors [Google and Yahoo!] don’t use it” — something I’d dis­pute the valid­ity of, as Ansearch isn’t a “major” player, and, as has been estab­lished, is chas­ing a fairly dif­fer­ent mar­ket sec­tor. Their core busi­ness is search, but it’s a dif­fer­ent breed of search con­ducted in a dif­fer­ent way: and seman­tic markup and acces­si­bil­ity is a dif­fer­ent way. Encour­ag­ingly, Jones sees the poten­tial for embrac­ing seman­tic markup in the future on both tech­ni­cal and com­mer­cial grounds: “It makes sense to use it and as it does open us to a wider audi­ence with var­i­ous devices used to browse our site.”

He didn’t cite the “reduced band­width expen­di­ture as a result of light­weight code” rea­son, pre­sum­ably because their host, OzHosting/Destra charges only for the link, not for trans­fers over this, on their ded­i­cated server range.

Irre­spec­tive of their rea­sons, the future of Ansearch in terms of markup is promising:

Our long term goal is to have Ansearch web­site designed with­out any tables and heav­ily styled using the CSS, which even­tu­ally will gives us more con­trol on how we present our site to dif­fer­ent media types.

Ansearch has gone through sev­eral minor enhance­ments over the past 6 months with the releases of ver­sions 1 to 1.3. We are cur­rently plan­ning a major update for ver­sion 2.0 and the issues [of seman­tic markup and sep­a­ra­tion of pre­sen­ta­tion and con­tent] will be addressed.

But as we know, markup isn’t every­thing: con­tent is what ranks well in search engines erm… con­tent is what draws an audi­ence. Ansearch’s explo­ration into the devel­op­ment of por­tal envi­ron­ments is some­thing to be watched with inter­est over the com­ing months, as well as its other busi­ness aspects, includ­ing an adver­tis­ing net­work known as Soush that remains slightly enig­matic, and the mys­te­ri­ously named “Fac­tory” division.

An announce­ment is expected to be filed with the ASX later this week out­lin­ing some­thing of Ansearch’s future direc­tion: At this stage, I’m inclined to believe that the future is a pos­i­tive one, as Ansearch dis­tances itself from its much-criticised prac­tices at launch, to a diverse range of prod­uct offer­ings that uniquely ful­fil the needs of Aus­tralian Inter­net users.

Update: A fol­lowup to this has been posted, in response to a crit­i­cism that this review was overly tech­ni­cal in nature. Read on!

Notes

1 Jus­ti­fied with the catch-cry “MSN do it, so we can, too!” — to which the only sen­si­ble reply is, “yes, but MSN do it with Inter­net Explorer, and as soon as you go and write your own web browser, feel free to hijack as many unused pages as you want.“
2 I noti­fied Ansearch of this shortly prior to pub­li­ca­tion in the hope that, if this is indeed an issue, it will be resolved before this post is noticed and widely acted upon. One hopes this poten­tial prob­lem dis­ap­pears quickly.

Creative Commons Australia

I’m grat­i­fied to see an inter­na­tion­al­i­sa­tion work in progress for Cre­ative Com­mons licens­ing schemes, specif­i­cally, the pro­duc­tion of a doc­u­ment for here, in Aus­tralia (for obvi­ous rea­sons, I have some­thing of a vested inter­est in the integrity of this licence in my place of res­i­dence!). If only for rea­sons of cor­rect domes­tic spelling, it’s enjoy­able to read the Aus­tralian ver­sion over for­eign (typ­i­cally, United States) ver­sions of the doc­u­ment! Read the rest of this entry »

Firefox 1.0

Firefox - Rediscover the web
Fire­fox 1.0, the stand­alone web browser tak­ing the world by storm, has been released today amidst much antic­i­pa­tion from the Inter­net com­mu­nity at large. This standards-compliant, secure and cross-platform browser has, even before its final ver­sion 1 release, received more than eight mil­lion down­loads globally.

If you haven’t already made the switch, visit http://getfirefox.com/ and redis­cover the web today!

I’ve got a mod­er­ately large poster (sized some­where between A3 and A2) printed out, and plan on propaganda-ing my part of UNSW tomor­row morn­ing… if you’ve got noth­ing bet­ter to do with your time tomor­row, engage in some propaganda-ing of your own!

# by Josh on November 10th, 2004 Tags: ,
| 1 Comment »