DNS oops

I may have for­got­ten to setup joahua.com to point to the new web server when I moved josh.st across. My bad. I changed it over a day or two ago (I for­get when actu­ally) and now the old addresses work. I will prob­a­bly get lots more search engine love accord­ingly as all those old links that stopped work­ing start func­tion­ing again. Some­thing else that would prob­a­bly get search engine love is post­ing new con­tent, but it’s so easy to get lazy and not bother. Sigh. At any rate, after both­er­ing to post some stuff my Adsense rev­enue actu­ally did some­thing this week for the first time in months. And I’m pretty sure none of the reg­u­lars even click the ads, so there we go! Don’t quite know how that hap­pened, but… cool.

The magic 1st-cheque mark is approach­ing kind of like a curve approaches a line it never touches. I seem to recall this is some­thing to do with Lim­its, but actu­ally never even stud­ied Cal­cu­lus at all and know that has some­thing to do with it… I seem to recall func­tions made sense only because I already under­stood them in the con­text of pro­gram­ming ran­dom stuff… maybe Adsense can teach me maths!

# by Josh on September 12th, 2007 Tags: , , ,
| No Comments »

RI revisited, Web standards, AJAX, LDAP and architecture

I vis­ited Raw Ideas today and was really quite excited by what I saw. They’re about to move office again so I was pop­ping in to return the keys (I still had them even though I haven’t worked there for sev­eral months now) and gen­er­ally catch up. Tino was work­ing on a tape library appli­ca­tion for archiv­ing DVCPro and Mini DV and HD(V, mostly) footage in a really search­able and gen­er­ally more-manageable-than-shelves-full-of-labels kind of way, and he was pretty keen to show it off. Freakin’ awe­some stuff. Aside from some DHTML gim­micks (fad­ing rollovers, etc., stuff that you think is cool when you’re devel­op­ing it but does noth­ing but irri­tate you once you have to sit down and finally use the appli­ca­tion for five min­utes!) it was great to see he’s using Scrip­tac­u­lous for some gen­uinely use­ful AJAX-based functionality.

Because it’s a library, it’s basi­cally one big search engine. Which means that auto­com­plete is a really handy thing to have, and being able to click on a piece of infor­ma­tion and edit it straight away (so, tak­ing plain text and con­vert­ing it into a textarea or input field for edit­ing imme­di­ately, with­out a sep­a­rate admin view) is absolutely price­less for mov­ing through a library quickly. This is so the way con­tent edit­ing should be head­ing — I’m hop­ing we all get there in the end.

But even more excit­ing than Javascript usabil­ity gim­micks was to see that he’s still using CSS, now more exten­sively and with­out assis­tance, and with kick-arse seman­tics. I looked at the source of his page quickly and the only com­plaint I had was his use of a span for a header instead of an Hx… totally won­der­ful to see a few months after the res­i­dent stan­dards nazi (that would be me) has taken off!

So we threw around ideas about that (includ­ing rip­ping time­code off DV tape and try­ing to set marker points, import­ing EDL’s for use inside the library, automat­ing transcod­ing processes and export­ing H.264 or FLV for pre­views, and a cou­ple of other equally fun things), then even­tu­ally started chat­ting about what I’m doing over here at Youth­works these days.

I think I made him kind of jeal­ous. I’ve seri­ously got one of the best jobs in the web devel­op­ment world right now. I get to come up with stuff that’s gen­uinely use­ful for users (and pro­duc­tive for the Gospel, yada yada — that’s the implicit goal of all of this), entirely in response to their needs, with­out being bur­dened in par­tic­u­lar by his­tory, or legacy sys­tems that need to inte­grate, or any major com­peti­tors — it’s won­der­ful. So we started talk­ing about plat­forms and what­ever and I said I was con­sid­er­ing Django (and got a big tick accord­ingly, which was nice) with an RDBMS (i.e. MySQL, just because that’s pretty much all I have expe­ri­ence with inso­far as DBs go) but then out­lined a bit more about the project and he rec­om­mended an LDAP sys­tem pretty strongly.

LDAP is a directory-based data­base which is strongly heirar­chi­cal and finely gran­u­lated in nature. Which is bloody use­ful when you’ve got a user struc­ture five lay­ers deep:

Simple CYIADA universe

But, of course, mod­er­a­tors do not “con­tain” lead­ers any more than lead­ers “con­tain” youth. All of these tiers exist inde­pen­dently of one another. They are inter­nally defined by their extrin­sic rela­tions, even though their user expe­ri­ence of the web­site will vary depend­ing on their heirar­chi­cal posi­tion. The lat­ter makes LDAP seem entirely sen­si­ble, but the for­mer def­i­n­i­tion of per­sonal iden­tity (that is, what con­sti­tutes a “self” or inde­pen­dent user entity — a Dis­tin­guished Name, in LDAP-speak) seems to rile against that direc­tory concept.

“Mod­er­a­tor” is, in fact, a prop­erty of “Leader”. That is, it is a qual­ity belong­ing to the user, who belongs to the group “leader”. Users should be unique and belong to an Organ­i­sa­tional Unit (again, in LDAP speak) that reflects their role within the sys­tem. Thus, mod­er­a­tor­ship gen­er­ally will neces­si­tate belong­ing to two OUs: one does not cease to lead within their own group con­text if they are appointed as a sitewide mod­er­a­tor — like­wise, mod­er­a­tors may be appointed who do not have any for­mal role as a leader of a youth group. (This prob­lem may be cir­cum­vented by cre­at­ing such users at a CYIADA Global admin­is­tra­tion level, instead — for exam­ple, I do not lead a youth group in the tar­get demo­graphic, and I vol­un­teer to edit con­tent occa­sion­ally: I am not the web­mas­ter admin­is­tra­tor (hypo­thet­i­cally), but require mod­er­a­tion pow­ers with­out being a leader asso­ci­ated with any group).

CYIADA universe with groups

Groups, of course pose their own set of stu­pid dif­fi­cul­ties. They appear to have no heirar­chy at all: indeed, even where they could (for exam­ple, a Katoomba Con­ven­tion branch with KYCK, KYLC, KEC, etc. sub-branches, or a CMS branch with Sum­mer School, MMM, etc. sub-branches) this isn’t par­tic­u­larly use­ful (and, con­se­quently, not desirable).

They don’t con­sti­tute OUs, because OUs have already been used to assign roles (prob­a­bly a bas­tardi­s­a­tion of stan­dard X.520 prac­tice, but so much of this will be I don’t par­tic­u­larly care). The only way I could see it work­ing would be by defin­ing mul­ti­ple Organi[s/z]ation com­po­nents, but even then…

I don’t know. My head has been in rela­tional data­base space for so long I want every­one to have a numeric iden­ti­fier link­ing them to another table chock full of organ­i­sa­tion records. It makes me com­fort­able. But then, LDAP would man­age authen­ti­ca­tion and roles, if not asso­ci­a­tion, and appears to gen­er­ally have poten­tial to make life a lot eas­ier. So per­haps there’s some way to con­nect direc­tory and RDBMS happily?

Feed­back more than wel­come. I’m not wor­ried about plat­form specifics, just about the the­o­ret­i­cal archi­tec­ture of such a beast (and my con­cep­tion of LDAP in gen­eral). If you’re read­ing this and know any­thing about OpenL­DAP or AD or RHCS or any other plat­form, or just know about con­nect­ing to exist­ing sources and extend­ing them, please leave a com­ment and make me happy :-)

Why MSM and open paradigms don’t mix

Microsoft are now play­ing ball. They’re “get­ting” this whole clue­train gig, even for­mal­is­ing their enact­ment of it into a con­fer­ence billed as a 72-hour con­ver­sa­tion. They’re doing blogs. They’re lightly, if at all, mod­er­at­ing those blogs. And they’re respond­ing to con­tent on those blogs as appro­pri­ate (that is, ignor­ing the absolute rub­bish and closed-mind-open-source-supporting-nerds).

In every way what they’re doing and what they’re chang­ing is absolutely awe­some. As an IT com­pany maybe it’d be fair to say they’ve got a head­start on the rest of the world. They’re cer­tainly doing bet­ter than MSM are.

Say, for exam­ple, there was a social networking/photo site to be inte­grated into a TV programme’s com­mu­nity site: one that’s meant to actu­ally con­nect with view­ers, and falls under “Com­mu­nity” in the network’s struc­ture — not the one that mind­lessly pushes top-down con­tent. And that because of con­cerns about mod­er­a­tion — chiefly stem­ming from the notion that pub­lic iden­ti­ties are untouch­able and sacred in the net­work eye, and the arro­gance that comes as a part of that –, the only advan­tages (pol­i­tics and free band­width because of deep-linked pho­tos aside) of inte­grat­ing an exter­nal photo ser­vice are negated, and users have absolutely no incen­tive to sign up for a wider Yahoo! sign-on (which would allow them to com­ment on pho­tos at Flickr, amongst other things).

So MSM struc­tures are still win­ning. I expected this would be the case. I think it’s going to take another five years before peo­ple can get over them­selves enough to realise that allow­ing peo­ple to com­ment (not anony­mously — that was never on the cards!) isn’t an intrin­si­cally dan­ger­ous thing. The idea that the greater fool is the one stop­ping to make flip­pant dis­parag­ing (even just seem­ingly so!) remarks about peo­ple they’ve never met is, in fact, turned on its head by the recog­ni­tion of such remarks. To acknowl­edge a fool’s power surely isn’t the most intel­li­gent thing one could do in response.

I digress. The point is, for as long as they’re think­ing they have any chance of con­trol­ling what’s going on, this isn’t going to work. Wanna stop peo­ple com­ment­ing on a photo you stuck up on Flickr? Sure thing, feel free to dis­able it. If the com­ment is of con­se­quence they’ll blog it any­way and the dam­age is out there and you’ve got a hell of a lot more work to do if you want to purge that blight on your carefully-constructed-cult-of-celebrity-image from the web… and if it’s not of con­se­quence they won’t bother to pub­lish it any­where else, and, in all prob­a­bil­i­tiy, it wouldn’t have done a great deal of harm were it to be pub­lished in the photo’s com­ments any­way. In many ways, inline com­ment­ing is actu­ally a more restric­tive form of social inter­ac­tion in the online sphere because it’s cen­tralised. I’m advo­cat­ing it here because the audi­ence has appalling elec­tronic lit­er­acy (which is, I take it, typ­i­cal of the bulk of the Aus­tralian pop­u­la­tion still: even if the SMH writes about blogs, only peo­ple who blog will bother to read an arti­cle that has “blog” in the head­line… and then they’ll go and blog about it), so the blog thing is still, prob­a­bly, 5 or so years off hit­ting “main­stream” audi­ences. (Inci­den­tally, any­one pro­claim­ing the death of radio/rise of pod­cast­ing should sim­i­larly antic­i­pate no-one is even know­ing what they are talk­ing about for a sim­i­lar period of time — and no, the fact that iTunes has an obscure fea­ture doesn’t help matters).

Must fin­ish with this price­less grab from a weekly newslet­ter, regard­ing viewer-directed con­tent cho­sen via an online sur­vey: “We always say our show is your show, so I think this seg­ment makes a lot of sense.” And yet they’d rather not give view­ers a voice at all. This isn’t giv­ing view­ers a say, it’s allow­ing them to effec­tively switch meta-channels (almost, pre­sum­ing they’re vot­ing with the major­ity). The seg­ment makes sense from a MSM per­spec­tive, but the far­ci­cal nature of this “open­ness” comes to light pretty quickly as soon as any tru­ely multi-directional com­mu­ni­ca­tions chan­nels come into play.

I think it’s going to be great fun watch­ing “them” (MSM gen­er­ally) slowly come to terms with this idea over the next cou­ple of years. MSM isn’t going away, but I think any of these “social” shows are going to flop unless they rad­i­cally re-think strate­gies (hybrid broadcast/Internet model, any­one?) or stop pretending…

A quick note: I haven’t men­tioned any­thing by name here because, well, no-one else is both­er­ing to blog about the site in ques­tion (an ear­lier blog post is on the first page of results for a par­tic­u­lar key­word, I’d rather not do that again!) Actu­ally it’s kind of funny because my site + seman­tic markup, etc., is blitz­ing the network’s core site (i.e. not our ancil­lary com­mu­nity site) in search engine rank­ings (well, Google at least, heh), but I digress! Not that I’ve writ­ten about any­thing sen­si­tive… every­thing here is digested pub­lic infor­ma­tion (or will be by the time this pub­lishes tomor­row) and is con­sis­tent with my usual rant­i­ngs and opin­ions about social media, IT, etcetera, and my usual cyn­i­cism and dis­dain for com­mer­cial (pri­mar­ily broad­cast — print is (paint­ing broad strokes) gen­er­ally less obvi­ously tainted) media! Good fun.

WordPress redeemed, a little; and, a rant about parallel blog universes

Well, not really… I’m just in less of a bad mood with it and have realised that TextPat­tern really isn’t that great unless you just want a blog and noth­ing more. And I’m loathed to use Mambo or the like… though I imag­ine that’s prob­a­bly largely poor brand per­cep­tion on my part (hav­ing seen the hor­ri­ble stuff peo­ple can cre­ate with it). I lump it into the same bas­ket as phpBB and other bloated/insecure/inaccessible crap like that.

It’s prob­a­bly not really, but I’ll per­sist in my delu­sions until forced to learn oth­er­wise (either by myself or others!)

Any­way, syn­di­ca­tion ser­vices (Atom, RSS) rock my world and should be more broadly used even inter­nally for things that you mightn’t think would require it. This is the con­clu­sion I’ve come to hav­ing started putting together a new site (the one based around Word­Press I was whin­ing about) for my church and won­der­ing how best to inte­grate an upcom­ing events cal­en­dar on the front page.

It remains to be seen whether or not I actu­ally do it that way, but it’d be nice if syn­di­ca­tion was already so heav­ily a part of Word­Press’ pro­cess­ing that it became a triv­ial thing to run a parser func­tion on any page. I’m still try­ing to decide whether to setup cus­tom queries in Word­Press to read future-dated posts for events + make them acces­si­ble (able to be accessed, that is; not espe­cially applied to broad audi­ences, assis­tive tech­nolo­gies, etc.) prior to when they’re sched­uled to appear… or whether to sim­ply build my own app on the side that either spits out an include I’ll grab with PHP in my tem­plates — bor­ing — or an Atom feed that Word­Press can parse, and lots of non-IE browsers (Well, prior to ver­sion 7! Can’t wait!) can do Use­ful­Stuff™ with, and that can inte­grate into a Dash­board wid­get for Mac users and a Kon­fab­u­la­tor wid­get for PC users, etc.

Yeah, peo­ple mightn’t use it lots but it’s a cool idea ;-) This is what doing one web­site for a TV net­work has done to me — it’s all about eye-candy and out-gimmicking the opposition!

Speak­ing of the Oppo­si­tion (NineMSN, I guess) and Gim­micks, Win­dows Mes­sen­ger 8 Beta looks like it’s shap­ing up into some­thing I could actu­ally use with­out com­plain­ing too loudly. They’ve pulled off the disposing-of-normal-UI-occasionally thing far bet­ter than Win­dows Media Player ever has, and every­thing feels as though it gels really nicely.

I’m a lit­tle con­cerned they’re try­ing to pull users into their own ‘por­tal’ thing with Spaces and var­i­ous other Live.com crap, but it’s hardly as if they’re the only ones doing that. It’s ironic that we’re get­ting into an era of allegedly-more-open citizen-powered media that’s becom­ing pro­gres­sively more iso­lated because of ser­vice providers. For exam­ple, what the heck do Yahoo! do? I don’t get it. I don’t know any­one that uses their Mes­sen­ger ser­vice, or their blog ser­vice (Yeah! They have one! What the heck?! Dis­cov­ered this last week and was suit­ably shocked), or their email ser­vice. Same goes for AOL (nearly… I know a hand­ful of peo­ple that have an AIM account and sup­pos­edly use it… but it’s lit­er­ally a hand­ful, as in I have enough fin­gers to count all of them, and I don’t know whether they actu­ally use it or not, not hav­ing an account myself!). And as for MSN Spaces… hmm. Well, my MSN Spaces page says “This isn’t my real blog, go else­where.” I flicked through a cou­ple of other peo­ples today (Mes­sen­ger Beta makes that pretty easy, though not sig­nif­i­cantly any bet­ter than the lat­est sta­ble release) and found more than a few who were uncer­tain as to whether they should keep their MSN space or just go with Blog­ger. Every non-geek I know who blogs uses Blog­ger. More power to Google.

But I’m sure these demo­graph­ics vary enor­mously depend­ing on who you know: the point is, I’m not see­ing any crossover, which is a lit­tle wor­ry­ing. Of course, I only ever search using Google, so go fire con­spir­acy the­o­ries around all you like… I reckon most blog con­tent on these ser­vices isn’t at all com­pelling, and doesn’t need to be. Blogs are, for the most part, mass-CC:-email sub­sti­tutes that really shouldn’t be archived… and these eas­ier to use ser­vices are prob­a­bly exac­er­bat­ing that problem.

I don’t excuse this blog from that entirely, of course, but there’s more than a lit­tle bit of con­tent here that draws search engine traf­fic and is “time­less” in a sense that “my dog ate crayons for break­fast this morn­ing and went to the vet and they said this hap­pens all the time” could never be. But I digress, hugely (a fail­ing of the medium, no doubt!)

So that’s all very inter­est­ing. Inter­ested to hear if oth­ers know peo­ple in mul­ti­ple “ser­vice provider uni­verses” or if everyone’s friends are, for the most part, con­fined to a par­tic­u­lar ser­vice (and what that ser­vice may be). If you’ve got a blog, this’d be a great time to play pingback/trackback tag instead of just com­ment­ing here… I’d love it if this could get a lit­tle viral and we could see what plat­forms peo­ple are using and “why”. For me, it’s mostly just that every­one I know is using a par­tic­u­lar ser­vice. What is it for you?

ImageBox Flash gallery app

I stum­bled across this post on RMW Web Publishing’s blog today, and it struck me the app they men­tioned could be use­ful for doing this whole CD/DVD thing for the year 12 photo web­site.

The purveyor’s web­site is hor­ri­bly Flash encum­bered (i.e. I’d have never found it if I were look­ing for it in a search engine — I actu­ally tem­porar­ily lost the developer’s URL for a bit there, and had to trawl through my brows­ing his­tory to find it again!), but the app itself is rather use­ful if you’re look­ing for a run-from-the-desktop gallery kinda thing. My only qualm is the dif­fi­culty of gen­er­at­ing meta­data for it to do inter­est­ing stuff with, but a quick spot of shell script­ing should see that prob­lem met, hope­fully. (Or even just nag­ging Ben until he hacks sup­port for this gizmo into Cat-scan natively… wink wink? :P) This is the kind of app that’s a prime can­di­date for XML appli­ca­tion, not in the least because of Flash’s reput­edly excel­lent sup­port for that kind of stuff… but it uses bor­ing and rather con­fus­ing (mostly because I don’t speak Ger­man so a few words are odd) flat files instead. With that one caveat, it’s an oth­er­wise help­ful appli­ca­tion. Just don’t make the mis­take of con­fus­ing appli­ca­tions with web­sites.

Selling an audience short?

Or, What Josh Said About Ansearch That Was Irrel­e­vant to Most Users.

Dean Jones responded to my Ansearch Answers post with the following:

All in all I feel [the post is] a fair rep­re­sen­ta­tion of the so called facts, but I stand by my recent email… namely that sim­ply review­ing us on tech­ni­cal issues that most peo­ple either

  1. wouldn’t have dis­cov­ered, or;
  2. would not likely care about,

is sell­ing your audi­ence short.

I’m inclined to dis­agree, and just wanted to quickly post to say that. I like to think I under­stand the ‘audi­ence’ here fairly well. They’re either peo­ple with (web-)geek ten­den­cies, and are hence inter­ested in any analy­sis and crit­i­cism I can deliver on the tech­ni­cal aspects of prod­ucts, etc., or (and this cat­e­gory is com­pletely unre­lated to the for­mer) stu­dents and humanities-focussed peo­ple read­ing var­i­ous con­tent I’ve pub­lished here — rang­ing from stage plots to a short story to an essay on the nature and effects of the dig­i­tal divide.

Most guests in the lat­ter cat­e­gory are just that: guests. They gen­er­ally dis­cover this con­tent via a search engine, read what they want, and leave. Over 80% of my vis­i­tors stick around for one minute or less, pre­sum­ably because they find what they need quickly, or dis­cover that the con­tent isn’t what they were look­ing for.

The “reg­u­lar” audience/participants, how­ever, are not that. I don’t think you’re all geeks, but this blog leans towards that style of con­tent, and you match that accord­ingly. You don’t come here look­ing for prod­uct rec­om­men­da­tions (the one excep­tion to that being some­one who viewed my post on Asterisk/VoIP, and asked me what my expe­ri­ences with it had been some time later: to which I replied, we haven’t both­ered, as we moved into a house with a Com­man­der sys­tem pre­in­stalled!). You come here, I think, for the qual­ity of writ­ing, for rants, for occa­sion­ally insight­ful (I hope) com­ment on var­i­ous facets of things I deem interesting.

This is a blog. This is not a news­pa­per, though it is pos­si­ble that search engines, iron­i­cally, are chang­ing the clout of this medium to some­thing sim­i­lar. The dis­tinc­tion between news­pa­per and blog becomes blurred with posts like the one that inspired this, because of the form it was writ­ten in. It is impor­tant, how­ever, to remem­ber the audience.

Peo­ple don’t come here to shop for search engines. We might be inter­ested in how they work, what they do, what the poten­tial ben­e­fits and fail­ings of each one is, but ulti­mately it doesn’t affect anyone’s choice in the real world. Sim­i­larly, investors are unlikely to come here, scop­ing out Ansearch’s offer­ing before buy­ing into par­ent com­pany Optum. And, if they did, my con­clud­ing remarks were pos­i­tive — I gen­uinely believe the story bal­anced out in their favour more than any­thing else. If I over­played the sig­nif­i­cance of a small flaw that could poten­tially be abused, my apolo­gies. I don’t, how­ever, regret includ­ing it in there at all, because I think it’s some­thing my audi­ence is inter­ested in.

As you stated in an ear­lier email… “I’m not 100% sure as to how one should go about review­ing a search engine.” Here’s a tip. like Google, Yahoo, MSN… we are a busi­ness. For us to stay in busi­ness we need to gen­er­ate revenue.

To do this we need to get more peo­ple to our SE, to get them to come back more often, and to, through their usage (CPM, CPC etc…) gen­er­ate revenue.

To achieve this we need to pro­vide a search ser­vice that the user finds use­ful. Given our rapid growth over the past months in UV’s and rev­enue, I would say we are doing OK.

Unfor­tu­nately for Ansearch and any­one else who wants to use this as an adver­tis­ing space, we don’t par­tic­u­larly care if you’re mak­ing money. It’s good to hear they’ve grown: if their evolv­ing prod­uct is any­thing to go by, they deserve it. But met­rics such as rev­enue and Unique Vis­i­tors mean lit­tle to this audi­ence, even if it’s what investors want to find out all about.

I think this is a fair assess­ment of this site’s ‘audi­ence’ (the impor­tant ‘audi­ence’, for me, being the minor­ity that don’t come through search engines, sub­scribe by RSS, and come back regularly) — though, as always, your role is not restricted to that. You are par­tic­i­pants. In light of this, I’d invite com­ment and dis­cus­sion on this post as to your role as you under­stand it. It’s pos­si­ble I’ve got this all wrong… but I doubt it.

Ansearch answers

All had been quiet on the Ansearch front as I awaited a response from Ansearch CEO Dean Jones, promised a hair under two weeks ago when I alluded to an ear­lier analysis/criticism I’d writ­ten when talk­ing about the state of play with Aus­tralian search engines, specif­i­cally refer­ring to the then-newcomer Ansearch.

Dean picked up my post via Tech­no­rati, a blog search engine that uses RPC update ser­vices to track what peo­ple are talk­ing about in real-time. I was suit­ably impressed by this dili­gence and appar­ent desire to hear what the mar­ket has to say about their prod­uct: could this be the same com­pany whose birth was so marred by a spat of cyber-squatting, in what Dean Jones was reported to have described as a fit of “youth­ful exu­ber­ance”?

Appar­ently so. Ansearch’s begin­nings, though marred by dubi­ous prac­tices1, received praise from var­i­ous quar­ters of the main­stream press — or, at least, those quar­ters not con­trolled by News Corp, whose domains had come under threat. How­ever, the Inter­net com­mu­nity responded qui­etly, and those voices that were heard were mostly of dis­dain at Ansearch’s domain practices.

Strangely enough, my orig­i­nal post wasn’t about any of that. I hadn’t heard of Ansearch until I read an arti­cle on them in the SMH — an arti­cle which reads a lit­tle too much like a rehashed press release for my lik­ing: the tell­tale sign is in the clos­ing sen­tence “Ansearch is the search engine divi­sion of Optum Ltd.” — if it were filed in the Busi­ness sec­tion of their paper, I’d under­stand, but it wasn’t.

I wan­dered over to their site, played around for a bit, and decided their offer­ing was mediocre. In hind­sight, it prob­a­bly didn’t help that I wasn’t shop­ping for any­thing in par­tic­u­lar — accord­ing to a ZDNet arti­cle, “In the short term [Ansearch] is focus­ing very heav­ily on the com­mer­cial end of the market.” — but at that point in time, I also don’t think they’d tuned their list­ings par­tic­u­larly well, as a search for Dash­Lite turned up my Word­Press hack over com­mer­cial list­ings for the actual Dash­lite brand I inad­ver­tantly used.

I say “at that point in time”, because it appears to have sub­stan­tially improved since, as per Jones’ claim: “Much has changed since your first arti­cle on us some 6 months ago.”

Much improved, it seems, on sev­eral fronts. Their core offer­ing has shaped up nicely, and some facets of my ini­tial com­plaints regard­ing acces­si­bil­ity have been met. Their ancil­lary prod­uct offer­ings seem to have devel­oped nicely: Ansearch CEO Jones claims “Each of [our prop­er­ties] goes through up to 7 stages rang­ing from an ini­tial, sim­ple SERP/Directory style page through to a more involved ser­vice, mini por­tal, search tool, etcetera.” He went on to say that these ancil­lary prop­er­ties (such as http://www.picsearch.com.au/, http://www.videosearch.com.au/, http://www.thefreedictionary.com.au/ and http://www.messengers.com.au/ amongst sev­eral oth­ers) are cur­rently being actively sep­a­rated from the core Ansearch site (he described it as “quar­an­ti­ning”), and the exact direc­tion of a num­ber of these projects would become clear over the com­ing months, with the appoint­ment of a full time man­ager of these online properties.

I’m a tad con­cerned about his descrip­tion of their strat­egy with regard to these — he said this would become clear over the months to come, and I’m hang­ing off two words here: dis­trib­uted por­tal. Whilst I can see this as being of value to users (espe­cially for generic, non-brand-specific/legally dubi­ous domains such as jokes.com.au and the ones listed above), it doesn’t seem to fit Ansearch’s core strength as I per­ceive it: as a com­mer­cial por­tal, and not as another Google. “We are not aim­ing to be another Google… we don’t have their bud­get and, to be frank, there are enough peo­ple try­ing to clone them: why build another?”

In fact, Jones sug­gested that Ansearch’s strengths lie in that it is not the ubiq­ui­tous search behe­moth, and that its index is “some­thing unique… some­thing faster… [and] against the so called “arms race” of search (my SE has more links than yours etc…)”. I’d agree this is indeed a strength, and also a rea­son for them not to try and be a por­tal. Aus­tralia already has Yahoo! and NineMSN for domes­tic por­tals, and I’m strug­gling to see what Ansearch will do to dif­fer­en­ti­ate them­selves in this: but I’m happy to be surprised!

Ansearch appar­ently holds an index of only 500,000 web­sites con­sid­ered by its met­rics to be “most pop­u­lar”. I argued that this was poten­tially a bad thing as rel­e­vant con­tent might lie out­side this realm: for exam­ple, this web­site per­forms well when peo­ple search for reviews of the HP 2610 or infor­ma­tion about Apache on Ubuntu linux or ACT files from MP3 play­ers that record audio, but isn’t included in Ansearch’s core index.

Which is per­fectly valid, for a commercially-focussed site, I just think they could be miss­ing out a lit­tle bit. They can lever­age on my con­tent for their adver­tis­ing impres­sions and poten­tial click­throughs, because they have more valu­able con­tent show­ing up in their list­ing along­side adver­tised prod­ucts. If some­one reads my HP 2610 review after hav­ing found it in Ansearch, and decides they’d like to buy it and remem­bers hav­ing seen a “Buy HP print­ers!” ad on Ansearch, they’ll most likely click “back”. It’s abstract, behav­ioural stuff, but valu­able nonetheless.

Whether it’s valu­able enough for them to bother is another mat­ter. “We spi­der our own con­tent… some­thing that over time will be done daily,” says Jones. “Hav­ing only 500,000 web­sites will allow us to index sites more often, and as is the case with the ‘site info’ pages, pro­vide far more info on these pages.” Which is a value-add, and worth pre­serv­ing. If that’s all resources per­mit, I think they’re doing the right thing as is. Jones openly admits Ansearch’s index of pop­u­lar­ity “has a com­mer­cial flavour to it” — and rightly so. Given their much-touted gen­der and age demo­graphic based search fea­ture, this makes sense.

Their index of pop­u­lar­ity seems to be fairly slow-moving. “Monthly we add around 20,000 sites… and take out 20,000.” I’d guess this would be the low­est 20,000 that gets shuf­fled, and this seems to make sense. One has to won­der whether all the higher-ranking pages can have sub­stan­tially fresh con­tent month after month, but pre­sum­ably they do — it’s one of the things the SEO experts have always cried from rooftops.

It was inter­est­ing to hear Jones speak­ing about these peo­ple, too: amus­ing, even! Web devel­op­ers the world over often join in spec­u­la­tion as to what exactly makes search engines tick, such that we can boost our clients (or employ­ers) website’s per­for­mance. It seems the reverse is also true: search engines all over the world sim­i­larly spec­u­late as to what those hor­ri­ble devel­op­ers are doing to screw with their indexes day in and day out!

I don’t say this in jest, and I believe they’re right to com­plain: “The larger SE’s are hav­ing a very tough time com­ing up with clever ways to index con­tent to counter SEO… only to have SEO’rs quickly find ways around it. Cat and mouse…” I think “counter SEO” was a poor choice of words, given that rel­e­vant con­tent should hope­fully still be rewarded, but his point stands.

Just as inter­est­ing is Ansearch’s strat­egy to avoid falling prey to dodgy SEO tactics:

By only index­ing the root page, we remove almost all SEO trick­ery. This works in 2 ways. Firstly, peo­ple rarely put spam on their home page — that is, door­way pages, link farms, etc. usu­ally reside away from the main index… and, sec­ondly, it deletes mul­ti­ple results from the same web­site. It also stops the site owner/webmaster from say­ing they are rel­e­vant to 100 or 1000 key­words or phrases.

Kids, we just found a new argu­ment against clients who love their splash pages!

Con­tent rich front pages aren’t, how­ever, an absolute solu­tion (at least, not in Ansearch’s index). Accord­ing to Jones, Ansearch’s pol­icy of “rank­ing sites in true usage pop­u­lar­ity, both on and off­site” is “SEO proof… or at the very least, extremely resis­tant.” I’d agree it’s a pow­er­ful met­ric, but my reser­va­tions above still stand.

One caveat of Ansearch’s algo­rithm that appears poten­tially exploitable is its fail­ure to exclude con­tent in the from indexing. I don't just speak of standard meta author/keywords data, but of something else.

A screenshot highlighting the inclusion of information between style tags in Ansearch's index

As high­lighted in the screen­shot above (click for orig­i­nal page, link may expire), Ansearch’s list­ing is includ­ing con­tent between <style> tags. This presents poten­tial for SEO abuse2, as most browsers hap­pily over­look errors in CSS — and <style> tags can be placed towards the top of a doc­u­ment: if we are to believe the SEO myths, increas­ing their rel­e­vance in engines. Of course, it’s entirely pos­si­ble the con­tent bears no weight at all — but the ques­tion of why it is stored in their index at all remains unanswered.

This is another rea­son to reward web­sites that use seman­tic markup prop­erly, though at this stage that would exclude dis­pro­por­tion­ate amounts of the web, so I under­stand engines’ hes­i­tance to embark on any­thing like this. It’s not some­thing a lot of sites use”, says Jones, before con­tin­u­ing “but it will be used more and more in the future.” Well, so much of the web com­mu­nity hopes.

This formed part of Ansearch’s defense for not hav­ing embraced seman­tic markup from the out­set. Accord­ing to Jones, it’s built on a tech­nol­ogy devel­oped for a pre-April 2000 (dot com crash) search engine — so that par­tially excuses the markup at launch time. Jones’ first com­ment on their fail­ure to use seman­tic markup was sim­ply that “The majors [Google and Yahoo!] don’t use it” — something I’d dis­pute the valid­ity of, as Ansearch isn’t a “major” player, and, as has been estab­lished, is chas­ing a fairly dif­fer­ent mar­ket sec­tor. Their core busi­ness is search, but it’s a dif­fer­ent breed of search con­ducted in a dif­fer­ent way: and seman­tic markup and acces­si­bil­ity is a dif­fer­ent way. Encour­ag­ingly, Jones sees the poten­tial for embrac­ing seman­tic markup in the future on both tech­ni­cal and com­mer­cial grounds: “It makes sense to use it and as it does open us to a wider audi­ence with var­i­ous devices used to browse our site.”

He didn’t cite the “reduced band­width expen­di­ture as a result of light­weight code” rea­son, pre­sum­ably because their host, OzHosting/Destra charges only for the link, not for trans­fers over this, on their ded­i­cated server range.

Irre­spec­tive of their rea­sons, the future of Ansearch in terms of markup is promising:

Our long term goal is to have Ansearch web­site designed with­out any tables and heav­ily styled using the CSS, which even­tu­ally will gives us more con­trol on how we present our site to dif­fer­ent media types.

Ansearch has gone through sev­eral minor enhance­ments over the past 6 months with the releases of ver­sions 1 to 1.3. We are cur­rently plan­ning a major update for ver­sion 2.0 and the issues [of seman­tic markup and sep­a­ra­tion of pre­sen­ta­tion and con­tent] will be addressed.

But as we know, markup isn’t every­thing: con­tent is what ranks well in search engines erm… con­tent is what draws an audi­ence. Ansearch’s explo­ration into the devel­op­ment of por­tal envi­ron­ments is some­thing to be watched with inter­est over the com­ing months, as well as its other busi­ness aspects, includ­ing an adver­tis­ing net­work known as Soush that remains slightly enig­matic, and the mys­te­ri­ously named “Fac­tory” division.

An announce­ment is expected to be filed with the ASX later this week out­lin­ing some­thing of Ansearch’s future direc­tion: At this stage, I’m inclined to believe that the future is a pos­i­tive one, as Ansearch dis­tances itself from its much-criticised prac­tices at launch, to a diverse range of prod­uct offer­ings that uniquely ful­fil the needs of Aus­tralian Inter­net users.

Update: A fol­lowup to this has been posted, in response to a crit­i­cism that this review was overly tech­ni­cal in nature. Read on!

Notes

1 Jus­ti­fied with the catch-cry “MSN do it, so we can, too!” — to which the only sen­si­ble reply is, “yes, but MSN do it with Inter­net Explorer, and as soon as you go and write your own web browser, feel free to hijack as many unused pages as you want.“
2 I noti­fied Ansearch of this shortly prior to pub­li­ca­tion in the hope that, if this is indeed an issue, it will be resolved before this post is noticed and widely acted upon. One hopes this poten­tial prob­lem dis­ap­pears quickly.