<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Josh.st &#187; search engines</title>
	<atom:link href="http://josh.st/tag/search-engines/feed/" rel="self" type="application/rss+xml" />
	<link>http://josh.st</link>
	<description>Web, English, 中国, and various geekosity</description>
	<lastBuildDate>Wed, 18 Apr 2012 23:37:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>What Josh Does at Youthworks</title>
		<link>http://josh.st/2006/11/30/what-josh-does-at-youthworks/</link>
		<comments>http://josh.st/2006/11/30/what-josh-does-at-youthworks/#comments</comments>
		<pubDate>Thu, 30 Nov 2006 02:14:12 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Christianity]]></category>
		<category><![CDATA[Geek]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[Bible study leader]]></category>
		<category><![CDATA[communication tool]]></category>
		<category><![CDATA[Contact tools]]></category>
		<category><![CDATA[dead-tree products]]></category>
		<category><![CDATA[energy]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Josh Does]]></category>
		<category><![CDATA[leader]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SMS]]></category>
		<category><![CDATA[social networking websites]]></category>
		<category><![CDATA[The Briefing]]></category>
		<category><![CDATA[time-sensitive advertising]]></category>
		<category><![CDATA[web strategy]]></category>
		<category><![CDATA[web usage]]></category>
		<category><![CDATA[youth group]]></category>

		<guid isPermaLink="false">http://joahua.com/blog/?p=1215</guid>
		<description><![CDATA[I’m employed by an organisation (the one I referred to in my first post about this project, wherein I didn’t bother explaining exactly what was going on, but hoped it would be clear to those who already knew) that exists to — amongst other things — resource youth ministry. One thing we’ve noticed (“we” is [...]]]></description>
			<content:encoded><![CDATA[<p>I’m employed by an organisation (the one I referred to in my <a href="/blog/2006/11/23/ri-a-few-months-on-and-a-bit-about-databases">first post about this project</a>, wherein I didn’t bother explaining <em>exactly</em> what was going on, but hoped it would be clear to those who already knew) that exists to — amongst other things — resource youth ministry.</p>
<p>One thing we’ve noticed (“we” is myself and a handful of others with an interest in the web) over the past twelve months is an uptake in web usage by youth ministries — for obvious reasons: that’s where kids are spending their time, and it’s a great communication tool, and everyone else is doing it.</p>
<p>When I say everyone else is doing it, I actually mean everyone else is trying to do it. Everyone has, for the last six to twelve months, been writing the same applications, integrating the same software, paying for the same software, attempting to train the same people, and generally doing a lot of the same stuff, separately. With no point of intersection or sharing or intelligent resource management.</p>
<p>This is understandable: afterall, the web presents a relatively new front for churches in general, and whilst kids have been wasting time online for years, only with the relatively recent advent of social networking websites (I refer to it as ‘SocNet’ in these parts — no-one else seems to, but I like it, so whatever) have the less computer-inclined began spending significant amounts of time in front of a keyboard.</p>
<p>There’s also a bit of a catch-22 when it comes to building these things. People ask, what are the benefits? We’ve never had someone come along to youth group because of our website! — well, no, you’re right. But you also don’t <em>have</em> a website, so that’s hardly fair, is it? Nine times out of ten people will not come along to church (generically) because they’ve searched for a church in a particular suburb in Google (though, speaking of that, I’ve got to do a bit of SEO work on the Matthias site — it’s not on the first page for a “Church in Paddington” query. Changed the title, it’ll be a while til that kicks in. We’ll see.)</p>
<p>They’ll come because a friend asked if they wanted to, or they were walking past and heard people inside, saw them going in, and wondered what it was all about.</p>
<p>But this is hardly exclusive to having a website. If they have those points of contact, a website is a great way to invisibly investigate further without needing to make themselves uncomfortable. It’s easy to find these sorts of websites through search engines — you walked past a church and noted its name, you remember the name of your friend’s church, etc.</p>
<p>The same goes for youth groups, obviously.</p>
<p>People have just been starting to realise this, or at least think of it at all and decide “yeah, we could do that”. So, there’s the rationale for it all. Most people with decent websites already may not have considered rationale in any great depth — they’ve got a good website because they know someone who makes them, and volunteered their time (maybe they’re a leader), throwing something together with <a href="http://www.xoops.org/">Xoops</a> in an afternoon. It’s quick and dirty, but effective.</p>
<p>We’re trying to spend a small but not insignificant amount of money to equip people to do these sorts of thing, so it’s only sensible that some more time is spent considering what on earth we’re trying to achieve. Hence the lengthy prelude to what it actually does.</p>
<p>Now, the features. We have too many target audiences for it to be an altogether comfortable project, but that’s half the fun of it. The product is being marketed to churches (who pay for it) through leaders (who want to use it) and for youth (who actually aren’t the centre of the universe on this one, but we need to give them UX that says they are). Outside of these three, there are also the friends of the youth already in the application who are just checking out the youth group page.</p>
<p>Of course, it’s not <em>quite</em> that simple. We’re also marketing this to camps, high school scripture groups/lunchtime bible groups, and maybe bands/events. Which is great and technically only a small step, but it does pretty horrible things when you try and explain who’s paying for what in a concise business-like fashion. If you’ve read this far, chances are you’re well aware that concise-ness has never been my strong point.</p>
<p>So, with these targets in mind, we are (firstly) going to equip them with websites. Big woop. WordPress.com and Blogger eat your heart out. Cue yawns.</p>
<p>No, seriously. We’re going to give them (‘them’ being the various entities described above, not individuals so much — there’s no way I’m positioning this against other SocNet sites because I reckon it’s too fragmented to last… Facebook or Myspace or Bebo or.… yes.) web pages. Welcome to 1999.</p>
<p>They’re going to have web pages with calendars they can chock full of the schedule for the term, though. So that’s exciting.</p>
<p>And everyone’s going to have their own username, so they can leave comments on the inevitable blogging element with identity — this is wonderful for comment– and generic form-spam. Incidentally, I read a few blogs that Wild St people are writing and was really excited to see they’re actually enthusiastic about doing it. There’s quite the bunch of them on Blogger these days, and it’s all completely autonomous — so far as I know, no-one has pushed them to start doing it. I was so proud of their keenness and innovation for building up community and spreading the gospel! Another aside, my copy hasn’t arrived yet but I believe there’s something about blogging in <a href="http://www.matthiasmedia.com.au/">The Briefing</a> for December (it’s not on their website yet, either). <ins datetime="2006-11-30T11:18:00+00:00">My copy arrived today, and I discovered the current issue is in their webstore, just not on the main site. It’s <a href="http://secure.fellowworkers.com/cgi-bin/mmstore/ebrfg339.html">The Briefing #339</a>, if you’d care to read it.</ins></p>
<p>Anyway. Blogs will feature. Calendars will feature. All the stuff you’d reasonably expect to be able to do with a CMS tool these days will feature. Blogs, calendars, galleries, contact forms, static pages. Yay. So that’s the boring stuff that we’ve just got to do the grunt-work for at some point (I’m sure it can be fun, but, just between you and me, I’m not really looking forward to the couple of weeks we have to spend on that bit).</p>
<p>Now, for interesting and innovative features — because, let’s face it, the above is hardly enough to convince anyone to switch their existing website (if indeed they have one) across to a hosted platform for a nominal (to be determined, but probably only payable by church groups, and not for camps/events on account of these being once-off) monthly fee.</p>
<p>Contact tools. Yummy. We’re going to give them mailers that make it easy to send a message to, say, all the kids in year 10. Or just guys. Or girls in year 8. Or only to your co-leaders (we’ll have a resource area where they can share files — Word documents, PDFs, slide shows — on the site, too: that’s some of the fun CMS stuff). But email’s been done before. Everyone’s used email. Admittedly, sometimes you just wish there’s a better way to store and manage lists of people, and this tool will certainly do that, but it’s a little boring still.</p>
<p>So we decided it’d be a good idea to throw SMS into the mix. It’s not just a gimmick: again, this is in response to what people are already doing. The only difference is it’s paid on a shared account (used by the leaders — the youth kids won’t have access to these tools, for fairly obvious reasons) and integrates the same contact management features as the mailer app. We’re hoping convenience will draw people across to this tool. Use scenarios are basically just that you’d use this tool to inform people of what’s going on this week at youth group, or reminding them that the group is on bringing supper this month, etcetera. The originating number will be that of a single leader, or it could even be that of that person’s own leader.</p>
<p>For example, one message is sent to all kids by the group co-ordinator, but that message is altered depending on who the individual recipient’s bible study leader is, so that it appears to originate from them. Obviously common sense would say that you wouldn’t do that without consultation, so we’d probably have a check box in the leader’s “my account” page that would say “Allow messages from other senders to originate from my mobile number”, or something to that affect.</p>
<p>Beyond contact tools, we want to take advantage of the fact that this is a service-based product and entirely a hosted solution. Part of the reason we’re strongly pursuing that is it gives an opportunity to equip and direct in a way that decentralised sites can’t be. So, a few things we’re thinking of doing are centralised offerings like weekly newsletters (sent to leaders two days in advance so they’ve got an opportunity to see it first) and global blog properties that give reviews, current affairs commentary, etc.</p>
<p>That’s the end of the universal features that are great for kids and leaders alike, but there’s lots more for leaders. As I’ve already said, we want this to be self-funding. Part of this is selling electronic versions of dead-tree products, as DRM’d PDFs, or as unencumbered PDFs with watermarks/obviously time-sensitive advertising (so violation of copyright is glaringly obvious). The other part is (for me at least) far more exciting, and that’s reselling user generated/contributed content (UGC) under an iStockPhoto-esque model (Basically, profit sharing).</p>
<p>This isn’t just about words on a page — I want to get plenty of video stuff happening, too, because (especially in reformed evangelical Anglican/Baptist/Presbyterian, etc. churches) that doesn’t get nearly enough of a work out as is. It’s a really effective tool for supporting preaching/bible studies, and it’s been largely overlooked until probably early this year (I had my first conversation with someone about video resources for small group bible studies as late as July or August this year, I think! They had used a Matthias Media resource which I haven’t encountered, and thought it really helpful).</p>
<p>Pricing models for all that are still a little up in the air, but, from a consumer’s point of view, it’s definitely going to be affordable. The project will ultimately sit on a server maintained gratis and depend largely on volunteer labour to administer content. The only “costs” are those to the established Youthworks publishing division, but hopefully we can transition the way they do their high-school level content effectively, so they’re commissioning content for the web and selling it there. Something that’s really exciting is the possibility that, instead of commissioning content, it’s possible to purchase it directly and already created from a pool of resources on the website.</p>
<p>There’s definitely a workable model here, somewhere.</p>
<p>Prayer is greatly welcomed for:</p>
<ul>
<li>wisdom trying to figure that model out</li>
<li>energy and resources to make it happen (in whatever form)</li>
<li>adoption and enthusiasm from youth leaders and kids</li>
<li>effectiveness in web strategy as we attempt to use it as an evangelistic outreach tool, and a tool for the growth of existing ministry</li>
<li>and, hand-in-hand with that last point, that God’s will be done and if He wills it, that growth would be given!</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2006/11/30/what-josh-does-at-youthworks/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>FEVA not-marketing, motivation, and red wine</title>
		<link>http://josh.st/2006/11/25/feva-not-marketing-motivation-and-red-wine/</link>
		<comments>http://josh.st/2006/11/25/feva-not-marketing-motivation-and-red-wine/#comments</comments>
		<pubDate>Sat, 25 Nov 2006 12:45:09 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Christianity]]></category>
		<category><![CDATA[Church]]></category>
		<category><![CDATA[Design]]></category>
		<category><![CDATA[Life]]></category>
		<category><![CDATA[basic web]]></category>
		<category><![CDATA[Budd]]></category>
		<category><![CDATA[etc]]></category>
		<category><![CDATA[food]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[online games]]></category>
		<category><![CDATA[retail outlets]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[web media]]></category>
		<category><![CDATA[web presence]]></category>
		<category><![CDATA[web strategy speaker]]></category>

		<guid isPermaLink="false">http://joahua.com/blog/?p=1209</guid>
		<description><![CDATA[FEVA’s “Promoting the Word through Image and Text” conference (they will break my link fairly quickly, methinks, but it’s good whilst it lasts) was today, and it rocked. Sessions about architecture to creative strategies to the theology of “promotion” (which we don’t call marketing for fear of stirring the controversy pot) to a rather helpful [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.feva.org/conf.html">FEVA’s “Promoting the Word through Image and Text” conference</a> (they will break my link fairly quickly, methinks, but it’s good whilst it lasts) was today, and it rocked.</p>
<p><img src="/blog/wp-content/2006/11/ptwtiat.png" alt="" /></p>
<p>Sessions about architecture to creative strategies to the theology of “promotion” (which we don’t call marketing for fear of stirring the controversy pot) to a rather helpful copyright session (albeit one raising more questions than it answered), as well as great food, a comfortable venue, and generally excellent organisation, etc.</p>
<p>Go along next year.</p>
<p>And, now that positive recommendation is cemented firmly without mention of the web…</p>
<p>I did, however, take great exception to the web strategy speaker, who I am tempted to pour out all manner of vitriolic utterances against but will attempt to refrain. He essentially said that footer keyword-stuffing was fine, as was spamming meta tags (though, thankfully, he acknowledged search engines pay “less attention” to them these days — I would put that closer to “insignificant attention and not worth the markup bloat they so often are”). Everything he had to say about content for the web could be surmised in the keyword, “keywords”, paying no attention to the different copy-writing demands of web media and the flow-on effects of organic keyword enhancement. Further, he managed to suggest online games for youth and prize competitions as legitimate marketing tactics, which, to me, seems brain-dead — perhaps I should just say “an unproductive use of time”. The entire presentation appeared to have been repurposed from a very basic web 1001 presentation to small businesses, without much (or any) regard for audience feedback.</p>
<p>For example, he asked questions at the beginning to get an indication of where the audience was at in terms of web presence (I would say well over 90% had a website, with probably half of that being maintained in some capacity — yes, <a href="http://www.matthias.org.au/">our website</a> is getting touched up soon… heh, in all my free time) and then proceeded to completely ignore that (although he did act very surprised at the number of hands that went up) and tell everyone about how to get online in the first place. Complete with the worst in Powerpoint presentation technique.</p>
<p>Definitely not a highlight of the day!</p>
<p>Anyway, that aside, I went home feeling pretty motivated to GetStuffDone™ and started on the three gazillion changes pending for the Matthias site… then gave up when Budd called saying Borat was on. I’ve generally had a great evening, though — a few hours with a glass of red wine and a sense of accomplishment as content takes shape, then a conversation about using Google Maps to plot some 2,100 retail outlets effectively (no consensus as to how to achieve this yet, because that’s 2,100 points to be rendered client-side as an overlay, which would probably crash some browsers, if not make them run hideously slowly — but the brain is churning over), then watching that crazy movie. Yeah, you’ve got to laugh at it, but… gosh. Really hope they went back and explained it was satire to some of those people, if not apologising outright. Having said that, I think he’s reached the limits of the persona; it really got a bit repetitive and predictable (but still evoking laughter for shock value) in parts. I still laughed loudly.</p>
<p>Anyway. More to come soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2006/11/25/feva-not-marketing-motivation-and-red-wine/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>People versus search engines</title>
		<link>http://josh.st/2006/10/26/people-versus-search-engines/</link>
		<comments>http://josh.st/2006/10/26/people-versus-search-engines/#comments</comments>
		<pubDate>Thu, 26 Oct 2006 02:40:20 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Usability]]></category>
		<category><![CDATA[Web Standards]]></category>
		<category><![CDATA[advertising model]]></category>
		<category><![CDATA[all-things-to-all-people content networks]]></category>
		<category><![CDATA[cash-cow-marketing-tool]]></category>
		<category><![CDATA[deputy]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[free web hosting services]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[internal search needs]]></category>
		<category><![CDATA[ISP]]></category>
		<category><![CDATA[King]]></category>
		<category><![CDATA[marked-up doing-everything-wrong-with-the-web]]></category>
		<category><![CDATA[meaningful search]]></category>
		<category><![CDATA[player]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[VOIP]]></category>
		<category><![CDATA[web hosting]]></category>
		<category><![CDATA[Whirlpool]]></category>
		<category><![CDATA[Yahoo!]]></category>

		<guid isPermaLink="false">http://joahua.com/blog/?p=1165</guid>
		<description><![CDATA[It seems that search engines are an immutable fact of early-twenty-first century existence. We can’t escape them in any immediate sense, and cannot believe they could ever disappear (I recall one instance on Whirlpool forums where a user thought his/her ISP’s interational link must be down because he couldn’t access Google. This was one of [...]]]></description>
			<content:encoded><![CDATA[<p>It seems that search engines are an immutable fact of early-twenty-first century existence. We can’t escape them in any immediate sense, and cannot believe they could ever disappear (I recall one instance on Whirlpool forums where a user thought his/her ISP’s interational link must be down because he couldn’t access Google. This was one of the very few times Google had actually dropped off the face of the planet for about twenty minutes. It was simply outside the realm of possibility.)</p>
<p>Yet, increasingly, our surfing habits are defined by this bizarre social concept that seems to be shaping certainly acquisitions and web-two-point-oh-bubblism, wherein websites serve users by connecting them with one another, not on the basis of them knowing what they wanted, but rather in a bizarre <em>a priori</em> manner whereby degrees-of-separation (MySpace) or user-supplied-already-knowns (LiveJournal, Xanga, etc.) define connectedness and displayed content.</p>
<p>Search is no longer the macro-inter killer app, but an intra-site facility applied to microcosm — often based on “transparent” technology that has, on the basis of known knowns (in the words of a certain <a href="http://www.knownknowns.net/index.html">Rumsfeld</a>), already done some of the hard work for users (I should say people, but don’t out of habit: it is an industry hazard) without actually asking them anything. This is where location– and organisation-based matching (<i>cf</i>. MySpace, Facebook, etc.) come in.</p>
<p>But none of this data is intelligently searchable by generic engines.</p>
<p>None of this data (in the case of Myspace especially, horribly marked-up doing-everything-wrong-with-the-web technically entity that it is) is <em>available</em> for indexing by search engines because it’s not abiding by any defined semantics. There is not, for example, any overwhelming use of microformats — <a href="http://microformats.org/wiki/hcard">hCard</a>, etc. — for defining contact details in any common sense. Yet these things <em>are</em> searchable within a given website. </p>
<p>And, what’s more, these things are searchable with great precision within (social networking) sites. This is because of a very well defined internal semantic (<strong>not</strong> the “semantic web”, but internal data structures) and an enforced obedience to these structures that was never a part of pre-SocNet sites.</p>
<p>SocNet platforms are radically different from web 1.0 systems in that they are (ironically) <em>vastly more constricting</em>. As “web 1.0″ I would cite <a href="http://geocities.yahoo.com/">Geocities</a> and free web hosting services, portals, and all-things-to-all-people content networks. Now, we’ve got blogs (precisely defined websites), <a href="http://myspace.com/">MySpace</a> (chiefly SocNet profiles with bits on the fringes common to the users, and now with enough impetus to appear unstoppable), <a href="http://flickr.com/">Flickr</a> (free — and fee-for-service that people actually pay for — web hosting, precisely defined as photo hosting), and, strangely, a portal (<a href="http://www.yahoo.com/">Yahoo!</a>) still on top of Alexa 500 rankings. A portal that owns both Flickr and Geocities, but has changed the model of the latter to place greater emphasis on fee-for-service hosting. But I digress into strategy — the point is not that, but rather in the way social data is stored.</p>
<p>Flickr is meta-data rich. It uses a well defined system based on EXIF, intrinsic semantics (title, description, tags — tags that get used properly, unlike Facebook which doesn’t bother to make such things clear — I want Facebook to flop, by the way, because it annoys me, so don’t expect nice things to be said about it. It’s a poor closed-system imitator, albeit with a stupidly effective advertising model everyone else should be wishing they came up with first but haven’t seen in order to copy… because it’s a closed system (or used to be) exclusive in scope. Which makes it very effective SocNet/Web 2.0, by my own definition, so I don’t really have a basis for complaint.) and extrinsic semantics (groups, pools, etc.).</p>
<p>Profiles, unlike ‘pure’ SocNet (Myspace, Facebook), permit anonymity, but allow disclosure of as much as is desired: at any rate, that is not the purpose of the site. Myspace/Facebook’s <em>raison d’etre</em> is profiles. (Well, and that and cash-cow-marketing-tool of the *R**IA’s of the world) Accordingly, its profiles have very definite semantics even whilst the rest of the site may not (I speak of Myspace more, here). Myspace gives core “Details” profile info individual fields, whilst allowing a diverse “Interests &amp; Personality” information in freeform textareas that are designed to entice users into participation (and, possibly, aiding more fuzzy searches — but mostly I think it’s just compelling content, as there is no immediately obvious way to search that data).</p>
<p>“Interests &amp; Personality”, along with blog content, seems to be the only freeform contributed material available on the site. Want music or a video with your profile? You’ve got to browse to the band’s site, load the player (no go in Opera with Flash at the minute, it seems), and then select “Add” on the track. They (yeah, it’s kinda big-brotherish) know exactly what song you chose, what band it’s from, what genre, etc. — that is to say, unambiguously and certainly beyond a probably-common song title. This isn’t an upload-yourself-and-we’ll-manage-rights kind of thing. The officiality gives that internal data structure that much more depth: but, again, the point is that the data is internal and not open.</p>
<p>This, it seems, is the defining quality of SocNet. That’s what makes the ideas of <a href="http://googleblog.blogspot.com/2006/01/open-federation-for-google-talk.html">open federation advocated by Google Talk earlier this year</a> so bizarre for the rest of us. We don’t particularly care, because closed systems mean innovation (because we can define new data for ourselves to work with) and/or extensibility that isn’t possible in an open platform (if, for example, not all federated partners agree to a spec extension — take, for example, Google Talk’s own Jabber base and proprietary VoIP on top of that). Openness is in Google’s interests, because it’s so dependent on things being open for its core business (search). But real people want services that work, not services that push them to another site. I’ve never trusted sites that bounce me off to Google for their site’s search, even if it’s one of those crappy co-branded things. It doesn’t make sense. Why would you make someone inspect your website from an inferior perspective when <em>all the information</em> is stored in a database, with the possibility of more semantically meaningful search open internally only?</p>
<p>Google <em>won’t</em> deal with your internal search needs. It’s not designed to. It does a great job of dealing with publicly indexed materials completely aside from SocNet services. SocNet sites thrive on and are empowered by strong intrinsic semantics that make clever profile-based (or <abbr title="User Generated Content">UGC</abbr>–based) search possible, which builds loyalty etcetera in a way foreign to informational websites. SocNet is experiential and (surprise surprise) social — it doesn’t have to be <em>about</em> anything.</p>
<p>Content was deposed as king sometime in the middle of the first decade of the twenty first century, and with that regime change his deputy, Search, was also shuffled to a somewhat less prominent position. Somewhere out of sight, Search’s identical twin, Query, is the real power behind the throne: it uses unindexed data and makes clever links to bring people closer together in a way that traditional search engines had never even envisaged.</p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2006/10/26/people-versus-search-engines/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>On the follies of Copyright expectations</title>
		<link>http://josh.st/2006/01/29/on-the-follies-of-copyright-expectations/</link>
		<comments>http://josh.st/2006/01/29/on-the-follies-of-copyright-expectations/#comments</comments>
		<pubDate>Sun, 29 Jan 2006 01:24:14 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Geek]]></category>
		<category><![CDATA[Life]]></category>
		<category><![CDATA[citizen media]]></category>
		<category><![CDATA[communal media]]></category>
		<category><![CDATA[conventional media]]></category>
		<category><![CDATA[copyright law]]></category>
		<category><![CDATA[DNS]]></category>
		<category><![CDATA[energy crisis]]></category>
		<category><![CDATA[influential podcast-media personality]]></category>
		<category><![CDATA[Internet-based communities]]></category>
		<category><![CDATA[mainstream media]]></category>
		<category><![CDATA[mass media]]></category>
		<category><![CDATA[mass-media-hostile personality]]></category>
		<category><![CDATA[multimodal media]]></category>
		<category><![CDATA[network-wide media]]></category>
		<category><![CDATA[Paul Sheehan]]></category>
		<category><![CDATA[present copyright law]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[share media/experience]]></category>
		<category><![CDATA[temporal media]]></category>
		<category><![CDATA[web media]]></category>
		<category><![CDATA[web publishing]]></category>
		<category><![CDATA[web remains]]></category>
		<category><![CDATA[web-based exploration]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/?p=842</guid>
		<description><![CDATA[I’ve been occupied the last few days trying to get an effective fileserving/sharing/roaming profile (domains) environment working with Samba, and was thinking this evening about the implications of a network-wide media share. At present, it’s illegal, though not particularly morally reprehensible in view of the fact that all content on it would be ‘licensed’ (just [...]]]></description>
			<content:encoded><![CDATA[<p>I’ve been occupied the last few days trying to get an effective fileserving/sharing/roaming profile (domains) environment working with Samba, and was thinking this evening about the implications of a network-wide media share. At present, it’s illegal, though not particularly morally reprehensible in view of the fact that all content on it would be ‘licensed’ (just not for duplication in a digital form, under present copyright law — <a href="/blog/2005/12/29/yes-clemency">scheduled to be overturned</a>). </p>
<p>It is a truth universally acknowledged… that the absence of a fair-use provision in Australian copyright law is simply an oversight on the part of legislators. (Apologies to Austen fans :P)</p>
<p>What if it’s not?</p>
<p>There is, now, what Paul Sheehan termed “<a href="http://www.smh.com.au/news/opinion/little-squares-that-define-the-nation/2006/01/01/1136050344123.html">little squares of light</a>”, signifying connectivity in an “advanced, ironic, post-ethnic polyglot societ[y]”. Before that? The “Dark Age” (also Sheehan). It did exist. There was a time before computers and multimedia were intrinsically connected (depending on your definition of multimedia–multimodal media is perhaps more apt). There was, indeed, a time before multimedia existed — though we can, perhaps, trace its origins to Wagner’s 1849 essay, “The Artwork of the Future” and the notion of <span lang="de" title="Synthesis of the Arts">Gesamtkunstwerk</span> — which, in turn, traces back to Greek drama, but no matter!</p>
<p>Yet irrespective of when this arose, legislators are meant to have acknowledged the imminent rise of the copyright-violating, citizen-empowering, content-producer-collaboration–<em>dictat</em> at the hands of the web. We’re expecting the wrong thing. Media has progressed, the law hasn’t. Yet.</p>
<p>But what if it doesn’t? Does this matter? Speaking to an influential podcast-media personality yesterday afternoon, it became clear that there had emerged between citizen media and conventional mechanisms a fissure that certain people were <em>very</em> reluctant to bridge. Suspicion exists between the two ‘industries’ (though it was suggested that an ‘industry’ cannot exist until someone is making money: perhaps not the case with citizen media, overblown acquisitions aside) where ‘citizen media’ is concerned that any partnerships with ‘conventional media’ will stifle innovation. Clearly, this is wrong, and ignores the ‘citizen’ part of ‘citizen media’: any partnership cannot exist without the ‘citizen’ remaining, thus changing conventional media. And if the ‘citizen’ component is dissolved, it becomes a meaningless acquisition as ‘media’ already exists, and ‘citizen media’ without the ‘citizen’ has no impetus whatsoever.</p>
<p>However, that aside, this (perhaps mutual) hostility raises interesting notions.</p>
<p>If we consider the two to exist in entirely distinct and disparate spheres, then new possibilities arise. We accept that citing and re-using ‘mass media’ material in new creations is, for a time, impossible. We accept that a ‘normalisation’ is taking place, to cite the much-lauded ‘village square’ concept of communal media: that we are returning to a ‘normal’ state, and that broadcast top-down media was a temporary hiccup in the state of human being. The difference, then, is that we now exist in a globalised state where those with whom we communicate (or, share media/experience) is not limited by geography… but remains limited in scope (sensual experience, for example, is rather inhibited by the tyranny of distance).</p>
<p>In two hundred years, assuming mass media assimilated back into ‘normality’ today, all copyright would have expired and all work could be cited, quoted, re-used and abused as people willed it. There is clearly no great possibility of this happening: acknowledged even by the mass-media-hostile personality interviewed. Should we care? Maybe. If there is material worth reproducing, that is.</p>
<p>The web is a temporal media, still. Never before have such vast volumes of information been so volatile, in part because such vast volumes of information have never been so accessible (in an entirely un-web-standards-related sense). Hence, it is possible that the alleviation of this access will hurt more than it would had we not known what was possible. The nature of this detachment from the web isn’t something to be discussed here — suffice to say, global energy crisis, war, censorship (because the web remains relatively dependent on a small number of servers — DNS root servers particularly) and a variety of other factors could all play a part. But what would this mean?</p>
<p>Earlier, I alluded to the ‘globalised village’ concept, and how that, in some senses (no pun intended), fails. What we are now seeing is a series of online ‘communities’ existing in parallel, with very occasional (but also very complex) perpendicular relationships. <strong>There is no global village</strong>. There are a series of global communities, with which people can choose to participate and engage to whatever extent they deem desirable. A series of factors aside from the web and <acronym title="Mainstream Media">MSM</acronym> have also led to the decline of the physical ‘village’ environment — urban sprawl, globalisation in a physical sense (highly mobile populations, etcetera) and the like are examples of such — but there is something wrong with an entirely directed, specific, no-overlap environment. <a href="http://kitten-man.com">Ben</a> remarked a day or two ago that it’s intriguing his three best friends all have an affinity for English (and two of those teaching it), whilst he is indifferent about the language, as about teaching (<a href="http://kitten-man.com/2006/01/26/5th-form-maths/">though remarked it is ‘fun’ where maths is concerned</a>!).</p>
<p>Rarely, in Internet-based communities, have I seen someone engage with people outside of their own area of principle interest. Web <em>sites</em> work like that. They are <em>sites</em> with a purpose: and, if they do not have a purpose, the traffic they attract is often sporadic and undirected. Even this blog has a purpose — it must, to have attracted (and retained) the attention of <a href="http://matthom.com/">an American</a> with an interest in web publishing. Once attention is engaged on one front, it is possible to explore others — it’s possible that people with an interest in web publishing and accessibility will read this post simply because it popped up in their feed reader and looked vaguely interesting (though length is doubtless a deterrent!). Back to the term ‘site’ — clearly, this word’s etymology ensures it cannot be divorced from its real-world meaning.</p>
<p>People do not simply enter a building for no reason. This parallel fails to some extent as the power of search-engines come into the equation — but, remember, search engines must <em>also</em> discover a ‘site’ at some point (impossible without incoming links). Which brings us back to the parallel-with-occasional-perpendicular-bridges image (note, parallel cannot mean linear because of the nature of hyperlinks. Perhaps I speak of parallel Möbius strips?)</p>
<p>Irrespective of the mechanisms for web-based exploration, web media and mainstream media <em>both</em> fail to serve an encompassing purpose of human interaction. Copyright makes no difference to this. Observe how distracted this post is. Observe how I return to the topic of copyright harshly, how it does not link to the important defining qualities of human interaction (which, it must be said, the web in part facilitates). This was both intentional and unavoidable: there is no better link. Copyright doesn’t matter, and previously created content under copyright does not matter. Eventually, copyright will dissolve, and a harmonisation between formally detached publishing mechanisms (I have decided that is all the difference is) will come about. People will continue to express themselves, drawing on the content of their time — ideas are aside from copyright — whilst, perhaps, drifting apart from this new media and back into the village…</p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2006/01/29/on-the-follies-of-copyright-expectations/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Name change</title>
		<link>http://josh.st/2005/11/04/name-change/</link>
		<comments>http://josh.st/2005/11/04/name-change/#comments</comments>
		<pubDate>Fri, 04 Nov 2005 11:37:22 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Georgia]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[site-by-site]]></category>
		<category><![CDATA[Style Street]]></category>
		<category><![CDATA[Tsar]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/2005/11/04/name-change</guid>
		<description><![CDATA[I’d been keeping the name “StreetComputing” around because… it fit like an old shoe. I now think it’s kind of ugly, and it’s been hinted I should do away with it a handful of times, but I’d kept putting it off. Brand recognition and all that, at least with the folks at school, many of [...]]]></description>
			<content:encoded><![CDATA[<p>I’d been keeping the name “StreetComputing” around because… it fit like an old shoe. I now think it’s kind of ugly, and it’s been hinted I should do away with it a handful of times, but I’d kept putting it off. Brand recognition and all that, at least with the folks at school, many of whom remembered it from Year 10 (when, it must be said, the audience from that part of the world was many times greater than what it has been ever since. Mostly as a result of the proliferation of videos in which stupid things were done, and, I would hazard, the fact that I wrote as though I were completely illiterate). But those days are done. And, further away from the Real World, your site could be called “The Hippopotamus Tsar; or, The Adventures of One Irrelevant Title to a Land Not So Far Away” and do perfectly well by the search engines, assuming you had the content to match it.</p>
<p>Rebranding in a commercial context is risky. But I just made this site look like a whitewashed wall, and have signalled my general lack of interest in complex aesthetics here at this particular point in time. Further away, perhaps, but that’s something entirely separate. I’ve got three designs in the works at the minute, one of which is this site, another of which is less austere but certainly not what would readily be described as “busy”, and another I envisage [having not yet commenced its design outside of my head] will be a fair distance from #fff (white, for those non-geeks in the audience) and hopefully more visually complex than the other two. I say this so everyone is aware I’m not a lost cause to this whole sparse design thing. I just wanted change, but change is isolated site-by-site.</p>
<p>So whilst I’m ripping down the establishment in that sense, I may as well do the same thing with the name! I struggle to define a purpose for this site, which is an overwhelmingly good thing (I think) at this particular point in time. It means I am not confined by the tyranny of Purpose and its twin, Rationale. Nepotism runs rife.</p>
<p>And, seeing as I have this domain here, it does make quite a lot of sense to make the blog name match the domain name, for as long as the blog remains the key component of the domain. StreetComputing would have been a convenient identifier had my blog related to computers and been of approximately equal or slightly less than equal (either way) importance within this domain space, but it doesn’t relate to computers for the most part, and I’m not keen on splitting up blogs into separate nerdery (language), geekery (pure IT), tech-ery (unsure about the appropriateness of that suffix there, but pertaining to A/V) and web-ery (CSS/markup/occasional server-side) blogs.</p>
<p>Though I am tempted. I could call one of them “Cascading Style Street” ;-)</p>
<p>Anyway. This is now called Joahua.com. And that was a really lengthy explanation.</p>
<p>My concern now is chiefly for Georgia’s capital J, on which the lower serif (there’s probably a more precise typographical term) seems disproportionally large:</p>
<p><img src="/blog/wp-content/2005/11/joahuacomgeorgia.png" alt="Joahua.com set in Georgia" /></p>
<p>That aside, this whole change thing is fun. Can you tell it’s [nearly] the end of my HSC? ;-)</p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2005/11/04/name-change/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Selling an audience short?</title>
		<link>http://josh.st/2005/09/14/selling-an-audience-short/</link>
		<comments>http://josh.st/2005/09/14/selling-an-audience-short/#comments</comments>
		<pubDate>Tue, 13 Sep 2005 22:03:33 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[advertising space]]></category>
		<category><![CDATA[Ansearch]]></category>
		<category><![CDATA[Commander]]></category>
		<category><![CDATA[Dean Jones]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Optum]]></category>
		<category><![CDATA[Search engine]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[search service]]></category>
		<category><![CDATA[VOIP]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[Yahoo!]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/?p=688</guid>
		<description><![CDATA[Or, What Josh Said About Ansearch That Was Irrelevant to Most Users. Dean Jones responded to my Ansearch Answers post with the following: All in all I feel [the post is] a fair representation of the so called facts, but I stand by my recent email… namely that simply reviewing us on technical issues that [...]]]></description>
			<content:encoded><![CDATA[<h4>Or, What Josh Said About Ansearch That Was Irrelevant to Most Users.</h4>
<p>Dean Jones responded to my <a href="/blog/2005/09/13/ansearch-answers">Ansearch Answers post</a> with the following:</p>
<blockquote><p>All in all I feel [the post is] a fair representation of the so called facts, but I stand by my recent email… namely that simply reviewing us on technical issues that most people either</p>
<ol>
<li>wouldn’t have discovered, or;</li>
<li>would not likely care about,</li>
</ol>
<p>is selling your audience short.</p></blockquote>
<p>I’m inclined to disagree, and just wanted to quickly post to say that. I like to think I understand the ‘audience’ here fairly well. They’re either people with (web-)geek tendencies, and are hence interested in any analysis and criticism I can deliver on the technical aspects of products, etc., <em>or</em> (and this category is completely unrelated to the former) students and humanities-focussed people reading various content I’ve published here — ranging from <a href="/blog/2004/10/29/ibsens-a-dolls-house-set-design">stage plots</a> to <a href="/blog/2005/08/24/abbreviated-human">a short story</a> to <a href="/blog/2005/04/25/what-is-the-digital-divide-and-what-implications-for-society-and-the-individual-are-seen-to-arise-from-this">an essay on the nature and effects of the digital divide</a>.</p>
<p>Most guests in the latter category are just that: guests. They generally discover this content via a search engine, read what they want, and leave. Over 80% of my visitors stick around for one minute or less, presumably because they find what they need quickly, or discover that the content isn’t what they were looking for.</p>
<p>The “regular” audience/participants, however, are not that. I don’t think you’re all geeks, but this blog leans towards that style of content, and you match that accordingly. You don’t come here looking for product recommendations (the one exception to that being someone who viewed <a href="/blog/2005/01/18/josh-wants-to-install-voip-asterisk">my post on Asterisk/VoIP</a>, and asked me what my experiences with it had been some time later: to which I replied, we haven’t bothered, as we <a href="/blog/2005/02/14/two-weeks-in">moved into a house with a Commander system preinstalled!</a>). You come here, I think, for the quality of writing, for rants, for occasionally insightful (I hope) comment on various facets of things I deem interesting.</p>
<p>This is a blog. This is not a newspaper, though it is possible that search engines, ironically, are changing the clout of this medium to something similar. The distinction between newspaper and blog becomes blurred with posts like the one that inspired this, because of the form it was written in. It is important, however, to remember the audience.</p>
<p>People don’t come here to shop for search engines. We might be interested in how they work, what they do, what the potential benefits and failings of each one is, but ultimately it doesn’t affect anyone’s choice in the real world. Similarly, investors are unlikely to come here, scoping out Ansearch’s offering before buying into parent company Optum. And, if they did, my concluding remarks were positive — I genuinely believe the story balanced out in their favour more than anything else. If I overplayed the significance of a small flaw that could potentially be abused, my apologies. I don’t, however, regret including it in there at all, because I think it’s something my audience is interested in.</p>
<blockquote><p>As you stated in an earlier email… “I’m not 100% sure as to how one should go about reviewing a search engine.” Here’s a tip. like Google, Yahoo, MSN… we are a business. For us to stay in business we need to generate revenue.</p>
<p>To do this we need to get more people to our SE, to get them to come back more often, and to, through their usage (CPM, CPC etc…) generate revenue.</p>
<p>To achieve this we need to provide a search service that the user finds useful. Given our rapid growth over the past months in UV’s <em>and</em> revenue, I would say we are doing OK.</p></blockquote>
<p>Unfortunately for Ansearch and anyone else who wants to use this as an advertising space, we don’t particularly care if you’re making money. It’s good to hear they’ve grown: if their evolving product is anything to go by, they deserve it. But metrics such as revenue and Unique Visitors mean little to <em>this</em> audience, even if it’s what investors want to find out all about.</p>
<p>I think this is a fair assessment of this site’s ‘audience’ (the important ‘audience’, for me, being the minority that don’t come through search engines, subscribe by RSS, and come back regularly) — though, as always, your role is not restricted to that. You are participants. In light of this, I’d invite comment and discussion on this post as to your role as <em>you</em> understand it. It’s possible I’ve got this all wrong… but I doubt it.</p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2005/09/14/selling-an-audience-short/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Ansearch answers</title>
		<link>http://josh.st/2005/09/13/ansearch-answers/</link>
		<comments>http://josh.st/2005/09/13/ansearch-answers/#comments</comments>
		<pubDate>Tue, 13 Sep 2005 12:46:21 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Geek]]></category>
		<category><![CDATA[Usability]]></category>
		<category><![CDATA[Web Standards]]></category>
		<category><![CDATA[absolute solution]]></category>
		<category><![CDATA[advertising impressions]]></category>
		<category><![CDATA[advertising network]]></category>
		<category><![CDATA[Ansearch]]></category>
		<category><![CDATA[Ansearch CEO]]></category>
		<category><![CDATA[ASX]]></category>
		<category><![CDATA[australia]]></category>
		<category><![CDATA[author]]></category>
		<category><![CDATA[blog search engine]]></category>
		<category><![CDATA[CEO]]></category>
		<category><![CDATA[Dean]]></category>
		<category><![CDATA[Dean Jones]]></category>
		<category><![CDATA[demographic based search feature]]></category>
		<category><![CDATA[full time manager]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Internet community]]></category>
		<category><![CDATA[Internet Explorer]]></category>
		<category><![CDATA[Internet users]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Major]]></category>
		<category><![CDATA[media types]]></category>
		<category><![CDATA[MP3]]></category>
		<category><![CDATA[News Corp]]></category>
		<category><![CDATA[NineMSN]]></category>
		<category><![CDATA[online properties]]></category>
		<category><![CDATA[Optum Ltd.]]></category>
		<category><![CDATA[player]]></category>
		<category><![CDATA[Search engine]]></category>
		<category><![CDATA[search engine division]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[search tool]]></category>
		<category><![CDATA[site owner/webmaster]]></category>
		<category><![CDATA[ubiquitous search behemoth]]></category>
		<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[update services]]></category>
		<category><![CDATA[web browser]]></category>
		<category><![CDATA[web community hopes]]></category>
		<category><![CDATA[Web developers]]></category>
		<category><![CDATA[Yahoo!]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/?p=687</guid>
		<description><![CDATA[All had been quiet on the Ansearch front as I awaited a response from Ansearch CEO Dean Jones, promised a hair under two weeks ago when I alluded to an earlier analysis/criticism I’d written when talking about the state of play with Australian search engines, specifically referring to the then-newcomer Ansearch. Dean picked up my [...]]]></description>
			<content:encoded><![CDATA[<p>All <em>had</em> been <a href="/blog/2005/09/08/all-quiet-on-the-ansearch-front">quiet on the Ansearch front</a> as I awaited <a href="/blog/2005/08/29/something-exciting-in-the-australian-search-space#comment-4550">a response from Ansearch CEO Dean Jones</a>, promised a hair under two weeks ago when I alluded to <a href="/blog/2005/04/04/something-about-backwards-search-engines">an earlier analysis/criticism</a> I’d written when talking about the state of play with Australian search engines, specifically referring to the then-newcomer <a href="http://www.ansearch.com.au/">Ansearch</a>.</p>
<p>Dean picked up my post via <a href="http://technorati.com/">Technorati</a>, a blog search engine that uses RPC update services to track what people are talking about in real-time. I was suitably impressed by this diligence and apparent desire to hear what the market has to say about their product: could this be the same company whose birth was so marred by a spat of cyber-squatting, in what Dean Jones was <a href="http://australianit.news.com.au/articles/0,7204,12618818%5E15318%5E%5Enbv%5E15306,00.html">reported to have described as a fit of “youthful exuberance”</a>?</p>
<p>Apparently so. Ansearch’s beginnings, though marred by dubious practices<sup><a href="#687fn1" id="#687fn1-base">1</a></sup>, received praise from various quarters of the mainstream press — or, at least, those quarters not controlled by News Corp, whose domains had come under threat. However, the Internet community responded quietly, and those voices that were heard were mostly of disdain at Ansearch’s domain practices.</p>
<p>Strangely enough, my original post wasn’t about any of that. I hadn’t heard of Ansearch until I read <a href="http://www.smh.com.au/news/Technology/New-Australian-search-engine-launched/2005/04/04/1112489391541.html">an article on them in the SMH</a> — an article which reads a little too much like a rehashed press release for my liking: the telltale sign is in the closing sentence “Ansearch is the search engine division of Optum Ltd.” — if it were filed in the Business section of their paper, I’d understand, but it wasn’t.</p>
<p>I wandered over to their site, played around for a bit, and decided their offering was mediocre. In hindsight, it probably didn’t help that I wasn’t shopping for anything in particular — according to <a href="http://www.zdnet.com.au/news/communications/soa/Ansearch_launches_amid_domain_name_dispute/0,2000061791,39186987,00.htm">a ZDNet article</a>, “In the short term [Ansearch] is focusing very heavily on the commercial end of the market.” — but at that point in time, I also don’t think they’d tuned their listings particularly well, as a search for DashLite turned up my WordPress hack over commercial listings for the actual Dashlite brand I inadvertantly used.</p>
<p>I say “at that point in time”, because it appears to have substantially improved since, as per Jones’ claim: “Much has changed since your first article on us some 6 months ago.”</p>
<p>Much improved, it seems, on several fronts. Their core offering has shaped up nicely, and  some facets of my initial complaints regarding accessibility have been met. Their ancillary product offerings seem to have developed nicely: Ansearch CEO Jones claims “Each of [our properties] goes through up to 7 stages ranging from an initial, simple <acronym title="Search Engine Results Page">SERP</acronym>/Directory style page through to a more involved service, mini portal, search tool, etcetera.” He went on to say that these ancillary properties (such as <a href="http://www.picsearch.com.au/">http://www.picsearch.com.au/</a>, <a href="http://www.videosearch.com.au/">http://www.videosearch.com.au/</a>, <a href="http://www.thefreedictionary.com.au/">http://www.thefreedictionary.com.au/</a> and <a href="http://www.messengers.com.au/">http://www.messengers.com.au/</a> amongst several others) are currently being actively separated from the core Ansearch site (he described it as “quarantining”), and the exact direction of a number of these projects would become clear over the coming months, with the appointment of a full time manager of these online properties.</p>
<p>I’m a tad concerned about his description of their strategy with regard to these — he said this would become clear over the months to come, and I’m hanging off two words here: distributed portal. Whilst I can see this as being of value to users (especially for generic, non-brand-specific/legally dubious domains such as jokes.com.au and the ones listed above), it doesn’t seem to fit Ansearch’s core strength as I perceive it: as a commercial portal, and not as another <a href="http://www.google.com.au/">Google</a>. “We are not aiming to be another Google… we don’t have their budget and, to be frank, there are enough people trying to clone them: why build another?”</p>
<p>In fact, Jones suggested that Ansearch’s strengths lie in that it is not the ubiquitous search behemoth, and that its index is “something unique… something faster… [and] against the so called “arms race” of search (my SE has more links than yours etc…)”. I’d agree this is indeed a strength, and also a reason for them not to try and be a portal. Australia already has Yahoo! and NineMSN for domestic portals, and I’m struggling to see what Ansearch will do to differentiate themselves in this: but I’m happy to be surprised!</p>
<p>Ansearch apparently holds an index of only 500,000 websites considered by its metrics to be “most popular”. I argued that this was potentially a bad thing as relevant content might lie outside this realm: for example, this website performs well when people search for <a href="/blog/2005/08/26/hp-photosmart-2610-review">reviews of the HP 2610</a> or information about <a href="/blog/2005/03/06/ubuntu-apache-and-making-mod_rewrite-happy">Apache on Ubuntu linux </a> or <a href="/blog/2004/11/08/mp3-player-and-act-files">ACT files from MP3 players that record audio</a>, but isn’t included in Ansearch’s core index.</p>
<p>Which is perfectly valid, for a commercially-focussed site, I just think they could be missing out a little bit. They can leverage on my content for their advertising impressions and potential clickthroughs, because they have more valuable content showing up in their listing alongside advertised products. If someone reads my HP 2610 review after having found it in Ansearch, and decides they’d like to buy it and remembers having seen a “Buy HP printers!” ad on Ansearch, they’ll most likely click “back”. It’s abstract, behavioural stuff, but valuable nonetheless.</p>
<p>Whether it’s valuable enough for them to bother is another matter. “We spider our own content… something that over time will be done daily,” says Jones. “Having only 500,000 websites will allow us to index sites more often, and as is the case with the ‘site info’ pages, provide far more info on these pages.” Which is a value-add, and worth preserving. If that’s all resources permit, I think they’re doing the right thing as is. Jones openly admits Ansearch’s index of popularity “has a commercial flavour to it” — and rightly so. Given their much-touted gender and age demographic based search feature, this makes sense.</p>
<p>Their index of popularity seems to be fairly slow-moving. “Monthly we add around 20,000 sites… and take out 20,000.” I’d guess this would be the lowest 20,000 that gets shuffled, and this seems to make sense. One has to wonder whether all the higher-ranking pages can have substantially fresh content month after month, but presumably they do — it’s one of the things the <acronym title="Search Engine Optimisation">SEO</acronym> experts have always cried from rooftops.</p>
<p>It was interesting to hear Jones speaking about these people, too: amusing, even! Web developers the world over often join in speculation as to what exactly makes search engines tick, such that we can boost our clients (or employers) website’s performance. It seems the reverse is also true: search engines all over the world similarly speculate as to what those horrible developers are doing to screw with their indexes day in and day out!</p>
<p>I don’t say this in jest, and I believe they’re right to complain: “The larger SE’s are having a very tough time coming up with clever ways to index content to counter SEO… only to have SEO’rs quickly find ways around it. Cat and mouse…” I think “counter SEO” was a poor choice of words, given that relevant content should hopefully still be rewarded, but his point stands.</p>
<p>Just as interesting is Ansearch’s strategy to avoid falling prey to dodgy SEO tactics:</p>
<blockquote><p>By only indexing the root page, we remove almost all SEO trickery. This works in 2 ways. Firstly, people rarely put spam on their home page — that is, doorway pages, link farms, etc. usually reside away from the main index… and, secondly, it deletes multiple results from the same website. It also stops the site owner/webmaster from saying they are relevant to 100 or 1000 keywords or phrases.</p></blockquote>
<p>Kids, we just found a new argument against clients who love their splash pages!</p>
<p>Content rich front pages aren’t, however, an absolute solution (at least, not in Ansearch’s index). According to Jones, Ansearch’s policy of “ranking sites in true <em>usage</em> popularity, both on <em>and</em> offsite” is “SEO proof… or at the very least, extremely resistant.” I’d agree it’s a powerful metric, but my reservations above still stand.</p>
<p>One caveat of Ansearch’s algorithm that appears potentially exploitable is its failure to exclude content in the <head> from indexing. I don't just speak of standard meta author/keywords data, but of something else.</p>
<p><a href="http://ansearch.com.au/furtherinfo?id=zvzshyzdzm"><img src="/blog/wp-content/2005/09/ansearchengadget.png" alt="A screenshot highlighting the inclusion of information between style tags in Ansearch's index" /></a></p>
<p>As highlighted in the screenshot above (click for original page, link may expire), Ansearch’s listing is including content between &lt;style&gt; tags. This presents potential for SEO abuse<sup><a href="#687fn2" id="#687fn2-base">2</a></sup>, as most browsers happily overlook errors in CSS — and &lt;style&gt; tags can be placed towards the top of a document: if we are to believe the SEO myths, increasing their relevance in engines. Of course, it’s entirely possible the content bears no weight at all — but the question of why it is stored in their index at all remains unanswered.</p>
<p>This is another reason to reward websites that use semantic markup properly, though at this stage that would exclude disproportionate amounts of the web, so I understand engines’ hesitance to embark on anything like this. It’s not something a lot of sites use”, says Jones, before continuing “but it will be used more and more in the future.” Well, so much of the web community hopes.</p>
<p>This formed part of Ansearch’s defense for not having embraced semantic markup from the outset. According to Jones, it’s built on a technology developed for a pre-April 2000 (dot com crash) search engine — so that partially excuses the markup at launch time. Jones’ first comment on their failure to use semantic markup was simply that “The majors [Google and Yahoo!] don’t use it” — something I’d dispute the validity of, as Ansearch isn’t a “major” player, and, as has been established, is chasing a fairly different market sector. Their core business is search, but it’s a different breed of search conducted in a different way: and semantic markup and accessibility <em>is</em> a different way. Encouragingly, Jones sees the potential for embracing semantic markup in the future on both technical and commercial grounds: “It makes sense to use it and as it does open us to a wider audience with various devices used to browse our site.”</p>
<p>He didn’t cite the “reduced bandwidth expenditure as a result of lightweight code” reason, presumably because their host, <a href="http://www.ozhostingadvanced.com/">OzHosting/Destra</a> charges only for the link, not for transfers over this, on their dedicated server range.</p>
<p>Irrespective of their reasons, the future of Ansearch in terms of markup is promising:</p>
<blockquote><p>Our long term goal is to have Ansearch website designed without any tables and heavily styled using the CSS, which eventually will gives us more control on how we present our site to different media types.</p>
<p>Ansearch has gone through several minor enhancements over the past 6 months with the releases of versions 1 to 1.3. We are currently planning a major update for version 2.0 and the issues [of semantic markup and separation of presentation and content] will be addressed.</p></blockquote>
<p>But as we know, markup isn’t everything: content is what <del>ranks well in search engines</del> erm… content is what draws an audience. Ansearch’s exploration into the development of portal environments is something to be watched with interest over the coming months, as well as its other business aspects, including an advertising network known as <a href="http://www.soush.com/">Soush</a> that remains slightly enigmatic, and the mysteriously named “Factory” division.</p>
<p>An announcement is expected to be filed with the <acronym title="Australian Stock Exchange">ASX</acronym> later this week outlining something of Ansearch’s future direction: At this stage, I’m inclined to believe that the future is a positive one, as Ansearch distances itself from its much-criticised practices at launch, to a diverse range of product offerings that uniquely fulfil the needs of Australian Internet users.</p>
<p><ins>Update: A followup to this has been posted, in response to a criticism that this review was overly technical in nature. Read on!</ins></p>
<h4>Notes</h4>
<p><sup><a href="#687fn1-base" id="#687fn1">1</a></sup> Justified with the catch-cry “MSN do it, so we can, too!” — to which the only sensible reply is, “yes, but MSN do it with Internet Explorer, and as soon as you go and write your own web browser, feel free to hijack as many unused pages as you want.“<br />
<sup><a href="#687fn2-base" id="#687fn2">2</a></sup> I notified Ansearch of this shortly prior to publication in the hope that, if this is indeed an issue, it will be resolved before this post is noticed and widely acted upon. One hopes this potential problem disappears quickly.</head></p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2005/09/13/ansearch-answers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Statistics for April</title>
		<link>http://josh.st/2005/05/01/statistics-for-april/</link>
		<comments>http://josh.st/2005/05/01/statistics-for-april/#comments</comments>
		<pubDate>Sun, 01 May 2005 00:13:09 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[certain syndication services]]></category>
		<category><![CDATA[Manitoba]]></category>
		<category><![CDATA[Search engine]]></category>
		<category><![CDATA[search engine spidering]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/2005/05/01/statistics-for-april</guid>
		<description><![CDATA[This month past has regrettably been one in which I haven’t spent enough time creating content, although March was fairly good, but the traffic didn’t really falter for it. Unique: 2091 Visits: 4281 Pages: 14158 Hits: 44419 Bandwidth: 484.76 MB Unique impressions are up by over 600 from last month, suggesting increased driven traffic (from [...]]]></description>
			<content:encoded><![CDATA[<p>This month past has regrettably been one in which I haven’t spent enough time creating content, although March was fairly good, but the traffic didn’t really falter for it.</p>
<p><strong>Unique:</strong> 2091<br />
<strong>Visits:</strong> 4281<br />
<strong>Pages:</strong> 14158<br />
<strong>Hits:</strong> 44419<br />
<strong>Bandwidth:</strong> 484.76 MB</p>
<p>Unique impressions are up by over 600 from last month, suggesting increased driven traffic (from external sites, search engines, etc., which accounted for 13.1% of total traffic, as opposed to incidental traffic — that is, regular visitors, syndication, etc.).  Bandwidth is slightly up on March, exceeding 500MB if non-viewed statistics are included (there was 192.48 MB of non-viewed traffic, which means search engine spidering as well as certain syndication services <em>I think</em>), which is getting fairly sizeable I think, especially compared to last year’s statistics for this month (which aren’t really valid, because there was a holder page up then, but it’s fun to point out) — 2.38MB of traffic and 30 unique visitors!</p>
<p>The most popular post on this website remains the <a href="/blog/2005/03/19/dashlite-an-alternative-dashboard-for-wordpress-15">original DashLite announcement</a>, although the updated version doesn’t really get a look in… which is okay, because it was released more out of social responsibility than any new <em>need</em>, and people can choose for themselves what they want.</p>
<p>See also:<br />
<a href="/blog/2005/03/01/statistics-for-february">February 2005 statistics</a><br />
<a href="/blog/2004/11/01/traffic-summary">September/October 2004 statistics</a><br />
<a href="/blog/2004/07/01/statistics-and-a-gimmick">June 2004 statistics</a></p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2005/05/01/statistics-for-april/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Something about backwards search engines</title>
		<link>http://josh.st/2005/04/04/something-about-backwards-search-engines/</link>
		<comments>http://josh.st/2005/04/04/something-about-backwards-search-engines/#comments</comments>
		<pubDate>Mon, 04 Apr 2005 07:21:59 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Design]]></category>
		<category><![CDATA[Geek]]></category>
		<category><![CDATA[Usability]]></category>
		<category><![CDATA[Web Standards]]></category>
		<category><![CDATA[AAP]]></category>
		<category><![CDATA[australia]]></category>
		<category><![CDATA[brand new search engine]]></category>
		<category><![CDATA[fresh search engine]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[internal web team]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[main search results]]></category>
		<category><![CDATA[search clutter]]></category>
		<category><![CDATA[Search engine]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[search terms]]></category>
		<category><![CDATA[software providers]]></category>
		<category><![CDATA[Sydney Morning Herald]]></category>
		<category><![CDATA[the Sydney Morning Herald]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/2005/04/04/something-about-backwards-search-engines</guid>
		<description><![CDATA[No, I’m not talking about elgooG. The Sydney Morning Herald published an article entitled “New Australian search engine launched” today, the first paragraph of which reads “Australia’s newest search engine Ansearch opens for business today with a novel twist, demographic searching.” It’s not a particularly well written article, but the article vendor is AAP, not [...]]]></description>
			<content:encoded><![CDATA[<p>No, I’m not talking about <a href="http://www.alltooflat.com/geeky/elgoog/">elgooG</a>.</p>
<p>The Sydney Morning Herald published an article entitled “<a href="http://www.smh.com.au/news/Technology/New-Australian-search-engine-launched/2005/04/04/1112489391541.html">New Australian search engine launched</a>” today, the first paragraph of which reads “Australia’s newest search engine <a href="http://www.ansearch.com.au/">Ansearch</a> opens for business today with a novel twist, demographic searching.”  It’s not a particularly well written article, but the article vendor is AAP, not the SMH itself, so we’ll leave <em>that</em> alone, at least for the minute.</p>
<p>It goes on to laud the search engine for their innovation, both in this feature of demographic searching, and in other areas:</p>
<blockquote><p>Ansearch says it cuts down search clutter by displaying the main search results as single websites and not the individual pages of websites.</p></blockquote>
<p>What, like the Google [More results from domainname] feature?  You know, the one that actually <em>works properly</em>?  I say “works properly”, because a quick search of Ansearch reveals that their “cutting search clutter” feature is a tad broken — not to mention their character encoding.</p>
<p><img src="/blog/wp-content/2005/04/ansbrokendashlite.jpg" alt="Proof that it's broken, demonstrated by duplicate entries and incorrectly encoded characters" /></p>
<p><span id="more-523"></span></p>
<p>Speaking of broken character encoding, let’s take a look at their source… well, they get some marks — at least they bothered with a Content-Type.  Never mind if the content is <em>broken</em> when displayed with that Content-Type — it’s not like a search engine could actually do any useful data processing to make things display correctly when using a slightly redundant Content-Type… oh, wait, disregard that comment: <em>they’re not using a doctype, either</em>.</p>
<p>See, what gets me is that this search engine has <em>just been launched</em>.  Which means the climate in which it’s been developed isn’t the same as 5 years ago, when accessibility was just on the very edges of the radar — you’d (wishfully) imagine that at least a <strong>doctype</strong> wouldn’t be too much to ask for, even if they still insisted on using table-based layouts.  Interestingly enough, that’s what one of their software providers, <a href="http://www.omniture.com">Omniture</a>, have done.  Which leaves something of a foul taste in the mouth, too, because they’re reselling that garbage to people — including, if you believe their website, three of the five top Fortune 500 companies (aside: doesn’t that make them three of the top Fortune 5?).</p>
<p>Perhaps that criticism is unfair — their latest version (assuming that’s what powers their own website, although possibly not… maybe their internal web team accepts that their product would be overkill, and coded it in Dreamweaver, instead… some of the JavaScript certainly looks Dreamweaver-esque, and, if Ansearch’s website is any example, the doctype probably doesn’t come from the Overture system!) seems to handle much better than what Ansearch are running: I say this, because apparently they’re using a version which was written back in 2003.  Hey, if it works… but we’ve already established it <em>doesn’t</em>.</p>
<p>And there concludes my rave review of yet another quite-some-way-from innovative and fresh search engine, this time in Australian waters.</p>
<p><small>Note: I’m not saying it’s any worse in terms of accessibility, usability, and semantics than most other search engines are — only that it has less excuse, being launched <em>now</em>, as opposed to 5 or 10 years ago.  It’s easier to make something work first time than it is to haphazardly patch over it later, especially something as gargantuan as I’d imagine a search engine would be.</small></p>
<p>Oh, and now for something that’s just plain amusing — the number 1 search terms on this brand new search engine, from befuddled users wondering why it sucks so much:</p>
<p><img src="/blog/wp-content/2005/04/anspopsearch.jpg" alt="Top search queries are Google for both Weekly and Monthly statistics" /></p>
<p>Yes indeed, the first thing users did was try to escape… how’s that for telling?</p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2005/04/04/something-about-backwards-search-engines/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Graveyard retired</title>
		<link>http://josh.st/2005/04/03/graveyard-retired/</link>
		<comments>http://josh.st/2005/04/03/graveyard-retired/#comments</comments>
		<pubDate>Sun, 03 Apr 2005 02:13:18 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Before WordPress]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Search engine]]></category>
		<category><![CDATA[search engine traffic]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[search function]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/2005/04/03/graveyard-retired</guid>
		<description><![CDATA[Some more attentive regulars (who don’t just peruse this website by means of syndication) may have noticed the disappearance of a link in the top bar in the last several hours. This is because I’ve finally got all the old content into WordPress, with no small amount of assistance from Michael, under a category called [...]]]></description>
			<content:encoded><![CDATA[<p>Some more attentive regulars (who don’t just peruse this website by means of syndication) may have noticed the disappearance of a link in the top bar in the last several hours.  This is because I’ve <em>finally</em> got all the old content into WordPress, with no small amount of assistance from <a href="http://www.bluetrait.com/">Michael</a>, under a category called “<a href="/blog/category/before-wordpress/">Before WordPress</a>” (this post is categorised similarly, and shall likely be the last ever entry into that category).</p>
<p>Practically, this means that that content is using semantically better markup, has better meta information for search engines, and is <em>internally</em> searchable, using the WordPress search function (it wasn’t before).</p>
<p>For most regulars, this probably doesn’t mean much, but the old articles attract the most search engine traffic, so this’ll be of benefit to people finding relevant content, at least, because the old script could be somewhat retarded in the way it was indexed, as there was no formal permalink structure, just a bunch of loose query strings, which search engines didn’t like.</p>
<p>&lt;/geek off&gt;</p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2005/04/03/graveyard-retired/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A quick note on accessibility and SEO</title>
		<link>http://josh.st/2005/04/01/a-quick-note-on-accessibility-and-seo/</link>
		<comments>http://josh.st/2005/04/01/a-quick-note-on-accessibility-and-seo/#comments</comments>
		<pubDate>Fri, 01 Apr 2005 02:03:41 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Design]]></category>
		<category><![CDATA[Usability]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[www.simonbreakspear.com.au]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/2005/04/01/a-quick-note-on-accessibility-and-seo</guid>
		<description><![CDATA[Just thought I’d brag for a moment about how, with only one incoming link in the entire world, and no visible text on the webpage, a site achieved the number 1 Google match globally in under 24 hours after adding the first link. Make that two links. The point stands, there’s not even any visible [...]]]></description>
			<content:encoded><![CDATA[<p>Just thought I’d brag for a moment about how, with only one incoming link in the entire world, and no visible text on the webpage, a site achieved the <a href="http://www.google.com/search?q=simon+breakspear&#038;start=0&#038;start=0&#038;ie=utf-8&#038;oe=utf-8">number 1 Google match globally</a> in under 24 hours after adding the first link.</p>
<p><a href="http://www.simonbreakspear.com.au/"><img src="http://www.joahua.com/blog/wp-content/2005/04/sb.jpg" alt="Simon Breakspar" /></a></p>
<p>Make that two links.  The point stands, there’s not even any visible text on that page, and, had it been created with Fireworks or Photoshop or something that autosplices and makes a table, search engines still wouldn’t know about it.</p>
<p><em>Full disclosure: <a href="http://www.simonbreakspear.com.au/">www.simonbreakspear.com.au</a> is a website developed by <a href="http://www.base10solutions.com.au/">base10solutions</a>, for whom I work.  The holder page located there at present should only persist a while longer, until the full website launches.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2005/04/01/a-quick-note-on-accessibility-and-seo/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Trackback spam</title>
		<link>http://josh.st/2005/01/06/trackback-spam/</link>
		<comments>http://josh.st/2005/01/06/trackback-spam/#comments</comments>
		<pubDate>Wed, 05 Jan 2005 22:53:49 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[filter systems]]></category>
		<category><![CDATA[Search engine]]></category>
		<category><![CDATA[search engine optimisation techniques]]></category>
		<category><![CDATA[search engine spiders]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/2005/01/06/trackback-spam</guid>
		<description><![CDATA[Oh. My. Goodness. Everyone else had been talking about it, but thus far my part of the world had remained relatively unaffected by it all — until this morning. I woke up at around 6, got up, checked my email, and (so I thought) my blog comments, before going back to sleep and waking again [...]]]></description>
			<content:encoded><![CDATA[<p>Oh. My. Goodness.  Everyone else had been talking about it, but thus far my part of the world had remained relatively unaffected by it all — until this morning.  I woke up at around 6, got up, checked my email, and (so I thought) my blog comments, before going back to sleep and waking again around 9ish.  At which point I was greeted with in excess of 40 trackback spam comments, most (all?  Does WP apply filters to trackbacks?  These were pretty blatant — sex, incest, bestiality — all plainly spelt for my filters to supposedly recognise — but still got through) of which were displayed on various pages of the blog.</p>
<p>Damn.</p>
<p>I very nearly disabled ping/trackbacks as a knee-jerk response to the whole thing, but didn’t.  What the hell do spammers aim to gain by doing this!?  In the case of blog-spam, especially!  <strong>DO YOU SEE MY REACTION?!  I’M NOT CLICKING YOUR LINKS TO SEE PICTURES OF INCEST AND BESTIALITY!  I’M CONSIDERING CLOSING OTHERWISE LEGITIMATE AVENUES OF COMMUNICATION JUST TO SEE YOU FAIL.</strong></p>
<p>Let’s all have a round of applause for their search engine optimisation techniques.  Not.  It’s not even a direct inbound link, which means it carries less weight with search engines anyway!  Considering you appear to be flaunting all kinds of filter systems, at least take advantage of it and do it properly!  Morons.  And, you know what?  I care enough that next time I get hit, I’ll take the time to remove all the comment spam over again.  If a search engine spiders me in that moment, your rank increases, right up until it spiders me again and it’s all removed.  Dare I say it, designing porn sites with ALT text and semantics in mind for the ultimate “blind user” is actually more intelligent than their currrent spamming methods.</p>
<p>Realise something:  Your trackbacks are utterly unfiltered at this point in time.  WordPress is open source, free-as-in-beer (porn?) software.  Anyone who had a shred of IT knowledge and 20 minutes could tell you that the trackbacks would get through.  Further, they would tell you that direct links weigh higher than indirect content links from external pages in terms of SEO.  And yet the trackbacks are nothing more than strings of words, with the ONLY link being an originating location link (for the trackback) — I’m NOT saying the entire content should be wrapped in an <code>a</code> tag — I’m saying the content itself is utterly not what it should be.  Contrary to popular belief, even spammers and porn networks should be capable of competently constructing brief messages using correct sentence structure and the like, whilst keeping a maximum of keywords intact.  Why am I saying this?  I couldn’t clearly explain.  I’m angry, and the only way I can (rationally) vent is by demonstrating (technically, ignoring all moral or other issues) what they’re doing wrong, and how this makes them ignorant pigs.  Yes, I’d rather use stronger language.  No, I’m not going to.</p>
<p>As a matter of defiance, please, if you’ve read this post and hold similar views on the subject, send me a trackback.  Just to show there’s legitimate use for it, and it needn’t be disqualified because of the criminal acts of spammers.  And, if you know of any (vigilante or not) campaigns in the works similar to Make Love Not Spam, let me know.  Please.</p>
<p><small>I’ve disabled commenting for this post to encourage people to use track– and ping-backs.  Having said that, if you have something to say and don’t have a blog or access to ping/trackback facilities, head over to the <a href="/contact/">Contact</a> page and drop me a line with what you want to say, and we’ll figure out a way to get it posted.</small></p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2005/01/06/trackback-spam/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>No, really, I’m here</title>
		<link>http://josh.st/2005/01/05/no-really-im-here/</link>
		<comments>http://josh.st/2005/01/05/no-really-im-here/#comments</comments>
		<pubDate>Wed, 05 Jan 2005 09:18:23 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[A Doll's House]]></category>
		<category><![CDATA[Amazon.com]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/2005/01/05/no-really-im-here</guid>
		<description><![CDATA[Again. Promise I’m not about to scurry away. Damn it, believe me! Anyway, to make up for the distinct lack of content this place has been suffering from (although the hits don’t suffer, thanks to search engines and previous relevant content regarding MP3 players, the set of A Doll’s House, and miscellaneous other things…), here’s [...]]]></description>
			<content:encoded><![CDATA[<p>Again.  Promise I’m not about to scurry away.  Damn it, believe me!</p>
<p>Anyway, to make up for the distinct lack of content this place has been suffering from (although the hits don’t suffer, thanks to search engines and previous relevant content regarding MP3 players, the set of A Doll’s House, and miscellaneous other things…), here’s a list of what I’ve been reading over the last week or two.  If you can’t read the titles on the cover images, hold your mouse over and text will appear.  Clicking images takes you to the relevant <a href="http://www.amazon.com/exec/obidos/redirect?tag=httwwwjoacom-20&amp;path=%2F">Amazon.com</a> product page (great for reviews that I’m too lazy to type now…).  I’ve got more to say on some of these books, but can’t be bothered typing right about now.</p>
<p><a href="http://www.amazon.com/exec/obidos/ASIN/0140444300/httwwwjoacom-20"><img border="0" src="/blog/wp-content/2005/01/lesmis.jpg" alt="Les Misérables (Hugo)" title="Les Misérables (Hugo)" /></a> <a href="http://www.amazon.com/exec/obidos/ASIN/0765342537/httwwwjoacom-20"><img border="0" src="/blog/wp-content/2005/01/dosadi.jpg" alt="The Dosadi Experiment (Herbert)" title="The Dosadi Experiment (Herbert)" /></a> <a href="http://www.amazon.com/exec/obidos/ASIN/0415943221/httwwwjoacom-20"><img border="0" src="/blog/wp-content/2005/01/futureactive.jpg" alt="Future Active: Media Activism and the Internet (Meikle)" title="Future Active: Media Activism and the Internet (Meikle)" /></a> <a href="http://www.amazon.com/exec/obidos/ASIN/0671027360/httwwwjoacom-20"><img border="0" src="/blog/wp-content/2005/01/angelsdemons.jpg" alt="Angels and Demons (Brown)" title="Angels and Demons (Brown)" /></a><br />
<a href="http://www.amazon.com/exec/obidos/ASIN/0312263120/httwwwjoacom-20"><img border="0" src="/blog/wp-content/2005/01/digitalfortress.jpg" alt="Digital Fortress (Brown)" title="Digital Fortress (Brown)" /></a> <a href="http://www.amazon.com/exec/obidos/ASIN/0521293669/httwwwjoacom-20"><img border="0" src="/blog/wp-content/2005/01/hamlet.jpg" alt="Hamlet (Shakespeare)" title="Hamlet (Shakespeare)" /></a> <a href="http://www.amazon.com/exec/obidos/ASIN/0802132758/httwwwjoacom-20"><img border="0" src="/blog/wp-content/2005/01/rosencrantzguildenstern.jpg" alt="Rosencrantz and Guildenstern are Dead (Stoppard)" title="Rosencrantz and Guildenstern are Dead (Stoppard)" /></a> <a href="http://www.amazon.com/exec/obidos/ASIN/0345410998/httwwwjoacom-20"><img border="0" src="/blog/wp-content/2005/01/streetboys.jpg" alt="Street Boys (Carcaterra)" title="Street Boys (Carcaterra)" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2005/01/05/no-really-im-here/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Assessments, nomadism and the ACCC</title>
		<link>http://josh.st/2004/11/16/assessments-nomadism-and-the-accc/</link>
		<comments>http://josh.st/2004/11/16/assessments-nomadism-and-the-accc/#comments</comments>
		<pubDate>Tue, 16 Nov 2004 07:18:06 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Geek]]></category>
		<category><![CDATA[Life]]></category>
		<category><![CDATA[School/Uni]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[Reading]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/2004/11/16/assessments-nomadism-and-the-accc</guid>
		<description><![CDATA[All three items mentioned in the title of this post have, fortunately, nearly no connection, aside from the fact that they’re all prevailing influences in my life at the moment. Well, okay, the ACCC isn’t a “prevailing influence”, but it’s being thought of. Those of you following the “Dear Microsoft” thread will possibly understand what [...]]]></description>
			<content:encoded><![CDATA[<p>All three items mentioned in the title of this post have, fortunately, nearly no connection, aside from the fact that they’re all prevailing influences in my life at the moment.  Well, okay, the ACCC isn’t a “prevailing influence”, but it’s being thought of.<span id="more-151"></span>  Those of you following the “<a href="http://www.joahua.com/blog/2004/11/11/dear-microsoft">Dear Microsoft</a>” thread will possibly understand what I’m referring to: It’s been updated today, following a phone call from the ACCC.  Head over there to find out more, because I don’t care to repeat myself elsewhere… it’ll only confuse search engines, Microsoft, the ACCC and myself (although not necessarily in that order).</p>
<p>The house-finding thing is yet to be tied up, unfortunately… this means I’m going to be moving not once, but twice (at a minimum) in the coming weeks (months).  More glad than ever about that PO Box, now.  The moves shall be punctuated by a cruise to miscellaneous Islands, which should be an enjoyable experience, although I’m planning on taking copious quantities of books and reading my way through it: there’s only so much you can do on a boat, after all.  That, and there are a handful of assessments of which I have already received notification for next year, for which reading is fortunately requisite.  Oh, yeah, and the whole final year of school slash HSC thing means this holiday is probably the last chance I’ll have to read for about 12 months… yes, advantage shall be taken!</p>
<p>Speaking more immediately, my goal is to make it to the weekend alive — I’ve got an oh-so-slightly menacing Modern assessment (a speaking task at that, ahhh!!!) due Friday, which I’m stressing a little about, and then a (completely insubstantial) proposal due for my Extension 2 major work (that’d be English) on Monday — I say “completely insubstantial” because it’s only 1,500 words, but that doesn’t mean I’m stressing any less about it!</p>
<p>As per usual, regardless as to whether or not anyone is actually <em>interested</em> in reading my submitted work, I’ll probably end up posting either a complete or somewhat abridged version of at least the Extension 2 proposal here… the Modern… well, perhaps.  I note with some disdain that the Wikipedia article entitled “<a href="http://en.wikipedia.org/wiki/Paris_Peace_Conference%2C_1919">Paris Peace Conference, 1919</a>″ is, in the wiki-vernacular, a stub… I may contribute to there once all this is over, not because I consider myself to be particularly knowledgeable, but simply because <em>some</em> content is better than none! (Assuming aforementioned content is accurate, which I sincerely hope it is!)</p>
<p>Reading between the lines of this post probably yields a message stating that Josh doesn’t have time (or the resources) to post in the immediate future, so anticipate a slow-down.  Whilst that’s one possible reading of this post, I don’t know that’s entirely true.  I’ll have to wait and see, as will everyone else!</p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2004/11/16/assessments-nomadism-and-the-accc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Robots.txt</title>
		<link>http://josh.st/2004/11/06/robotstxt/</link>
		<comments>http://josh.st/2004/11/06/robotstxt/#comments</comments>
		<pubDate>Sat, 06 Nov 2004 06:00:17 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Apache]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Search engine]]></category>
		<category><![CDATA[search engine counts]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[web site directory]]></category>

		<guid isPermaLink="false">http://www.joahua.com/blog/2004/11/06/robotstxt</guid>
		<description><![CDATA[I’ve been racking up a few too many 404’s of late, which I think is due to hits on my robots.txt file falling flat on their face (ooh, alliteration!) — this means that search engines are looking for this page, and getting errors happening. Reduce your 404’s today, and let search engines use their time [...]]]></description>
			<content:encoded><![CDATA[<p>I’ve been racking up a few too many <code>404</code>’s of late, which I think is due to hits on my <code>robots.txt</code> file falling flat on their face (ooh, alliteration!) — this means that search engines are looking for this page, and getting errors happening.  Reduce your <code>404</code>’s today, and let search engines use their time (and your bandwidth) more efficiently!<span id="more-134"></span></p>
<p><code>robots.txt</code> is the first file requested by a search engine (or any other <em>robot</em>) as it begins to spider a website.  You can use this file to (theoretically) allow or disallow all search engines, or specific search engines, access to certain pages of your website.  Each spider should have its own unique identifier, or user-agent string, which can be used to apply specific rules to it in a robots.txt file.</p>
<p>Personally, that’s of little relevance to me: blanket rules seem to work quite nicely.  <code>robots.txt</code> files should be placed in the root of your web site directory — so, in the case of this website, the appropriate place to locate it is <a href="http://www.joahua.com/robots.txt">http://www.joahua.com/robots.txt</a>.</p>
<p>With <a href="http://wordpress.org/">WordPress</a>, the blogging software in use here, there are some files which spiders have absolutely no need or reason to access — this should not only result in bandwidth reductions, but also in (possible) security improvements.</p>
<p>My <code>robots.txt</code> file, as of around 4:30pm today, looks like this:</p>
<p><code>User-agent: *</p>
<p>Disallow: /cgi-bin/<br />
Disallow: /blog/wp-comments-post.php<br />
Disallow: /blog/wp-login.php<br />
Disallow: /blog/wp-register.php</code></p>
<p>The contents of it are fairly self-explanatory, I think, but basically this stops search engines from looking in my cgi-bin directory, and from hitting the comment posting page, or trying to login to my administration area for WordPress.</p>
<p>Whether a bad hit on robots.txt from a search engine counts as a <code>404</code> error with Awstats, I’m uncertain: at any rate, the number of <em>logged</em> <code>404</code>’s (speaking of raw Apache statistics) should decrease now that there’s actually a file there.</p>
<p>More information can be had at <a href="http://www.robotstxt.org/wc/active.html">http://www.robotstxt.org/wc/active.html</a></p>
]]></content:encoded>
			<wfw:commentRss>http://josh.st/2004/11/06/robotstxt/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced

Served from: josh.st @ 2012-05-25 08:19:49 -->
