escaping lock-in

about a decade ago, i set up a yahoo email account as a spam catcher, and have used it for all sorts of registrations over the years. the time had come to finally ditch that account, between the pathetic spam filtering and the annoying user interface, but how to get all the emails i still cared about out without having to pay ransom? freepops to the rescue.

i find it remarkable that yahoo is so insecure about their product that they feel the need to lock-in users. in contrast, the superior gmail offers free pop access, so there is no lock-in. it is about time the industry holds yahoo to task over this nonsense.

spam filtering as a proxy for search market share

why is it that the most basic spams and 419 scams make it past yahoo’s spam filters into my yahoo inbox? i was willing to go give them a fair shot with their new ui, but their spam filtering is beyond bad, and makes their new mail beta just as unusable as the old one. almost makes me wonder if they have commercial reasons for letting a lot of spam through to their userbase. not having a capable contextual advertising platform must put them in some tight spots when revenue maximization time inevitably rolls around, and spam thresholds are early victims, i suppose. even more so at MSN, whose hotmail is even worse. it is the rare event, however, maybe once a month, that spam makes it to my gmail inbox.

the less search marketshare your email provider has, the more spam you can expect in your inbox.

update: the new york times reports that gmail spam filtering is getting even better. meanwhile, the obvious spam in my yahoo inbox continues.

looks like yahoo can’t even afford a SSL certificate for their mail domain.. oy. plus they insist to show you a spammy ‘start page’ instead of your inbox. someone getting desperate in the monetization department?

yahoo mail sucks

two approaches to rss

i don’t normally comment on RSS, but i have recently had occasion to deal with two sets of RSS extensions. both extend RSS into the geographical realm, but are lacking test cases so far. so i wrote to the two originators of those extensions, and got wildly different responses. a clueless “shucks” from yahoo, collaboration from the georss community. luckily, the yahoo extension is not causing a lot of damage since they are limiting their own success with their ineptitude.

bot classes

ok, after a couple days of robots.txt love, i have now much less crap in my logs. a good opportunity to see which bots are well-written. based on what i am seeing with /robots.txt, i am sure glad i blocked most of these festering piles of dung from my site.

not using conditional get while requesting /robots.txt

Only kinjabot, OnetSzukaj/5.0 and Seekbot/1.0 get this right. All other bots, including google and yahoo, do not. lame.

requesting /robots.txt too often

The biggest offender is VoilaBot, checking /robots.txt every 5 minutes, every day. you gotta be kidding me. google and yahoo are not much better, you’d think they’d figured out a way by now to communicate the state of /robots.txt across different crawlers. Other bots fare better by virtue of being less desperate.

update: problems like this are economic opportunities.