about a decade ago, i set up a yahoo email account as a spam catcher, and have used it for all sorts of registrations over the years. the time had come to finally ditch that account, between the pathetic spam filtering and the annoying user interface, but how to get all the emails i still cared about out without having to pay ransom? freepops to the rescue.
i find it remarkable that yahoo is so insecure about their product that they feel the need to lock-in users. in contrast, the superior gmail offers free pop access, so there is no lock-in. it is about time the industry holds yahoo to task over this nonsense.
why is it that the most basic spams and 419 scams make it past yahoo’s spam filters into my yahoo inbox? i was willing to go give them a fair shot with their new ui, but their spam filtering is beyond bad, and makes their new mail beta just as unusable as the old one. almost makes me wonder if they have commercial reasons for letting a lot of spam through to their userbase. not having a capable contextual advertising platform must put them in some tight spots when revenue maximization time inevitably rolls around, and spam thresholds are early victims, i suppose. even more so at MSN, whose hotmail is even worse. it is the rare event, however, maybe once a month, that spam makes it to my gmail inbox.
the less search marketshare your email provider has, the more spam you can expect in your inbox.
update: the new york times reports that gmail spam filtering is getting even better. meanwhile, the obvious spam in my yahoo inbox continues.
looks like yahoo can’t even afford a SSL certificate for their mail domain.. oy. plus they insist to show you a spammy ‘start page’ instead of your inbox. someone getting desperate in the monetization department?
what moron would expire news article URLs when disk space is plentyful and cheap? apparently yahoo doesn’t want me to use del.icio.us. the sad thing, just as with their RSS mess, is that they have people working for them who know better. that place needs it’s own mini. too many clueless middle managers.
i don’t normally comment on RSS, but i have recently had occasion to deal with two sets of RSS extensions. both extend RSS into the geographical realm, but are lacking test cases so far. so i wrote to the two originators of those extensions, and got wildly different responses. a clueless “shucks” from yahoo, collaboration from the georss community. luckily, the yahoo extension is not causing a lot of damage since they are limiting their own success with their ineptitude.
will the short and stupid end of the tail destroy the wonderful ontological ecology at del.icio.us? we will shortly know how much conceptual overlap there is between joe sixpack and the digerati. /. was once great, too..
ok, after a couple days of robots.txt love, i have now much less crap in my logs. a good opportunity to see which bots are well-written. based on what i am seeing with /robots.txt, i am sure glad i blocked most of these festering piles of dung from my site.
not using conditional get while requesting /robots.txt
Only kinjabot, OnetSzukaj/5.0 and Seekbot/1.0 get this right. All other bots, including google and yahoo, do not. lame.
requesting /robots.txt too often
The biggest offender is VoilaBot, checking /robots.txt every 5 minutes, every day. you gotta be kidding me. google and yahoo are not much better, you’d think they’d figured out a way by now to communicate the state of /robots.txt across different crawlers. Other bots fare better by virtue of being less desperate.
update: problems like this are economic opportunities.
i went ahead and blocked most crawlers in my robots.txt. there are too many of them, and for most, my ROI is negative anyway. if you had any doubts how far search still has to go, or how many moronic copycat companies there are in this space, spend some time with your log files.