What Geek Like Me Wants from Mobile Phone

dscn3878

I was preparing my new Zendbox alike server to host all of my toys when i found some interesting feed items on my reading list, all talking about the same thing, mobile phone. Which brought me to Ian Hay’s top ten list of what people want from his/her mobile phone.

I’m not really good at making list, so if someone ask me what i want from my mobile phone i’ll give him one single answer: control.

For that matter, i want my mobile phone to support open source software and has open hardware architecture. That’s all i need, i’ll take care the rest myself, thanks.

:)

Against the System: Rise of the Robots

…big difference between the web and traditional well controlled collections is that there is virtually no control over what people can put on the web. Couple this flexibility to publish anything with the enormous influence of search engines to route traffic and companies which deliberately manipulating search engines for profit become a serious problem.

That was the quote from Sergey Brin and Lawrence Page’s paper about the prototype of Google search engine which then was in http://google.stanford.edu/.

But i don’t think even Brin nor Page would expect that their invention could bring another problem that emphasize what they meant with “no control over what people can put on the web”.

Yesterday post from Securiteam blog shows us that people can use Googlebot to attack other websites anonimously.

The idea is quite simple, all you have to do is to create a malicious website that contains links attacking web application (CSRF), like this:

http://the-target.com/csrf-vulnerable?url=http://maliciousweb.com/attackcode

and submit this to Google. When Googlebot comes to your website and find this link it will dutifully try to index the URL. And when it does .. bang! the robot do the job for you, attacking your target.

This is not a new idea though. Michal Zalewski wrote about this in 2001 in title “Against the System: Rise of the Robots“. His introduction tells us the whole idea,

Consider a remote exploit that is able to compromise a remote system without sending any attack code to his victim. Consider an exploit which simply creates local file to compromise thousands of computers, and which does not involve any local resources in the attack. Welcome to the world of zero-effort exploit techniques. Welcome to the world of automation, welcome to the world of anonymous, dramatically difficult to stop attacks resulting from increasing Internet complexity.

However, this kind of attack is not only Googlebot’s problem, other search engine bot have the same kind of ability to do the dirty job for you like MSN, Yahoo and dozen of others.

So who’s to blame? Surely, the bad guy who run the original website. Although you can also put the blame to the owner of the victim websites which ignore the security factor and leave all their pages open to any bot for higher pagerank.

My Delicious Linkblog

Gee, even Andrei now has linkblog. I think I have to make it too. So, i took a little PEAR::Services_Delicious and some ugly codes of mine and there you go,

<?php

require_once 'Services/Delicious.php';

$d = new Services_Delicious('eristemena','guesswhat');
$r = $d->getRecentPosts();

$k=0;
foreach ( $r as $l ) {
  $tidx = date('d F Y',strtotime($l[time]));
  $lb[$tidx][$k]['href'] =  $l['href'];
  $lb[$tidx][$k]['description'] =  $l['description'];
  $lb[$tidx][$k]['extended'] =  $l['extended'];
  $k++;
}

foreach ( $lb as $d =   >
  $v ) {echo '<strong>'.date('F jS, Y',strtotime($d)).'</strong><br /><hr size="1">';
  echo '<blockquote>';
  foreach ( $v as $l ) {
    echo '<p><a href="'.$l[href].'">'.$l[description].'</a><br />';
    echo $l[extended].'</p>';
  }
  echo '</blockquote>';
}

?>

Ok, i cheat using Delicious here, so what? i have my own linkblog now, delicious linkblog, just like anybody else.

What? You want the feed too? alright, grab it here.

AJAXish Domain Search

I'm an old time user of domaintools.com, they're doing great for domain search and suggestion. They even share their XML API for domain spinner. I've tried myself here.

But for most of you who are used to instant result for everything (ajax enabled), then you have to try this : PCNames Domain Search. It does this virtually instantaneously domain search using Ajax, just like Google Suggest.

When you get unavailable in the result it will be shown WHOIS button for each domain to get more information of the site owner, or if a site is available for registration the search engine provides a list of companies that can be used to register it. I think this is where they get the money - not by direct reselling but from donations and commission from affiliates.

The most impressive thing about this search engine is its speed; it really is fast and will save a lot of time. There are also a variety of other tools available on the site as well, which are equally useful for web authors and searchers alike.

One of my favourite is Word Search which will show you which domains were available, deleted or expired according to any word you enter (try: Google).

Here's the AJAX API from PCNames,

Microsoft, Google, Yahoo! Unite to Support Sitemaps

Finally, Microsoft, Google and Yahoo! announced today that they will all begin using the same Sitemaps protocol to index sites around the web. Now based at Sitemaps.org, the system instructs web masters on how to install an XML file on their servers that all three engines can use to track updates to pages.

What and why?

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

The protocol is offered under an Attribution-ShareAlike Creative Commons License, so it can be used by any search engine, derivative variations using the same license can be created and it can be used for commercial purposes.

People who use Google Sitemaps don’t need to change anything, because it has already use the protocol described in sitemaps.org, only now those maps will now be indexed by Yahoo and Microsoft.

If you're using WordPress, you might like to use the sitemap generator plug ins here. I've been using it for couple months, and Google has indexed these pages beautifully.

Gdata and The Future of Database

I've been playing around with GData for the last couple months. And i must agree with Jeremy when he said that GData is the realization of the future that Adam Bosworth talked about.

Adam gave us a different view of how to deal with huge amount of data. Until now, we can't talk about data without talking about conventional database management system with complex architecture. Apperently, it's not always about finding a simple solution to a complex problem, sometimes all you have to do is to simplify the problem.

Web was something that sounds almost impossible to build. But Tim Barners Lee simplified it, and by using HTML (and HTTP) almost everyone now can build the web. Why not do the same way with managing data. That's when he talked about ATOM Publishing Protocol (APP), a simple way to manage data on the web. As you might know, GData is the improvement of this publishing protocol.

Adam should know this very well because he works for the company that has managed to deal with huge amount of data. Adam is VP, Engineering at Google, and was one of the top Engineers at Microsoft.

Just listen to his talk and you'll find so many GData things in it.

Play now:
data="http://assets.gigavox.com/flash/emff_comments.swf?src=/audio/stream/itconversations-571.mp3"

width="125" height="18" align="center">

value="http://assets.gigavox.com/flash/emff_comments.swf?src=/audio/download/itconversations-571.mp3"/>

Download MP3 |
Help with Listening

So it was only a matter of time until many applications adopt GData or protocol like GData. Today it's announced that Windows Mobile has also used GData.

Web 3.0

R/WW has a great wrap up from last week Web 2.0 Summit. It seems to me that Web 2.0 now has become more attractive to business world than the geekosphere who gave birth to it.

… this year’s conference was a lot different from last year’s. It was still a great conference, but in a different way – perhaps reflected in the name change to Summit (a more business-sounding title). Last year there were a lot more developers and designers running around, this year the crowd was overwhelmingly from the media and business worlds. No doubt because of this, I also felt this year’s conference lacked in cutting edge new products – and I didn’t learn many new insights about Web technology.

So, what were those geeks working on while they missed the summit? This article from NYT might give you a clue. Apparently, they’re too busy playing around with their new toys, Web 3.0.

I don’t know who gave it that name. But obviously, it was the same people who suggest the phrase “Web 2.0″. And like the previous phrase which is meaningless from the developer’s point of view, “Web 3.0″ has no meaning other than what we’ve known for years as “semantic web “.

It is a noble attempt to change the way computer see the web not only as a bunch of data but also as informations which has meaning to computer so that it can decide things for us.

In other words, with semantic web, computer will be able to give a reasonable and complete response to a question like: “I’m looking for a warm place to vacation and I have a budget of $3,000. Oh, and I have an 11-year-old child.”

From where we stand now, it is almost impossible to do that. Since it depends strongly to the data on the web. The old wisdom about this kind of data is you can’t trust them. Too many people lie when putting data on the web. And too many people publish invalid data which is hard to understand by computer.

However, as long as it is in “impossible” state, there’s always chances to make it happen. As we all know, geeks like impossible things. They’ve done it with Web 2.0.