This blog is gone elsewhere!

To facilitate the sharing of contents, I’ve decided to move my personal work blog to Tumblr. Thus, The Rice Cooker has now become The Electric Rice Cooker.


Google+ API crawler in Python and a few remarks to start with

We’ve started working on tools to crawl the newly released Google+ API. I got an e-mail notifying us of the availability of the API on September 16th. I think we’re the first ones to write third-party tools to download and cache some of the data.

I’ll post the database schema later when they’re more stable.

For now, the API is read-only, and we’re limited to a 1000 requests/day limit. Since it is a first release, I was keen on collecting, in case the terms would change.

The API is interestingly minimalistic: People, Activities and Comments are the three data types you can search, list and get. There are many other types of data, but they are attached to the aforementioned. For instance, a “People” can have several organisations, urls, placesLived and emails, although I don’t think the latter is available with the current version of the API.

As People are concerned, you may also get a hasApp (for the mobile app, we guess), languagesSpoken (an array of string) and even an intriguing currentLocation (Latitude/Maps integration, someone?). It’s interesting, but it’s also scary, from a user’s point of view, how much publicly accessible information there is.


Google+ in Greater China

This morning, I stumbled upon a website listing the users with the largest following (in circles) so far of Google+. Mark Zuckerberg is at the top, with some 35,000 now, followed by a series of top Google execs such as Larry and Sergey.

The Top 50 contains many American techies and celebrities, but also a sizeable complement of Chinese Internet notables. The highest ranked Chinese celebrity is a blogger and software engineer named William Long (月光博客) at #19. He is trailed by Valen Hsu (許茹芸), a Taiwanese singer, who are circled by 5,000 people so far, good for #20.

Screenshot-許茹芸 - Google+ - Google Chrome
Taiwanese singer Valen Hsu is 20th among the most circled users on Google+

Prominent blogger Hecaitou (和菜头) is currently at #33. Another Chinese blogger, keso.me, is #45. Dahui Feng, a well-known tech commentator, is #49.

In the meanwhile, reports of Google+’s death in China may have been greatly exaggerated.


Visualising HK Transport Department traffic accident data in Google Fusion Tables

Screenshot-Transport Department - Year 2008 - Google Chrome
Step One: Download the data from the Transport Department website at http://www.td.gov.hk/en/road_safety/road_traffic_accident_statistics/2008/index.html. Scroll down and you will find a link to Road Traffic Accident Database 2008.

Screenshot-Google Fusion Tables - Google Chrome
Step Two: Import to Google Fusion Tables. You have to save the XLS file as individual CSVs, since it’ll take only one table at the time, and the number of rows limit is lower for XLS files.

Screenshot-Google Fusion Tables | Vehicles involved in Road Traffic Accidents in 2008 (Hong Kong) - Google Chrome
Step Three: Visualise. Here, we see that an overwhelming proportion of casualties on the road in 2008 involved men (coded as 1 in the data), but it might just be because of demographics.

Because there is not a lot of unique information to plot (like a datetime of the accident), the suggestion with this data is to do an aggregate on your column of interest (say, driver sex), then plot it as the entity, and use the count as your value. Could be nice to mix and match two criteria (are young men more frequently involved in accidents?).

If you want to play with the data yourself, here are the links to the tables, as imported in Google Fusion Tables:

1. Road Traffic Accident Stats in 2008: http://tables.googlelabs.com/DataSource?dsrcid=224727

2. Vehicles involved in Road Traffic Accidents in 2008: http://tables.googlelabs.com/DataSource?dsrcid=225310

3. Casualties in Road Traffic Accidents in 2008: http://tables.googlelabs.com/DataSource?dsrcid=225311

Here is how it compares in terms of age, whether the casualty involved was male or female (note that the scale is different, being much lower for women).


Male driver casualties in 2008 (plotted by age on the x-axis)


Female driver casualties in 2008 (plotted by age on the x-axis)


Overall driver casualties in 2008 (plotted by age on the x-axis)

The current problem with Google Fusion Tables (which is still a Labs product) is that it won’t allow you to compare more than two criteria at the same time in a practical format. For instance, I can’t superimpose graphs for deaths per sex and per age on one single view. Sounds like a pretty basic feature, so I wouldn’t be terribly surprised if it sprung up in a couple of months, if not weeks.

Quality of the data is also questionable since maybe 60-70 people listed as “drivers” are aged 16 or less… Did they mean they were in the driver’s seat or actually driving when the accident occurred?!

***

On another note, I also imported the news agencies database from China’s General Administration of Press and Publication, which is the state agency regulating news and print publication in the PRC. This data was retrieved at around March 2010 from www.gapp.gov.cn using custom scripts systematically reading the GAPP’s webpages. After parsing into a database-friendly format, I used it to build the China Media Map, which might start to include our annotations, soon.

But frankly, there isn’t much to visualise with this data, aside from location, since it has no contextual data attached to it (it’s just an address/phone book, basically). If you can think of something to do with it, drop me a line.


China Media Map on Google Fusion Tables

http://tables.googlelabs.com/DataSource?snapid=68215

I just discovered Google Fusion Tables. Ten minutes later, I imported the China Media Map and it produced this map. Info windows can even be customized by the user!


Starting on GWT

One of the discoveries made at Google I/O was the Google Web Toolkit (or GWT). I’m currently starting to learn how to use it to build a tool to curate our various data on Chinese media and potentially other projects.

I used to use Yahoo! User Interface (or YUI), which is not bad at all, but just a totally different model of Web user interface development.


Mes impressions de Google I/O (in French)

When I was at Google I/O 2010 last month, I found this cool YouTube upload booth, and decided to talk to it. I made a video explaining my thoughts up to Day 2 (in the morning), right after they had made the announcement for Google TV.

Not so much like what we might have thought, Google TV is to me another impressive demonstration that Google seeks to bring the power of the Internet to more places and hopefully make the Internet experience more and more seamless (more voice and gestures, and less keyboard and mouse).


Google I/O 2010 (Day II): Google TV, the HuffPost, NYTimes

Thursday, May 20:

Google TV

The second and last day of Google I/O was expected to be the big day for announcements. Google did not disappoint, when during the keynote speech, it unveiled the latest version of Android named Froyo and the much-anticipated Google TV.

Google calls Google TV an “experience”. In fact, it is a system that will be built in to television sets, blu-ray players and boxes that plug into HDTVs. It is not IPTV (television over the Internet) or an alternative to satellite dishes.

It would be simplistic to say that it is what Google does best, search, only that it is made for television. It is merely just Google entering your living room by the front door.

One of the most interesting feature of Google TV is that one of its most fundamental building pieces is the Android operating system (OS).

Known as the operating system for mobile devices (and recently of tablets and netbooks), the OS with the Green Robot expands to the big screen. In effect, developers writing applications for mobile phones can now consider writing them for the, now not so small, small screen.

Android 2.2 Froyo

To users, some of the new improvements of Android Froyo are obvious, such as support for Adobe Flash and official wifi tethering (sharing your phone’s internet connection as a personal wifi hotspot).

To developers, Froyo also opens a world of possibilities. The Cloud to Device Messaging Framework (a “push” technology) is intriguing. It would allow, for example, a user signed into his or her Google account on a computer paired with an Android phone to click on a link to, say a restaurant, and the mobile handset would immediately call the number. Now, what if a user could interact with the “cloud” using a remote control?

The following video contains a demonstration of the Cloud to Device functionality with real-life applications:

Digital journalism at Google I/O

Huffington Post at Google I/O

The Huffington Post

Noteworthy booths at the Developer Sandbox, a demo area occupying the first floor hall of Moscone West where the Google I/O was held, included The New York Times and the Huffington Post.

The HuffPost’s chief technical officer, Paul Berry, said the Post does not have project managers. Editors are thoroughly engaged in the production process. Ideas circulate fast between editors, Berry and developers, and can become published media product within days.

The New York Times

Over at The New York Times, software developer Andre Berhens was behind the wheel.

An interesting discussion ensued about the organizational structure of digital media companies, specifically the fine balance between having flexible resources and specialists in different roles (project/product managers, developers, designers, integrators).

Berhens is the man behind Times Skimmer, a beautiful and elegant website made in the new HTML5 standard. He makes use of Web workers (a browser feature to run scripts in parallel, thus overall much faster) and CSS transitions (in rendering animations for instance).

The New York Times is well-known in the media developer community for Open and Times Developer Network.

A Google I/O session

Google I/O 2010 Developer Sandbox

Google I/O Facts
– Held in late May since 2008
– Two days of Google I/O, one bootcamp day
– 5,000 developers
– 90 sessions

Originally posted on the main JMSC website


Google I/O 2010 (Pre-Event and Day 1): The Future of the Web

SAN FRANCISCO, CA — I arrived in San Francisco last Friday night, a few days ahead of the Google I/O 2010 conference. After watching inspiring sessions from I/O on YouTube last year, I decided that I would go through the trouble to attend the conference in person.

Cedric Sam at Facebook HQ

JMSC developer Cedric Sam at Facebook HQ

On the weekend and Monday before the conference, my hosts (current and former workers of the IT industry, of course) and I toured landmarks of the Bay Area tech industry. Over our trek, I visited offices of Cisco, PayPal and Yahoo! (all had in common cubicle farms and free espresso), and also the Intel museum at Intel Headquarters in Santa Clara.

Silicon Valley is perhaps a little underwhelming, esthetically speaking. At the heart of it is the City of Sunnyvale, an expanse of bungalows, strip malls, with few office buildings taller than five stories (as it is a region highly susceptible to earthquakes). The valley does have the most impressive lineup of IT companies I’ve ever listed (Sunnyvale might be best known as Yahoo!’s home), living up to its reputation as the world’s technology capital.

HTML5: The future of the Web

Tuesday was the Google I/O bootcamp, a series of sessions for newcomers, with mainly tutorials and hands-on talks on topics to be covered in depth during Google I/O.

Google's Sundar Pichai fronting the Google I/O keynoteThe conference officially started yesterday (Wednesday). The keynote address on the first day of Google I/O put a clear emphasis on HTML5, the upcoming revision of the HTML standard. More than half of the two-hour introductory presentation was reserved for HTML5, with Google’s spokespeople and guest partners speaking of HTML5, qualifying it is “the future of the Web” (a quote from Microsoft’s IEBlog).

In practice to end-users, a fully HTML5-compliant browser would mean features we often take for granted, such as video, being supported natively by the browser, rather than through a plugin such as Adobe Flash, as it is widely the case today. It will allow webpages to be richer in ways that we may have never even imagined before. Some HTML5 applications like MugTug (an all-Web equivalent to Photoshop) and Clicker (an online video store and player) were both showcased during the keynote.

Google I/O also has its importance for media watchers. Terry McDonell, the editor of Sports Illustrated Group, spoke during the keynote event to introduce the Sports Illustrated app (made by Wonder Factory), for Google’s brand new Chrome Web Store. The app is literally a HTML-only version of a Sports Illustrated magazine, laid out with typography, cool transitions and even video, all of it rendered by the browser (without the need of Flash), perhaps a preview of what full HTML5 sites could look like in one or two years. Throughout his intervention, McDonell emphasised the importance of open standards, as well as the need to work with programmers and technical specialists to bring innovation to the field of publication. He thinks that neatly designed apps like the HTML5 version of Sports Illustrated will encourage revenues from advertisers and customers. The following is a video of his talk:

Google’s annual developer conference has been a vehicle for product launches in its past two editions. In 2009, Google Wave and a new version of Android were launched. Technology observers (see article on FT.com) this year expect Google’s new Smart TV platform to be the big announcement this year. A new version of Android, Froyo 2.2, with long-awaited support for Flash, is also expected during I/O. A giant bowl of frozen yogurt still in wraps could be seen in front of Google HQ in Mountain View.

Google I/O remains a showcase window for Google’s products. The event is essentially composed of developer sessions of one hour each, during which Google employees and other specialists from the industry talk about how to make use of development platforms and services. They are technically involved, but let developers exchange ideas among themselves and directly with Googlers as well.

Ignite Google I/O and online availability

Again at this year’s I/O, Ignite was held as a session. It consists in a presentation format where successive speakers talk for five minutes about a topic of their interest, generally departing general techtalk, while their slides are being flipped through at each 15 seconds.

The presentations generally cover wacky topics (Matt Harding’s The Imaginary Line of Ancient Cosmic Weirdness), hobbyist obsessions (James Young’s You Sank My Battleship! on 1/144 scale battleships battles), but also life-defining experiences (Bradley Vickers’s How to Row across the North Atlantic, Ration Food and Not Have Your Teammates Eat You). Some are totally “out of left field” demonstrations of mastery of the format, such as Ben Huh’s Evolution of the Meme (he is the founder of award-winning time-sucking site I Can Haz Cheezburger). [See session webpage]

Knight Fellow and radio journalist Krissy Clark’s Narrative Landscapes (Or a Vision I Had in the Desert) was a compelling story of her epiphany with location journalism, or how the advent of augmented reality brings us to literally want to “click” on the environment surrounding us to read or hear stories.

All sessions (including Ignite Google I/O) should go online on YouTube within the next few days, if not hours, and available on the sessions page.