Friday, May 27, 2011

GData: I can't take it anymore

I have been playing with GData since I was on the Google Calendar team back in 2005/2006. My experiences with GData can be summarized by the following graph:

There are many reasons why GData continues to infuriate me:
  • GData is not about data—it is an exercise in XML masturbation. If you look at the content of a GData feed, most of the bytes are dedicated to crap that does not matter due to a blind devotion to the Atom Publishing Protocol. In recent history, GData has become better in providing clean JSON data, but the equivalent XML it sends down is still horrifying by comparison. I understand the motivation to provide an XML wire format, but at least the original Facebook API had the decency to use POX to reduce the clutter. Atom is the reason why, while I was at Google, the first GData JSON API we released had to have a bunch of dollar signs and crap in it, so that you could use it to construct the Atom XML equivalent when making a request to the server. When I wanted to add web content events to Calendar in 2006, most of my energy was spent on debating what the semantically correct Atom representation should be rather than implementing the feature. Hitching GData to the Atom wagon was an exhausting waste of time and energy. Is it really that important to enable users to view their updates to a spreadsheet in Google Reader?
  • REST APIs aren't cool. Streaming APIs are cool. If we want to have a real time Web, then we need streaming APIs. PubSub can be used as a stopgap, but it's not as slick. For data that may change frequently (such as a user's location in Google Latitude), you have no choice but to poll aggressively using a REST API.
  • GData has traditionally given JavaScript developers short shrift. Look at the API support for the GData Java client library or Python client library compared to the JavaScript client library. JavaScript is the lingua franca of the Web: give it the attention it deserves.
  • The notion of Atom forces the idea of "feeds" and "entries." That is fine for something like Google Calendar, but is less appropriate for hierarchical data, such as that stored in Google Tasks. Further, for data that does not naturally split into "entries," such as a Google Doc, the entire document becomes a single entry. Therefore, making a minor change to a Google Doc via GData requires uploading the entire document rather than the diff. This is quite expensive if you want to create your own editor for a Google Doc that has autosave.
  • Perhaps the biggest time sink when getting started with GData is wrapping your head around the authentication protocols. To play around with your data, the first thing you have to do is set up a bunch of crap to get an AuthSub token. Why can't I just fill out a form on and give myself one? Setting up AuthSub is not the compelling piece of the application I want to build—interacting with my data is. Let me play with my data first and build a prototype so I can determine if what I'm trying to build is worth sharing with others and productionizing, and then I'll worry about authentication. Facebook's JavaScript SDK does this exceptionally well. After registering your application, you can include one <script> tag on your page and start using the Facebook API without writing any server code. It's much more fun and makes it easier to focus on the interesting part of your app.
If GData were great, then Google products would be built on top of GData. A quick look under the hood will reveal that no serious web application at Google (Gmail, Calendar, Docs, etc.) uses it. If GData isn't good enough for Google engineers, then why should we be using it?

1 comment:

  1. You should take a look at:

    Things are getting better.