I’ve recently been checking out YQL, Yahoo’s new gift to developers! The idea behind it is to unify the complicated world of APIs into one solid SQL-like abstraction; flickr feeds, search, weather and RSS feeds (etc). Additionally, it can retrieve HTML, XML, CSV, ATOM and more, all returned as structured data via either the JSON or XML format.
The fact that it has the capability to return JSON and even JSONP means cross-domain web querying is now a piece of cake. Just imagine being able to have access to the entire internet from the comfort of the client-side!
If you have a Yahoo account you should definitely check out the console; there’s a bunch of examples to choose from on the side. Try this query:
SELECT * FROM html WHERE url='http://news.bbc.co.uk' AND xpath="//*[@id='livestats200']//ul[contains(@class,'popstoryList')]/li/a" LIMIT 1 |
The above query should return the most read article on the BBC website (at the time of querying). You can use XPATH to retrieve specific nodes within the targetted document. It would be cool if YQL supported CSS selectors as many more people know the syntax but unfortunately we’re stuck with XPATH… It’s not too hard to learn though, if you’re new to it then this page is worth a glance: http://www.w3schools.com/XPath/xpath_syntax.asp.
I’ve started development on a “CSS to XPATH” converter for JavaScript, it works quite well, but it’s not quite robust enough to deal with complicated selectors yet.
I’ve also written a small interface for working with YQL within your JavaScript applications (framework agnostic). Have a look:
// YQL serves JSONP (with a callback) so all we have to do // is create a script element with the right 'src': function YQLQuery(query, callback) { this.query = query; this.callback = callback || function(){}; this.fetch = function() { if (!this.query || !this.callback) { throw new Error('YQLQuery.fetch(): Parameters may be undefined'); } var scriptEl = document.createElement('script'), uid = 'yql' + +new Date(), encodedQuery = encodeURIComponent(this.query.toLowerCase()), instance = this; YQLQuery[uid] = function(json) { instance.callback(json); delete YQLQuery[uid]; document.body.removeChild(scriptEl); }; scriptEl.src = 'http://query.yahooapis.com/v1/public/yql?q=' + encodedQuery + '&format=json&callback=YQLQuery.' + uid; document.body.appendChild(scriptEl); }; } |
Use it like this:
// Alert the latest post title from Ajaxian.com // Construct your query: var query = "select * from rss where url='feeds2.feedburner.com/ajaxian' limit 1"; // Define your callback: var callback = function(data) { var post = data.query.results.item; alert(post.title); }; // Instantiate with the query: var ajaxianPosts = new YQLQuery(query, callback); // If you're ready then go: ajaxianPosts.fetch(); // Go!! /* Callback & query can be defined as properties also: ajaxianPosts.query = 'select * from...'; ajaxianPosts.callback = function(){}; */ |
Also, for you jQuery junkies, here’s a plugin:
$.YQL = function(query, callback) { if (!query || !callback) { throw new Error('$.YQL(): Parameters may be undefined'); } var encodedQuery = encodeURIComponent(query.toLowerCase()), url = 'http://query.yahooapis.com/v1/public/yql?q=' + encodedQuery + '&format=json&callback=?'; $.getJSON(url, callback); }; // Usage: $.YQL("select * from rss where url='feeds2.feedburner.com/ajaxian' limit 1", function(data) { var post = data.query.results.item; alert(post.title); }); |
Even if you’re not planning on using it in a project it’s definitely worth a look and Yahoo’s console makes it incredibly easy to play around with it.
Thanks for reading! Please share your thoughts with me on Twitter. Have a great day!
Your use of the comma operator in yr last code section initially obfuscated to me that you were building the url parameter to use in yr getJSON call. To clarify the example and add robustness, the parameters to yr jQuery plugin could be type checked for string & function, beyond checking if not undefined.
Thanks for the useful example of using the power of JSONP and YQL to bring a remote RSS feed into the client!
@Monty, Thanks for the tip about extra type-checking. To be honest, I was reluctant about integrating any input checking since I generally think it’s unnecessary in abstractions like these; I only included it for good form.
Wow.. This is really amazing!
Just a newbie question: is there an easy method to rebuild the XHTML DOM structure from the JSON results? I’m trying to fetch some news from a magazine home page and inject them in a div.
Unfortunatly I discovered it’s impossible.
Due to the JSON structure utilized by YQL the order of the children tags is arbitrary (and sometimes wrong).
If that’s ok for someone, here’s my function that try to convert YQL JSON to HTML: http://snipplr.com/view/13389/yql2dom-jquery-plugin/
@Valentino, I think the trick is to be as specific as you can with the query. So if you’re looking for all the content from
ul#latest-news
you could try:Another cool thing you can do is only select certain properties/nodes, so instead of selecting all (*) you can select all content:
Or, for example, all HREF attributes from all anchors:
The order of the children tags returned has been correct for me so far – what lead you to that conclusion?
It may make more sense for you to use Yahoo Pipes. It requires you to make a pipe yourself but once you’ve created it you can retrieve all the HTML within any page as a string. Have a look: http://pipes.yahoo.com/pipes/pipe.edit (drag in the “fetch page” source, link it up with the output box, then save it and retrieve as JSON)
I’ve made some tests and I discovered more problems:
http://www.maverick.it/m2stream/yql.php
I found YQL has these problems with HTML:
* content disappear in some cases (more below)
* ‘class’ attributes removed
* change tags order
* change some tags (‘b’ became ‘strong’)
* ambiguity between tags and attributes (think for example at style or title…)
The biggest problem I noticed is that some content may disappear.
If I try to get this HTML:
The ‘content’ of the p object contains only “just a” and not “This”.
Anyway, my function has a little bug. Now is correct:
http://snipplr.com/view/13389/yql2dom-jquery-plugin/
Thanks for the suggestions. I didn’t know Yahoo Pipes has JSON output. I’ll give it a shot. ; )
Hi, you shoud use the jquery $.getScript function to optimize YQLQuery Class:
http://docs.jquery.com/Ajax/jQuery.getScript
Cheers!
Hi James,
I tested your code and it works fine except that Firebug seems to bring up the following error message regarding your “src” attribute value…
not well-formed
Line 34
scriptEl.src = ‘http://query.yahooapis.com/v1/public/yql?q=’ + encodedQuery + ‘&format=json&callback=YQLQuery.’ + uid;
…not sure what you make of that, I couldn’t see anything wrong with it myself.
(there is no such error with IE so I don’t know if it’s a bug in the code or with the Firebug console itself).
Any feedback appreciated.
Thanks James.
wow! someone at Yahoo listened my complains:
http://ajaxian.com/archives/pimping-json-yql-now-offers-jsonp-x
Unfortunately, this example script doesn’t really work for yahoo search results on YQL.
(i.e. you can’t just replace the select statement).
Is there a workaround?