Chris Heilmann recently posted on how to use YQL to make cross-domain requests, which would usually be prohibited due to the same-domain-policy. I already knew about YQL, but I had no idea that it allowed retrieval of HTML from other sites, via JSON, returned as a single string!
Instead of asking for JSON
format, ask for XML
, but also add a callback
parameter to your query. Voila!
So, in short, YQL allows us to make cross-domain GET requests!
Chris also posted a demo!
With a bit of hacking, we can make jQuery work with YQL for all cross-domain GET requests. UPDATE: I’ve decided to put this in my “jQuery Plugins” repo at Github:
Cross-Domain Ajax mod @ Github
With this mod, any GET request made via jQuery.ajax
to another domain will work!
$('#container').load('http://google.com'); // SERIOUSLY! $.ajax({ url: 'http://news.bbc.co.uk', type: 'GET', success: function(res) { var headline = $(res.responseText).find('a.tsh').text(); alert(headline); } }); // Works with $.get too! |
Have fun!
Thanks for reading! Please share your thoughts with me on Twitter. Have a great day!
The ability to scrape HTML from external services (and have it cached by Yahoo!) was one of the main things that excited me when I was at last years London Hackday.
I’ve been meaning to abuse… erm I mean test it for a while but just never got round to it.
And to make POST requests, see http://www.wait-till-i.com/2009/11/16/using-yql-to-read-html-from-a-document-that-requires-post-data/
Nice idea, really like it
Great idea!
YQL even supports SSL, so I would make the yql url dependent on the website’s protocol to avoid mixed content warnings in IE:
Besides that, you forgot an semicolon after the “var” declarations. 🙂
@Francois, looking into that now 🙂 Thanks for the link!
@Christopher, thanks, fixed it!
I use this for a little online recipe bookmarking script I made. It saves me so much time from having to copy and paste the title, URL, and image source. Now I just have to copy the URL, paste it into the URL field and the other two are filled. 😀
cool stuff.
Hmmm… Seems like someone got inspired by your script, and added the same functionality to MooTools: http://mootools.net/shell/aUgSz/
Yay….
Now when i visit homepages, that look EXACTLY like my bank, and even lets me log ind… will rob all my money…
and furthermore… Steal my Facebook, Gmail, and MSDN — just because i can’t tell the difference between a real login that works, and a fake login that works…
There are good reasons why browsers by default doesn’t support cross site Javascript post/get’s
Gaaahhh
… Not to mention Content origin…
with Cross-Site JS… you may “steal” content without even using the server (which is done the old way)
By letting Clients grab content – you’ll expose the risk that content providers get hacked/grabbed by zillions of clients without knowing it…
@Martin, wow, do you even understand how this works? It’s using JSONP to retrieve arbitrary HTML… And the HTML retrieved is that which is seen by the YQL proxy — not by you. This isn’t breaking the same-domain-policy or even trying to. There’s no way that an attacker can harness this in that way, – it’d probably be better for them just to go to the bank’s website and copy the source.
@James
Yes, i perfectly understand whats going on – Heck i’ve grabbed HTML content from other sites myself – but always using a server-proxy:
client -> server -> X-Site -> sever -> client
the way i read your post you are doing:
client -> X-Site -> client
which is exactly what I’m talking about, no ?
Nope.
client -> YQL Server -> X-Site -> YQL Server -> client
How is
client -> X-Site -> client
even possible??Ahh… that makes me feel more safe 😀
i read that you retrieved HTML… not just JSON’like-data from a trusted server…
Cross-site Requests are not allowed – and only possible with custom plugins (afaik.)
Can we expect this in the official assembly of jQuery? 🙂
Umm, is it just me or does anyone else not like the idea of relying on Yahoo’s proxy? If yahoo is down for some reason you’re going to be SOL. Not that this is all that likely, but still, it’s an extra dependency. I mean… are we even sure Yahoo will be in business in a year? 🙂
@Eric: If you want to use your own proxy, go ahead. Good luck.
@Eric comments like this will eventually make me stop caring to build solutions like that. We built YQL because we run our own services on it. We then offer it to the world to make it better and what we get is “I wonder if they’ll be around in a year”. Why I fight the good fight in the company I start to wonder when I get messages like these back. I am quite sure that Yahoo will be around in a year’s time – otherwise I wouldn’t spend that much of my effort in there. If the people who constantly claim that Yahoo is dead while blatantly praising everything else some other companies or random startups do will be I am not too sure about.
Is it possible, with this plugin, to get the current location of an iFrame?
I mean, I need to get the actual location.href of an iframe, but i get permission denied with simple javascript…
Perhaps I’m just dense, but I encountered an anomaly with this plugin. I built a simple html file to test the
$(‘#container’).load(‘http://google.com’); // SERIOUSLY!
example. I opened the file in the browser and nothing happened. After stepping through with FireBug, I noticed the request was using a FILE:// protocol to the Yahoo URL. I made the follow change to the plugin:
//YQL = protocol + ‘/query.yahooapis.com/v1/public/yql?callback=?’,
YQL = ‘http://query.yahooapis.com/v1/public/yql?callback=?’,
and it worked fine. Any thoughts on how to modify this plugin so it would work in this situation? I often perform initial development on local files, before moving to a server.
@Regent, doubtful. I’m actually quite worried about this getting too popular. The horde of jQuery beginners will think this works like same-domain XHR… which it absolutely doesn’t!
@David, nope. You can’t get the navigated-too location of an external-domain iframe. But you can get its first location…
iframeElement.src
.@Clayton, ahh, yes, I added the protocol thing so it would work with
https
. I’ve updated it, so it should now work locally too. See the commit: http://github.com/jamespadolsey/jQuery-Plugins/commit/3db614a8e3a04f871bccbbe8f18442850ddf19bd@James, Yes, you’re right, this is a good reason.
If the Yahoo server is not working, can not use another alternative to 100% result?
For example CssHttpRequest – http://nb.io/hacks/csshttprequest
Just curious, but how would you install this in the simplest terms?
Also, is this basically an iFrame functions but with javascript? but without restraining the loaded page to the dimensions of an iframe but use the entire page as if it was loaded locally and not remotely?
is there a way to send some request to an extrnal domain (opened in a new popup window), and then the external domain after the user clicks a button set some parameters back to the initial server? We have full acces to both domain to put js on it, is there a way using Jquery
Regards,
Ben Klaswer
neat & works like a charm 🙂
Unfortunately I can’t get it to work in IE 8 when I change the YQL query to xml for xml responses (select * from xml), am I missing something?
FF parses perfectly, only MSIE does not. Json response is ok, though, but jQuery.find() nor filter does work 🙁
This works great for me when I make the Ajax call immediately upon loading, but if I try to make the call after clicking a button, I get an error response. I’m trying to do this:
$(‘button’).click(function() {
$.ajax({
//AJAX code here
});
});
Clicking the button gives me an error, whereas it works fine immediately upon loading the page if I just have this:
$.ajax({
//AJAX code here
});
I must be missing something obvious…any thoughts?
Thanks!
Disregard my above comment. I figured it out, and it was something unique to my site.
Thanks for the great plug-in! 🙂
Thank you, James. This plugin absolutely made my day!
Having said the above, I’m noticing an anomaly in the content returned from YQL.
I’m using the plugin to .load() content from public Google Calendar events, the url for which is a variable parsed by jQuery. Sometimes the content returned is in German, and sometimes it is (correctly) in English!
Is this the fault of YQL, something I did, something you did, or something Google did? I can’t work it out.
Thanks.
Okay looks like modifying this portion of your plugin solves the error call!
if (_success && data.results[0] != undefined) {
// Fake XHR callback.
_success.call(this, {
responseText: data.results[0]
// YQL screws with s
// Get rid of them
.replace(/]+?/>|/gi, '')
}, 'success');
}
else {
o.error.call(this, 'not received', 'data is null');
}
Please let me know if I am correct or not 🙂
so what wolud be the best practice to pass xpath parametar , now you use xpath=”*” , but what if i want to filter something on page , for example xpath=’//div[@class=”someContents”]’ … and/or use limit and offset keywords ?
Ermm, and this is so great because…? I’m routinely doing screen-scraping of OPS (other people’s sites) via PHP’s get-contents() function on my own server. This way I get the same code I see in Firebug, which is important to me for extraction. And I don’t have to rely on Yahoo.
Thanks for the hack!
One thing, since nobody seemed to note it before.
YQL will refuse to return content if webmaster has banned robots from his site (I tested query on yahoo site and response was very clear). In particular, I was trying to get information from google maps business pages (like http://www.google.com/maps/place?cid=17434047103649409317)
Perhaps someone can help me, i need to have jQuery render an xml feed in html cross domain from http://clinicaltrials.gov/search?term=%22lyme+disease%22&studyxml=true for instance.
tried the following but it did not work.
$(document).ready(function(){
$.ajax({
type: “GET”,
url: “http://clinicaltrials.gov/search?term=%22lyme+disease%22&studyxml=true”,
dataType: “xml”,
success: function(xml) {
$(xml).find(‘site’).each(function(){
var nct_id = $(this).attr(‘nct_id’);
var title = $(this).find(‘title’).text();
var url = $(this).find(‘url’).text();
var condition_summary = $(this).find(‘condition_summary’).text();
var condition_summary = $(this).find(‘condition_summary’).text();
$(”).html(‘‘+title+’‘).appendTo(‘#page-wrap’);
$(this).find(‘desc’).each(function(){
var brief = $(this).find(‘brief’).text();
var long = $(this).find(‘long’).text();
$(”).html(brief).appendTo(‘#link_’+id);
$(”).html(long).appendTo(‘#link_’+id);
});
});
}
});
});
#christoff
you need a serverside-proxy…
Hey I’m getting a strange error:
Uncaught ReferenceError: jsonp1283340405175 is not defined
Any ideas? Seems like the JSON is not being parsed correctly?
Here’s my ajax request;
data.results[0] is undefined, in firebug console.
– any clue what am I doing wrong ?
Great solution for the cross domain issue. But it didn’t work for some URLs. This URL that I’m trying to load contains most of the dynamic content instead of static. Don’t know if it’s the issue. And the other problem is this didn’t work for me in IE. Worked perfectly in Firefox. Is YQL is browser dependent?
@James
When I change google.com to facebook.com in the jQuery.ajax function you give in your examples and tests I get an error in my console, that is data.results[0] is undefined. Can someone please give me some advice with this? Here is the code snippet:
$.ajax({
type: 'GET',
url: 'http://www.facebook.com',
success: function(html){
process(html);
},
error: function(){
debug("ajax error");
}
});
{/code}