I was recently looking into automatically sourcing event data directly from venue websites. What I found was stuff like this:
And I cried a little. What happened to semantic HTML?
Semantic HTML is the use of HTML markup to reinforce the semantics, or meaning, of the information in webpages rather than merely to define its presentation or look
We have all this semantically rich information (like cinema timings/locations for example) curated over countless man hours and it’s all just sitting on private database servers. The exposed representation, as you can see above, is not enriched, not semantic, and not really usable.
Ok, semantically-poor HTML is still usable — I just have to implement a generic date-parser, a natural-language processor, a domain specific location parser, and maybe I can throw in some neural networking too, huh?
All that effort developing software that can parse and make sense of data that should have been semantic to begin with. What a waste of time & effort.
I know this plea is fruitless though. People want their magical web with embedded tweets and Instagram photos. They couldn’t care less about semantic enrichment.
But we should care. We understand the tenets of the web; the mere fact that via HTTP You or A Server In Timbuktu can request a resource and get back a corresponding representation in HTML. That universality is squandered every time we write crap HTML or choose not to progressively enhance.
The concept of the semantic web has been around since 1999, when Tim Berners-Lee expressed his vision of a future web in which computers could understand the context of human speech and thought, to be able to “understand” our meaning when expressing ourselves.
It’s cool though — we’re just holding our breaths for an AI that can decipher the mess we’ve created, and when we get there maybe the enriched data will be publicly accessible and not in the hands of a monopolising party. And in the meantime, while we await salvation, we can create short-lived sugar in that new MVVM framework you heard about.
PS. Watch this:
Thanks for reading! Please share your thoughts with me on Twitter. Have a great day!
While the state of Semantic markup is pretty bad at the moment do not despair. The very fact that you illustrate ensures that the next company who cares about clean semantic code will eventually win out because their effort in build and maintenance is so much less, they will always have more time to build useful, enhanced experiences that customers really want.
If you cry about this stuff, get a hold of a few Japanese websites. Afaik, they are the worst on the planet when it comes to semantic markup. Especially the kinds of sites that I, as a returning tourist, need to do some research before I go.
Not only tables-galore like it’s the end of all things, but also (quite annoyingly) texts-as-images with no alt/title attributes. Have fun Google Translating that. Add to that a total shitpile of banners, buttons, and affiliate links, and you’re in for a real treat.
Now, why such a highly technologically advanced country like Japan is so far behind on web tech, is beyond me. Maybe it’s just like you stated: they couldn’t care less.
@davidelrizzo, the thing is, I’m not convinced that having semantic HTML yields economically for a single business. In fact, it probably does the opposite — businesses that retain their own enriched data can capitalise upon it. Releasing it outright would, in most cases, not help them. The only way to convince a business to do this is with ‘SEO’. Unfortunately it seems to be the case that most businesses do not truly care for the tenet of a semantically enriched web, nor doing a ‘good thing’ in the more general sense.
@Martijn, I think I’ll spare myself the pain of visiting such websites. It’s a great shame…
I suppose when it comes to frontend developers we just produce good code. That leads me to believe that such horribly bad code must come from serverside system, where the html was hacked in years ago.
Combined with inflexible CMSes, customers feel forced to produce bad code, since that’s the only thing that works, plus it will give them the idea that their systems can output “anything” into their CMS, because all the styling is in the HTML.
I don’t think frontend developers are to blame for this. Personally, from what I’ve seen happening, strange, blackboxed, and/or poorly maintained backends are the culprit for many of these kinds of bad code. Combined with a too costly process to mitigate such problems, even if they’re acknowledged.
Right with you @James on the “But we should care.” front. The last thing this world is less clarity in communication between people. I the irony that one of the people in film points out is the word semantic is seemingly loosing it’s clear meaning.
I think that if browsers and assistive technologies, like JAWS, the Mac assistive screen reader and others, would better implement semantics we might see greater use. It’s hard to build a case for its importance when you can’t point anyone is using it well today. By that I mean, what positive impact does it have on the user and business. Where is the ROI for semantics? Unfortunately that’s all most businesses care about.
I’ve been riding the semantic bandwagon for a while, mostly because it “feels” right. I have trouble finding actionable reasons for doing it to justify semantics to others. It’d be great to see some case studies that but some dollar value to semantics. Maybe it’s simply a matter of it saves time and in turn money in the long run by making data more maintainable.
Long live semantics! Unicorn forever! 😉
Haha well said! I sort of have to agree with @Martijn – old school CMS and inflexibility may have a big impact on this.
One thing that would make it a whole lot easier would be proper tools for the job. It’s currently very hard for a person not schooled in development to toss in semantic data into their websites using CMSes etc.
Even to your average developer, it can be rather daunting. There are some automated test tools here and there, but most of them suck massively at showing you if you did things right (that includes Googles rich snippet tool). Often, you won’t find out if your efforts at enriching the text worked until results start showing up in search engines.
So, proper tools is one thing. Good standards that aren’t voodoo is another. RDFa-lite looks pretty good here, but after you stare at it for a while you’re left wondering: is this enough for my purposes or do I need RDFa proper? Unfortunately, you’re left with that answer, because most of the examples and experts can’t be bothered to cater to your (aka the average developers) needs. What should you choose then? How would you go about deciding on this issue?
Until we answer questions like: “How do I properly mark up information in this blog post?”, “How do I mark up the information about this product and the related products?”, and most importantly “How do I test if I got it right and get hints when I haven’t?” – we won’t see widespread semantic html. It’s simply too much of a bother for your average dev to get into when the pay-out is not higher.
I do agree with Peter Lind, but we have to add to this that many backend developers (hopefully that term suffices for the kind of devvers I’m talking about) just couldn’t be bothered with semantic markup. In my own work environment I’m seeing exactly that: unless there’s a averagely good frontend developer around, the HTML is going to explode into “whatever works” code, rather than the stuff we love to see.
In part this is because backend developers have limited knowledge of HTML and related tech (which is understandable), but also because it’s simply not their field of interest. These guys usually know “tables are bad”, but hardly why exactly or what’s the best not-table-element to use.
Mostly, I think, it’s because it’s not in the task description. When “produce properly semantic code” is in the task description, it’s truly an exception. So perhaps higher-ups should get some of the blame.
I suppose to mitigate this particular cause for bad code is to simply have a frontend developer always assist in what may appear to be a purely serverside solution. But I realise there’s rarely ever enough budget for that.
To be honest, I think it’s way too easy to blame backend devs, lousy CMSes or management. If you honestly think you produce good code as a frontend dev, then that should include semantic markup as well. If the end-result does not include semantic markup, you’re also to blame – if for no other reason, then for not standing up to clients/management and insisting on doing things right. If clients don’t know about semantic markup and what could be gained there, it’s because you’re not doing a good enough job of telling them.
@Peter Lind
Unfortunately, it’s not always that simple. For one, the frontend developer on my company is often hardly involved anymore in a project that is in the stage of backend development. Secondly, backend devs have it easier: somehow for them it is “more okay” to say “it’s not possible to produce this and this html code”.
Trust me, I keep yelling about the latter. It’s bollocks of course. The backend needs to spew out html, and no CMS or other backend thing is to dictate what that html looks like… Until you find yourself in the real world, it seems.
In a commercial or production environment, it is vital to write good and clean code. The big issue, is when SOMEONE ELSE has to work with your code or when you have to come back to it 6 months down the road and don’t remember squat. It’s obvious that those Japanese websites aren’t being maintained, mainly because it would be impossible with such crazy code (only the content is being updated).
If you don’t care about code quality, then you really shouldn’t be writing it because it causes problems for anyone after that has to touch it. I’m not saying you have to be a superstar either, but to care and do the best with what your tools will allow you to do is admirable and respectable.
Hey @James,
just checked your “Pulse plugin for jQuery” and i become your fan #1,
i love your articles and efforts.
By the way i’m a student of BS(HONS) IT and currently doing my final year project
(HTML5 based Game) and i found your articles(specially the first one “My first “REAL” job” as in coming day’s i also need a development job :p) and other stuff very useful.
best wishes for you.
hope from now we will be in touch 🙂
Regards