-
Datatypeing
- body
-
There is some work going on at the microformats site to provide a datatypeing mechanism for microformats. XOXO being the most flexible format, it stands to gain much from such a system. While the work is as yet not done, it has come a fair way since I first saw it. Here is a basic overview of the system as it applies to XOXO :
Add the class 'typed' to the root XOXO element to specify that this XOXO makes use of datatypeing. Then, to each 'a' or 'dd' tag add one of the following classes:
- 'string' for strings (this is the default if you do not specify a datatype class)
- 'boolean' for boolean values (should be in page as '1' or '0')
- 'int' or 'long' for integer values
- 'double' for decimal values
- 'datetime' for dates and times (should be encoded using the datetime-design-pattern)
- 'binary' for binary data (should be encoded using a data URI)
- 'nil' for NULL values
Note that other field types (such as HREF and TITLE) will simply always be treated as strings and this can only be applied to the direct contents of 'a' or 'dd' field-tags.
-
The Importance of XML Well-formedness
- body
-
XHTML validity is a buzzword around the Internet, but many people generally agree that it is not all that important. It has its advantages, but it is not the end of the world if you can't quite get it. XML well-formedness, however, is very important. Why? Because it makes server-side hackery much easier. That may not be the only reason, but it is an important one. Some people have mastered the art of screen-scraping with RegExps, but I and others like me have never quite mastered that often-complicated technique. Instead, it is much easier to parse the webpage as XML and pull out the data that way. This works especially well when the page is known to conform to some standard (as in the code addition for Blogger Recent Comments).
While some leniancy can be built in, here are some basic guidelines for keeping your pages well-formed and making our job that much easier:
- XHTML empty tags — some tags, such as <br>, <link …>, and others used to be written in HTML as you see them there. This breaks XML well-formedness. Instead, one should use <br />, <link … /> and the like. (note to advanced users, this can be partially overcome using a RegExp line similar to $XMLdata = preg_replace('/<(img|meta|link|hr|br)([^<>]*?)([\/]?)>/i','<$1$2 />', $XMLdata); )
- Escaping out Ampersands — Many URLs contain the '&' character, and sometimes this character is used in content as well. If this character is left unescaped it breaks XML well-formedness. Use '&' instead. (note to advanced users, this can be mostly overcome use a RegExp line similar to $XMLdata = preg_replace('/&([^;]{10})/i','&$1', $XMLdata); )
- Escaping Scripts — JavaScript code will often contain characters that must be escaped out in XML, but which cannot be escaped out if the script is to work. To overcome this you add '//<![CDATA[' after every <script> tag and '//]]>' before every </script> tag.
- Closing tags — Some tags, such as <p> are often inserted by web designers without a closing tag. instead of '<p>text<p>more text' use '<p>text</p><p>more text</p>'. Note that XML is case-sensetive, so if you open a section, say, with <head> you must end it with </head> not </HEAD>
- Quoting Attributes — <p class="1"> not <p class=1>, etc. Quotation marks always go around attributes, no matter what.
- Non-tag < > — If you reference a Blogger template tag (such as <$BlogID$>) or for some other reason need to include a < or > character in content, you must escape it out with < or >, respecively.
A note about content : Blogger's post form and comment form is not very good at checking XML well-formedness. Thus if you want to maintain a (at least mostly) well-formed page you must follow these rules in any code entered in these forms. For example, if you enter a < character in the blogger post form, it does not escape it out for you, you must actually enter <, and the same goes for the comment form. This is sometimes annoying if you are trying to maintain full XML well-formedness because a well-meaning commentor can sometimes mess up your well-formedness and you must go and edit their comment. This is not usually the biggest problem, however, since it is usually one of the first two problems which can be overcome as noted. You can check for XML-formedness without validating XHTML using this tool.







You can trackback from you own site.
4 Comments
Thanks for the site. Good work chap but are you colour blind? This has to be the worst colour scheme for online reading I have ever seen! All you need is some blinking text and naff graphics and I would have thought I was back in 1990 reading one of the first web pages.
Hello!Very nice, beautiful and interesting site!Respect you! This is my site: [url=http://siniy.net/replica-louis-vuitton-handbags/index.html]replica louis vuitton handbags[/url] [url=http://siniy.net/discount-louis-vuitton-handbags/index.html]discount louis vuitton handbags[/url] [url=http://siniy.net/louis-vuitton-knockoff-handbags/index.html]louis vuitton knockoff handbags[/url] [url=http://siniy.net/louis-vuitton-knockoff-handbags/authentic-louis-vuitton-handbags.html]authentic louis vuitton handbags[/url]
Hello!Very nice, beautiful and interesting site!Respect you!My site: [url=http://siniy.net/buy-phentermine-without-a-prescription/index.html]buy phentermine without a prescription[/url] [url=http://siniy.net/buy-phentermine-without-a-prescription/buy-phentermine-in-the-uk.html]buy phentermine in the uk[/url] [url=http://siniy.net/buy-phentermine-online-with-paypal/index.html]buy phentermine online with paypal[/url]
Excellent site, added to favorites!! This is my site: [url=http://www.freewebs.com/hochuvtop/]replica chanel handbags[/url] [url=http://www.freewebs.com/hochuvtop/cheap-chanel-handbags.html]cheap chanel handbags[/url] [url=http://www.freewebs.com/hochuvtop/authentic-chanel-handbags.html]authentic chanel handbags[/url] [url=http://www.freewebs.com/chunichuni2/]discounted gucci handbags[/url] [url=http://www.freewebs.com/chunichuni2/gucci-handbag.html]gucci handbag[/url]
Reply / Comment