This is not a SEO question.
I am curious how to markup HTML in a semantic correct way concerning the used language. Please correct me if my markup is mistaken.
My questions is: do I need the lang
attribute in the html
tag when I already use the hreflang
attribute in the link
tag?
Are both directives semantically different? I mean: will the self-reference in the link
tag in both examples semantically be understood as indicating the language of the document?
The code samples below might clarify my question a bit:
Example of an English webpage
http://example.com/en/
<!DOCTYPE html>
<html lang="en">
<head>
<title>English webpage</title>
<link rel="canonical" href="http://example.com/en">
<link rel="alternate" href="http://example.com/en/" hreflang="en">
<link rel="alternate" href="http://example.com/nl/" hreflang="nl">
<link rel="alternate" href="http://example.com/en/" hreflang="x-default">
</head>
<body>
<p>This is a webpage written in English.
This page is also available in Dutch.
The default language of this page is English.
</body>
</html>
Example of a Dutch webpage
http://example.com/nl/
<!DOCTYPE html>
<html lang="nl">
<head>
<title>Nederlandse webpagina</title>
<link rel="canonical" href="http://example.com/nl">
<link rel="alternate" href="http://example.com/en/" hreflang="en">
<link rel="alternate" href="http://example.com/nl/" hreflang="nl">
<link rel="alternate" href="http://example.com/en/" hreflang="x-default">
</head>
<body>
<p>Dit is een Nederlandstalige web pagina.
Deze pagina is beschikbaar in het Engels.
De standaardtaal van deze pagina is Engels.
</body>
</html>
You should always provide the lang
attribute on the html
element.
Two reasons relevant to your case:
The HTML spec describes how the language of a node gets determined. The hreflang
attribute plays no role here.
If you don’t provide lang
on the html
element, this node has no language.
An alternate
+hreflang
link is only interpreted to point to a translation of the current document if the value of link
-hreflang
differs from the value of html
-lang
:
If the
alternate
keyword is used with thehreflang
attribute, and that attribute’s value differs from the root element’s language, it indicates that the referenced document is a translation.
If you don’t provide lang
on the html
element, the alternate
+hreflang
links are not considered to point to translations.
Even if a user agent deduces the language of the document by taking self-referential¹ alternate
+hreflang
links into account, there are situations in which this could fail:
If the HTML document gets opened locally, it no longer has a HTTP URL, so a user agent can’t deduce that the alternate
+hreflang
link refers to this document.
If the HTML documents gets retrieved over a different URL (e.g., with tracking parameters), the alternate
+hreflang
link no longer refers to the current URL, so a user agent can’t deduce that it does apply to this URL, too.
(With a canonical
link, both situations could be mitigated, but that’s one more thing a user agent would have to support. Not all do.)
¹ Strictly speaking, a self-referential alternate
+hreflang
hyperlink is not semantic, because alternate
is defined to refer to "an alternate representation of the current document", but a document is of course not an alternate representation of itself. However, as Google Search documents its use, it’s now common to see this markup.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With