For HTML 4, the answer is technically:
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
HTML 5 is even more permissive, saying only that an id must contain at least one character and may not contain any space characters.
The id attribute is case sensitive in XHTML.
As a purely practical matter, you may want to avoid certain characters. Periods, colons and '#' have special meaning in CSS selectors, so you will have to escape those characters using a backslash in CSS or a double backslash in a selector string passed to jQuery. Think about how often you will have to escape a character in your stylesheets or code before you go crazy with periods and colons in ids.
For example, the HTML declaration <div id="first.name"></div>
is valid. You can select that element in CSS as #first\.name
and in jQuery like so: $('#first\\.name').
But if you forget the backslash, $('#first.name')
, you will have a perfectly valid selector looking for an element with id first
and also having class name
. This is a bug that is easy to overlook. You might be happier in the long run choosing the id first-name
(a hyphen rather than a period), instead.
You can simplify your development tasks by strictly sticking to a naming convention. For example, if you limit yourself entirely to lower-case characters and always separate words with either hyphens or underscores (but not both, pick one and never use the other), then you have an easy-to-remember pattern. You will never wonder "was it firstName
or FirstName
?" because you will always know that you should type first_name
. Prefer camel case? Then limit yourself to that, no hyphens or underscores, and always, consistently use either upper-case or lower-case for the first character, don't mix them.
A now very obscure problem was that at least one browser, Netscape 6, incorrectly treated id attribute values as case-sensitive. That meant that if you had typed id="firstName"
in your HTML (lower-case 'f') and #FirstName { color: red }
in your CSS (upper-case 'F'), that buggy browser would have failed to set the element's color to red. At the time of this edit, April 2015, I hope you aren't being asked to support Netscape 6. Consider this a historical footnote.
I use javascript:void(0)
.
Three reasons. Encouraging the use of #
amongst a team of developers inevitably leads to some using the return value of the function called like this:
function doSomething() {
//Some code
return false;
}
But then they forget to use return doSomething()
in the onclick and just use doSomething()
.
A second reason for avoiding #
is that the final return false;
will not execute if the called function throws an error. Hence the developers have to also remember to handle any error appropriately in the called function.
A third reason is that there are cases where the onclick
event property is assigned dynamically. I prefer to be able to call a function or assign it dynamically without having to code the function specifically for one method of attachment or another. Hence my onclick
(or on anything) in HTML markup look like this:
onclick="someFunc.call(this)"
OR
onclick="someFunc.apply(this, arguments)"
Using javascript:void(0)
avoids all of the above headaches, and I haven't found any examples of a downside.
So if you're a lone developer then you can clearly make your own choice, but if you work as a team you have to either state:
Use href="#"
, make sure onclick
always contains return false;
at the end, that any called function does not throw an error and if you attach a function dynamically to the onclick
property make sure that as well as not throwing an error it returns false
.
OR
Use href="javascript:void(0)"
The second is clearly much easier to communicate.
Best Solution
Parsing HTML reliably using regexp is known to be notoriously difficult.
I think I would be looking for a HTML parsing library, or a "screen scraping" library ;)
If the HTML comes from an unreliable source, you have to be extra careful to handle malicious HTML syntax well. Bad HTML handling is a major source of security attacks.