Monday, September 6, 2010

Hyphen usage in package names

As always, the simple things in live make everything complicated. A co-worker came up with this and I would like to answer the question and point you to some additional links.
Do we have a prefered way about our Java package names? Do we use the short:
- de.xxxx.* or the long form
- com.xxxx-xxxx.* ?

The technical answer
Trivial? Not necessarily. The actuall question is hidden. If you simply look at the semantics you would tend to choose the "short form" (the de.xxx). But let's look at the syntactic first.
The true question is: Does the Java Language Specification (jls) allows hyphen (dashes) in package names? A quick look into 7.7 Unique Package Names tells us the following about package names in general
You form a unique package name by first having (or belonging to an organization that has) an Internet domain name, such as sun.com. You then reverse this name, component by component, to obtain, in this example, com.sun, and use this as a prefix for your package names, using a convention developed within your organization to further administer package names.
[...}
If the domain name contains a hyphen, or any other special character not allowed in an identifier (§3.8), convert it into an underscore.
(Source: jls, 3rd edition)
Now you have to look at 3.8 Identifiers to find out about "special characters".
An identifier is an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter.
[...]
The Java letters include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII underscore (_, or \u005f) and dollar sign ($, or \u0024).
You are right, the simple answer is: You are NOT allowed to use the hypen in package names. If you have a unique package name build from a domainname containing a hyphen you have to translate it to the underscore.

For the above example this will look like:
- com.xxxx_xxxx.*

The developer answer
Reading all this stuff, this somehow feels outdated. If you look at the massive amount of registered domain names it is most likely that the hyphen will more and more become a part of our namespace. Even if it technically does not make a big difference if you use a "-" or a "_" there are some more considerations:
- You have to hit an additional key "shift" if you want to access the hyphen.
- It simply does not correspond with the domain name.
- It looks like a typo.
- The _ has to be "translated" to get the real domain name

Related to the concrete question there are some more conciderations:
- The de.xxx obviously is shorter
- The com.xxx sounds more international

The answer to the question is: Use the de.xxxx! It is shorter and correct. If any of the marketing guys is forcing you to pick the second option: resist! It is longer, makes developing harder and simply does not feel right. At last to me.
You have your own thoughts about this. Glad to hear it.