Java – Why does a new SimpleDateFormat object contain calendar with the wrong year

calendardatejavasimpledateformat

I came upon a strange behavior that has left me curious and without a satisfactory explanation as yet.

For simplicity, I've reduced the symptoms I've noticed to the following code:

import java.text.SimpleDateFormat;
import java.util.GregorianCalendar;

public class CalendarTest {
    public static void main(String[] args) {
        System.out.println(new SimpleDateFormat().getCalendar());
        System.out.println(new GregorianCalendar());
    }
}

When I run this code, I get something very similar to the following output:

java.util.GregorianCalendar[time=-1274641455755,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=1929,MONTH=7,WEEK_OF_YEAR=32,WEEK_OF_MONTH=2,DAY_OF_MONTH=10,DAY_OF_YEAR=222,DAY_OF_WEEK=7,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=55,SECOND=44,MILLISECOND=245,ZONE_OFFSET=-28800000,DST_OFFSET=0]
java.util.GregorianCalendar[time=1249962944248,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2009,MONTH=7,WEEK_OF_YEAR=33,WEEK_OF_MONTH=3,DAY_OF_MONTH=10,DAY_OF_YEAR=222,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=55,SECOND=44,MILLISECOND=248,ZONE_OFFSET=-28800000,DST_OFFSET=3600000]

(The same thing happens if I provide a valid format string like "yyyy-MM-dd" to SimpleDateFormat.)

Forgive the horrendous non-wrapping lines, but it's the easiest way to compare the two. If you scroll to about 2/3rds of the way over, you'll see that the calendars have YEAR values of 1929 and 2009, respectively. (There are a few other differences, such as week of year, day of week, and DST offset.) Both are obviously instances of GregorianCalendar, but the reason why they differ is puzzling.

From what I can tell the formatter produces accurate when formatting Date objects passed to it. Obviously, correct functionality is more important than the correct reference year, but the discrepancy is disconcerting nonetheless. I wouldn't think that I'd have to set the calendar on a brand-new date formatter just to get the current year…

I've tested this on Macs with Java 5 (OS X 10.4, PowerPC) and Java 6 (OS X 10.6, Intel) with the same results. Since this is a Java library API, I assume it behaves the same on all platforms. Any insight on what's afoot here?

(Note: This SO question is somewhat related, but not the same.)


Edit:

The answers below all helped explain this behavior. It turns out that the Javadocs for SimpleDateFormat actually document this to some degree:

"For parsing with the abbreviated year pattern ("y" or "yy"), SimpleDateFormat must interpret the abbreviated year relative to some century. It does this by adjusting dates to be within 80 years before and 20 years after the time the SimpleDateFormat instance is created."

So, instead of getting fancy with the year of the date being parsed, they just set the internal calendar back 80 years by default. That part isn't documented per se, but when you know about it, the pieces all fit together.

Best Solution

I'm not sure why Tom says "it's something to do with serialization", but he has the right line:

private void initializeDefaultCentury() {
    calendar.setTime( new Date() );
    calendar.add( Calendar.YEAR, -80 );
    parseAmbiguousDatesAsAfter(calendar.getTime());
}

It's line 813 in SimpleDateFormat.java, which is very late in the process. Up to that point, the year is correct (as is the rest of the date part), then it's decremented by 80.

Aha!

The call to parseAmbiguousDatesAsAfter() is the same private function that set2DigitYearStart() calls:

/* Define one-century window into which to disambiguate dates using
 * two-digit years.
 */
private void parseAmbiguousDatesAsAfter(Date startDate) {
    defaultCenturyStart = startDate;
    calendar.setTime(startDate);
    defaultCenturyStartYear = calendar.get(Calendar.YEAR);
}

/**
 * Sets the 100-year period 2-digit years will be interpreted as being in
 * to begin on the date the user specifies.
 *
 * @param startDate During parsing, two digit years will be placed in the range
 * <code>startDate</code> to <code>startDate + 100 years</code>.
 * @see #get2DigitYearStart
 * @since 1.2
 */
public void set2DigitYearStart(Date startDate) {
    parseAmbiguousDatesAsAfter(startDate);
}

Now I see what's going on. Peter, in his comment about "apples and oranges", was right! The year in SimpleDateFormat is the first year of the "default century", the range into which a two-digit year string (e.g, "1/12/14") is interpreted to be. See http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html#get2DigitYearStart%28%29 :

So in a triumph of "efficiency" over clarity, the year in the SimpleDateFormat is used to store "the start of the 100-year period into which two digit years are parsed", not the current year!

Thanks, this was fun -- and finally got me to install the jdk source (I only have 4GB total space on my / partition.)

Related Question