Times and Dates in Python
Overview
Time is an essential component of nearly all geoscience data. Timescales span orders of magnitude from microseconds for lightning, hours for a supercell thunderstorm, days for a global weather model, millenia and beyond for the earth’s climate. To properly analyze geoscience data, you must have a firm understanding of how to handle time in Python.
In this notebook, we will:
Introduce the time and datetime modules from the Python Standard Library
Look at formatted input and output of dates and times
See how we can do simple arithmetic on date/time data, making use of the
timedelta
objectBriefly make use of the pytz module to handle some thorny time zone issues in Python.
Prerequisites
Concepts |
Importance |
Notes |
---|---|---|
Necessary |
Understanding strings |
|
Basic Python string formatting |
Helpful |
Time to learn: 30 minutes
Imports
# Python Standard Library packages
# We'll discuss below WHY we alias the packages this way
import datetime as dt
import math
import time as tm
# Third-party package for time zone handling, we'll discuss below!
import pytz
Time
Versus Datetime
modules
Some core terminology
Python comes with time and datetime modules as part of the Standard Library included with every Python installation. Unfortunately, Python can be initially disorienting because of the heavily overlapping terminology concerning dates and times:
datetime
module has adatetime
classdatetime
module has atime
classdatetime
module has adate
classtime
module has atime
function which returns (almost always) Unix timedatetime
class has adate
method which returns adate
objectdatetime
class has atime
method which returns atime
object
This confusion can be partially alleviated by aliasing our imported modules, we did above:
import datetime as dt
import time as tm
We can now reference the datetime
module (aliased to dt
) and datetime
object unambiguously.
pisecond = dt.datetime(2021, 3, 14, 15, 9, 26)
print(pisecond)
2021-03-14 15:09:26
Our variable pisecond
now stores a particular date and time, which just happens to be \(\pi\)-day 2021 down to the nearest second (3.1415926…)
now = tm.time()
print(now)
1667937904.2637124
The variable now
holds the current time in seconds since January 1, 1970 00:00 UTC (see What is Unix Time below).
time
module
The time
module is well-suited for measuring Unix time. For example, when you are calculating how long it takes a Python function to run (so-called “benchmarking”), you can employ the time()
function from the time
module to obtain Unix time before and after the function completes and take the difference of those two times.
start = tm.time()
tm.sleep(1) # The sleep function will stop the program for n seconds
end = tm.time()
diff = end - start
print(f"The benchmark took {diff} seconds")
The benchmark took 1.0012857913970947 seconds
Info
For more accurate benchmarking, see the timeit module.
What is Unix Time?
Unix time is an example of system time which is the computer’s notion of passing time. It is measured in seconds from the the start of the epoch which is January 1, 1970 00:00 UTC. It is represented “under the hood” as a floating point number which is how computers represent real (ℝ) numbers .
datetime
module
The datetime
module handles time with the Gregorian calendar (the calendar we are all familiar with) and is independent of Unix time. The datetime
module has an object-oriented approach with the date
, time
, datetime
, timedelta
, and tzinfo
classes.
date
class represents the day, month and yeartime
class represents the time of daydatetime
class is a combination of thedate
andtime
classestimedelta
class represents a time durationtzinfo
(abstract) class represents time zones
The datetime
module is effective for:
performing date and time arithmetic and calculating time duration
reading and writing date and time strings with various formats
handling time zones (with the help of third-party libraries)
The time
and datetime
modules overlap in functionality, but in your geoscientific work, you will probably be using the datetime
module more than the time
module.
We’ll delve into more details below, but here’s a quick example of writing out our pisecond
datetime object as a formatted string. Suppose we wanted to write out just the date, and write it in the month/day/year format typically used in the US. We can use the strftime()
method with a format specifier:
print('Pi day occurred on:', pisecond.strftime(format='%m/%d/%Y'))
Pi day occurred on: 03/14/2021
Reading and writing dates and times
Parsing lightning data timestamps with the datetime.strptime
method
Suppose you want to analyze US NLDN lightning data. Here is a sample row of data:
06/27/07 16:18:21.898 18.739 -88.184 0.0 kA 0 1.0 0.4 2.5 8 1.2 13 G
Part of the task involves parsing the 06/27/07 16:18:21.898
time string into a datetime
object. (The full description of the data is here.) In order to parse this string or others that follow the same format, you will employ the datetime.strptime() method from the datetime
module. This method takes two arguments:
the date time string you wish to parse
the format which describes exactly how the date and time are arranged.
The full range of format options is described in the Python documentation. In reality, the format will take some degree of experimentation to get right. This is a situation where Python shines as you can quickly try out different solutions in the IPython interpreter (or in a notebook). Beyond the official documentation, Google and Stack Overflow are your friends in this process.
Eventually, after some trial and error, you will find the '%m/%d/%y %H:%M:%S.%f'
format will properly parse the date and time.
strike_time = dt.datetime.strptime('06/27/07 16:18:21.898', '%m/%d/%y %H:%M:%S.%f')
# print strike_time to see if we have properly parsed our time
print(strike_time)
2007-06-27 16:18:21.898000
Example usage of the datetime
object
Why did we bother doing this? It might look like all we’ve done here is take the string 06/27/07 16:18:21.898
and reformatted it to 2007-06-27 16:18:21.898000
.
But in fact our variable strike_time
is a datetime
object that we can manipulate in many useful ways.
A few quick examples:
Controlling the output format with strftime()
Suppose we want to write out just the time (not date) in particular format like this:
16h 18m 21s
We can do this with the datetime.strftime() method, which takes a format identical to the one we employed for strptime()
. After some trial and error from the IPython interpreter, we arrive at '%Hh %Mm %Ss'
:
print(strike_time.strftime(format='%Hh %Mm %Ss'))
16h 18m 21s
Calculating coastal tides with the timedelta
class
Let’s suppose we are looking at coastal tide and current data perhaps in a tropical cyclone storm surge scenario.
The lunar day is 24 hours, 50 minutes with two low tides and two high tides in that time duration. If we know the time of the current high tide, we can easily calculate the occurrence of the next low and high tides with the timedelta class. (In reality, the exact time of tides is influenced by local coastal effects, in addition to the laws of celestial mechanics, but we will ignore that fact for this exercise.)
The timedelta
class is initialized by supplying time duration usually supplied with keyword arguments to clearly express the length of time. Significantly, you can use the timedelta
class with arithmetic operators (i.e., +
, -
, *
, /
) to obtain new dates and times as the next code sample illustrates.
This convenient language feature is known as operator overloading and again illustrates Python’s batteries-included philosophy of making life easier for the programmer. (In another language such as Java, you would have to call a method significantly obfuscating the code.)
Another great feature is that the difference of two times (like we did above with the lightning strike data) will yield a timedelta
object. Let’s examine all these features in the following code block.
high_tide = dt.datetime(2016, 6, 1, 4, 38, 0)
lunar_day = dt.timedelta(hours=24, minutes=50)
tide_duration = lunar_day / 4 # Here we do some arithmetic on the timedelta object!
next_low_tide = (
high_tide + tide_duration
) # Here we add a timedelta object to a datetime object
next_high_tide = high_tide + (2 * tide_duration) # and so on
tide_length = next_high_tide - high_tide
print(f"The time between high and low tide is {tide_duration}.")
print(f"The current high tide is {high_tide}.")
print(f"The next low tide is {next_low_tide}.")
print(f"The next high tide {next_high_tide}.")
print(f"The tide length is {tide_length}.")
print(f"The type of the 'tide_length' variable is {type(tide_length)}.")
The time between high and low tide is 6:12:30.
The current high tide is 2016-06-01 04:38:00.
The next low tide is 2016-06-01 10:50:30.
The next high tide 2016-06-01 17:03:00.
The tide length is 12:25:00.
The type of the 'tide_length' variable is <class 'datetime.timedelta'>.
In the last print
statement, we use the type() built-in Python function to show that the difference between two times yields a timedelta
object.
Dealing with Time Zones
Time zones can be a source of confusion and frustration in geoscientific data and in computer programming in general. Core date and time libraries in various programming languages inevitably have design flaws (Python is no different) leading to third-party libraries that attempt to fix the core library limitations. To avoid these issues, it is best to handle data in UTC, or at the very least operate in a consistent time zone, but that is not always possible. Users will expect their tornado alerts in local time.
What is UTC?
UTC is an abbreviation of Coordinated Universal Time and is equivalent to Greenwich Mean Time (GMT), in practice. (Greenwich at 0 degrees longitude, is a district of London, England.) In geoscientific data, times are often in UTC though you should always verify this assumption is actually true!
Time Zone Naive Versus Time Zone Aware datetime
Objects
When you create datetime
objects in Python, they are so-called “naive” which means they are time zone unaware. In many situations, you can happily go forward without this detail getting in the way of your work. As the Python documentation states:
Naive objects are easy to understand and to work with, at the cost of ignoring some aspects of reality.
However, if you wish to convey time zone information, you will have to make your datetime
objects time zone aware. The datetime
library is able to handle conversions to UTC:
naive = dt.datetime.now()
aware = dt.datetime.now(dt.timezone.utc)
print(f"I am time zone naive {naive}.")
print(f"I am time zone aware {aware}.")
I am time zone naive 2022-11-08 20:05:05.336593.
I am time zone aware 2022-11-08 20:05:05.336628+00:00.
Notice that aware
has +00:00
appended at the end, indicating zero hours offset from UTC.
Our naive
object shows the local time on whatever computer was used to run this code. If you’re reading this online, then chances are it was executed on a cloud server that already uses UTC, so naive
and aware
will differ only at the microsecond level!
In the code above, we used dt.timezone.utc
to initialize the UTC timezone for our aware
object. Unfortunately at this time the Python Standard Library does not fully support initializing datetime
objects with arbitrary time zones, or conversions between different time zones.
Full time zone support with the pytz
module
For improved handling of time zones in Python, you will need the third-party pytz module whose classes build upon, or “inherit” in OO terminology, from datetime
classes.
Here, we repeat the above exercise but initialize our aware
object in a different time zone:
naive = dt.datetime.now()
aware = dt.datetime.now(pytz.timezone('US/Mountain'))
print(f"I am time zone naive: {naive}.")
print(f"I am time zone aware: {aware}.")
I am time zone naive: 2022-11-08 20:05:05.343482.
I am time zone aware: 2022-11-08 13:05:05.357801-07:00.
The pytz.timezone()
method takes a time zone string and returns a tzinfo
object which can be used to initialize the time zone. The -06:00
denotes we are operating in a time zone six hours behind UTC.
Print Time with a Different Time Zone
If you have data that are in UTC, and wish to convert them to another time zone, Mountain Time Zone for example, you will again make use of the pytz
module.
First, we will create a UTC time with the utcnow() method which inexplicably returns a time zone naive object so you must still specify the UTC time zone with the replace() method. We then create a “US/Mountain” tzinfo
object as before, but this time we will use the astimzone() method to adjust the time to the specified time zone.
utc = dt.datetime.utcnow().replace(tzinfo=pytz.utc)
print("The UTC time is {}.".format(utc.strftime('%B %d, %Y, %-I:%M%p')))
mountaintz = pytz.timezone("US/Mountain")
ny = utc.astimezone(mountaintz)
print("The 'US/Mountain' time is {}.".format(ny.strftime('%B %d, %Y, %-I:%M%p')))
The UTC time is November 08, 2022, 8:05PM.
The 'US/Mountain' time is November 08, 2022, 1:05PM.
Here we’ve also used the strftime()
method to format a human-friendly date and time string.
Summary
The Python Standard Library contains several modules for dealing with date and time data. We saw how we can avoid some name ambiguities by aliasing the module names with import datetime as dt
and import time as tm
. The tm.time()
method just returns the current Unix time in seconds – which can be useful for measuring elapsed time, but not all that useful for working with geophysical data.
The datetime
module contains various classes for storing, converting, comparing, and formatting date and time data on the Gregorian calendar. We saw how we can parse data files with date and time strings into dt.datetime
objects using the dt.datetime.strptime()
method. We also saw how we can do arithmetic on time and date data, making use of the dt.timedelta
class to represent intervals of time.
Finally, we looked at using the third-party pytz module to handle timezone awareness and conversions.
Resources and References
This notebook was adapted from material in Unidata’s Python Training.
For further reading on these modules, take a look at the official documentation for:
For more information on Python string formatting, try:
A nice tutorial from RealPython