5.3. US treasury yield curves modelization#

A yield curve is a line that plots the yields (interest rates) with respect to their maturity date.

This guide will see an efficient approach to model and work with yield curves with Quasar. We will be using the public US Treasuries yield data as an example.

In this guide, we will:

  • Model yield curves with Quasar;

  • Query and plot the yield curve for a given date;

  • Query and plot the variation of a given maturity over time;

  • Query and plot the surface that represents the yield curve change over time.

We will be using Python, Pandas, and plotly. You can achieve similar results with any programming language or tool that can query Quasar.

5.3.1. Preparation#

5.3.1.1. Install and launch QuasarDB#

Please install & launch QuasarDB for your environment; the free community edition is sufficient.

For more information and instructions on how to install QuasarDB, please refer to our installation guide.

5.3.1.2. Prepare your Jupyter notebook#

Please follow the steps below to create and set up your Jupyter notebook:

# Install requirements
$ python3 -m pip install jupyter \
                         pandas \
                         numpy \
                         quasardb

# Launch local notebook
$ jupyter notebook ./qdb-howto.ipynb

[I 17:16:02.496 NotebookApp] Serving notebooks from local directory: /home/user/qdb-howto/
[I 17:16:02.496 NotebookApp] Jupyter Notebook 6.4.6 is running at:
[I 17:16:02.496 NotebookApp] http://localhost:8888/?token=...
[I 17:16:02.496 NotebookApp]  or http://127.0.0.1:8888/?token=...
[I 17:16:02.496 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

A new jupyter notebook should automatically open, otherwise please manually navigate and you can navigate your browser to http://localhost:8888/ with the environment prepared.

You are now completed with the preparations.

5.3.2. Yield curves#

When you borrow money, the maturity date, that is, the date at which you must reimburse the loan impacts the interest rate. For example, as of February 2022, the US Treasury can borrow money for 6 months at an interest rate of approximately 0.5%, whereas for a 30 years loan, the interest rate is around 2.2%.

The logic of interest (yield) that changes depending on the maturity date is called a yield curve. Yield curves exist across many financial products.

To quote the US Treasury home page:

These rates are indicative closing market bid quotations on the most recently auctioned Treasury Bills in the over-the-counter market as obtained by the Federal Reserve Bank of New York at approximately 3:30 PM each business day.

That means that rates move on two time dimensions:

  1. The day on which they are published. Depending on the investors’ appetite for the US treasuries, the rate will go up or down.

  2. The maturity date when the loan must be reimbursed. Each day, a new yield curve is published.

This guide will see how to model, store, and query yield curves. We will use the US Treasuries data.

Warning

The yield data provided is for educational purposes only. We offer no guarantee regarding its accuracy or completeness.

5.3.3. Modeling#

To model a yield curve, there are two approaches, wide or narrow.

Let’s take the following data for February 4th 2022:

Maturity

Yield

1 month

0.05

3 months

0.23

6 months

0.56

1 year

0.89

2 years

1.31

3 years

1.55

5 years

1.78

7 years

1.9

10 years

1.93

20 years

2.29

30 years

2.23

There are two ways you can model that in Quasar. The first one uses one column per duration and one row per curve.

Timestamp

1 month

3 months

6 months

1 year

2 years

3 years

5 years

7 years

10 years

20 years

30 years

2022-02-04

0.05

0.23

0.56

0.89

1.31

1.55

1.78

1.9

1.93

2.29

2.23

The advantage of this approach is plotting the yield curve is very easy as the data is “ready to use.” However, it has many drawbacks:

  • Querying for points based on maturity requires you to know the name of the columns, and you can’t quickly write things such as “give me all the yields for a maturity below 40 months.”

  • If you’re working with a yield curve that has thousands of points, it can become unwieldy

  • If you want to add or remove a maturity date, this requires a change in the schema, perhaps in your applications.

This why in this guide we will model the data using a second approach: one column for the maturity and multiple rows per timestamp for each maturity.

The data thus becomes:

Timestamp

Maturity

Yield

2022-02-04

1

0.05

2022-02-04

3

0.23

2022-02-04

6

0.56

2022-02-04

12

0.89

2022-02-04

24

1.31

2022-02-04

36

1.55

2022-02-04

60

1.78

2022-02-04

84

1.9

2022-02-04

120

1.93

2022-02-04

240

2.29

2022-02-04

360

2.23

The maturity is expressed in months and unlocks the possibility to query for maturity using a consistent time duration. This data model is fully supported by Quasar and delivers excellent performance. We will see how we can query the data to make it possible to plot your yield curve.

5.3.4. Importing the yield curves#

For simplicity, we have prepared a data set with the proper layout ready to be imported .

Here is how to load the data:

import pandas as pd
import requests
import io

data = requests.get('https://doc.quasardb.net/howto/yield-curve/usyield.csv').content
df = pd.read_csv(io.StringIO(data.decode('utf-8')),
                 index_col='date',
                 dtype={'maturity': 'int64', 'yield': 'float64'},
                 parse_dates=True)

print(df)

date

maturity

yield

2002-01-02

1

1.73

2002-01-02

3

1.74

2002-01-02

6

1.85

2002-01-02

12

2.28

2002-01-02

24

3.22

2022-02-04

60

1.78

2022-02-04

84

1.90

2022-02-04

120

1.93

2022-02-04

240

2.29

2022-02-04

360

2.23

We will store the yield curves in a single table named “us_yield”. The QuasarDB API will create the table with the right columns automatically for you based on the dataframe, as demonstrated here:

import quasardb
import quasardb.pandas as qdbpd
import datetime

def get_conn():
    return

# change this with the URL of you Quasar cluster
conn = quasardb.Cluster("qdb://127.0.0.1:2836")
t = conn.table('us_yield')

# remove the table if it exists
try:
    t.remove()
except:
    # Table did not yet exist
    pass

qdbpd.write_dataframe(df, conn, t, create=True, shard_size=datetime.timedelta(days=10000))

We chose a shard size of 10,000 days. For more information about how to pick a good shard size, see shard size impact on performance.

5.3.5. Displaying the yield curve for a given day#

The most frequent thing you want to do for a yield curve is plot the yield curve for a given date.

To get the yield curve for a given day, the query is:

SELECT maturity, yield FROM us_yield IN RANGE(2022-02-04, +1d)

Note how we use Quasar’s extended time syntax to select one day conveniently. The Python code to retrieve and plot the curve is thus:

import plotly.graph_objects as  go

q = "SELECT maturity, yield FROM us_yield IN RANGE(2022-02-04, +1d)"
df = qdbpd.query(conn, q)

fig = go.Figure(data=go.Scatter(x=df['maturity'],
                                y=df['yield']))
fig.update_xaxes(title="Maturity in months")
fig.update_yaxes(title="Yield")
fig.update_layout(template='plotly_dark',
                  title='US Yield curve for February 4th 2022')
fig.show()
../../_images/plot-11.png

Quasar’s API will return a Python Pandas dataframe ready to be used. Columns names in the Pandas dataframe match precisely the name of the columns in the query.

5.3.6. Displaying the change of yield for a given maturity over time#

Another thing you may want to visualize is how the yield changed over time for a given maturity. Let’s say we want to plot the yield change for a 1 year maturity. The query for that is:

SELECT $timestamp, yield FROM us_yield WHERE maturity=12 AND yield IS NOT NULL

You noted that we are filtering out any potential null values in the data set. That step is entirely optional, depending on the quality of your data.

Here is how you fetch the data and plot it in Python:

import plotly.graph_objects as  go

q = "SELECT $timestamp, yield FROM us_yield WHERE maturity=12 AND yield IS NOT NULL"
df = qdbpd.query(conn, q)

fig = go.Figure(data=go.Scatter(x=df['$timestamp'],
                                y=df['yield']))
fig.update_xaxes(title="Date")
fig.update_yaxes(title="Yield")
fig.update_layout(template='plotly_dark',
                  title='US 1y yield over time')
fig.show()
../../_images/plot-21.png

It’s, of course, possible to combine multiple yield curves on the same graph. If we would want to compare the yield curve for 1 and 2 years, we would do the following:

import plotly.graph_objects as  go

q = "SELECT $timestamp, yield FROM us_yield WHERE maturity={} AND yield IS NOT NULL"
df1 = qdbpd.query(conn, q.format(12))
df2 = qdbpd.query(conn, q.format(24))

fig = go.Figure()
fig.add_trace(go.Scatter(x=df1['$timestamp'],
                         y=df1['yield'],
                         name='1y'))
fig.add_trace(go.Scatter(x=df2['$timestamp'],
                         y=df2['yield'],
                         name='2y'))
fig.update_xaxes(title="Date")
fig.update_yaxes(title="Yield")
fig.update_layout(template='plotly_dark',
                  title='US 1y and 2y yield over time')
fig.show()
../../_images/plot-3.png

5.3.7. Displaying the yield surface#

The approach above can become messy as soon as you visualize more than one yield curve over time. The ideal visualization for yield over time is a surface plotted in 3D.

You will need to pivot the data on the maturity date to do that. Quasar supports that in the query language directly. While you could pivot the data with Pandas, being able to do it from the query enables you to have the data in optimal form, whichever tool you are using.

The query to obtain the data pivoted is the following:

SELECT * FROM us_yield LEFT ASOF JOIN us_yield PIVOT ON maturity

This query tells Quasar to pivot on the maturity date to transform the data from this

Timestamp

Maturity

Yield

2022-02-04

1

0.05

2022-02-04

3

0.23

2022-02-04

6

0.56

2022-02-04

12

0.89

2022-02-04

24

1.31

2022-02-04

36

1.55

2022-02-04

60

1.78

2022-02-04

84

1.9

2022-02-04

120

1.93

2022-02-04

240

2.29

2022-02-04

360

2.23

to this

Timestamp

1

3

6

12

24

36

60

84

120

240

360

2022-02-04

0.05

0.23

0.56

0.89

1.31

1.55

1.78

1.9

1.93

2.29

2.23

This pivot is necessary to have the data along a Z axis to draw a surface.

To fetch the data in Python, the code is thus:

q = """
SELECT *
FROM us_yield
LEFT ASOF JOIN us_yield
PIVOT on maturity
"""
df = qdbpd.query(conn, q)
# drop the $table column that is not necessary for drawing the surface
df = df.drop(columns=['$table']).set_index('$timestamp').reindex()
print(df)

$timestamp

1.yield

3.yield

6.yield

12.yield

24.yield

36.yield

60.yield

84.yield

120.yield

240.yield

360.yield

2002-01-02

1.73

1.74

1.85

2.28

3.22

3.75

4.52

4.97

5.20

5.86

5.56

2002-01-03

1.73

1.73

1.82

2.24

3.19

3.71

4.48

4.93

5.16

5.83

5.54

2002-01-04

1.72

1.72

1.82

2.25

3.19

3.72

4.50

4.97

5.18

5.87

5.57

2002-01-07

1.70

1.68

1.77

2.19

3.08

3.61

4.39

4.86

5.09

5.76

5.49

2002-01-08

1.70

1.68

1.77

2.19

3.07

3.60

4.39

4.86

5.10

5.77

5.51

2022-01-31

0.03

0.22

0.49

0.78

1.18

1.39

1.62

1.75

1.79

2.17

2.11

2022-02-01

0.04

0.19

0.48

0.78

1.18

1.39

1.63

1.76

1.81

2.19

2.12

2022-02-02

0.04

0.19

0.45

0.76

1.16

1.38

1.60

1.74

1.78

2.17

2.11

2022-02-03

0.03

0.20

0.48

0.78

1.19

1.42

1.66

1.78

1.82

2.20

2.14

2022-02-04

0.05

0.23

0.56

0.89

1.31

1.55

1.78

1.90

1.93

2.29

2.23

To have a surface that starts from the shortest maturity closest to us, let’s rename and reindex the columns:

month2maturity={'1': '1m',
                '3': '3m',
                '6': '6m',
                '12': '1y',
                '24': '2y',
                '36': '3y',
                '60': '5y',
                '84': '7y',
                '120': '10y',
                '240': '20y',
                '360': '30y'}
maturities=list(month2maturity.values())
maturities.reverse()

# Rename columns to make them easier to read
df = df.rename(axis='columns',
               mapper=lambda x: month2maturity.get(x.replace('.yield', '')))

# Sort data in order of maturity
df = df[maturities].reindex()

Now the data is ready to be plotted:

fig = go.Figure(data=[go.Surface(x=maturities,
                                 y=df.index,
                                 z=df.to_numpy(),
                                 colorscale=[[0, 'rgb(15, 25, 56)'],
                                             [1, 'rgb(152, 185,  224)']])])
fig.update_layout(scene_aspectmode='manual',
                  scene_aspectratio=dict(x=1, y=3, z=1),
                  scene_camera_eye=dict(x=2, y=1, z=1),
                  scene_camera_center=dict(x=0, y=0.25, z=0)
                  )
fig.update_layout(width=900,
                  height=525,
                  autosize=False,
                  template='plotly_dark',
                  title='US Yield surface',
                  scene={'xaxis_title': 'Maturity',
                         'yaxis_title': 'Date',
                         'zaxis_title': 'Yield'})
fig.show()
../../_images/plot-4.png

5.3.8. Conclusion#

Quasar makes the management of yield curves easy and fast. The approach we used in this guide will work for any yield curve and similar mathematical object.

5.3.8.1. Further reading#