5.3. US treasury yield curves modelization

5.3. US treasury yield curves modelization

A yield curve is a line that plots the yields (interest rates) with respect to their maturity date.

This guide will see an efficient approach to model and work with yield curves with Quasar. We will be using the public US Treasuries yield data as an example.

In this guide, we will:

  • Model yield curves with Quasar;

  • Query and plot the yield curve for a given date;

  • Query and plot the variation of a given maturity over time;

  • Query and plot the surface that represents the yield curve change over time.

We will be using Python, Pandas, and plotly. You can achieve similar results with any programming language or tool that can query Quasar.

5.3.1. Preparation

5.3.1.1. Install and launch QuasarDB

Please install & launch QuasarDB for your environment; the free community edition is sufficient.

For more information and instructions on how to install QuasarDB, please refer to our installation guide.

5.3.1.2. Prepare your Jupyter notebook

Please follow the steps below to create and set up your Jupyter notebook:

# Install requirements
$ python3 -m pip install jupyter \
                         pandas \
                         numpy \
                         quasardb

# Launch local notebook
$ jupyter notebook ./qdb-howto.ipynb

[I 17:16:02.496 NotebookApp] Serving notebooks from local directory: /home/user/qdb-howto/
[I 17:16:02.496 NotebookApp] Jupyter Notebook 6.4.6 is running at:
[I 17:16:02.496 NotebookApp] http://localhost:8888/?token=...
[I 17:16:02.496 NotebookApp]  or http://127.0.0.1:8888/?token=...
[I 17:16:02.496 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

A new jupyter notebook should automatically open, otherwise please manually navigate and you can navigate your browser to http://localhost:8888/ with the environment prepared.

You are now completed with the preparations.

5.3.2. Yield curves

When you borrow money, the maturity date, that is, the date at which you must reimburse the loan impacts the interest rate. For example, as of February 2022, the US Treasury can borrow money for 6 months at an interest rate of approximately 0.5%, whereas for a 30 years loan, the interest rate is around 2.2%.

The logic of interest (yield) that changes depending on the maturity date is called a yield curve. Yield curves exist across many financial products.

To quote the US Treasury home page:

These rates are indicative closing market bid quotations on the most recently auctioned Treasury Bills in the over-the-counter market as obtained by the Federal Reserve Bank of New York at approximately 3:30 PM each business day.

That means that rates move on two time dimensions:

  1. The day on which they are published. Depending on the investors’ appetite for the US treasuries, the rate will go up or down.

  2. The maturity date when the loan must be reimbursed. Each day, a new yield curve is published.

This guide will see how to model, store, and query yield curves. We will use the US Treasuries data.

Warning

The yield data provided is for educational purposes only. We offer no guarantee regarding its accuracy or completeness.

5.3.3. Modeling

To model a yield curve, there are two approaches, wide or narrow.

Let’s take the following data for February 4th 2022:

Maturity

Yield

1 month

0.05

3 months

0.23

6 months

0.56

1 year

0.89

2 years

1.31

3 years

1.55

5 years

1.78

7 years

1.9

10 years

1.93

20 years

2.29

30 years

2.23

There are two ways you can model that in Quasar. The first one uses one column per duration and one row per curve.

Timestamp

1 month

3 months

6 months

1 year

2 years

3 years

5 years

7 years

10 years

20 years

30 years

2022-02-04

0.05

0.23

0.56

0.89

1.31

1.55

1.78

1.9

1.93

2.29

2.23

The advantage of this approach is plotting the yield curve is very easy as the data is “ready to use.” However, it has many drawbacks:

  • Querying for points based on maturity requires you to know the name of the columns, and you can’t quickly write things such as “give me all the yields for a maturity below 40 months.”

  • If you’re working with a yield curve that has thousands of points, it can become unwieldy

  • If you want to add or remove a maturity date, this requires a change in the schema, perhaps in your applications.

This why in this guide we will model the data using a second approach: one column for the maturity and multiple rows per timestamp for each maturity.

The data thus becomes:

Timestamp

Maturity

Yield

2022-02-04

1

0.05

2022-02-04

3

0.23

2022-02-04

6

0.56

2022-02-04

12

0.89

2022-02-04

24

1.31

2022-02-04

36

1.55

2022-02-04

60

1.78

2022-02-04

84

1.9

2022-02-04

120

1.93

2022-02-04

240

2.29

2022-02-04

360

2.23

The maturity is expressed in months and unlocks the possibility to query for maturity using a consistent time duration. This data model is fully supported by Quasar and delivers excellent performance. We will see how we can query the data to make it possible to plot your yield curve.

5.3.4. Importing the yield curves

For simplicity, we have prepared a data set with the proper layout ready to be imported .

Here is how to load the data:

import pandas as pd
import requests
import io

data = requests.get('https://doc.quasardb.net/howto/yield-curve/usyield.csv').content
df = pd.read_csv(io.StringIO(data.decode('utf-8')),
                 index_col='date',
                 dtype={'maturity': 'int64', 'yield': 'float64'},
                 parse_dates=True)

df
maturity yield
date
2002-01-02 1 1.73
2002-01-02 3 1.74
2002-01-02 6 1.85
2002-01-02 12 2.28
2002-01-02 24 3.22
... ... ...
2022-02-04 60 1.78
2022-02-04 84 1.90
2022-02-04 120 1.93
2022-02-04 240 2.29
2022-02-04 360 2.23

55330 rows × 2 columns

We will store the yield curves in a single table named “us_yield”. The QuasarDB API will create the table with the right columns automatically for you based on the dataframe, as demonstrated here:

import quasardb
import quasardb.pandas as qdbpd
import datetime

def get_conn():
    return

# change this with the URL of you Quasar cluster
conn = quasardb.Cluster("qdb://127.0.0.1:2836")
t = conn.table('us_yield')

# remove the table if it exists
try:
    t.remove()
except:
    # Table did not yet exist
    pass

qdbpd.write_dataframe(df, conn, t, create=True, shard_size=datetime.timedelta(days=10000))
coercing deduplicate option from args =  {'deduplicate': False, 'deduplication_mode': 'drop'}
deduplication mode = {} drop
<quasardb.Table name='us_yield'>

We chose a shard size of 10,000 days. For more information about how to pick a good shard size, see shard size impact on performance.

5.3.5. Displaying the yield curve for a given day

The most frequent thing you want to do for a yield curve is plot the yield curve for a given date.

To get the yield curve for a given day, the query is:

SELECT maturity, yield FROM us_yield IN RANGE(2022-02-04, +1d)

Note how we use Quasar’s extended time syntax to select one day conveniently. The Python code to retrieve and plot the curve is thus:

import plotly.graph_objects as  go

q = "SELECT maturity, yield FROM us_yield IN RANGE(2022-02-04, +1d)"
df = qdbpd.query(conn, q)

fig = go.Figure(data=go.Scatter(x=df['maturity'],
                                y=df['yield']))
fig.update_xaxes(title="Maturity in months")
fig.update_yaxes(title="Yield")
fig.update_layout(template='plotly_dark',
                  title='US Yield curve for February 4th 2022')
fig.show()