Introduction

In EHR databases, prescriptions are usually provided as a single row for each record for every individual.

These need to be merged to form a treatment episode (Figure 1, Panel C).

Figure 1: Sertraline sample prescriptions

One common approach is to construct treatment episodes using continuous prescription records – two prescriptions can be assigned to the same treatment episode if the periods of coverage overlap (Figure 1, Panel A).

In some EHR databases, the duration of exposure might be unavailable, and assumptions are needed to infer treatment episodes.

A simple example is shown here, assuming all prescriptions last for 28 days (Figure 1, Panel B).

T-Rx provides options to merge prescriptions into treatment episodes as longitudinal periods of exposure by:

Function	Condition
`rx_merge`	when coverage (start and end date) of each prescription is available
`rx_infer`	when coverage of each prescription is not available

Sample dataframe

In this tutorial, we utilized the data object attached with T-Rx object, rx_demo_1 for illustration.

It is a hypothetical prescription dataframe consists of 35 rows and 8 columns (for 4 patients), with details described below.

Column name descriptions

Column	Details
`ID`	Numeric identifier for individual patients
`drug`	Name of the prescribed drug
`class`	Drug class or category (e.g., `SSRI`)
`start_date`	Start date of the prescription
`end_date`	End date of the prescription
`dose`	Dosage of the prescribed drug
`dose_unit`	Dosage unit of the prescribed drug
`quantity`	Quantity of the drug prescribed in the prescription

`rx_merge()` - Merge Prescriptions into Exposure Periods

Description

Consider the sertraline prescriptions on the right (Panel A/B).

The first two prescriptions are in overlap, while the start date of third one has a 4-day gap with the end date of the previous prescription.

rx_merge() aggregates them into the same treatment episode, if the coverages of prescriptions overlap (Panel C/D).

Figure 2: rx_merge function

rx_merge() also allows a window of gap between prescriptions to account for real-world treatment complexity, such as nudges in follow-up periods due to medication stockpiling, and avoid misclassification as treatment episode discontinuation.

This window can be tweaked by changing the gap argument in rx_merge() function.

With a 7-day gap, prescriptions that were less than 7 days apart would be merged together into the same treatment episode (Panel E/F).

Example & Expected Input

rx_merge() allows users to merge prescriptions based on two major options, specified by merge_option.

Option	Condition
`date`	By differences between prescription dates (when the start and end dates of prescriptions are available)
`duration`	By dates and duration of prescriptions (when the dates and durations of prescriptions are available)

The below demonstration of rx_merge() focuses on using the date option.

This is the more commonly used option in most EHR databases, since the dates of prescriptions are more readily available.

Visualizing the prescriptions

In the rx_demo_1 data object, the start date and end dates were available as start_date and end_date columns in the dataframe.

These prescriptions were visualized as shown on the right.

We have limited the prescriptions to those prescribed after 2010-01-01, for the ease of illustration.

Description of the data object (rx_demo_1) can also be found here.

Instructions

Firstly, try an initial run with the data object.

rx_date = rx_merge(rx_df = rx_demo_1,
                   merge_option = "date", gap = 30,
                   id_col = "ID",  rx_date_col = "start_date", rx_end_col = "end_date",
                   drug_group = c("drug"))

The above code specified a 30-day gap between prescriptions (from gap option).

This means that the prescriptions would be treated as a continuous period of exposure if the following prescription is made within 30 days from the end date of previous prescription.

Explanation on arguments

Argument	Details
`rx_df`	The prescription dataframe (`rx_demo_1`)
`merge_option`	Merge the prescriptions based on prescription dates
`id_col`	Column name that contains patient ID (`ID`)
`rx_date_col`	Column name that contains the (start) dates of prescriptions (`start_date`)
`rx_end_col`	Column name that contains the end dates of prescriptions (`end_date`)
`drug_group`	A vector of column names in which the prescriptions were merged based on. We use drug here as we want to merge prescriptions belonging to the same drug name (`drug`)

Output Dataframe

Running rx_merge() returns a dataframe of treatment episodes of 4 patients (10 rows):

**Treatment episodes (using 30 as customisable gap between prescriptions)**
ID	drug	start_date	end_date	n_epi
1	fluoxetine	2010-02-22	2010-05-15	1
1	sertraline	2010-04-15	2011-02-17	1
2	amitriptyline	2007-01-30	2007-04-30	1
2	amitriptyline	2010-02-14	2010-05-15	2
3	paroxetine	2010-01-01	2010-05-17	1
4	amitriptyline	2010-09-30	2010-10-24	1
4	citalopram	2009-12-24	2010-03-27	1
4	fluoxetine	2010-02-25	2010-05-20	1
4	perphenazine	2010-09-30	2010-10-24	1
4	venlafaxine	2010-05-22	2010-09-25	1

Details of dataframe

Column	Details
`ID`	Patient ID (same as input dataframe)
`drug`	Medication (same as input dataframe)
`start_date`	Start date of the treatment episode (1st prescription)
`end_date`	End date of the treatment episode (final prescription)
`n_epi`	Number of episode within the group (`drug`)

As a comparison, we have put in the dataframes for patient 1 before and after running rx_merge():

Before rx_merge()

ID	drug	class	start_date	end_date	dose	dose_unit	quantity
1	fluoxetine	SSRI	2010-02-22	2010-03-22	20	mg	28
1	fluoxetine	SSRI	2010-03-22	2010-04-19	20	mg	50
1	sertraline	SSRI	2010-04-15	2010-05-15	50	mg	14
1	fluoxetine	SSRI	2010-04-15	2010-05-15	20	mg	14
1	sertraline	SSRI	2010-05-18	2010-06-15	50	mg	30
1	sertraline	SSRI	2010-06-10	2010-07-04	50	mg	30
1	sertraline	SSRI	2010-07-02	2010-08-01	50	mg	30
1	sertraline	SSRI	2010-07-29	2010-08-26	50	mg	30
1	sertraline	SSRI	2010-08-24	2010-09-23	50	mg	30
1	sertraline	SSRI	2010-09-27	2010-10-25	50	mg	30
1	sertraline	SSRI	2010-10-25	2011-01-23	50	mg	90
1	sertraline	SSRI	2011-01-20	2011-02-17	50	mg	30

After rx_merge()

ID	drug	start_date	end_date	n_epi
1	fluoxetine	2010-02-22	2010-05-15	1
1	sertraline	2010-04-15	2011-02-17	1

All sertraline / fluoxetine prescriptions were merged together after running rx_merge().

Customizing parameters

As in Figure 1, Panel C/D, a customizable gap between prescriptions can be specified by users to account for real-life treatment complexities, using the gap option in rx_merge().

If gap is specified as 0, prescriptions with any gaps (as shown on the right) in between would be treated as separate treatment episodes.

Run rx_merge() again, but change gap as 0.

rx_date = rx_merge(rx_df = rx_demo_1,
                   merge_option = "date", gap = 0,
                   id_col = "ID",  rx_date_col = "start_date", rx_end_col = "end_date",
                   drug_group = c("drug", "dose"))

Note: If users want to construct treatment episodes based on more than one criteria (i.e.: not just same drug, but also same dose etc.), these groups can be specified using the drug_group option in rx_merge(). The resultant dataframe would also show these groups as additional columns (dose in this case).

The resultant dataframe would be shown as below (for patient 1).

ID	drug	start_date	end_date	n_epi
1	fluoxetine	2010-02-22	2010-05-15	1
1	sertraline	2010-04-15	2010-05-15	1
1	sertraline	2010-05-18	2010-09-23	2
1	sertraline	2010-09-27	2011-02-17	3

`rx_infer()` - Making Inference on periods of exposure

Description

The end dates of prescriptions are not always available in EHR databases, such as UK Biobank.

As a result, the coverage of prescriptions cannot be ascertained and rx_merge() therefore cannot be applied.

In rx_merge(), T-Rx infers treatment duration and constructs treatment episodes based on blocks of “repeated prescriptions”, defined from consecutive prescriptions that were:

of the same drug;
of the same dosage (optional);
of the same frequency or quantity (optional); and
close enough in prescription dates (specified by rx_window_days argument).

Repeated Prescriptions

A sample figure the fluoxetine prescriptions shown below (Panel A/B), with the segments represent the expected exposure periods for each prescription (assuming each prescription lasts for 28 days).

Repeated prescriptions can be defined by users based on:

same drug name: merging all 4 prescriptions together;
same dosage and quantity: two separate prescribing episodes will be constructed (Panel C/D).

Key Arguments to infer episode lengths

Column	Details
`rx_window_days`	Gap of days between prescription dates of two consecutive prescriptions allowed to be treated as the same treatment episode, to distinguish cases of separate treatment episodes
`assume_days`	assumption on the length of final prescription (or single prescriptions) of a treatment episode, i.e.: lengths of segments in each prescriptions

Instructions

First, build the above dataframe of 4 fluoxetine prescriptions with the following code:

Code chunk to build prescription sample

fluoxetine_rx <- data.frame(
  ID        = c(4, 4, 4, 4),
  drug      = c("fluoxetine", "fluoxetine", "fluoxetine", "fluoxetine"),
  class     = c("SSRI", "SSRI", "SSRI", "SSRI"),
  start_date= as.Date(c("2010-02-25", "2010-03-17", "2010-04-20", "2010-06-30")),
  dose      = c(20, 20, 20, 40),
  dose_unit = c("mg", "mg", "mg", "mg"),
  quantity  = c(60, 60, 60, 60),
  stringsAsFactors = FALSE)

Now, try to build the prescription episodes with rx_infer(), using:

98 days (14 weeks) between consecutive prescriptions (rx_window_days) to distinguish separate treatment episodes; and
28 days as an assumption (assume_days) for the length of final prescriptions in the treatment episode.

Here, we are demonstrating an example of rx_infer() below.

fluoxetine_rx_infer = rx_infer(rx_df = fluoxetine_rx, id_col = "ID", 
                  drug_col = "drug",
                  date_col = "start_date",
          rx_window_days = 98, assume_days = 28)

Explanation on arguments

Argument	Details
`rx_df`	The prescription dataframe (`fluoxetine_rx`)
`id_col`	Column name that contains patient ID (`ID`)
`drug_col`	Column name that contain the drug names of prescriptions (`drug`)
`dose_col`	Column name that contains the drug doses of prescriptions (`NULL`)
`freq_col`	Column name that contains the drug quantities / frequencies of prescriptions (`NULL`)
`date_col`	Column name that contains the prescription (start) dates of prescriptions (`start_date`)
`rx_window_days`	Gap of days between prescription dates of two consecutive prescriptions allowed to be treated as the same treatment episode, to distinguish cases of separate treatment episodes
`assume_days`	assumption on the length of final prescription (or single prescriptions) of a treatment episode, i.e.: lengths of segments in each prescriptions

In rx_infer(), dose_col and freq_col are optional arguments that specify the column names of the respective columns, if users wish to group up the treatment episodes by same dose and/or same dosing frequency/quantity.

If users prefer to group up repeated prescriptions just by same drug names only (like above), these arguments can be left as NULL.

Output Dataframe

Running rx_infer() as above would return one single prescription episode (since all of them are fluoxetine prescriptions and close enough, within 98 days).

ID	drug	start_date	end_date	n_epi
4	fluoxetine	2010-02-25	2010-07-28	1

Column name descriptions

Column	Details
`ID`	Numeric identifier for individual patients
`drug`	Name of the prescribed drug
`start_date`	Start date of the prescription episode
`end_date`	End date of the prescription episode
`n_epi`	Number of episode (for the group, same drug for this case)

The start date of the episode would be 2010-02-25, the prescription date of first prescription, whereas the end date of the episode would be 2010-07-28, 28 days (specified by assume_days) after the prescription date of the final prescription (2010-06-30).

Customizing parameters

If users want to establish repeated prescriptions by same drug, same dose and same frequency/quantity, these can be inputted to the rx_infer() function as the dose_col and freq_col argument.

A sample code is show below.

fluoxetine_rx_infer = rx_infer(rx_df = fluoxetine_rx, id_col = "ID", 
                  drug_col = "drug", dose_col = "dose", freq_col = "quantity",
                  date_col = "start_date",
          rx_window_days = 98, assume_days = 28)

If you run rx_infer() by grouping up prescriptions by dose / quantity, this would return 2 prescribing episodes as below.

ID	drug	dose	quantity	start_date	end_date	n_epi
4	fluoxetine	20	60	2010-02-25	2010-05-18	1
4	fluoxetine	40	60	2010-06-30	2010-07-28	1

It returns two prescribing episodes (as shown in above figure, Panel C) as:

Episode	Duration
20mg	`2010-02-15` to `2010-05-18` (28 days after the 3rd fluoxetine prescription, `2010-04-20`)
40mg	`2010-06-30` to `2010-07-28` (28 days after the final fluoxetine prescription, `2010-06-30`)

Any questions?

Please post questions as an issue on the T-Rx GitHub repo here.

The T-Rx package is currently under beta testing. Most functions should have adequate documentation on possible errors.

Please kindly reach out to Chris Lo (chris.lowh@kcl.ac.uk) for feedback on documentation.

Episode Ascertainment

Introduction

Sample dataframe

rx_merge() - Merge Prescriptions into Exposure Periods

Description

Example & Expected Input

Instructions

Output Dataframe

Customizing parameters

rx_infer() - Making Inference on periods of exposure

Description

Repeated Prescriptions

Instructions

Output Dataframe

Customizing parameters

Any questions?

`rx_merge()` - Merge Prescriptions into Exposure Periods

`rx_infer()` - Making Inference on periods of exposure