Introduction


Logic for imputation

The algorithms require users to provide a dataframe of prescriptions (either available directly from EHR databases or extracted with T-Rx modules), with dosage and quantity information extracted for the majority of prescriptions.

The module then splits the prescription dataframe into:

  1. Reference dataframe: contains prescriptions with extracted/cleaned dosage or quantity information
  2. Imputation dataframe: contains prescriptions without dosage or quantity information

Most commonly occurred dosage/quantity for every drug present in the reference dataframe is used to impute the information in imputation dataframe, with the same drug name.

By using the original provided dataframe as training dataset for imputation, it also allows the flexibility to capture differences in practices, where the most commonly prescribed dose/strength could differ across regions.


The use of external references for the imputation of dosage and quantity is still under development. Please kindly contact Chris Lo () for details.

Figure 1: Overview of the imputation module


Sample dataframe

In this tutorial, we utilized a sample dataframe of uncleaned antidepressant prescriptions in UK Biobank (UKB) primary care records (5000 rows, 4 columns), containing the following columns:

Column name descriptions
Column Details Note
drug_name Drug name (from raw UKB primary care records) Please refer to UKB documentation
quantity Quantity issued (from raw UKB primary care records) Please refer to UKB documentation
chem_name Drug name of active ingredient Extracted from https://github.com/chiarafabbri/MDD_TRD_study
func_class Drug class of active ingredient Extracted from https://github.com/chiarafabbri/MDD_TRD_study
Inspect prescription sample
Sample dataframe demonstration (first 6 rows)
drug_name quantity chem_name func_class
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 30 tab - 20 mg paroxetine SSRI
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) 84 tabs nefazodone SARI
Paroxetine 10mg/5ml oral suspension sugar free 60 ml(s) - 10 mg/5 ml paroxetine SSRI
Venlafaxine 37.5mg tablets 1 - Pack of 56 venlafaxine SNRI
Clomipramine 25mg capsules 140 capsules clomipramine tricyclic_antidepressant
Dosulepin 25mg capsules 39 capsule(s) dosulepin tricyclic_antidepressant



Imputation of strength / dosage

Initialization

The sample prescriptions were hosted on T-Rx as the data object antidep_ukb_rx.

Load T-Rx and inspect the data object on R.

# load package
library(TRX)

# inspect sample data object (antidepressant prescriptions)
dim(antidep_ukb_rx) 
# [1] 5000    4


The sample prescriptions in UK Biobank do not have strength/dosage information extracted. Run the strength_extract() function to extract strength/dosage information.

antidep_ukb_extracted = strength_extract(rx_df = antidep_ukb_rx,
                      liquid_strength_unit = c("mg/5ml"),
                      solid_strength_unit = c("mg", "mcg", "microgram"),
                      combined_strength_unit = c("mg", "mcg", "microgram", "miligram"),
                      info_col = c("drug_name"), combined_strength = 2)

Note: For details on strength_extract(), please refer to the instructions here.

Resultant dataframe
Sample prescriptions with strength / dosage extracted (first 6 rows)
drug_name quantity chem_name func_class strength_unit strength
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 30 tab - 20 mg paroxetine SSRI mg 20
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) 84 tabs nefazodone SARI mg 100
Paroxetine 10mg/5ml oral suspension sugar free 60 ml(s) - 10 mg/5 ml paroxetine SSRI mg/5ml 10
Venlafaxine 37.5mg tablets 1 - Pack of 56 venlafaxine SNRI mg 37.5
Clomipramine 25mg capsules 140 capsules clomipramine tricyclic_antidepressant mg 25
Dosulepin 25mg capsules 39 capsule(s) dosulepin tricyclic_antidepressant mg 25
Log file (as messages)


Inspecting problematic prescriptions

From the log files, 31 rows of prescriptions did not have strength/dosage information available.

There are also prescriptions with multiple strength, the dosage of which were separated by ,. However, these strengths need to be matched to specific drugs within the multi-strength products and expanded into separate rows.

Missing strength/dosage information
Prescriptions with missing strength/dosage information
drug_name quantity chem_name func_class strength_unit strength
206 NEFAZODONE starter pack 56.000 nefazodone SARI NA NA
313 Triptafen tablets (AMCo) 400 tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
363 Motipress tablets (Sanofi-Synthelabo Ltd) 30 tablets nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
556 Triptafen tablets (AMCo) tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
591 NEFAZODONE STARTER PACK [TABS (STARTER PACK)] 1.000 nefazodone SARI NA NA
661 PARSTELIN tablets [GLAXSK CON] 84 tranylcypromine MAOI NA NA
1172 TRIPTAFEN-M TAB 120.000 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
1180 Triptafen tablets (AMCo) 30 tabs amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
1254 TRIPTAFEN 56.000 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
1329 TRIPTAFEN TAB 60.000 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
1363 MOTIVAL tabs 60.000 nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
1592 Triptafen tablets (AMCo) 120.000 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
1641 Motipress tablets (Sanofi-Synthelabo Ltd) tablet(s) nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
1994 Trazodone Controlled Rele trazodone SARI NA NA
2374 Motival TABS 112.000 nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
2498 Triptafen tablets (AMCo) 30 tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
2793 Triptafen tablets (AMCo) 2*100 tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
2882 nefazodone starter pack 56 tablets nefazodone SARI NA NA
3053 Triptafen tablets (AMCo) 14 tablets amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
3187 nefazodone starter pack tablet(s) - 14 x 50 mg, 14 x 100 mg, 28 x 200 mg nefazodone SARI NA NA
3404 PARSTELIN tablets [GLAXSK CON] x 60 tranylcypromine MAOI NA NA
3676 MOTIVAL TABLETS 50.000 nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
3689 NEFAZODONE starter pack 1.000 nefazodone SARI NA NA
3848 Triptafen tablets (AMCo) 240 tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
4125 Triptafen tablets (AMCo) 30.000 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
4387 CITALOPRAM 28.000 citalopram SSRI NA NA
4392 nefazodone starter pack 56 tabs (starter pack) nefazodone SARI NA NA
4516 SERTRALINE 56.000 sertraline SSRI NA NA
4535 nefazodone starter pack 1 - tablets (14x50mg,14x100mg,28x200mg) nefazodone SARI NA NA
4668 Dutonin tablets treatment initiation pack (Bristol-Myers Squibb Pharmaceuticals Ltd) 56 nefazodone SARI NA NA
4782 Parstelin Tablet (GlaxoSmithKline Consumer Healthcare) 112.000 tranylcypromine_trifluoperazine MAOI_typical_antipsychotic NA NA
Multiple strength products
Prescriptions for multiple strength products
drug_name quantity chem_name func_class strength_unit strength
72 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 30 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
327 Motival 10mg/500microgram tablets (Sanofi) 3 op nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
403 perphenazine with amitriptyline tablets 2mg + 25mg 28 - tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,25
429 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 120 tablets amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
498 perphenazine with amitriptyline tablets 2mg + 10mg 56 - tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
571 Motival 10mg/500microgram tablets (Sanofi) 30.000 nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
663 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 168 tablet amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
883 Motival 10mg/500microgram tablets (Sanofi) 60 tablets nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
995 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 30 tab amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
1102 Motival 10mg/500microgram tablets (Sanofi) 90 nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
1341 Motival 10mg/500microgram tablets (Sanofi) 168 nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
1377 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 42 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
1431 Motival 10mg/500microgram tablets (Sanofi) 28 tablets nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
1539 Motival 10mg/500microgram tablets (Sanofi) 84 tabs nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
1581 Motival 10mg/500microgram tablets (Sanofi) 120 nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
1634 perphenazine with amitriptyline tablets 2mg + 25mg 56 tablets amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,25
1705 Triptafen m 2mg+10mg Tablet (Goldshield Pharmaceuticals Ltd) 84.000 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
1810 Motival 10mg/500microgram tablets (Sanofi) 90 tablets nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
1899 TRIPTAFEN M tabs 2mg + 10mg 84.000 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
2045 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 30 tablets amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
2079 perphenazine with amitriptyline tablets 2mg + 25mg 100 - tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,25
2281 Motival 10mg/500microgram tablets (Sanofi) 60 tabs nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
2695 Motival 10mg/500microgram tablets (Sanofi) 56 tab nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
2817 Motival 10mg/500microgram tablets (Sanofi) 60 - tablet(s) nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
3301 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 112 tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
3862 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 40 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
3906 Motival 10mg/500microgram tablets (Sanofi) 30 tablet(s) nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
3942 TRIPTAFEN M tabs 2mg + 10mg 168.000 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
4046 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 28 tablets amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
4300 PERPHENAZINE + AMITRIPTYLINE tabs 2mg + 10mg 84.000 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
4391 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 56 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
4928 Motival 10mg/500microgram tablets (Sanofi) 100 tablets nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500



strength_impute()

Logic

strength_impute() primarily performs two actions:

  1. Creating an imputation reference, then match to prescriptions without strength/dosage information;

  2. Expanding multiple strength products (assuming they were contained in single row with separator strings) to prescriptions of multiple rows, each representing the correct dosage of individual ingredients within the product.

Note: Expansion of multiple strength products builds on the presumption that prescriptions of the multi-strength product is summarised in one single row, with information on the names, strengths and strength units of each constituent drug.

Figure 2: The strength_impute() function



Instructions

Using the above antidep_ukb_extracted dataframe, run the strength_impute() function.

impute_df = strength_impute(rx_df = antidep_ukb_extracted,
                    drug_separator = "_", ref_drug_col = "chem_name",
                    strength_separator = ",", strength_colname = "strength",
                    strength_unit_colname = "strength_unit", strength_unit_separator = ",")

Note: If you wish to only perform imputation without the need to expand multiple strength products, please still provide the drug_separator, strength_separator and strength_unit_separator arguments as default. It would not affect imputation algorithms unless the drug names contains strings of the separator specified.

Description on arguments
Argument Details
rx_df Prescription dataframe
drug_separator The separator used to separate multiple drug names in a single cell
ref_drug_col The column name of the drug name in the dataframe
strength_separator The separator used to separate multiple strength values in a single cell
strength_colname The column name for strength information
strength_unit_separator The separator used to separate multiple strength unit values in a single cell
strength_unit_colname The column name for strength units
Log file (as messages)



Output

The log file shows:

  1. 52 prescriptions with multi-ingredient products (scanned based on the separator arguments)
  2. 31 prescriptions requires strength imputation (<NA> in strength_colname)

Here we list out prescription samples of missing strengths and multiple ingredients for inspection.

Details for imputed prescriptions


Prescription with missing strength (Sertraline as example)
drug_name quantity chem_name func_class strength_unit strength
4566 SERTRALINE 56.000 sertraline SSRI mg 50


Prescription with multiple strengths
drug_name quantity chem_name func_class strength_unit strength
perphenazine with amitriptyline tablets 2mg + 25mg 28 - tablet(s) perphenazine tricyclic_antidepressant_typical_antipsychotic mg 2
perphenazine with amitriptyline tablets 2mg + 25mg 28 - tablet(s) amitriptyline tricyclic_antidepressant_typical_antipsychotic mg 25

You would see the same row of prescription is now expanded to two rows, which contains perphenazine and amitriptyline with the strength/dosage mapped to the correct drugs.


Failure for imputation

Out of the 31 prescriptions, 30 can be imputed. The remaining prescriptions which the dosage cannot be imputed, is coded as failure string instead.

drug_name quantity chem_name func_class strength_unit strength
4833 Parstelin Tablet (GlaxoSmithKline Consumer Healthcare) 112.000 trifluoperazine MAOI_typical_antipsychotic failure failure



Imputation of quantity

COMING SOON


This section is currently under development. Please stay tuned and check later!


Additional functions for UKB primary care records


multi_num_infer()


Logic

The multi_num_infer() function is a function to infer quantity information from UKB-formatted records. Please use with caution when used in other settings as this function is currently not validated for prescription records outside of the UKB.

In UKB-formatted records, most details of the prescriptions were described in the drug_name and quantity columns.

The two columns usually contain information on:

  • Dosage / Strength
  • Quantity
  • Multiplier

Therefore, if the dosage, multiplier information were extracted as integers already, we could infer the remaining integer to be quantity.

To achieve this, the multi_num_infer() function looks for all integers that were present in info_col, and compare the list of integers to the existing extracted integers in strength and multiplier information.

If there is only one outstanding integer, we can infer that as the quantity. Otherwise, the quantity would remain as <NA> (as ambiguity in extraction).

Figure 3: The multi_num_infer() function



Initialization

The multi_num_infer() function works when dosage/strength, quantity and multiplier information were extracted (fully or partly) using upstream functions in T-Rx, including

  • strength_extract() (with option to run strength_impute())
  • quantity_extract()
  • multiplier_extract()

Using antidepressant prescriptions in UK Biobank as an example (antidep_ukb_rx), run the above functions to clean the prescription records.

Functions
# Run strength extraction function with UK Biobank prescription sample
antidep_ukb_infer = strength_extract(rx_df = antidep_ukb_rx,
                      liquid_strength_unit = c("mg/ml","mg/5ml"),
                      solid_strength_unit = c("mg", "mcg", "microgram"),
                      combined_strength_unit = c("mg", "mcg", "microgram", "miligram"),
                      info_col = c("drug_name"), combined_strength = 2,
                      no_strength_unit_remove = FALSE, no_strength_unit = "mg")

# Run strength imputation function
antidep_ukb_infer = strength_impute(rx_df = antidep_ukb_infer,
                    drug_separator = "_", ref_drug_col = "chem_name",
                    strength_separator = ",", strength_colname = "strength",
                    strength_unit_colname = "strength_unit", strength_unit_separator = ",")

# remove prescription with failure in imputation
antidep_ukb_infer = antidep_ukb_infer[!grepl("failure", antidep_ukb_infer$strength),]

# Run quantity extraction function
antidep_ukb_infer = quantity_extract(rx_df = antidep_ukb_infer,
                      liquid_forms = c("susp", "suspension", "liq", "liquid", "syrup", "soln", "solution", "drop", "elixir"),
                      solid_forms = c("tab", "cap"),
                      form_info_col = c("quantity", "drug_name"),
                      dose_form_colname = "dosage_form",
                      form_quant_colname = "form_quantity",
                      liq_form_suffix = c("ml", "millilitre"),
                      quantity_info_col = c("quantity"), no_form_remove = TRUE)

# Run multiplier extraction function
antidep_ukb_infer = multiplier_extract(rx_df = antidep_ukb_infer,
                         multipliers =  c("pack", "pack of" , "\\*"),
                         alt_multipliers = c("x"),
                         info_col = c("quantity"),
                         multiplier_colname = "multiplier", qc_remove = TRUE)


Note: We inserted quality control parameters for quantity_extract() and multiplier_extract() to make up a cleaned presccription dataframe. Please refer to the instructions here for details.

Inspecting problematic prescriptions, with missing quantity (first 6 rows)
Prescriptions with missing quantity information (first 6 rows)
drug_name quantity chem_name func_class strength_unit strength dosage_form form_quantity multiplier
4 Venlafaxine 37.5mg tablets 1 - Pack of 56 venlafaxine SNRI mg 37.5 tab NA 1
9 Duloxetine 30mg gastro-resistant capsules 60.000 duloxetine SNRI mg 30 cap NA 1
10 AMITRIPTYLINE 25mg tablets 60.000 amitriptyline tricyclic_antidepressant mg 25 tab NA 1
11 Mirtazapine Orodispersible TABS 30MG 90.000 mirtazapine other mg 30 tab NA 1
13 Duloxetine Gastro Resistant CAPS 30MG 14.000 duloxetine SNRI mg 30 cap NA 1
19 DOTHIEPIN TABS 75MG 21.000 dosulepin tricyclic_antidepressant mg 75 tab NA 1



Instructions

Using the above half-cleaned dataframe, run the multi_num_infer() function:

antidep_ukb_infer_post = multi_num_infer(rx_df = antidep_ukb_infer,
                     info_col = "quantity", strength_colname = "strength",
                     strength_unit_colname = "strength_unit",
                     dose_form_colname = "dosage_form",
                     form_quant_colname = "form_quantity")
Description on arguments
Argument Details
rx_df Prescription dataframe
strength_colname The column name in ‘rx_df’ containing strength information
strength_unit_colname The column name in ‘rx_df’ containing strength unit information
dose_form_colname The column name in ‘rx_df’ containing dosage form information
form_quant_colname The column name in ‘rx_df’ containing form quantity information



Output

The multi_num_infer() function provides inferrence on quantity information on many prescriptions where quantity information cannot be extracted from quantity_extract().

Run the below lines to check pre-/post-inferrence changes.

# Pre-inferrence
sum(is.na(antidep_ukb_infer$form_quantity)) # 2380
# Post-inferrence
sum(is.na(antidep_ukb_infer_post$form_quantity)) #70

Out of the 2380 prescriptions without quantity information extracted, only 70 of them did not have the quantity information inferred.


Checking problematic prescriptions

Run the below lines to check problematic prescriptions where inferrence cannot be made.

Code
antidep_ukb_infer_na = antidep_ukb_infer_post[is.na(antidep_ukb_infer_post$form_quantity),]
Prescriptions with missing information on quantity after inferrence
Prescriptions with missing quantity information
drug_name quantity chem_name func_class strength_unit strength dosage_form form_quantity multiplier
58 Trimipramine 25mg tablets trimipramine tricyclic_antidepressant mg 25 tab NA 1
79 Amitriptyline 10mg tablets tablet(s) Tablets amitriptyline tricyclic_antidepressant mg 10 tab NA 1
122 DOTHIEPIN TABS 75 MG dosulepin tricyclic_antidepressant mg 75 tab NA 1
142 Prozac 20mg capsules (Eli Lilly and Company Ltd) 2 - Pack of 28 (2 X 14) fluoxetine SSRI mg 20 cap NA 2
159 Fluanxol 1mg tablets (Lundbeck Ltd) tablet(s) flupentixol phenothiazine_antipsychotic mg 1 tab NA 1
193 Citalopram 20mg tablets tablet(s) Tablets citalopram SSRI mg 20 tab NA 1
213 Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 2 - Pack of 30 (2 X 15) paroxetine SSRI mg 20 tab NA 2
311 Escitalopram 10mg tablets See Dosage For Quantity escitalopram SSRI mg 10 tab NA 1
531 Sertraline 100mg tablets tablet(s) sertraline SSRI mg 100 tab NA 1
562 Triptafen tablets (AMCo) tablet(s) amitriptyline tricyclic_antidepressant_typical_antipsychotic mg 10 tab NA 1
563 Triptafen tablets (AMCo) tablet(s) perphenazine tricyclic_antidepressant_typical_antipsychotic mg 2 tab NA 1
719 Mirtazapine 30mg tablets -1 - 30 mg mirtazapine other mg 30 tab NA 1
802 Mirtazapine 30mg tablets See Dosage For Quantity mirtazapine other mg 30 tab NA 1
805 Dosulepin 25mg capsules (Almus Pharmaceuticals Ltd) See Dosage For Quantity dosulepin tricyclic_antidepressant mg 25 cap NA 1
1082 Fluoxetine 20mg capsules 3 - Pack of 30 (2 X 15) fluoxetine SSRI mg 20 cap NA 3
1245 Citalopram tablets 20mg citalopram SSRI mg 20 tab NA 1
1353 Sertraline 100mg tablets 1 op - 100 mg sertraline SSRI mg 100 tab NA 1
1619 Prozac 20mg capsules (Eli Lilly and Company Ltd) capsule(s) Capsules fluoxetine SSRI mg 20 cap NA 1
1660 Motipress tablets (Sanofi-Synthelabo Ltd) tablet(s) nortriptyline tricyclic_antidepressant_typical_antipsychotic mg 10 tab NA 1
1661 Motipress tablets (Sanofi-Synthelabo Ltd) tablet(s) fluphenazine tricyclic_antidepressant_typical_antipsychotic microgram 500 tab NA 1
1688 Amitriptyline 10mg tablets amitriptyline tricyclic_antidepressant mg 10 tab NA 1
1741 Dosulepin 75mg tablets 1 months supply - 75 mg dosulepin tricyclic_antidepressant mg 75 tab NA 1
1926 Mianserin 10mg tablets mianserin tetracyclic_antidepressant mg 10 tab NA 1
1996 Fluoxetine 20mg capsules 1 month - 20 mg fluoxetine SSRI mg 20 cap NA 1
2026 Molipaxin 150mg tablets (Zentiva) trazodone SARI mg 150 tab NA 1
2092 Amitriptyline 25mg tablets one month amitriptyline tricyclic_antidepressant mg 25 tab NA 1
2166 Citalopram 10mg tablets 1 28x3 citalopram SSRI mg 10 tab NA 28
2203 Lustral 100mg tablets (Pfizer Ltd) See Dosage For Quantity sertraline SSRI mg 100 tab NA 1
2254 Citalopram 20mg tablets citalopram SSRI mg 20 tab NA 1
2358 Mianserin Hydrochloride Tablets 30 mg mianserin tetracyclic_antidepressant mg 30 tab NA 1
2458 Cipramil 20mg tablets (Lundbeck Ltd) 4/52 citalopram SSRI mg 20 tab NA 1
2470 Prothiaden 25mg capsules (Teofarma) 4/52 dosulepin tricyclic_antidepressant mg 25 cap NA 1
2473 CLOMIPRAMINE capsules 50mg [SANDOZ] clomipramine tricyclic_antidepressant mg 50 cap NA 1
2533 Venlafaxine 37.5mg tablets See Dosage For Quantity venlafaxine SNRI mg 37.5 tab NA 1
2612 Lofepramine 70mg tablets See Dosage For Quantity lofepramine tricyclic_antidepressant mg 70 tab NA 1
2643 Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) tablet(s) Tablets paroxetine SSRI mg 20 tab NA 1
2670 Surmontil 50mg capsules (Sanofi) 1 o.p. - 50 mg trimipramine tricyclic_antidepressant mg 50 cap NA 1
2693 MARPLAN TAB 10mg isocarboxazid MAOI mg 10 tab NA 1
2894 Amitriptyline 25mg tablets EIGHTY FOUR amitriptyline tricyclic_antidepressant mg 25 tab NA 1
3071 Maprotiline 25mg tablets See Dosage For Quantity maprotiline tetracyclic_antidepressant mg 25 tab NA 1
3084 Oxactin 20mg capsules (Discovery Pharmaceuticals) 1 30 x 2 fluoxetine SSRI mg 20 cap NA 30
3108 Lustral 50mg tablets (Pfizer Ltd) tablets sertraline SSRI mg 50 tab NA 1
3155 Amitriptyline 25mg tablets 1 o.p - 25 mg amitriptyline tricyclic_antidepressant mg 25 tab NA 1
3167 Fluoxetine 20mg capsules 3 Pack of 30 (2 X 15) fluoxetine SSRI mg 20 cap NA 3
3223 Maprotiline 50mg tablets See Dosage For Quantity maprotiline tetracyclic_antidepressant mg 50 tab NA 1
3251 Dosulepin 75mg tablets tablet(s) Tablets dosulepin tricyclic_antidepressant mg 75 tab NA 1
3257 Lofepramine 70mg tablets 2 /52 lofepramine tricyclic_antidepressant mg 70 tab NA 1
3266 Citalopram 20mg tablets -1 - 20 mg citalopram SSRI mg 20 tab NA 1
3319 Reboxetine 4mg tablets 1 pack - 4 mg reboxetine NRI mg 4 tab NA 1
3378 Prothiaden 25mg capsules (Teofarma) dosulepin tricyclic_antidepressant mg 25 cap NA 1
3479 Manerix 150mg tablets (Meda Pharmaceuticals Ltd) tablets moclobemide MAOI mg 150 tab NA 1
3648 Lentizol 25mg modified-release capsules (Pfizer Ltd) amitriptyline tricyclic_antidepressant mg 25 cap NA 1
3651 Dosulepin 75mg tablets tablet(s) dosulepin tricyclic_antidepressant mg 75 tab NA 1
3654 Paroxetine 20mg tablets paroxetine SSRI mg 20 tab NA 1
3696 Paroxetine 30mg tablets paroxetine SSRI mg 30 tab NA 1
3738 Nortriptyline 10mg tablets nortriptyline tricyclic_antidepressant mg 10 tab NA 1
3789 Fluvoxamine 50mg tablets fluvoxamine SSRI mg 50 tab NA 1
3840 Mianserin Hydrochloride Tablets 10 mg mianserin tetracyclic_antidepressant mg 10 tab NA 1
3889 Seroxat 30mg tablets (GlaxoSmithKline UK Ltd) See Dosage For Quantity paroxetine SSRI mg 30 tab NA 1
4003 Venlafaxine 150mg modified-release capsules venlafaxine SNRI mg 150 cap NA 1
4348 Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 1 months supply - 20 mg paroxetine SSRI mg 20 tab NA 1
4505 Prozac 20mg capsules (Eli Lilly and Company Ltd) fluoxetine SSRI mg 20 cap NA 1
4600 Citalopram 40mg tablets 4/52 citalopram SSRI mg 40 tab NA 1
4617 Lustral 50mg tablets (Pfizer Ltd) tablet(s) sertraline SSRI mg 50 tab NA 1
4646 Lofepramine 70mg tablets 2 Pack of 56 (4 X 14) lofepramine tricyclic_antidepressant mg 70 tab NA 2
4879 Paroxetine 20mg tablets 2 - Pack of 30 (2 X 15) paroxetine SSRI mg 20 tab NA 2
4895 SERTRALINE tabs 50mg sertraline SSRI mg 50 tab NA 1
4922 Lustral 50mg tablets (Pfizer Ltd) 1 Pack of 28 (2 X 14) sertraline SSRI mg 50 tab NA 1
4992 PROTHIADEN tablets 75mg dosulepin tricyclic_antidepressant mg 75 tab NA 1
4995 Mirtazapine 15mg tablets mirtazapine other mg 15 tab NA 1


The remainder prescriptions without quantity information primarily consist of:

  1. Genuine missingness in quantity
  2. Quantity summarised as duration (e.g.: 4/52)
  3. Conflicting numbers (e.g.: 2 Pack of 56 (4 X 14))

It is recommended to handle these prescriptions manually, or use some of the downstream functions (such as duration_handling(), quantity_impute()).



Optional Arguments

If users wish to remove the prescriptions without quantity information inferred, change ambig_filter to TRUE.

antidep_ukb_infer_post = multi_num_infer(rx_df = antidep_ukb_infer,
                     info_col = "quantity", strength_colname = "strength",
                     strength_unit_colname = "strength_unit",
                     dose_form_colname = "dosage_form",
                     form_quant_colname = "form_quantity",
                     ambig_filter = TRUE)

If the ambig_filter argument is set to TRUE, prescriptions without quantity information inferred would be removed.



duration_handling()


Logic

The duration_handling() function is a function to infer quantity information from duration-based strings.

Please use with caution when used in other settings as this function is currently not validated for prescription records outside of the UKB.

Prescriptions were sometimes written as duration-based strings, corresponding as the following:

Duration Strings
months /12, month, months
weeks /52, week, weeks
days /365, day, days


The duration_handling() function handles these strings and convert them into quantity.

Figure 4: The duration_handling() function



Initialization

It is recommended to run duration_handling() function when the users are certain that there are no other combinations of strings outside of the table of duration-based strings listed above.

We used antidep_ukb_infer_post and antidep_ukb_infer_na dataframe as starting points, which has got dosage and quantity information inferred for most prescriptions.

Please refer to multi_num_infer() sections for details.

antidep_ukb_infer_na = antidep_ukb_infer_post[is.na(antidep_ukb_infer_post$form_quantity),]



Instructions

For simplicity, run the duration_handling() function on antidep_ukb_infer_na dataframe, where quantity information cannot be imputed from multi_num_infer().

antidep_ukb_infer_na = duration_handling(rx_df = antidep_ukb_infer_na,
                       info_col = c("quantity"), form_quant_colname = c("form_quantity"))


The function looks for duration-based strings in the columns listed in the info_col argument, but only for the prescriptions without pre-existing quantity information (specified using form_quantity argument).

Description on arguments
Argument Details
rx_df Prescription dataframe
month_string A character string or vector of strings representing months
week_string A character string or vector of strings representing weeks
day_string A character string or vector of strings representing days
info_col The column names in ‘rx_df’ to search for duration-based strings
form_quant_colname The column name in ‘rx_df’ where form quantities are stored

Note: We have configured default strings for duration-based strings from the above table. If users wish to change the strings for search, change the month_string, week_string or day_string arguments.


Output (only showing prescriptions with duration-based strings)
Prescriptions with duration-based strings
drug_name quantity chem_name func_class strength_unit strength dosage_form form_quantity multiplier
22 Dosulepin 75mg tablets 1 months supply - 75 mg dosulepin tricyclic_antidepressant mg 75 tab 30 1
24 Fluoxetine 20mg capsules 1 month - 20 mg fluoxetine SSRI mg 20 cap 30 1
31 Cipramil 20mg tablets (Lundbeck Ltd) 4/52 citalopram SSRI mg 20 tab 28 1
32 Prothiaden 25mg capsules (Teofarma) 4/52 dosulepin tricyclic_antidepressant mg 25 cap 28 1
47 Lofepramine 70mg tablets 2 /52 lofepramine tricyclic_antidepressant mg 70 tab 14 1
61 Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 1 months supply - 20 mg paroxetine SSRI mg 20 tab 30 1
63 Citalopram 40mg tablets 4/52 citalopram SSRI mg 40 tab 28 1



Troubleshooting

Please post questions as an issue on the T-Rx GitHub repo here.

The T-Rx package is currently under beta testing. Most functions should have adequate documentation on possible errors.

Please kindly reach out to Chris Lo () for feedback on documentation.