Introduction


Logic for extraction

The algorithm utilizes regular expression (REGEX) patterns in prescription records to extract strength / dosage, as well as quantity information from prescription records.

In this module, users are required to specify:

  1. dosage units (e.g: “mg”, “mcg”, “mg/5ml”)
  2. dosage forms that captures units of quantity (e.g.: “tab”, “cap”, “suspension”)

Using these information, the numbers preceding user-specified dosage units / dosage forms are extracted as dose and quantity respectively.


Sample dataframe

In this tutorial, we provided a sample dataframe of antidepressant prescriptions, the structure of which was same to those in UK Biobank (UKB) primary care records (5000 rows, 4 columns). These strings can be mapped using READv2 / BNF / dm+d codes that are publicly available, with details described at https://github.com/chiarafabbri/MDD_TRD_study.

Column name descriptions
Column Details Note
drug_name Drug name (from raw UKB primary care records) Please refer to UKB documentation
quantity Quantity issued (from raw UKB primary care records) Please refer to UKB documentation
chem_name Drug name of active ingredient Extracted from https://github.com/chiarafabbri/MDD_TRD_study
func_class Drug class of active ingredient Extracted from https://github.com/chiarafabbri/MDD_TRD_study
Inspect prescription sample
Sample dataframe demonstration (first 6 rows)
drug_name quantity chem_name func_class
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 30 tab - 20 mg paroxetine SSRI
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) 84 tabs nefazodone SARI
Paroxetine 10mg/5ml oral suspension sugar free 60 ml(s) - 10 mg/5 ml paroxetine SSRI
Venlafaxine 37.5mg tablets 1 - Pack of 56 venlafaxine SNRI
Clomipramine 25mg capsules 140 capsules clomipramine tricyclic_antidepressant
Dosulepin 25mg capsules 39 capsule(s) dosulepin tricyclic_antidepressant


For details of UKB primary care records, please refer to the UKB documentation here.



Extraction of strength / dosage

Initialization

The sample prescriptions were hosted on T-Rx as the data object antidep_ukb_rx.

Load T-Rx and inspect the data object on R.

# load package
library(TRX)

# inspect sample data object (antidepressant prescriptions)
dim(antidep_ukb_rx) 
# [1] 5000    4



strength_extract()

Specifying strength / dose units

The strength_extract() function requires user to specify strings of strength / dosage units of solid and liquid dosage forms.

Note: Common samples of dosage forms:

  • Solid: tablets, capsules, pills, granules
  • Liquid: solutions, suspensions, mixtures, elixirs, syrups, drops

In these dosage forms, users are required to determine these strings of common dosing units.

The choice of strength / dosage units should based on clinical context, and the data source of interest.

Taking antidepressants as an example, many were available as:

  • Solid: mg, mcg, miligram, microgram
  • Liquid (as concentrations): mg/5ml, mg/1ml, mg/ml

These strings are specified under the liquid_strength_unit and solid_strength_unit arguments in the strength_extract() function.



A simplified example

As illustration, we created a hypothetical sample dataframe (with similar structure as UKB primary care records) containing 5 prescriptions as below.

Hypothetical prescription sample
drug_name quantity
Venlafaxine 75mg modified-release capsules 28 capsules - 75 mg
Motival 10mg/500microgram capsules (Sanofi) 1*28 capsule(s)
TRIPTAFEN tabs 2mg + 25mg 2 packs of 28 tablet(s)
LOFEPRAMINE sf susp 70mg/5ml 100ml
Amitriptyline 10mg tablets (Wockhardt UK Ltd) 28 tablets
Code chunk to create sample dataframe
df <- data.frame(drug_name = c("Venlafaxine 75mg modified-release capsules","Motival 10mg/500microgram capsules (Sanofi)","TRIPTAFEN tabs 2mg + 25mg","LOFEPRAMINE sf susp 70mg/5ml","Amitriptyline 10mg tablets (Wockhardt UK Ltd)"),
                 quantity = c("28 capsules - 75 mg","1*28 capsule(s)","2 packs of 28 tablet(s)","100ml","28 tablets"))


Using this sample dataframe, run the strength_extract() function using the following arguments:

# Extract single strength information
single_df = strength_extract(rx_df = df, liquid_strength_unit = c("mg/5ml"), 
                                  solid_strength_unit = c("mg", "mcg", "microgram"),
                                  info_col = c("drug_name"))
Explanation on arguments
Argument Details
rx_df The prescription dataframe (df)
liquid_strength_unit Possible strength units of liquid dosage forms
solid_strength_unit Possible strength units of solid dosage forms
info_col Column names where strength and strength units were extracted from (drug_name)


Afterwards, check the outputs in R / RStudio:

Log file (as messages)

Extracted prescription dataframe
drug_name quantity strength_unit strength
Venlafaxine 75mg modified-release capsules 28 capsules - 75 mg mg 75
Motival 10mg/500microgram capsules (Sanofi) 1*28 capsule(s) mg 10
TRIPTAFEN tabs 2mg + 25mg 2 packs of 28 tablet(s) mg 2
LOFEPRAMINE sf susp 70mg/5ml 100ml mg/5ml 70
Amitriptyline 10mg tablets (Wockhardt UK Ltd) 28 tablets mg 10


Running strength_extract() creates two new columns in the prescription dataframe:

  1. strength_unit: Strength Units extracted using liquid_strength_unit / solid_strength_unit arguments specified by user.
  2. strength: Numbers preceding strength_unit.

Note: strength_extract() did not handle combination products by default.

Therefore, only the first ingredient in TRIPTAFEN tabs 2mg + 25mg extracted as 2mg. The expected number of ingredients in combination products can be specified with the combined_strength and combined_strength_unit arguments.



UKB prescriptions with combination products

In practical situations, one should expect the prescriptions to contain combination products with multiple ingredients at the same time.

T-Rx also offers function to handle these products, assuming + or / were used as separators between multiple strengths / strength units. Examples include 2mg + 25mg and 2mg/25mg.

To extract all strengths in combination products, these two arguments need to be added:

Column Details
combined_strength Maximum number of ingredients expected in combination products
combined_strength_unit The strings of strength units used to extract strengths in combination products

Note: It is recommended to only use solid dosage form units (e.g.: mg, mcg) to avoid confusion with liquid dosage forms. In combination products, the strengths were rarely expressed as concentrations even as liquid dosage forms.


Run the strength_extract() function again, but in the UKB prescription sample (with multiple ingredients expected).

# Re-run strength extraction function again, but add multi-ingredient options
antidep_ukb_rx = strength_extract(rx_df = antidep_ukb_rx, liquid_strength_unit = c("mg/5ml"),
                            solid_strength_unit = c("mg", "mcg", "microgram"),
                            info_col = c("drug_name"), 
                            combined_strength = 2,
                            combined_strength_unit = c("mg", "mcg", "microgram", "miligram"))
Inspecting the output (First 100 rows)
drug_name quantity chem_name func_class strength_unit strength
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 30 tab - 20 mg paroxetine SSRI mg 20
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) 84 tabs nefazodone SARI mg 100
Paroxetine 10mg/5ml oral suspension sugar free 60 ml(s) - 10 mg/5 ml paroxetine SSRI mg/5ml 10
Venlafaxine 37.5mg tablets 1 - Pack of 56 venlafaxine SNRI mg 37.5
Clomipramine 25mg capsules 140 capsules clomipramine tricyclic_antidepressant mg 25
Dosulepin 25mg capsules 39 capsule(s) dosulepin tricyclic_antidepressant mg 25
Efexor 37.5mg tablets (Wyeth Pharmaceuticals) 14 tablets - 37.5 mg venlafaxine SNRI mg 37.5
Fluvoxamine 50mg tablets 30 tab fluvoxamine SSRI mg 50
Duloxetine 30mg gastro-resistant capsules 60.000 duloxetine SNRI mg 30
AMITRIPTYLINE 25mg tablets 60.000 amitriptyline tricyclic_antidepressant mg 25
Mirtazapine Orodispersible TABS 30MG 90.000 mirtazapine other mg 30
Sertraline 100mg tablets 10 tablets sertraline SSRI mg 100
Duloxetine Gastro Resistant CAPS 30MG 14.000 duloxetine SNRI mg 30
Yentreve 20mg gastro-resistant capsules (Eli Lilly and Company Ltd) 56 caps duloxetine SNRI mg 20
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 120 tablet(s) - 20 mg paroxetine SSRI mg 20
Amitriptyline 25mg tablets 74 tablet amitriptyline tricyclic_antidepressant mg 25
Duloxetine 30mg gastro-resistant capsules 30 capsule - 30 mg duloxetine SNRI mg 30
Nortriptyline 10mg tablets 14 tablet - 10 mg nortriptyline tricyclic_antidepressant mg 10
DOTHIEPIN TABS 75MG 21.000 dosulepin tricyclic_antidepressant mg 75
Zispin SolTab 45mg orodispersible tablets (Merck Sharp & Dohme Ltd) 60 tablet mirtazapine other mg 45
Paroxetine 20mg tablets 1 x 90 days paroxetine SSRI mg 20
Lofepramine 70mg tablets 84 tablet(s) - 70 mg lofepramine tricyclic_antidepressant mg 70
Citalopram 10mg tablets 28.000 citalopram SSRI mg 10
Fluoxetine 20mg capsules (Teva UK Ltd) 3 packs of 30 capsule(s) fluoxetine SSRI mg 20
ESCITALOPRAM TABLETS 20MG 56.000 escitalopram SSRI mg 20
Fluoxetine CAPS 20MG 1000.000 fluoxetine SSRI mg 20
Venlafaxine 37.5mg tablets 2*56 tablets venlafaxine SNRI mg 37.5
Zispin SolTab 45mg orodispersible tablets (Merck Sharp & Dohme Ltd) 30 tab mirtazapine other mg 45
Trazodone 150mg tablets 14 tablet trazodone SARI mg 150
Amitriptyline 25mg tablets 21 TABLET amitriptyline tricyclic_antidepressant mg 25
Ludiomil 75mg tablets (Novartis Pharmaceuticals UK Ltd) 28 tabs maprotiline tetracyclic_antidepressant mg 75
Fluoxetine 20mg capsules 70 capsule - 20 mg fluoxetine SSRI mg 20
Clomipramine 50mg capsules 150 capsules clomipramine tricyclic_antidepressant mg 50
Venlafaxine 75mg modified-release capsules 28 capsule(s) - 75 mg venlafaxine SNRI mg 75
Dosulepin 25mg capsules 252 capsule(s) - 25 mg dosulepin tricyclic_antidepressant mg 25
Trimipramine 25mg tablets 28 - tablet(s) trimipramine tricyclic_antidepressant mg 25
Sertraline 50mg tablets (Sandoz Ltd) 2 packs of 28 tablet(s) sertraline SSRI mg 50
Clomipramine 10mg capsules 300 capsules clomipramine tricyclic_antidepressant mg 10
Clomipramine 10mg capsules 224 capsule clomipramine tricyclic_antidepressant mg 10
Venlafaxine 225mg modified-release tablets 56.000 venlafaxine SNRI mg 225
Venlafaxine 37.5mg tablets 1 venlafaxine SNRI mg 37.5
Gamanil 70mg tablets (Merck Serono Ltd) 28 tablets - 70 mg lofepramine tricyclic_antidepressant mg 70
CLOMIPRAMINE caps 25mg 14.000 clomipramine tricyclic_antidepressant mg 25
DOTHIEPIN tabs 75mg 30.000 dosulepin tricyclic_antidepressant mg 75
Prothiaden 25mg capsules (Teofarma) 30 days - 25 mg dosulepin tricyclic_antidepressant mg 25
Amitriptyline 10mg tablets 28 tablets - 10 mg amitriptyline tricyclic_antidepressant mg 10
Imipramine 25mg tablets 30 tabs imipramine tricyclic_antidepressant mg 25
FLUPENTIXOL TABLETS 1MG 120.000 flupentixol phenothiazine_antipsychotic mg 1
Paroxetine 10mg tablets 30 paroxetine SSRI mg 10
Lustral 50mg tablets (Pfizer Ltd) 28 tablet - 50 mg sertraline SSRI mg 50
VENLAFAXINE MR CAPSULES 75MG 30.000 venlafaxine SNRI mg 75
Mirtazapine 30mg tablets 7 - tablet(s) mirtazapine other mg 30
LOFEPRAMINE TAB 70mg 6.000 lofepramine tricyclic_antidepressant mg 70
Imipramine 25mg tablets 112 tablets imipramine tricyclic_antidepressant mg 25
Sertraline 50mg tablets (Teva UK Ltd) 2 packs of 28 tablet(s) sertraline SSRI mg 50
Nortriptyline TABS 25MG 224.000 nortriptyline tricyclic_antidepressant mg 25
Escitalopram 5mg tablets 28 escitalopram SSRI mg 5
Trimipramine 25mg tablets trimipramine tricyclic_antidepressant mg 25
NORTRIPTYLINE TABLETS 10MG 200.000 nortriptyline tricyclic_antidepressant mg 10
MIRTAZAPINE TABLETS 30MG 84.000 mirtazapine other mg 30
Nefazodone Hydrochloride Tablets 100 mg 56.000 nefazodone SARI mg 100
AMITRIPTYLINE HYDROCHLORIDE tablets 50mg 28.000 amitriptyline tricyclic_antidepressant mg 50
Amitriptyline 10mg tablets 12 tab amitriptyline tricyclic_antidepressant mg 10
Mirtazapine Orodispersible tablets 15mg 30.000 mirtazapine other mg 15
Prothiaden 25mg capsules (Teofarma) 56 capsule dosulepin tricyclic_antidepressant mg 25
Amitriptyline 10mg/5ml oral solution sugar free 300 millilitres amitriptyline tricyclic_antidepressant mg/5ml 10
Mirtazapine 15mg tablets 30 tablet - 15 mg mirtazapine other mg 15
Nortriptyline 10mg tablets 1 x 28 nortriptyline tricyclic_antidepressant mg 10
Fluoxetine 20mg capsules 120 - capsule fluoxetine SSRI mg 20
DOTHIEPIN HCL TAB 75MG 56.000 dosulepin tricyclic_antidepressant mg 75
Prozac Capsules 20 mg 60.000 fluoxetine SSRI mg 20
TRIPTAFEN M tablets 2mg + 10mg [AMCO] 30 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
Surmontil 50mg capsules (Sanofi) 60 capsules trimipramine tricyclic_antidepressant mg 50
Amitriptyline 10mg tablets 80 tablets - 10 mg amitriptyline tricyclic_antidepressant mg 10
Dosulepin 75mg tablets 28 - tablets dosulepin tricyclic_antidepressant mg 75
Fluoxetine 20mg capsules 6 capsule fluoxetine SSRI mg 20
Escitalopram Tablets 20 mg 28.000 escitalopram SSRI mg 20
Amitriptyline 10mg tablets tablet(s) Tablets amitriptyline tricyclic_antidepressant mg 10
Paroxetine 20mg tablets 90 days tablet(s) - 20 mg paroxetine SSRI mg 20
Zispin 30mg tablets (Organon Laboratories Ltd) 14 tablets - 30 mg mirtazapine other mg 30
Amitriptyline 25mg tablets 56.00 amitriptyline tricyclic_antidepressant mg 25
Venlafaxine Hydrochloride M/R capsules 150 mg 28.000 venlafaxine SNRI mg 150
Amitriptyline 25mg tablets 200 tablets amitriptyline tricyclic_antidepressant mg 25
Dosulepin 25mg capsules 210 capsule(s) - 25 mg dosulepin tricyclic_antidepressant mg 25
Paroxetine 20mg tablets 28 tablet(s) Temp. paroxetine SSRI mg 20
DOTHIEPIN TABLETS 75 MG 30.000 dosulepin tricyclic_antidepressant mg 75
Nortriptyline 10mg tablets 28 tablet - 10 mg nortriptyline tricyclic_antidepressant mg 10
Fluoxetine 20mg/5ml oral solution 280 millilitres fluoxetine SSRI mg/5ml 20
Escitalopram 20mg tablets 56 tab escitalopram SSRI mg 20
Nortriptyline 10mg tablets 180.000 nortriptyline tricyclic_antidepressant mg 10
Citalopram 10mg tablets (A A H Pharmaceuticals Ltd) 1 pack of 28 tablet(s) citalopram SSRI mg 10
Citalopram TABS 40MG 30.000 citalopram SSRI mg 40
Venlafaxine 150mg modified-release capsules 56 tablets venlafaxine SNRI mg 150
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 2*30 tablet - 20 mg paroxetine SSRI mg 20
Amitriptyline Hydrochloride Oral solution 10 mg/5 ml 300.000 amitriptyline tricyclic_antidepressant mg/5 ml 10
Amitriptyline 25mg tablets 12 amitriptyline tricyclic_antidepressant mg 25
Paroxetine 20mg tablets 14 tablet(s) paroxetine SSRI mg 20
Citalopram 40mg tablets 56 tablet - 40 mg citalopram SSRI mg 40
PAROXETINE tabs 10mg 48.000 paroxetine SSRI mg 10
FLUPENTIXOL tabs 500 micrograms 60.000 flupentixol phenothiazine_antipsychotic microgram 500
Inspecting the rows of dataframe containing combination products in antidep_ukb_rx

Users can also expand the combination products to correct dosage of individual ingredients using strength_impute() function, with documentation listed here.


drug_name quantity chem_name func_class strength_unit strength
72 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 30 amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
122 Nortriptyline 10mg / Fluphenazine 500microgram tablets 100 nortriptyline tricyclic_antidepressant mg,microgram 10,500
327 Motival 10mg/500microgram tablets (Sanofi) 3 op nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic mg,microgram 10,500
403 perphenazine with amitriptyline tablets 2mg + 25mg 28 - tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,25
429 TRIPTAFEN M tablets 2mg + 10mg [AMCO] 120 tablets amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
498 perphenazine with amitriptyline tablets 2mg + 10mg 56 - tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic mg,mg 2,10
Log files



Quality control

There are 31 prescriptions without strength information extracted.

The likely reason is because these prescriptions do not have a strength unit attached to them.

drug_name quantity chem_name func_class strength_unit strength
206 NEFAZODONE starter pack 56.000 nefazodone SARI NA NA
313 Triptafen tablets (AMCo) 400 tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
363 Motipress tablets (Sanofi-Synthelabo Ltd) 30 tablets nortriptyline_fluphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
556 Triptafen tablets (AMCo) tablet(s) amitriptyline_perphenazine tricyclic_antidepressant_typical_antipsychotic NA NA
591 NEFAZODONE STARTER PACK [TABS (STARTER PACK)] 1.000 nefazodone SARI NA NA
661 PARSTELIN tablets [GLAXSK CON] 84 tranylcypromine MAOI NA NA


If prescriptions without strengths / strength units were present, it is recommended to perform quality control to handle these prescriptions, or use strength_impute function here.

Note: To remove these prescriptions, set no_strength_unit_remove as TRUE. You can also keep these prescriptions and specify a default strength unit as no_strength_unit.



Extraction of quantity

Initialization

The sample prescription we used in the tutorial were outputs from above, i.e.: strength_extract() function.

For simplicity, we removed prescriptions without strengths extracted and used this quality-controlled dataframe as a starting point.

# Load T-Rx package
library("TRX", quietly=T)
# run strength_extract() function
antidep_ukb_rx = strength_extract(rx_df = antidep_ukb_rx, liquid_strength_unit = c("mg/5ml"),
                            solid_strength_unit = c("mg", "mcg", "microgram"),
                            info_col = c("drug_name"), 
                            combined_strength = 2,
                            combined_strength_unit = c("mg", "mcg", "microgram", "miligram"),
                            no_strength_unit_remove = TRUE)



quantity_extract()

Description

The quantity_extract() function uses regular expressions to extract numeric quantity values and their associated dosage forms from specified columns with the following steps:

  1. Extract the dosage form strings provided by users (liqid_forms / solid_forms).
  2. For solid forms, pull out the number preceding the dosage form unit, e.g.: 30 preceding tab will be pulled from 30 tab.
  3. For liquid forms, pull out the number preceding liq_form_suffix, instead of dosage form unit.



Specifying dosage forms

The quantity_extract() function requires user to specify strings of solid and liquid dosage forms.

The strings of dosage forms can be specified under the liquid_forms and solid_forms arguments.

The choice of dosage form strings should based on clinical context, and the data source of interest.

It is recommended to use less specific strings (e.g.: tab rather than tablet) whenever necessary to capture more strings.

Note: Common samples of dosage forms:

  • Solid: tablets, capsules, pills, granules
  • Liquid: solutions, suspensions, mixtures, elixirs, syrups, drops

Run quantity_extract() with the following arguments as an initial trial:

df <- quantity_extract(rx_df = antidep_ukb_rx,
                        liquid_forms = c("susp", "suspension", "liquid", "syrup"),
                        solid_forms = c("tab", "cap"),
                        form_info_col = c("quantity", "drug_name"),
                        quantity_info_col = "quantity",
                        liq_form_suffix = "ml")
Explanation on arguments
Argument Details
rx_df The prescription dataframe
liquid_forms Possible liquid dosage forms
solid_forms Possible solid dosage forms
form_info_col Column names where dosage form information were extracted from in order (quantity,drug_name)
form_quantity_col Column names where quantity information were extracted from (quantity)
liq_form_suffix String to capture quantity of liquid dosage forms (default: ml)


Afterwards, check the outputs as below:

Output dataframe after quantity_extract()
drug_name quantity chem_name func_class strength_unit strength form form_quant
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 30 tab - 20 mg paroxetine SSRI mg 20 tab 30
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) 84 tabs nefazodone SARI mg 100 tab 84
Paroxetine 10mg/5ml oral suspension sugar free 60 ml(s) - 10 mg/5 ml paroxetine SSRI mg/5ml 10 susp 60
Venlafaxine 37.5mg tablets 1 - Pack of 56 venlafaxine SNRI mg 37.5 tab NA
Clomipramine 25mg capsules 140 capsules clomipramine tricyclic_antidepressant mg 25 cap 140
Dosulepin 25mg capsules 39 capsule(s) dosulepin tricyclic_antidepressant mg 25 cap 39
Efexor 37.5mg tablets (Wyeth Pharmaceuticals) 14 tablets - 37.5 mg venlafaxine SNRI mg 37.5 tab 14
Fluvoxamine 50mg tablets 30 tab fluvoxamine SSRI mg 50 tab 30
Duloxetine 30mg gastro-resistant capsules 60.000 duloxetine SNRI mg 30 cap NA
AMITRIPTYLINE 25mg tablets 60.000 amitriptyline tricyclic_antidepressant mg 25 tab NA
Mirtazapine Orodispersible TABS 30MG 90.000 mirtazapine other mg 30 tab NA
Sertraline 100mg tablets 10 tablets sertraline SSRI mg 100 tab 10
Duloxetine Gastro Resistant CAPS 30MG 14.000 duloxetine SNRI mg 30 cap NA
Yentreve 20mg gastro-resistant capsules (Eli Lilly and Company Ltd) 56 caps duloxetine SNRI mg 20 cap 56
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 120 tablet(s) - 20 mg paroxetine SSRI mg 20 tab 120
Amitriptyline 25mg tablets 74 tablet amitriptyline tricyclic_antidepressant mg 25 tab 74
Duloxetine 30mg gastro-resistant capsules 30 capsule - 30 mg duloxetine SNRI mg 30 cap 30
Nortriptyline 10mg tablets 14 tablet - 10 mg nortriptyline tricyclic_antidepressant mg 10 tab 14
DOTHIEPIN TABS 75MG 21.000 dosulepin tricyclic_antidepressant mg 75 tab NA
Zispin SolTab 45mg orodispersible tablets (Merck Sharp & Dohme Ltd) 60 tablet mirtazapine other mg 45 tab 60


Running quantity_extract() creates two new columns in the prescription dataframe:

  1. form: Dosage form extracted using liquid_forms / solid_forms arguments specified by user.
  2. form_quant: Numbers preceding form.

Note: Dosage form (form) was not extracted from 79 rows of prescriptions. To remove these, set no_form_remove as TRUE.

Log files

Preliminary checks from log file


Number of prescriptions WITHOUT dosage form extracted

If many are present, please consider expanding the list of strings used to capture dosage forms (liquid_forms / solid_forms arguments), or configure a default dosage form in no_form argument as default dosage form assignment when information is missing.


Number of prescriptions WITHOUT quantity value extracted

If many are present (but most prescriptions have dosage form extracted), it is possible that quantity information is available, but is not precede the dosage form strings specifically.

Please consider multiplier_extract() here and multi_num_infer().


Quality control

If user wishes to remove prescriptions without dosage form extracted by liquid_forms and solid_forms, you can set no_form_remove to TRUE.

These changes will be documented in the updated log file.

Log files



Extraction of multipliers (Additional function for UKB primary care records)

Relevance

In UKB primary care records, information on multipliers is sometimes included as semi-structured text in the prescription dataframe.

A couple of example prescriptions were shown below, extracted from the antidep_ukb_rx data object.

Demonstration of with multiplier information
Prescriptions with multipliers demonstration (first 6 rows)
drug_name quantity chem_name func_class strength_unit strength
24 Fluoxetine 20mg capsules (Teva UK Ltd) 3 packs of 30 capsule(s) fluoxetine SSRI mg 20
27 Venlafaxine 37.5mg tablets 2*56 tablets venlafaxine SNRI mg 37.5
37 Sertraline 50mg tablets (Sandoz Ltd) 2 packs of 28 tablet(s) sertraline SSRI mg 50
55 Sertraline 50mg tablets (Teva UK Ltd) 2 packs of 28 tablet(s) sertraline SSRI mg 50
91 Citalopram 10mg tablets (A A H Pharmaceuticals Ltd) 1 pack of 28 tablet(s) citalopram SSRI mg 10
94 Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 2*30 tablet - 20 mg paroxetine SSRI mg 20
Code chunk to create example dataframe
library("TRX", quietly = T)

# Examples of "multipliers"
example = antidep_ukb_rx[grepl("pack|\\*", antidep_ukb_rx$quantity), ]
print(head(example))



multiplier_extract()

Description

To extract the strings related to multipliers in UKB primary care records, users can specify strings under the multipliers and alt_multipliers arguments in the multiplier_extract function.

Due to complexity of prescription text across different datasets, we recommend to use the default set of strings below under NHS-formatted records:

After the strings are specified, the multiplier_extract() function then searches for numeric values preceding the multiplier strings in the order of:

  1. multipliers: Standard strings that likely pinpoint multiplier information, e.g.: packs, pack of
  2. alt_multipliers: Non-standard strings that might indicate multiplier information, but as a non-specific / loose match, e.g.: “x”.
  3. If numeric values cannot be searched from multiplier strings, assign as 1 or 999 (as quality control, see below).
Default strings
Argument Strings Details
multipliers pack, pack of, * Preferred multiplier strings (prioritized)
alt_multipliers x Alternative multiplier strings (less specific, use cautiously)



Execution

Run multiplier_extract() with the following arguments as an initial trial:

df <- multiplier_extract(rx_df = antidep_ukb_rx,
                         multipliers = c("pack", "pack of" , "\\*"),
                         alt_multipliers = c("x"),
                         info_col = c("quantity"),
                         qc_remove = FALSE)
Explanation on arguments
Argument Details
rx_df The prescription dataframe
multipliers Preferred multiplier strings (prioritized)
alt_multipliers Alternative multiplier strings (less specific, use cautiously)
info_col Column names where multiplier information were extracted from (quantity).


After running multiplier_extract(), you would notice 2 rows in the prescription dataframe were coded as “999” in the resultant dataframe.

Log file (as messages)

Inspect dataframe
Example dataframe (failure to extract multipliers)
drug_name quantity chem_name func_class strength_unit strength multiplier
3187 nefazodone starter pack tablet(s) - 14 x 50 mg, 14 x 100 mg, 28 x 200 mg nefazodone SARI NA NA 999
4535 nefazodone starter pack 1 - tablets (14x50mg,14x100mg,28x200mg) nefazodone SARI NA NA 999


For the 2 rows highlighted here, you would notice there are multiple multipliers that can be extracted, and would have been too difficult to resolve under the scope of multiplier_extract()

We suggest manual handling of these prescriptions if that happens (usually these are specific drug sets or prescription regimens that require special handling).



Quality control and optional parameters

As quality control for the above prescriptions, change the qc_remove argument to TRUE.

df <- multiplier_extract(rx_df = antidep_ukb_rx,
                         multipliers = c("pack", "pack of" , "\\*"),
                         alt_multipliers = c("x"),
                         info_col = c("quantity"),
                         qc_remove = TRUE)


You would see a log file as follows:
Log file (as messages)


If you want to change the column name for the multiplier column, please specify with multiplier_colname argument.

df <- multiplier_extract(rx_df = antidep_ukb_rx,
                         multipliers = c("pack", "pack of" , "\\*"),
                         alt_multipliers = c("x"),
                         info_col = c("quantity"),
                         multiplier_colname = "multi",
                         qc_remove = TRUE)
Inspect dataframe
Example (multiplier column renamed)
drug_name quantity chem_name func_class strength_unit strength multi
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) 30 tab - 20 mg paroxetine SSRI mg 20 1
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) 84 tabs nefazodone SARI mg 100 1
Paroxetine 10mg/5ml oral suspension sugar free 60 ml(s) - 10 mg/5 ml paroxetine SSRI mg/5ml 10 1
Venlafaxine 37.5mg tablets 1 - Pack of 56 venlafaxine SNRI mg 37.5 1
Clomipramine 25mg capsules 140 capsules clomipramine tricyclic_antidepressant mg 25 1
Dosulepin 25mg capsules 39 capsule(s) dosulepin tricyclic_antidepressant mg 25 1

Troubleshooting

Please post questions as an issue on the T-Rx GitHub repo here.

The T-Rx package is currently under beta testing. Most functions should have adequate documentation on possible errors.

Please kindly reach out to Chris Lo () for feedback on documentation.