The algorithm utilizes regular expression (REGEX) patterns in prescription records to extract strength / dosage, as well as quantity information from prescription records.
In this module, users are required to specify:
Using these information, the numbers preceding user-specified dosage units / dosage forms are extracted as dose and quantity respectively.
In this tutorial, we provided a sample dataframe of antidepressant prescriptions, the structure of which was same to those in UK Biobank (UKB) primary care records (5000 rows, 4 columns). These strings can be mapped using READv2 / BNF / dm+d codes that are publicly available, with details described at https://github.com/chiarafabbri/MDD_TRD_study.
Column | Details | Note |
---|---|---|
drug_name |
Drug name (from raw UKB primary care records) | Please refer to UKB documentation |
quantity |
Quantity issued (from raw UKB primary care records) | Please refer to UKB documentation |
chem_name |
Drug name of active ingredient | Extracted from
https://github.com/chiarafabbri/MDD_TRD_study |
func_class |
Drug class of active ingredient | Extracted from
https://github.com/chiarafabbri/MDD_TRD_study |
drug_name | quantity | chem_name | func_class |
---|---|---|---|
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) | 30 tab - 20 mg | paroxetine | SSRI |
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) | 84 tabs | nefazodone | SARI |
Paroxetine 10mg/5ml oral suspension sugar free | 60 ml(s) - 10 mg/5 ml | paroxetine | SSRI |
Venlafaxine 37.5mg tablets | 1 - Pack of 56 | venlafaxine | SNRI |
Clomipramine 25mg capsules | 140 capsules | clomipramine | tricyclic_antidepressant |
Dosulepin 25mg capsules | 39 capsule(s) | dosulepin | tricyclic_antidepressant |
For details of UKB primary care records, please refer to the UKB documentation here.
The sample prescriptions were hosted on T-Rx as the data object
antidep_ukb_rx
.
Load T-Rx and inspect the data object on R.
# load package
library(TRX)
# inspect sample data object (antidepressant prescriptions)
dim(antidep_ukb_rx)
# [1] 5000 4
strength_extract()
The strength_extract()
function requires user to specify
strings of strength / dosage units of solid and liquid dosage forms.
Note: Common samples of dosage forms:
In these dosage forms, users are required to determine these strings of common dosing units.
The choice of strength / dosage units should based on clinical context, and the data source of interest.
Taking antidepressants as an example, many were available as:
These strings are specified under the
liquid_strength_unit
and solid_strength_unit
arguments in the strength_extract()
function.
As illustration, we created a hypothetical sample dataframe (with similar structure as UKB primary care records) containing 5 prescriptions as below.
drug_name | quantity |
---|---|
Venlafaxine 75mg modified-release capsules | 28 capsules - 75 mg |
Motival 10mg/500microgram capsules (Sanofi) | 1*28 capsule(s) |
TRIPTAFEN tabs 2mg + 25mg | 2 packs of 28 tablet(s) |
LOFEPRAMINE sf susp 70mg/5ml | 100ml |
Amitriptyline 10mg tablets (Wockhardt UK Ltd) | 28 tablets |
df <- data.frame(drug_name = c("Venlafaxine 75mg modified-release capsules","Motival 10mg/500microgram capsules (Sanofi)","TRIPTAFEN tabs 2mg + 25mg","LOFEPRAMINE sf susp 70mg/5ml","Amitriptyline 10mg tablets (Wockhardt UK Ltd)"),
quantity = c("28 capsules - 75 mg","1*28 capsule(s)","2 packs of 28 tablet(s)","100ml","28 tablets"))
Using this sample dataframe, run the strength_extract()
function using the following arguments:
# Extract single strength information
single_df = strength_extract(rx_df = df, liquid_strength_unit = c("mg/5ml"),
solid_strength_unit = c("mg", "mcg", "microgram"),
info_col = c("drug_name"))
Argument | Details |
---|---|
rx_df |
The prescription dataframe (df ) |
liquid_strength_unit |
Possible strength units of liquid dosage forms |
solid_strength_unit |
Possible strength units of solid dosage forms |
info_col |
Column names where strength and strength units were
extracted from (drug_name ) |
Afterwards, check the outputs in R / RStudio:
drug_name | quantity | strength_unit | strength |
---|---|---|---|
Venlafaxine 75mg modified-release capsules | 28 capsules - 75 mg | mg | 75 |
Motival 10mg/500microgram capsules (Sanofi) | 1*28 capsule(s) | mg | 10 |
TRIPTAFEN tabs 2mg + 25mg | 2 packs of 28 tablet(s) | mg | 2 |
LOFEPRAMINE sf susp 70mg/5ml | 100ml | mg/5ml | 70 |
Amitriptyline 10mg tablets (Wockhardt UK Ltd) | 28 tablets | mg | 10 |
Running strength_extract()
creates two
new columns in the prescription dataframe:
strength_unit
: Strength Units extracted using
liquid_strength_unit
/ solid_strength_unit
arguments specified by user.strength
: Numbers preceding
strength_unit
.Note: strength_extract()
did not handle
combination products by default.
Therefore, only the first ingredient in
TRIPTAFEN tabs 2mg + 25mg
extracted as 2mg
.
The expected number of ingredients in combination products can be
specified with the combined_strength
and
combined_strength_unit
arguments.
In practical situations, one should expect the prescriptions to contain combination products with multiple ingredients at the same time.
T-Rx also offers function to handle these products, assuming
+
or /
were used as separators between
multiple strengths / strength units. Examples include
2mg + 25mg
and 2mg/25mg
.
To extract all strengths in combination products, these two arguments need to be added:
Column | Details |
---|---|
combined_strength |
Maximum number of ingredients expected in combination products |
combined_strength_unit |
The strings of strength units used to extract strengths in combination products |
Note: It is recommended to only use solid
dosage form units (e.g.: mg
, mcg
) to
avoid confusion with liquid dosage forms. In combination products, the
strengths were rarely expressed as concentrations even as liquid dosage
forms.
Run the strength_extract()
function again, but in the
UKB prescription sample (with multiple ingredients expected).
# Re-run strength extraction function again, but add multi-ingredient options
antidep_ukb_rx = strength_extract(rx_df = antidep_ukb_rx, liquid_strength_unit = c("mg/5ml"),
solid_strength_unit = c("mg", "mcg", "microgram"),
info_col = c("drug_name"),
combined_strength = 2,
combined_strength_unit = c("mg", "mcg", "microgram", "miligram"))
drug_name | quantity | chem_name | func_class | strength_unit | strength |
---|---|---|---|---|---|
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) | 30 tab - 20 mg | paroxetine | SSRI | mg | 20 |
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) | 84 tabs | nefazodone | SARI | mg | 100 |
Paroxetine 10mg/5ml oral suspension sugar free | 60 ml(s) - 10 mg/5 ml | paroxetine | SSRI | mg/5ml | 10 |
Venlafaxine 37.5mg tablets | 1 - Pack of 56 | venlafaxine | SNRI | mg | 37.5 |
Clomipramine 25mg capsules | 140 capsules | clomipramine | tricyclic_antidepressant | mg | 25 |
Dosulepin 25mg capsules | 39 capsule(s) | dosulepin | tricyclic_antidepressant | mg | 25 |
Efexor 37.5mg tablets (Wyeth Pharmaceuticals) | 14 tablets - 37.5 mg | venlafaxine | SNRI | mg | 37.5 |
Fluvoxamine 50mg tablets | 30 tab | fluvoxamine | SSRI | mg | 50 |
Duloxetine 30mg gastro-resistant capsules | 60.000 | duloxetine | SNRI | mg | 30 |
AMITRIPTYLINE 25mg tablets | 60.000 | amitriptyline | tricyclic_antidepressant | mg | 25 |
Mirtazapine Orodispersible TABS 30MG | 90.000 | mirtazapine | other | mg | 30 |
Sertraline 100mg tablets | 10 tablets | sertraline | SSRI | mg | 100 |
Duloxetine Gastro Resistant CAPS 30MG | 14.000 | duloxetine | SNRI | mg | 30 |
Yentreve 20mg gastro-resistant capsules (Eli Lilly and Company Ltd) | 56 caps | duloxetine | SNRI | mg | 20 |
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) | 120 tablet(s) - 20 mg | paroxetine | SSRI | mg | 20 |
Amitriptyline 25mg tablets | 74 tablet | amitriptyline | tricyclic_antidepressant | mg | 25 |
Duloxetine 30mg gastro-resistant capsules | 30 capsule - 30 mg | duloxetine | SNRI | mg | 30 |
Nortriptyline 10mg tablets | 14 tablet - 10 mg | nortriptyline | tricyclic_antidepressant | mg | 10 |
DOTHIEPIN TABS 75MG | 21.000 | dosulepin | tricyclic_antidepressant | mg | 75 |
Zispin SolTab 45mg orodispersible tablets (Merck Sharp & Dohme Ltd) | 60 tablet | mirtazapine | other | mg | 45 |
Paroxetine 20mg tablets | 1 x 90 days | paroxetine | SSRI | mg | 20 |
Lofepramine 70mg tablets | 84 tablet(s) - 70 mg | lofepramine | tricyclic_antidepressant | mg | 70 |
Citalopram 10mg tablets | 28.000 | citalopram | SSRI | mg | 10 |
Fluoxetine 20mg capsules (Teva UK Ltd) | 3 packs of 30 capsule(s) | fluoxetine | SSRI | mg | 20 |
ESCITALOPRAM TABLETS 20MG | 56.000 | escitalopram | SSRI | mg | 20 |
Fluoxetine CAPS 20MG | 1000.000 | fluoxetine | SSRI | mg | 20 |
Venlafaxine 37.5mg tablets | 2*56 tablets | venlafaxine | SNRI | mg | 37.5 |
Zispin SolTab 45mg orodispersible tablets (Merck Sharp & Dohme Ltd) | 30 tab | mirtazapine | other | mg | 45 |
Trazodone 150mg tablets | 14 tablet | trazodone | SARI | mg | 150 |
Amitriptyline 25mg tablets | 21 TABLET | amitriptyline | tricyclic_antidepressant | mg | 25 |
Ludiomil 75mg tablets (Novartis Pharmaceuticals UK Ltd) | 28 tabs | maprotiline | tetracyclic_antidepressant | mg | 75 |
Fluoxetine 20mg capsules | 70 capsule - 20 mg | fluoxetine | SSRI | mg | 20 |
Clomipramine 50mg capsules | 150 capsules | clomipramine | tricyclic_antidepressant | mg | 50 |
Venlafaxine 75mg modified-release capsules | 28 capsule(s) - 75 mg | venlafaxine | SNRI | mg | 75 |
Dosulepin 25mg capsules | 252 capsule(s) - 25 mg | dosulepin | tricyclic_antidepressant | mg | 25 |
Trimipramine 25mg tablets | 28 - tablet(s) | trimipramine | tricyclic_antidepressant | mg | 25 |
Sertraline 50mg tablets (Sandoz Ltd) | 2 packs of 28 tablet(s) | sertraline | SSRI | mg | 50 |
Clomipramine 10mg capsules | 300 capsules | clomipramine | tricyclic_antidepressant | mg | 10 |
Clomipramine 10mg capsules | 224 capsule | clomipramine | tricyclic_antidepressant | mg | 10 |
Venlafaxine 225mg modified-release tablets | 56.000 | venlafaxine | SNRI | mg | 225 |
Venlafaxine 37.5mg tablets | 1 | venlafaxine | SNRI | mg | 37.5 |
Gamanil 70mg tablets (Merck Serono Ltd) | 28 tablets - 70 mg | lofepramine | tricyclic_antidepressant | mg | 70 |
CLOMIPRAMINE caps 25mg | 14.000 | clomipramine | tricyclic_antidepressant | mg | 25 |
DOTHIEPIN tabs 75mg | 30.000 | dosulepin | tricyclic_antidepressant | mg | 75 |
Prothiaden 25mg capsules (Teofarma) | 30 days - 25 mg | dosulepin | tricyclic_antidepressant | mg | 25 |
Amitriptyline 10mg tablets | 28 tablets - 10 mg | amitriptyline | tricyclic_antidepressant | mg | 10 |
Imipramine 25mg tablets | 30 tabs | imipramine | tricyclic_antidepressant | mg | 25 |
FLUPENTIXOL TABLETS 1MG | 120.000 | flupentixol | phenothiazine_antipsychotic | mg | 1 |
Paroxetine 10mg tablets | 30 | paroxetine | SSRI | mg | 10 |
Lustral 50mg tablets (Pfizer Ltd) | 28 tablet - 50 mg | sertraline | SSRI | mg | 50 |
VENLAFAXINE MR CAPSULES 75MG | 30.000 | venlafaxine | SNRI | mg | 75 |
Mirtazapine 30mg tablets | 7 - tablet(s) | mirtazapine | other | mg | 30 |
LOFEPRAMINE TAB 70mg | 6.000 | lofepramine | tricyclic_antidepressant | mg | 70 |
Imipramine 25mg tablets | 112 tablets | imipramine | tricyclic_antidepressant | mg | 25 |
Sertraline 50mg tablets (Teva UK Ltd) | 2 packs of 28 tablet(s) | sertraline | SSRI | mg | 50 |
Nortriptyline TABS 25MG | 224.000 | nortriptyline | tricyclic_antidepressant | mg | 25 |
Escitalopram 5mg tablets | 28 | escitalopram | SSRI | mg | 5 |
Trimipramine 25mg tablets | trimipramine | tricyclic_antidepressant | mg | 25 | |
NORTRIPTYLINE TABLETS 10MG | 200.000 | nortriptyline | tricyclic_antidepressant | mg | 10 |
MIRTAZAPINE TABLETS 30MG | 84.000 | mirtazapine | other | mg | 30 |
Nefazodone Hydrochloride Tablets 100 mg | 56.000 | nefazodone | SARI | mg | 100 |
AMITRIPTYLINE HYDROCHLORIDE tablets 50mg | 28.000 | amitriptyline | tricyclic_antidepressant | mg | 50 |
Amitriptyline 10mg tablets | 12 tab | amitriptyline | tricyclic_antidepressant | mg | 10 |
Mirtazapine Orodispersible tablets 15mg | 30.000 | mirtazapine | other | mg | 15 |
Prothiaden 25mg capsules (Teofarma) | 56 capsule | dosulepin | tricyclic_antidepressant | mg | 25 |
Amitriptyline 10mg/5ml oral solution sugar free | 300 millilitres | amitriptyline | tricyclic_antidepressant | mg/5ml | 10 |
Mirtazapine 15mg tablets | 30 tablet - 15 mg | mirtazapine | other | mg | 15 |
Nortriptyline 10mg tablets | 1 x 28 | nortriptyline | tricyclic_antidepressant | mg | 10 |
Fluoxetine 20mg capsules | 120 - capsule | fluoxetine | SSRI | mg | 20 |
DOTHIEPIN HCL TAB 75MG | 56.000 | dosulepin | tricyclic_antidepressant | mg | 75 |
Prozac Capsules 20 mg | 60.000 | fluoxetine | SSRI | mg | 20 |
TRIPTAFEN M tablets 2mg + 10mg [AMCO] | 30 | amitriptyline_perphenazine | tricyclic_antidepressant_typical_antipsychotic | mg,mg | 2,10 |
Surmontil 50mg capsules (Sanofi) | 60 capsules | trimipramine | tricyclic_antidepressant | mg | 50 |
Amitriptyline 10mg tablets | 80 tablets - 10 mg | amitriptyline | tricyclic_antidepressant | mg | 10 |
Dosulepin 75mg tablets | 28 - tablets | dosulepin | tricyclic_antidepressant | mg | 75 |
Fluoxetine 20mg capsules | 6 capsule | fluoxetine | SSRI | mg | 20 |
Escitalopram Tablets 20 mg | 28.000 | escitalopram | SSRI | mg | 20 |
Amitriptyline 10mg tablets | tablet(s) Tablets | amitriptyline | tricyclic_antidepressant | mg | 10 |
Paroxetine 20mg tablets | 90 days tablet(s) - 20 mg | paroxetine | SSRI | mg | 20 |
Zispin 30mg tablets (Organon Laboratories Ltd) | 14 tablets - 30 mg | mirtazapine | other | mg | 30 |
Amitriptyline 25mg tablets | 56.00 | amitriptyline | tricyclic_antidepressant | mg | 25 |
Venlafaxine Hydrochloride M/R capsules 150 mg | 28.000 | venlafaxine | SNRI | mg | 150 |
Amitriptyline 25mg tablets | 200 tablets | amitriptyline | tricyclic_antidepressant | mg | 25 |
Dosulepin 25mg capsules | 210 capsule(s) - 25 mg | dosulepin | tricyclic_antidepressant | mg | 25 |
Paroxetine 20mg tablets | 28 tablet(s) Temp. | paroxetine | SSRI | mg | 20 |
DOTHIEPIN TABLETS 75 MG | 30.000 | dosulepin | tricyclic_antidepressant | mg | 75 |
Nortriptyline 10mg tablets | 28 tablet - 10 mg | nortriptyline | tricyclic_antidepressant | mg | 10 |
Fluoxetine 20mg/5ml oral solution | 280 millilitres | fluoxetine | SSRI | mg/5ml | 20 |
Escitalopram 20mg tablets | 56 tab | escitalopram | SSRI | mg | 20 |
Nortriptyline 10mg tablets | 180.000 | nortriptyline | tricyclic_antidepressant | mg | 10 |
Citalopram 10mg tablets (A A H Pharmaceuticals Ltd) | 1 pack of 28 tablet(s) | citalopram | SSRI | mg | 10 |
Citalopram TABS 40MG | 30.000 | citalopram | SSRI | mg | 40 |
Venlafaxine 150mg modified-release capsules | 56 tablets | venlafaxine | SNRI | mg | 150 |
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) | 2*30 tablet - 20 mg | paroxetine | SSRI | mg | 20 |
Amitriptyline Hydrochloride Oral solution 10 mg/5 ml | 300.000 | amitriptyline | tricyclic_antidepressant | mg/5 ml | 10 |
Amitriptyline 25mg tablets | 12 | amitriptyline | tricyclic_antidepressant | mg | 25 |
Paroxetine 20mg tablets | 14 tablet(s) | paroxetine | SSRI | mg | 20 |
Citalopram 40mg tablets | 56 tablet - 40 mg | citalopram | SSRI | mg | 40 |
PAROXETINE tabs 10mg | 48.000 | paroxetine | SSRI | mg | 10 |
FLUPENTIXOL tabs 500 micrograms | 60.000 | flupentixol | phenothiazine_antipsychotic | microgram | 500 |
antidep_ukb_rx
Users can also expand the combination products to correct dosage of
individual ingredients using
strength_impute()
function, with
documentation listed
here.
drug_name | quantity | chem_name | func_class | strength_unit | strength | |
---|---|---|---|---|---|---|
72 | TRIPTAFEN M tablets 2mg + 10mg [AMCO] | 30 | amitriptyline_perphenazine | tricyclic_antidepressant_typical_antipsychotic | mg,mg | 2,10 |
122 | Nortriptyline 10mg / Fluphenazine 500microgram tablets | 100 | nortriptyline | tricyclic_antidepressant | mg,microgram | 10,500 |
327 | Motival 10mg/500microgram tablets (Sanofi) | 3 op | nortriptyline_fluphenazine | tricyclic_antidepressant_typical_antipsychotic | mg,microgram | 10,500 |
403 | perphenazine with amitriptyline tablets 2mg + 25mg | 28 - tablet(s) | amitriptyline_perphenazine | tricyclic_antidepressant_typical_antipsychotic | mg,mg | 2,25 |
429 | TRIPTAFEN M tablets 2mg + 10mg [AMCO] | 120 tablets | amitriptyline_perphenazine | tricyclic_antidepressant_typical_antipsychotic | mg,mg | 2,10 |
498 | perphenazine with amitriptyline tablets 2mg + 10mg | 56 - tablet(s) | amitriptyline_perphenazine | tricyclic_antidepressant_typical_antipsychotic | mg,mg | 2,10 |
There are 31 prescriptions without strength information extracted.
The likely reason is because these prescriptions do not have a strength unit attached to them.
drug_name | quantity | chem_name | func_class | strength_unit | strength | |
---|---|---|---|---|---|---|
206 | NEFAZODONE starter pack | 56.000 | nefazodone | SARI | NA | NA |
313 | Triptafen tablets (AMCo) | 400 tablet(s) | amitriptyline_perphenazine | tricyclic_antidepressant_typical_antipsychotic | NA | NA |
363 | Motipress tablets (Sanofi-Synthelabo Ltd) | 30 tablets | nortriptyline_fluphenazine | tricyclic_antidepressant_typical_antipsychotic | NA | NA |
556 | Triptafen tablets (AMCo) | tablet(s) | amitriptyline_perphenazine | tricyclic_antidepressant_typical_antipsychotic | NA | NA |
591 | NEFAZODONE STARTER PACK [TABS (STARTER PACK)] | 1.000 | nefazodone | SARI | NA | NA |
661 | PARSTELIN tablets [GLAXSK CON] | 84 | tranylcypromine | MAOI | NA | NA |
If prescriptions without strengths / strength units were present, it
is recommended to perform quality control to handle these prescriptions,
or use strength_impute
function
here.
Note: To remove these prescriptions, set
no_strength_unit_remove
as
TRUE
. You can also keep these prescriptions and specify a
default strength unit as
no_strength_unit
.
The sample prescription we used in the tutorial were outputs from
above, i.e.: strength_extract()
function.
For simplicity, we removed prescriptions without strengths extracted and used this quality-controlled dataframe as a starting point.
# Load T-Rx package
library("TRX", quietly=T)
# run strength_extract() function
antidep_ukb_rx = strength_extract(rx_df = antidep_ukb_rx, liquid_strength_unit = c("mg/5ml"),
solid_strength_unit = c("mg", "mcg", "microgram"),
info_col = c("drug_name"),
combined_strength = 2,
combined_strength_unit = c("mg", "mcg", "microgram", "miligram"),
no_strength_unit_remove = TRUE)
quantity_extract()
The quantity_extract()
function uses regular expressions
to extract numeric quantity values and their associated dosage forms
from specified columns with the following steps:
liqid_forms
/ solid_forms
).30
preceding tab
will be pulled from
30 tab
.liq_form_suffix
, instead of dosage form unit.The quantity_extract()
function requires user to specify
strings of solid and liquid dosage forms.
The strings of dosage forms can be specified under the
liquid_forms
and solid_forms
arguments.
The choice of dosage form strings should based on clinical context, and the data source of interest.
It is recommended to use less specific strings (e.g.:
tab
rather than tablet
) whenever necessary to
capture more strings.
Note: Common samples of dosage forms:
Run quantity_extract()
with the following arguments as
an initial trial:
df <- quantity_extract(rx_df = antidep_ukb_rx,
liquid_forms = c("susp", "suspension", "liquid", "syrup"),
solid_forms = c("tab", "cap"),
form_info_col = c("quantity", "drug_name"),
quantity_info_col = "quantity",
liq_form_suffix = "ml")
Argument | Details |
---|---|
rx_df |
The prescription dataframe |
liquid_forms |
Possible liquid dosage forms |
solid_forms |
Possible solid dosage forms |
form_info_col |
Column names where dosage form information were
extracted from in order
(quantity ,drug_name ) |
form_quantity_col |
Column names where quantity information were extracted
from (quantity ) |
liq_form_suffix |
String to capture quantity of liquid dosage forms
(default: ml ) |
Afterwards, check the outputs as below:
quantity_extract()
drug_name | quantity | chem_name | func_class | strength_unit | strength | form | form_quant |
---|---|---|---|---|---|---|---|
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) | 30 tab - 20 mg | paroxetine | SSRI | mg | 20 | tab | 30 |
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) | 84 tabs | nefazodone | SARI | mg | 100 | tab | 84 |
Paroxetine 10mg/5ml oral suspension sugar free | 60 ml(s) - 10 mg/5 ml | paroxetine | SSRI | mg/5ml | 10 | susp | 60 |
Venlafaxine 37.5mg tablets | 1 - Pack of 56 | venlafaxine | SNRI | mg | 37.5 | tab | NA |
Clomipramine 25mg capsules | 140 capsules | clomipramine | tricyclic_antidepressant | mg | 25 | cap | 140 |
Dosulepin 25mg capsules | 39 capsule(s) | dosulepin | tricyclic_antidepressant | mg | 25 | cap | 39 |
Efexor 37.5mg tablets (Wyeth Pharmaceuticals) | 14 tablets - 37.5 mg | venlafaxine | SNRI | mg | 37.5 | tab | 14 |
Fluvoxamine 50mg tablets | 30 tab | fluvoxamine | SSRI | mg | 50 | tab | 30 |
Duloxetine 30mg gastro-resistant capsules | 60.000 | duloxetine | SNRI | mg | 30 | cap | NA |
AMITRIPTYLINE 25mg tablets | 60.000 | amitriptyline | tricyclic_antidepressant | mg | 25 | tab | NA |
Mirtazapine Orodispersible TABS 30MG | 90.000 | mirtazapine | other | mg | 30 | tab | NA |
Sertraline 100mg tablets | 10 tablets | sertraline | SSRI | mg | 100 | tab | 10 |
Duloxetine Gastro Resistant CAPS 30MG | 14.000 | duloxetine | SNRI | mg | 30 | cap | NA |
Yentreve 20mg gastro-resistant capsules (Eli Lilly and Company Ltd) | 56 caps | duloxetine | SNRI | mg | 20 | cap | 56 |
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) | 120 tablet(s) - 20 mg | paroxetine | SSRI | mg | 20 | tab | 120 |
Amitriptyline 25mg tablets | 74 tablet | amitriptyline | tricyclic_antidepressant | mg | 25 | tab | 74 |
Duloxetine 30mg gastro-resistant capsules | 30 capsule - 30 mg | duloxetine | SNRI | mg | 30 | cap | 30 |
Nortriptyline 10mg tablets | 14 tablet - 10 mg | nortriptyline | tricyclic_antidepressant | mg | 10 | tab | 14 |
DOTHIEPIN TABS 75MG | 21.000 | dosulepin | tricyclic_antidepressant | mg | 75 | tab | NA |
Zispin SolTab 45mg orodispersible tablets (Merck Sharp & Dohme Ltd) | 60 tablet | mirtazapine | other | mg | 45 | tab | 60 |
Running quantity_extract()
creates two
new columns in the prescription dataframe:
form
: Dosage form extracted using
liquid_forms
/ solid_forms
arguments specified
by user.form_quant
: Numbers preceding
form
.Note: Dosage form
(form
) was not extracted from 79 rows of
prescriptions. To remove these, set
no_form_remove
as TRUE
.
Preliminary checks from log file
Number of prescriptions WITHOUT dosage form extracted
If many are present, please consider expanding the list of strings
used to capture dosage forms (liquid_forms
/ solid_forms
arguments), or configure a
default dosage form in no_form
argument as
default dosage form assignment when information is missing.
Number of prescriptions WITHOUT quantity value extracted
If many are present (but most prescriptions have dosage form extracted), it is possible that quantity information is available, but is not precede the dosage form strings specifically.
Please consider multiplier_extract()
here and
multi_num_infer()
.
If user wishes to remove prescriptions without dosage form extracted
by liquid_forms
and solid_forms
, you can set
no_form_remove
to TRUE.
These changes will be documented in the updated log file.
In UKB primary care records, information on multipliers is sometimes included as semi-structured text in the prescription dataframe.
A couple of example prescriptions were shown below, extracted from
the antidep_ukb_rx
data object.
drug_name | quantity | chem_name | func_class | strength_unit | strength | |
---|---|---|---|---|---|---|
24 | Fluoxetine 20mg capsules (Teva UK Ltd) | 3 packs of 30 capsule(s) | fluoxetine | SSRI | mg | 20 |
27 | Venlafaxine 37.5mg tablets | 2*56 tablets | venlafaxine | SNRI | mg | 37.5 |
37 | Sertraline 50mg tablets (Sandoz Ltd) | 2 packs of 28 tablet(s) | sertraline | SSRI | mg | 50 |
55 | Sertraline 50mg tablets (Teva UK Ltd) | 2 packs of 28 tablet(s) | sertraline | SSRI | mg | 50 |
91 | Citalopram 10mg tablets (A A H Pharmaceuticals Ltd) | 1 pack of 28 tablet(s) | citalopram | SSRI | mg | 10 |
94 | Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) | 2*30 tablet - 20 mg | paroxetine | SSRI | mg | 20 |
library("TRX", quietly = T)
# Examples of "multipliers"
example = antidep_ukb_rx[grepl("pack|\\*", antidep_ukb_rx$quantity), ]
print(head(example))
multiplier_extract()
To extract the strings related to multipliers in UKB primary care
records, users can specify strings under the
multipliers
and
alt_multipliers
arguments in the
multiplier_extract
function.
Due to complexity of prescription text across different datasets, we recommend to use the default set of strings below under NHS-formatted records:
After the strings are specified, the
multiplier_extract()
function then searches for numeric
values preceding the multiplier strings in the order of:
multipliers
: Standard strings that likely pinpoint
multiplier information, e.g.: packs
,
pack of
alt_multipliers
: Non-standard strings that might
indicate multiplier information, but as a non-specific / loose match,
e.g.: “x”.1
or 999
(as quality control, see
below).Argument | Strings | Details |
---|---|---|
multipliers |
pack, pack of, * | Preferred multiplier strings (prioritized) |
alt_multipliers |
x | Alternative multiplier strings (less specific, use cautiously) |
Run multiplier_extract()
with the following arguments as
an initial trial:
df <- multiplier_extract(rx_df = antidep_ukb_rx,
multipliers = c("pack", "pack of" , "\\*"),
alt_multipliers = c("x"),
info_col = c("quantity"),
qc_remove = FALSE)
Argument | Details |
---|---|
rx_df |
The prescription dataframe |
multipliers |
Preferred multiplier strings (prioritized) |
alt_multipliers |
Alternative multiplier strings (less specific, use cautiously) |
info_col |
Column names where multiplier information were
extracted from (quantity ). |
After running multiplier_extract()
, you would notice 2
rows in the prescription dataframe were coded as “999” in the resultant
dataframe.
drug_name | quantity | chem_name | func_class | strength_unit | strength | multiplier | |
---|---|---|---|---|---|---|---|
3187 | nefazodone starter pack | tablet(s) - 14 x 50 mg, 14 x 100 mg, 28 x 200 mg | nefazodone | SARI | NA | NA | 999 |
4535 | nefazodone starter pack | 1 - tablets (14x50mg,14x100mg,28x200mg) | nefazodone | SARI | NA | NA | 999 |
For the 2 rows highlighted here, you would notice there are multiple
multipliers
that can be extracted, and would have been too
difficult to resolve under the scope of
multiplier_extract()
We suggest manual handling of these prescriptions if that happens (usually these are specific drug sets or prescription regimens that require special handling).
As quality control for the above prescriptions, change the
qc_remove
argument to
TRUE
.
df <- multiplier_extract(rx_df = antidep_ukb_rx,
multipliers = c("pack", "pack of" , "\\*"),
alt_multipliers = c("x"),
info_col = c("quantity"),
qc_remove = TRUE)
You would see a log file as follows:
If you want to change the column name for the multiplier column,
please specify with multiplier_colname
argument.
df <- multiplier_extract(rx_df = antidep_ukb_rx,
multipliers = c("pack", "pack of" , "\\*"),
alt_multipliers = c("x"),
info_col = c("quantity"),
multiplier_colname = "multi",
qc_remove = TRUE)
drug_name | quantity | chem_name | func_class | strength_unit | strength | multi |
---|---|---|---|---|---|---|
Seroxat 20mg tablets (GlaxoSmithKline UK Ltd) | 30 tab - 20 mg | paroxetine | SSRI | mg | 20 | 1 |
Dutonin 100mg tablets (Bristol-Myers Squibb Pharmaceuticals Ltd) | 84 tabs | nefazodone | SARI | mg | 100 | 1 |
Paroxetine 10mg/5ml oral suspension sugar free | 60 ml(s) - 10 mg/5 ml | paroxetine | SSRI | mg/5ml | 10 | 1 |
Venlafaxine 37.5mg tablets | 1 - Pack of 56 | venlafaxine | SNRI | mg | 37.5 | 1 |
Clomipramine 25mg capsules | 140 capsules | clomipramine | tricyclic_antidepressant | mg | 25 | 1 |
Dosulepin 25mg capsules | 39 capsule(s) | dosulepin | tricyclic_antidepressant | mg | 25 | 1 |
Please post questions as an issue on the T-Rx GitHub repo here.
The T-Rx package is currently under beta testing. Most functions should have adequate documentation on possible errors.
Please kindly reach out to Chris Lo (chris.lowh@kcl.ac.uk) for feedback on documentation.