Statistical applications in pharmaceutical and chemical field

by Riccardo Bonfichi

MULTIPLE LINEAR REGRESSION: A POWERFUL STATISTICAL TOOL TO UNDERSTAND AND IMPROVE APIs MANUFACTURING PROCESSES

10/26/2020

Abstract

It is known that, over time, all production processes tend to deviate from their initial conditions, and this happens because of many different reasons such as changes in materials, personnel, environment, etc.
This variability in the processes, which often goes unnoticed, is instead well intercepted by the data that Quality Control systematically collects for batch release purposes.
If these data are analyzed using Multiple Linear Regression (MLR), they reveal a lot regarding the manufacturing processes that generated them.
This product knowledge is of great practical use to the Company as it allows to:
• understand which are the parameters that most affect the product quality and how they interact with each other,
• establish whether the parameters that are controlled are really the ones we need or, instead, which ones would be better to consider,
• define / improve a product control strategy based on experimental data and quantitative models rather than speculation,
• define and graphically represent the design space (ICH Q8) inherent to the production process considered,
• identify possible ways to improve process performance and scientifically pilot this improvement,
• mitigate the Regulatory impact in case of changes.
In this post is detailed, step by step, how this ready-to-use process knowledge can be obtained from experimental data easily available.

Read more

QUALITY METRICS AND DATA CONSISTENCY – Part 2

08/01/2020

Abstract

This second part is the continuation and completion of the previous one.
In this second post the points dealt with are:
Read more

QUALITY METRICS AND DATA CONSISTENCY – Part 1

08/01/2020

Abstract

In 2002, FDA launched the “Pharmaceutical cGMPs for the 21st Century” initiative with the aim of promoting a modern production approach, risk- and science-based. In 2015, always in that context, FDA asked the industry for inputs to define a “FDA Quality Metrics program” and in December 2019 announced that the implementation of a “Quality Metrics Program” has become a priority. Taking its cue from these FDA stimuli, this post and the next deal with the use of quantitative tools (or Quality Metrics) for understanding, monitoring and possibly improving pharmaceutical manufacturing processes. Real case studies that show the practical application of Quality Metrics to typical QA / QC topics are discussed and their statistical analysis detailed step by step. In practice it is shown how, from data normally available at the company, it is possible to easily extract useful information on the state of the processes and, above all, predict their possible outcome. It is exactly this combination of two aspects, one descriptive and the other predictive, which allows to really know a given process, control it and possibly even improve it. This knowledge is also useful for managing issues like OOS, OOT, deviations, etc. In fact, a poor knowledge of the process and of its quality indicators can lead to consider anomalous what is not. Given the number of Quality Metrics considered and the breadth of the case studies discussed, the topic was splitted in two parts. In this first post the points dealt with are:
Read more

Basics of Statistical Risk Analysis

07/23/2020

Abstract

Risk is an essential part of daily life and even the society, as a whole, needs to take risks to continue growing and developing. Risk management is the process of identifying, analyzing and responding to risk factors. According to ICH Q9, Risk Assessment consists of the identification of hazards and the analysis and evaluation of risks associated with exposure to those hazards. Apart from a few exceptions (e.g., quantitative FTA), most of the risk analysis tools commonly used in the pharmaceutical field (e.g., FMEA, etc.) are basically subjective. However, in some cases, there are statistical techniques that allow us to assess the extent of the risk associated with some decisions. A typical example of this is, for example, the decision regarding the conformity, or not, of a lot based on the analysis of a sample of it. In such a decision two figures must be considered, the PRODUCER and the CUSTOMER (or CONSUMER), who run two different types of risk. The PRODUCER runs the risk of rejecting a “good lot” while the CUSTOMER (or CONSUMER) that of accepting a “not compliant” or a “poor quality” product. This post briefly addresses this topic.

Read more

Regulatory Technical Writing - Labor Ergo Scribo!

07/17/2020

Abstract

Those who work must necessarily write! The aims are many: to communicate the results of one's studies, to give operating instructions, to respond to requests, etc. In all cases, however, if the message contained in the writing does not reach the recipient, the entire communication process is frustrated and the consequences of this can be significant. For this purpose, it is sufficient to think that at least a third of the time of an executive is spent in writing documents and that the quality of a given job, the choice to continue it, interrupt it, finance it, etc. are often determined solely by the document that illustrates it! The focus of this presentation is therefore to analyze the structure of a technical document and provide practical suggestions for its preparation. Writing, however, is still much more than this and therefore the presentation considers, more generally, the "what it means to write and how to do it".

Read more

Solvents Classification using a Multivariate Approach: Cluster Analysis.

07/16/2018

Abstract

This post continues and completes the analysis of a database consisting of 64 solvents, each described by eight physico-chemical descriptors, initiated in the previous post. Subject matter of this study is the application of Cluster Analysis with the intention of finding groups in data, i.e., identifying which observations are alike and categorize them in groups, or clusters. As clustering is a broad set of techniques, this study focuses just on the so-called hard clustering methods, i.e., those assigning observations with similar properties to the same group and dissimilar data points to different groups. Two types of algorithms have been considered: hierarchical and partitional. Quite apart from the chosen technique, the experimental evidence indicates the presence, in the database, of: • three main groups, each consisting of individuals categorized as similar among them and • a few isolated individuals dissimilar from the others. A similar finding was also obtained in the previous post using 2d-contour plots. A closer examination of these three main groups of solvent shows a finer structure consisting of smaller groups of individuals highly similar among them (e.g., members of a given chemical family (e.g., alcohols, chlorinated hydrocarbons) or of chemical entities sharing common characteristics (e.g., aprotic dipolar solvents).

Read more

Solvents Classification using a Multivariate Approach: Correlation and Principal Component Data Analysis.

06/01/2018

Abstract

The identification of data-driven criteria to make a conscious choice of solvents for practical applications is a rather old issue in the chemical field. Solvents, in fact, are mainly selected based on Chemist’s experience and intuition driven by parameters such as polarity, basicity and acidity. At least two research groups, already in 1985, approached the issue of solvent selection using multivariate statistical methods. These Scientists, using different databases, each based on different types of physicochemical descriptors, obtained different classification patterns. In this post, it has been chosen one of those databases and the data analysis process has been repeated detailing it systematically. This post deals with the first part of the process and it covers the intercorrelation among the physicochemical descriptors used to characterize the solvents under study and Principal Component Analysis. The correlation found allows to capture 70% of the initial data variability just using two principal components the first of which is related to “polarity/polarizability” and “lipophilicity” of molecules and the second to “strength of intermolecular forces”. The use of these two principal components suggests the possibility of grouping solvents into aggregates (or clusters) of similar individuals and this aspect will be covered in the following post.

Read more

A different way to look at pharmaceutical Quality Control data: multivariate instead of univariate.

05/09/2018

Abstract

In the pharmaceutical industry, Quality Control (QC) data are typically arranged in data tables each row of which refers to a specific production lot and contains the results from different types of measurements (chemical and microbiological). As for each active chemical entity, or dosage form, there is a specific data table and since all lots listed therein are manufactured using the same approved process, the data table contains the “analytical fingerprint” of that specific manufacturing process. In spite of their table form, QC data are usually reviewed, evaluated and trended in a univariate mode, i.e., each type of data is analyzed individually using statistical tools such as control charts, box plots, etc. The dataset is therefore studied “ by columns ”. In this post, it is proposed a different way to analyze QC data, i.e., by using a multivariate approach that improves upon separate univariate analyses of each variable by using information about the relationships between the variables. Moreover, the combination of multivariate methods with the power of the programming language R and its unsurpassed graphic tools, allows analyzing data mainly relying on graphics and, as stated by Chambers et al., “there is no statistical tool that is as powerful as a well-chosen graph”. This post shows how using R for combined multivariate data analysis and visualization, the information contained in QC chemical dataset can be easily extracted and converted into “knowledge ready to use”.

Read more

Riccardo Bonfichi Hi and Welcome on my website Smile

I am a Chemist and I work in the pharmaceutical industry since 1982 where I had experience of Analytical R&D, Quality Control and Quality Assurance. In the last six - seven years, I have developed a deep, personal interest in Statistical data analysis. After a start using Minitab and the univariate approach, I later discovered R/RStudio and Multivariate Analysis. Both these last findings, that impressed and fascinated me, are one of the main reasons for creating this website. I hope, in fact, it would allow me to get in touch with Scientists involved in the field of Multivariate Analysis and Clustering to learn from and to cooperate with. Therefore, please, get in touch to talk about statistical methods in the pharmaceutical / chemical industries and, in particular, Multivariate Analysis and Data Clustering.
The content of this website and the opinions therein have nothing to do with my current position or with my previous or current employers.

UNIVERSITY EDUCATION

1986 Master in Analytical and Chemical Methods of Fine Organic Chemistry
Polytechnic University of Milan, Italy

1981 Graduated in Chemistry
University of Milan, Italy

Training courses
• Statistical Process Control for the FDA regulated Industry, Pragmata, Teramo, May, 3rd - 4th 2016
• Statistics for Data Science with R, Quantide, Legnano, October, 19th - 20th 2018
• Data Mining with R, Quantide, Legnano, February, 15th - 16th 2018
• Intermediate R Course, DataCamp, February, 27th 2018
• Data Visualization and Dashboard with R, Quantide, Legnano, June, 25th - 26th 2018

Affiliations
• Member of the Italian Council of Chartered Chemists (since July 1996)
• Member of the Italian Statistical Society (since May 2018)

Languages
My mother tongue is Italian. From 1989 to 1992, I have worked and lived in Basel (Switzerland) where I learned German. Beside this, I also speak English and a bit of French.

r-bloggers.com

quantide.com

r-project.org

rstudio.com/




Read Italian legislation on data protection and privacy.
Contacts
Twitter
Linkedin


Privacy policy


Template by Danny Design

Privacy policy

Law D.Lgs. n. 196/03

COOKIES LAW
This site doesn't use any type of cookies (technical cookies or profiling cookies).
Pursuant to Section 122 of the “Italian Privacy Act” and Authority Provision of 8 May 2014, no consent is required from site visitors.
Garante della privacy (en-it)
PERSONAL DATA
This website doesn't collect or store any kind of personal data.

COOKIE TECNICI USATI DA QUESTO SITO
Questo sito non fa uso di cookies di profilazione per i quali è richiesto il consenso del navigatore come meglio specificato nelle pagine del Garante della privacy (en-it)
DATI PERSONALI
Questo sito non richiede, non raccoglie e non tratta dati personali di alcun genere.