Lab02 - Investigation of prognostic factors (factors involved in pathogenesis) through case-control studies

For getting help during the practical activities access: Instructions and Interpretations.   


Case-control studies are an important category of clinical trials which evaluates the link between one or more prognostic factors and various pathological conditions.

In these studies, the relationship between the factor (s) and the occurrence of a disease is assessed, or whether a particular factor changes the progression of a disease (amelioration or healing). Prognostic factors can be: risk factors (favoring the occurrence of a disease) or protective factors (prevent the disease or help the process of healing). Investigating the relationship between possible prognostic factors and disease can be achieved through all types of data collection without the researcher intervening.
In the case-control study, it starts from the fact that the disease is rare and the cases are first identified ( subjects who have a certain disease) and then the controls are chose: subjects with similar characteristics, (gender, age, socio-economic status) they do not have the disease. Only after that, with retrospective or transversal methods, the researcher is looking for prognostic factors in these groups. There are several methods for identifying prognostic factors, some of them are: laboratory determinations, standardized questionnaire, interview with the subject / family members, or consultation of medical records of subjects enrolled in the study.
Strengths of the case-control study:·Efficacy in case of rare diseases or high period of latency diseases·  Relatively easy to achieve (not expensive, it takes a short period of time)· Allows analysis of several prognostic factors Weaknesses of the case-control study:can bring false information (eg. subjects forget some of previous exposures, or sick subjects tend to remember even non-significant exposures in comparison to healthy people) the time between the onset of exposure and the onset of the illness may be difficult to determine         

Example of a case-control study:

               An example of case-control study is the one led by the authors  Destaalem Gebremedhin, Haftu Berh and Kahsu Gebrekirstos, a study entitled „Risk Factors for Neonatal Sepsis in Public Hospitals of Mekelle City, North Ethiopia,2015: Unmatched Case Control Study”,published in 2016,in the journal Plos One (ISI, IF:2.80, Q1).Article source:

Definitions used in the study:Neonatal sepsis  - Systemic inflammatory response  in the presence, or as a result of a proven or suspected infection in a newborn. The infection can be bacterial, viral or fungal.  


               Gebremedhin D et al. have conducted a study on newborns registered at public hospitals in Mekelle, North Ethiopia, from December 2014 to June 2015, to identify risk factors for neonatal sepsis. The data needed for the study were collected from the infant's medical records and questionnaire applied to the mothers.  Two groups of subjects were compiled in the study: a sample of 78( newborns diagnosed with neonatal sepsis) and a control group of 156 (newborns without sepsis).
The inclusion criteria for cases were:·         newborns registered in pediatric or neonatal intensive care in public hospitals in Mekelle City, North Ethiopia during the study period with at least one of the following IMNCI (Integrated Management of Neonatal and Childhood Illness) clinical features: fever (≥37.5 ° C) or hypothermia (≤ 35.5 ° C), increased breathing (≥ 60 breaths per minute), severe chest retraction, movements only when stimulated, seizures, lethargy or unconsciousness ·         at least 2 haematological criteria: total white blood cells (12000 cells/m3), absolute neutrophil number (1500 cells/mm3 or >7500 cells/mm3), erythrocyte sedimentation rate (>15/1h),  number of platelets( > 440 cells/m3) .
The inclusion criteria for the control group were:·         newborns who did not meet the sepsis criteria and who were enrolled in neonatal pediatric or intensive care units of public hospitals in Mekelle, North Ethiopia during the study period 

Reaserch protocol

Aim and objectives

The aim of this study was to evaluate the association between the mother's history of urinary tract infection(UTI)  or sexually transmitted infection(STI)  and neonatal sepsis.

 Objectives: testing for the link between the factor (prolonged rupture of the membrane) and the disease (neonatal sepsis)·         quantifying the importance of this link : estimation of the odds ratio indicator (OR) 2. Domain of research: risk factors or/and prognostic factors 3. Study type:·         Based on study objectives: analytical study·         Based on the results of the researchers: observational study·         Based on the matching technique that was used in choosing the groups: without matching Note:analytical study = groups of patients are compared, associations between different clinical characteristics are testeddescriptive study=it describes a number of cases or a single group of patients and do not search for possible associations / linksobservational study = the type of study that does not involve the researcher intervention on the progression of the disease.matching in studies = for each subject in the case group/exposed group there is a subject with similar characteristics for the control /unexposed group.  For ex. Matching by: same sex, similar age (identical or +/- 2 years), same risk factors (eg diabetics). The match can be made 1: 1 or 1: 2, 1: 3, 1: 4 (one case at 1/2/3/4 controls). 4. Target population and study sample: Target population: Newborn babies arriving in the world in the section of obstetrics in public hospitals in Mekelle, North Ethiopia, December 2014 - June 2015. Study sample:Inclusion criteria: Babies born between December 2014 and June 2015 in the Mekelle, Northern Ethiopia hospitals' public hospitals.Exclusion criteria :Note: In this scenario, the exclusion criteria are not mentioned.Sample size:The authors mention in the article that the size of the sample was estimated at 234 subjects for a type II error of 0,20 (a 80% study power), a proportion of  UTI/STI with neonates without sepsis of 13% (determined in another study per similar population), control/cases (ratio) = 2: 1, and an OR = 2,87. We believe that the size of the sample was sufficient. 5. Data collection method·         based on the studied population: systematic random sampling-see the Method section of the article·         based on the duration of data collection: longitudinal retrospective·         based on the grouping method: Case – control (the two groups were defined according to the presence/absence of neonatal sepsis).
!Note: Types of probabilistic samplingRandom simple sampling: Each element within the population has the same chance of     being included in the sampleRandom systematic sampling: assumes random choice of a start number (step), from which, adding a fixed size will result in a unit (element) of the sampleRandom layered sampling: consists of dividing the research population into layers according to certain characteristics (genes, age groups) to randomly subtract sub-samples (groups) proportional to the size of the layer from each layer, groups that are then combined to obtain a single sampleRandom cluster sampling: The research population is divided into clusters and a number of groups are randomly selected with all the units included. 
6. Statistical analysis Demographic and clinical characteristics of the mother:Age: continuous quantitative variableCivil status, religion, ethnicity, occupation: nominal qualitative variablesEducational level (primary school, secondary school, college or faculty): ordinal qualitative variableHistory of urinary tract infection or sexually transmitted infection (TUI/STI), hypertension disorders, the birth environment, prolonged membrane rupture, intrapartum fever: dichotomial qualitative variables Demographic and clinical features of the newborn:Newborn gender (M/F), Apgar score at 1minute (7 coded with 1 versus  ≥7 coded with 0), Apgar score at 5 minutes (7 coded with 1 versus ≥7 coded with 0), birth resuscitation (yes/no),  sepsis (yes/no): dichotomial qualitative variables

Weight at birth (1500-2500 grams,  ≥2500 grams), gestational age ( 42 weeks): ordinal qualitative variables

Data description: It can be done either through a frequency table, contingency table, or vertical or horizontal column chart.

The bivariate association between the exposer factor and the disease can be demonstrated trough statistical Chi-square test or Fisher's exact test (the latter applies if expected frequencies in the contingency table are   20% of cells)

Multivariate association between the exposure factor and the disease: through multivariate logistic regression.

To quantify the importance of the association between the risk factor and the disease, the chances of having sepsis (OR) and the associated 95% confidence interval (CI) will be calculated.·         If the estimated odds ratio  OR> 1 and both ends (lower limit and upper limit) of the confidence interval are greater than >1 then it can be stated that the exposure factor is a risk factor·         If the estimated odds ratio  OR 1 and both ends (lower limit and upper limit) of the confidence interval are less than 1 then we can assume that the exposure factor can be a protective factor.·         If the odds ratio OR = 1 or the confidence interval contains 1 then we can say that we do not have enough evidence to demonstrate that the exposure factor is a risk or a protective factor!Note: The odds ratio can be determined in the unadjusted form (also referred to as the "crude odds ratio") representing the chance of disease in exposed versus non-exposed patients and / or in the adjusted form (also called the "adjusted odds ratio") representing the chance of disease in exposed versus non-exposed patients adjusted after the presence of other covariates. 

Expected results. Data analysis and presentation

1. Sample description

In the present study, the description of the characteristics measured on the sample was made using the contingency tables presented in a condensed form in the Results section of the article.

Table 1. Description of the socio-demographic characteristics of the study groups


Source: imagine preluată din articolul Gebremedhin D, Berhe H, Gebrekirstos K. Risk Factors for Neonatal Sepsis in Public Hospitals of Mekelle City, North Ethiopia, 2015: Unmatched Case Control Study. PLoS One. 2016 May 10;11(5):e0154798.


  1. 2. Bivariate and multivariate association between a history of urinary tract infection or a sexually transmitted infection and neonatal sepsis.

The contingency of risk factor (TUI / STI) and disease (sepsis):



Sepsis (no)


AIU/AIS (yes)












TUI / STI= the history of the mother of urinary infection or sexually transmitted infection

Table 2. Bivariate and multivariate association between TUI /STI and neonatal sepsis


Source: preluat din articolul Gebremedhin D, Berhe H, Gebrekirstos K. Risk Factors for Neonatal Sepsis in Public Hospitals of Mekelle City, North Ethiopia, 2015: Unmatched Case Control Study. PLoS One. 2016 May 10;11(5):e0154798.

Column Chart for the relationship between risk factor and disease (made in Excel):


Figure 1. Relationship between the TUI(AIU) /STI( AIS) mother's history and neonatal sepsis

The odds ratio (unadjusted OR) and the associated confidence interval (Table 2)( to calculate the point estimator and associated confidence interval the EpiInfo program can be used) the result writing format: point estimator (95% CI: superior limit-inferior limit); OR = 6,8 (95% CI: 3,6-12,8)

The value of p (made in EpiInfo is reported as p = value - the name of the test used, with a maximum of 3 decimal places, if p 0.001 then "p 0.001"): p 0.001-Chi-square test

Note: In the present study, the authors also estimated the rate of chance adjusted for different associations such as maternal age, background, type of birth, prolonged rupture of the membrane, etc.


Interpretation of data. Discussions

1. Interpreting the results from a statistical point o view: The aim of the study was to test and quantify the association between the mother's history of  UTI /STI and neonatal sepsis.From a statistical point of view, association testing was made by formulating two hypotheses (the null hypothesis and the alternative hypothesis), the null hypothesis being the one tested. In the case of its rejection, we affirm that we are in favor of the alternative hypothesis.Null Hypothesis: There is no significant association between the mother's history of UTI /STI and neonatal sepsis.Alternative hypothesis: There is a significant association between the mother's history of UTI /STI and neonatal sepsis.Because p 0.05, there is significant association between the mother's history of UTI /STI and neonatal sepsis.From a statistical point of view, the quantification of the association was achieved by punctual estimation of the OR indicator (unadjusted and adjusted) and the 95% confidence interval associated with it. Point estimate:OR = 6,8;  newborns whose mothers had a history of UTI/STI during pregnancy had a 6,8-times higher chance of developing sepsis compared to neonates whose mothers did not have UTI/STI.Reliable interval for OR: 95% CI: 3,6-12,8 - we are 95% sure that the odds ratio for the study population will be between 3,6 and 12,8 (if we extract samples of the same size from the study population, 95% of CI will contain the chance rate in the target population of newborns).In addition, OR> 1 and both ends of the confidence interval were greater than 1, so the history of the UTI/STI mother is a risk factor for neonatal sepsis. !Note: In the study presented, the authors also analyzed the association between UTI/STI in the context of presence of other covariates (maternal and neonatal factors) such as hypertension in pregnancy, intrapartum fever, birth complications, low score at 5 minutes, etc. - and an adjusted OR = 5.23, 95% IC: 1,8-15,04. Since the results were statistically significant, it can be argued that the history of UTI / STI of the mother remains an independent risk factor for neonatal sepsis.
2. Interpreting the results from a clinical point of view: The size of the OR indicator in clinical context:• Very important / moderate / not importantOR = 6,8 shows a moderate magnitude in the context of similar studies performed on the same population as having an OR = 12,9 (see section Discussion of the article)
Accuracy of result (see confidence interval): • Relatively accurate / imprecise (broad range - imprecise results, narrow range - accurate results)The confidence interval of 95%: 3,6-12,8 can be considered as a relatively accurate interval • Relatively important clinical connection (both ends with significant clinical value) The link between the maternal factors (UTI / STI history during pregnancy) can be considered as clinically relevant and relatively important (in the simulation of the size of the sample, the authors considering an OR = 2,87 effect size to be of interest). 

Conclusions of the study

The history of the mother`s urinary tract infection or sexually transmitted infection as a maternal factor contributes to the risk of neonatal sepsis, being even an independent risk factor.

On Infomed server, in your own folder make a new LAB02 folder. Download the following file in this folder and answer inside to all practical activities requiered. Save the file.

LAB 02 - Practical Activity

Data file - EXCEL

Results obtained with EpiInfo - for those who cannot install it

Read 1451 times