.
Categorical Data Analysis
Study Course Description
Course Description Statuss:Approved
Course Description Version:5.00
Study Course Accepted:14.03.2024 11:48:07
Study Course Information | |||||||||
Course Code: | SL_117 | LQF level: | Level 7 | ||||||
Credit Points: | 2.00 | ECTS: | 3.00 | ||||||
Branch of Science: | Mathematics; Theory of Probability and Mathematical Statistics | Target Audience: | Life Science | ||||||
Study Course Supervisor | |||||||||
Course Supervisor: | Maksims Zolovs | ||||||||
Study Course Implementer | |||||||||
Structural Unit: | Statistics Unit | ||||||||
The Head of Structural Unit: | |||||||||
Contacts: | 14 Baložu street, Block A, Riga, statistikarsu[pnkts]lv, +371 67060897 | ||||||||
Study Course Planning | |||||||||
Full-Time - Semester No.1 | |||||||||
Lectures (count) | 7 | Lecture Length (academic hours) | 2 | Total Contact Hours of Lectures | 14 | ||||
Classes (count) | 5 | Class Length (academic hours) | 2 | Total Contact Hours of Classes | 10 | ||||
Total Contact Hours | 24 | ||||||||
Part-Time - Semester No.1 | |||||||||
Lectures (count) | 7 | Lecture Length (academic hours) | 1 | Total Contact Hours of Lectures | 7 | ||||
Classes (count) | 5 | Class Length (academic hours) | 2 | Total Contact Hours of Classes | 10 | ||||
Total Contact Hours | 17 | ||||||||
Study course description | |||||||||
Preliminary Knowledge: | • Familiarity with probability theory and mathematical statistics. • Basic knowledge in Jamovi software. • Basic knowledge in data analysis. | ||||||||
Objective: | Since most of statistical data is of categorical nature, then the aim of the course is to point out the specific features of this type of data and to teach the respective methods of statistical analysis. The course will be mostly on methods and application, to some extent mathematical background and justification of the methodology will be provided. Statistics software Jamovi will be used for computer labs, where real datasets covered in lectures will be analysed by students, so that they would connect theory and practice, so that students would become confident in using the methodology in practical data analysis. | ||||||||
Topic Layout (Full-Time) | |||||||||
No. | Topic | Type of Implementation | Number | Venue | |||||
1 | The nature of categorical data. Classification by purpose and scale. Types of studies. Probability distributions. Overdispersion. | Lectures | 1.00 | auditorium | |||||
2 | Joint distribution of categorical variables. Conditional and marginal distributions. Maximum likelihood estimates to probabilities. | Lectures | 1.00 | auditorium | |||||
3 | Independence. Measures of dependence, relative risk, odds, odds ratio. 2x2 frequency tables. Estimates from frequency table. Conditional probabilities – sensitivity, specificity. True negative, false positive. | Lectures | 1.00 | auditorium | |||||
4 | Introduction to "Jamovi". Visualising categorical data. Comparing distributions. Frequency tables. Conditional frequencies. Estimating measures of dependence. | Classes | 1.00 | computer room | |||||
5 | Larger than 2 x 2 tables. Measurements of dependence for ordinal and nominal data. Hypothesis about population distribution. Chi-square test. | Lectures | 1.00 | auditorium | |||||
6 | Measurements of dependence for ordinal and nominal data. Hypothesis about independence and conditional independence. | Classes | 1.00 | computer room | |||||
7 | Large sample case, hypothesis about independence. Chi-square and likelihood ratio test. Small sample case, Fisher’s exact test. | Lectures | 1.00 | auditorium | |||||
8 | Hypothesis about independence and conditional independence. | Classes | 1.00 | computer room | |||||
9 | Asymptotic distribution of multinomial frequencies. Confidence intervals for odds ratio and relative risk. Testing homogeneity of marginal distribution in case of paired observations. | Lectures | 1.00 | auditorium | |||||
10 | Interval estimation for measures of dependence. McNemar test. | Classes | 1.00 | computer room | |||||
11 | Models for binary outcome variable – Logit- and log-linear models. Models for retrospective studies. Decision trees for classification. | Lectures | 1.00 | auditorium | |||||
12 | Modelling categorical data. Classified data and raw data. Modelling and decision trees. Classification error. | Classes | 1.00 | computer room | |||||
Topic Layout (Part-Time) | |||||||||
No. | Topic | Type of Implementation | Number | Venue | |||||
1 | The nature of categorical data. Classification by purpose and scale. Types of studies. Probability distributions. Overdispersion. | Lectures | 1.00 | auditorium | |||||
2 | Joint distribution of categorical variables. Conditional and marginal distributions. Maximum likelihood estimates to probabilities. | Lectures | 1.00 | auditorium | |||||
3 | Independence. Measures of dependence, relative risk, odds, odds ratio. 2x2 frequency tables. Estimates from frequency table. Conditional probabilities – sensitivity, specificity. True negative, false positive. | Lectures | 1.00 | auditorium | |||||
4 | Introduction to "Jamovi". Visualising categorical data. Comparing distributions. Frequency tables. Conditional frequencies. Estimating measures of dependence. | Classes | 1.00 | computer room | |||||
5 | Larger than 2 x 2 tables. Measurements of dependence for ordinal and nominal data. Hypothesis about population distribution. Chi-square test. | Lectures | 1.00 | auditorium | |||||
6 | Measurements of dependence for ordinal and nominal data. Hypothesis about independence and conditional independence. | Classes | 1.00 | Pool | |||||
7 | Large sample case, hypothesis about independence. Chi-square and likelihood ratio test. Small sample case, Fisher’s exact test. | Lectures | 1.00 | auditorium | |||||
8 | Hypothesis about independence and conditional independence. | Classes | 1.00 | computer room | |||||
9 | Asymptotic distribution of multinomial frequencies. Confidence intervals for odds ratio and relative risk. Testing homogeneity of marginal distribution in case of paired observations. | Lectures | 1.00 | auditorium | |||||
10 | Interval estimation for measures of dependence. McNemar test. | Classes | 1.00 | computer room | |||||
11 | Models for binary outcome variable – Logit- and log-linear models. Models for retrospective studies. Decision trees for classification. | Lectures | 1.00 | auditorium | |||||
12 | Modelling categorical data. Classified data and raw data. Modelling and decision trees. Classification error. | Classes | 1.00 | computer room | |||||
Assessment | |||||||||
Unaided Work: | 1. Individual work with the course material in preparation to lectures according to plan. 2. Independently prepared homeworks by practicing the concepts studied in the course. In order to evaluate the quality of the study course as a whole, the student should fill out the study course evaluation questionnaire on the Student Portal. | ||||||||
Assessment Criteria: | Assessment on the 10-point scale according to the RSU Educational Order: • 2 independent homeworks – 50%. • Attendance and active participation during practical classes – 25%. • Final written exam – 25%. | ||||||||
Final Examination (Full-Time): | Exam (Written) | ||||||||
Final Examination (Part-Time): | Exam (Written) | ||||||||
Learning Outcomes | |||||||||
Knowledge: | • On successful course completion students will be familiar with a range of statistical analysis methodology available for categorical data. They will know and interpret the large sample as well the small sample tests. • Will detect the nature of categorical data; how to measure dependence between categorical variables based of the type of study and type of variables (nominal or ordinal). • Students will demonstrate how to model a binary outcome variable using continuous or categorical variables. | ||||||||
Skills: | • Student will understand and explain the effect of different types of data collection methods to the random nature of the frequency table. Interpret the distributional models for the frequency table, for its rows and columns. • Explain the dependence measures defined on the joint distribution of 2 categorical variables (relative risk, odds ratio, etc.), can interpret and estimate them. • Can test goodness-of-fit of data with the assumed distributional model, can test independence of categorical variables. • Can model the categorical variable (binary, in special case) by other variables. • Can apply independently his/her knowledge on real data. | ||||||||
Competencies: | • On successful course completion student will be competent to read and critically assess the scientific publications that have used categorical data in their analysis. • Student will be competent to plan and execute data analysis with categorical data. | ||||||||
Bibliography | |||||||||
No. | Reference | ||||||||
Required Reading | |||||||||
1 | Agresti, Alan. Categorical Data Analysis. Wiley, 2012 (or 1990, 2002 editions). | ||||||||
Additional Reading | |||||||||
1 | Agresti, Alan. An Introduction to Categorical Data Analysis. Wiley, 2019 (or 1996, 2007 editions). |