-
Presentation
Presentation
This Course Unit aims to understand concepts associated with Data Science oriented to biotechnology and similar areas. The student is introduced to the mathematical formulation of tools in the domains of Statistics and Probability, which enable the implementation, development and interpretation of scripts in Python language. In this sense, through supervised and unsupervised learning mechanisms, the student is able to generate solutions that comprise processing, analysis and visualization of relevant data sets, which can be extensive (big data) or not. The relevance of this Curricular Unit is also completed with the study and modeling of applications in the biotechnological domain, such as the prediction of diseases in living beings to effects in genetically modified crops, in order to support decision-making either in academic research or in biobusiness.
-
Class from course
Class from course
-
Degree | Semesters | ECTS
Degree | Semesters | ECTS
Bachelor | Semestral | 5
-
Year | Nature | Language
Year | Nature | Language
3 | Optional | Português
-
Code
Code
ULHT6643-22371
-
Prerequisites and corequisites
Prerequisites and corequisites
Not applicable
-
Professional Internship
Professional Internship
Não
-
Syllabus
Syllabus
Introduction to Data Science in Biotechnology Mathematical Preliminaries for Data Science Data Munging. Scores and Rankings Statistical Analysis Data Visualization Mathematical Models in Data Science Linear Algebra Linear and Logistic Regression Machine Learning Big Data
-
Objectives
Objectives
The main objective of this Course Unit is to provide the student with the main techniques and methodological principles of data acquisition, processing and interpretation in biotechnology, using computation via Python language. Thus, the student will be able to: understand, implement or develop simple quantitative, classification and predictive mathematical models based on existing data sets; understand, implement or develop methodologies that are based on computerized autonomous learning, via supervised and unsupervised mechanisms; and apply the Python programming language in the scope of Data Science to solve and infer problems in the area of ¿¿biotechnology.
-
Teaching methodologies and assessment
Teaching methodologies and assessment
The theoretical classes present the program content, using presentations and simulations, stimulating discussion between students and teachers. In the theoretical-practical classes, students solve exercises with a progressive transition in complexity. Assessment may be continuous or non-continuous. Continuous assessment: written test (theoretical component, CT) and submission of two exercises solved during the semester (theoretical-practical component, CTP). CT: completion of two tests or one exam. CTP consists of the submission of two solved exercises via Moodle and their discussion (40% exercises, 60% discussion), with no minimum grade. The final grade for the course unit is calculated as follows: Final Grade = 50% CT + 50% CTP
-
References
References
Trevor Hastie, Robert Tibshirani, Jerome Friedman (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd edition. Springer (ISBN-13: 978-0387848570) Laura Igual, Santi Segui (2017) Introduction to Data Science: A Python Approach to Concepts, Techniques, and Applications. 1st edition. Springer (ISBN-13: 978-3319500164)
-
Office Hours
Office Hours
-
Mobility
Mobility
No