-
Presentation
Presentation
This Course Unit aims to understand concepts associated with Data Science oriented to biotechnology and similar areas. The student is introduced to the mathematical formulation of tools in the domains of Statistics and Probability, which enable the implementation, development and interpretation of scripts in Python language. In this sense, through supervised and unsupervised learning mechanisms, the student is able to generate solutions that comprise processing, analysis and visualization of relevant data sets, which can be extensive (big data) or not. The relevance of this Curricular Unit is also completed with the study and modeling of applications in the biotechnological domain, such as the prediction of diseases in living beings to effects in genetically modified crops, in order to support decision-making either in academic research or in biobusiness.
-
Class from course
Class from course
-
Degree | Semesters | ECTS
Degree | Semesters | ECTS
Bachelor | Semestral | 5
-
Year | Nature | Language
Year | Nature | Language
3 | Optional | Português
-
Code
Code
ULHT6643-22371
-
Prerequisites and corequisites
Prerequisites and corequisites
Not applicable
-
Professional Internship
Professional Internship
Não
-
Syllabus
Syllabus
Introduction to Data Science in Biotechnology Mathematical Preliminaries for Data Science Data Munging. Scores and Rankings Statistical Analysis Data Visualization Mathematical Models in Data Science Linear Algebra Linear and Logistic Regression Machine Learning Big Data
-
Objectives
Objectives
The main objective of this Course Unit is to provide the student with the main techniques and methodological principles of data acquisition, processing and interpretation in biotechnology, using computation via Python language. Thus, the student will be able to: understand, implement or develop simple quantitative, classification and predictive mathematical models based on existing data sets; understand, implement or develop methodologies that are based on computerized autonomous learning, via supervised and unsupervised mechanisms; and apply the Python programming language in the scope of Data Science to solve and infer problems in the area of ¿¿biotechnology.
-
Teaching methodologies and assessment
Teaching methodologies and assessment
In theoretical classes, the contents of the program are presented, using presentations and simulations, stimulating discussion between students and teacher. In theoretical-practical classes, students solve exercises with a progressive transition of complexity. Assessment can be continuous or non-continuous. Continuous assessment comprises a written test (theoretical component, TC) and delivery of two exercises solved during the semester (theoretical-practical component, TPC). TC consists of two frequencies or one exam. TCP consists of delivering two exercises solved via Moodle. The final grade for the Curricular Unit results from the calculation: Final Grade = 50% TC + 50% TCP, where TC and TCP are respectively the averages of the tests given in the scope of each component. Alternatively, at the beginning of the semester, the student can select the non-continuous assessment mode. Thus, the student is submitted to an exam, which must have a minimum grade of 9.5 for approval.
-
References
References
Trevor Hastie, Robert Tibshirani, Jerome Friedman (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd edition. Springer (ISBN-13: 978-0387848570) Laura Igual, Santi Segui (2017) Introduction to Data Science: A Python Approach to Concepts, Techniques, and Applications. 1st edition. Springer (ISBN-13: 978-3319500164)
-
Office Hours
Office Hours
-
Mobility
Mobility
No