-
Presentation
Presentation
The Introduction to Data Science course aims to provide the student with essential skills in data analysis, in a multidisciplinary perspective, as is Data Science. Through the presentation of methodologies and fundamental techniques to treat, transform, construct and analyze data, the objective of this curricular unit is to give the student the ability to translate this analysis into knowledge and value in a sustainable way for decision making. The practical component is one of the fundamental aspects of the discipline, so the ability to translate knowledge into practical actions and analysis decisions is particularly valued. The close connection with the business world to answer business questions will be illustraded in this course.
-
Class from course
Class from course
-
Degree | Semesters | ECTS
Degree | Semesters | ECTS
Master Degree | Semestral | 7
-
Year | Nature | Language
Year | Nature | Language
1 | Mandatory | Português
-
Code
Code
ULHT6347-23265
-
Prerequisites and corequisites
Prerequisites and corequisites
Not applicable
-
Professional Internship
Professional Internship
Não
-
Syllabus
Syllabus
1. Introduction to Data Science: Importance and applications of Data Science Project Workflow: Practical Examples Data types: Structured, Semi-structured, and unstructured Challenges in Data Science 2. Python for Data Science Setup: Jupyter notebook NumPy Pandas Matplotlib 3. Data pre-processing Cleaning and preparing structured data (data wrangling: slicing, grouby, pivoting, missing values, imputation, duplicates, outliers, etc.) Unstructured data processing - Text (lemmatization, stemming, etc.) 4. Introduction to Machine Learning, supervised and unsupervised models Basic concepts Linear Regression Logistic Regression KNN Dimensionality Reduction (PCA) Clustering Model performance evaluation metrics
-
Objectives
Objectives
The course aims to give the student the skills to: LG1. Understand the importance of Data Science in the real world LG2. Understand the nature of data LG3. Understand the main techniques and methods in Python programming used by data scientists, through their practice LG4. Be able to perform basic data preparation and pre-processing tasks LG5. Do exploratory data analysis with Python implementation LG6. Understanding a data scientist's workflow and being able to think about solving problems with data LG7. Understand and implement machine learning methods, supervised and unsupervised LG8. Know the performance metrics of a model
-
Teaching methodologies and assessment
Teaching methodologies and assessment
Theoretical concepts are introduced in class, and then they are complemented with real-world examples. For each topic, the students are given a set of exercises that aim to apply the theoretical concepts. Exercises are discussed and solved in class, students are invited to share any doubts they might have. Support materials and exercises with resolution suggestions will be available on Moodle. It is believed that continuous assessment, adapted according to the evolution of students, is a good practice. Individual monitoring and availability to clarify doubts, whenever necessary, is essential for the student and his/her performance.
-
References
References
Grus, J. (2019). Dafa science from scratch: first princples with Python . O'ReiIIy Media.
-
Office Hours
Office Hours
-
Mobility
Mobility
No