COMP3314 Tutorial 1
Basics for Python Programming & Assignment 1
Agenda
● Introduction
● Basics of Python
○ Setting Up Python Environment
○ Installing Miniconda
○ Managing Python Virtual Environments
○ Installing Libraries
● Development environment
○ Introduction to Jupyter Notebook
○ Leveraging Google Colab
○ Using Visual Studio Code for Python Development
● Overview of Assignment 1
● Summary and Q&A
Basics of Python
● What is Python?
○ High-Level, Interpreted Language: Known for simplicity and readability.
○ Created by Guido van Rossum in the late 1980s.
○ Purpose: Designed for ease of use, quick application development.
● Key Features of Python
○ Intuitive Syntax: Ideal for beginners.
○ Versatile Use: From web development to automation.
○ Open Source: With a large, supportive community.
○ Rich Libraries: For data analysis, ML, scientific computing.
○ Interpreted Nature: Facilitates quick prototyping.
● Python in Machine Learning & Data Science
○ Dominant Language: Due to simplicity and powerful libraries.
○ Strong Community Support: Resources and forums for learning.
○ Efficient for Prototyping: Quick experimentation with ML models.
One of the most popular programming languages
● Python is the 2# most popular programming language on GitHub
Installing Miniconda
● What is a Python virtual environment?
○ A virtual environment is a "container" for of multiple installed Python libraries and executables
○ Best practice: use separate environment for each project
● What is Miniconda?
○ A popular tool for managing Python virtual environment
○ Miniconda is the "mini" version of conda, recommended for general use
● Installing Miniconda
○ Find the proper version for your OS and follow the steps
■ https://docs.conda.io/projects/miniconda/en/latest/
○ Optional: prevent conda from activating base automatically
■ https://stackoverflow.com/a/54560785/1255535
● conda config --set auto_activate_base false
● Live demo for installation on macOS/Linux
○ Please refer to: https://asciinema.org/a/YhEyleUmEHeKfPRKIX4nxlKuK
● Windows installation
○ Please refer to: https://www.youtube.com/watch?v=oHHbsMfyNR4
Managing Python Virtual Environments with Conda
● Creating a new virtual environment
○ # Create an environment called "demo"
○ conda create -n demo python=3.8
● Activating and deactivating environments
○ # Check existing environment
○ conda env list
○ # Activate "demo" environment
○ conda activate demo
○ # Check python version
○ python --version
○ # Deactivate environment
○ conda deactivate
Installing Python Libraries
● Introduction to pip and conda
● Common libraries for machine learning
○ NumPy, scikit-learn, PyTorch, TensorFlow, Jupyter
● Installing libraries using pip commands
○ # Activate your virtual environment first!
○ conda activate demo
○ # Install Python libraries
○ pip install numpy
○ pip install scikit-learn
○ pip install jupyter
○ ...
NumPy
● What is NumPy?
○ NumPy: A fundamental package for numerical computation in Python.
○ Core Feature: Multidimensional array object (ndarray).
○ Purpose: Optimized for numerical operations, linear algebra, random number capabilities.
● Key Features of NumPy
○ Efficient Array Computing: Fast, memory-efficient array processing.
○ Mathematical Functions: Comprehensive mathematical functions.
○ Interoperability: Works well with other libraries.
import numpy as np
# Creating a NumPy array
arr = np.array([1, 2, 3, 4, 5])
# Performing element-wise operations
squared = arr ** 2
# Computing basic statistics
mean_value = np.mean(arr)
print(f"Original Array: {arr}")
print(f"Squared Array: {squared}")
print(f"Mean Value: {mean_value}")
Original Array: [1 2 3 4 5]
Squared Array: [ 1 4 9 16 25]
Mean Value: 3.0
scikit-learn
● What is Scikit-Learn?
○ Scikit-Learn: A Python library for machine learning.
○ Purpose: Offer simple and efficient tools for data mining and data analysis.
● Key Features of Scikit-Learn
○ Wide Range of Algorithms: Classification, regression, clustering, etc.
○ Data Preprocessing Tools: Feature scaling, normalization, .etc.
○ Model Evaluation: Cross-validation, metrics for performance evaluation.
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import
train_test_split
from sklearn.metrics import accuracy_score
# Load dataset
iris = load_iris()
X, y = iris.data, iris.target
# Split dataset
X_train, X_test, y_train, y_test =
train_test_split(X, y, test_size=0.3)
# Train a model
classifier = DecisionTreeClassifier()
classifier.fit(X_train, y_train)
# Predict and evaluate
predictions = classifier.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
PyTorch
● What is PyTorch?
○ PyTorch: An open-source machine learning library developed by Facebook's AI Research lab.
○ Purpose: Preferred for deep learning and artificial intelligence projects.
○ Features: Dynamic computational graph and tensor computation with strong GPU acceleration.
● Key Features of PyTorch
○ Dynamic Computation Graphs: Flexibility and ease in defining and modifying neural networks.
○ Tensor Library: Similar to NumPy, but with GPU support.
○ Autograd Module: Automatic differentiation for gradient calculations.
import torch
import torch.nn as nn
import torch.optim as optim
# Simple neural network
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc = nn.Linear(1, 1)
def forward(self, x):
return self.fc(x)
# Create a model, criterion and optimizer
model = Net()
criterion = nn.MSELoss()
ptimizer = optim.SGD(model.parameters(), lr=0.01)
# Dummy data
inputs = torch.tensor([[1.0], [2.0], [3.0]])
targets = torch.tensor([[2.0], [4.0], [6.0]])
# Forward pass, backward pass, optimize
optimizer.zero_grad()
utputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()
print(f"Loss: {loss.item()}")