Difference between revisions of "Getting Started with Machine Learning"
m |
(skeleton of uv) |
||
Line 16: | Line 16: | ||
** [https://github.com/psf/black black] code formatter | ** [https://github.com/psf/black black] code formatter | ||
** [https://docs.astral.sh/ruff/ ruff] alternative to flake8 | ** [https://docs.astral.sh/ruff/ ruff] alternative to flake8 | ||
+ | |||
+ | ==== uv ==== | ||
+ | |||
+ | To get started with <code>uv</code>, pick a directory you want your virtual environment to live in. (<code>$HOME</code> is not recommended.) Once you have <code>cd</code>ed there, run | ||
+ | |||
+ | <syntaxhighlight lang="bash"> | ||
+ | uv venv | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | To activate your virtual environment, run | ||
+ | |||
+ | <syntaxhighlight lang="bash"> | ||
+ | source .venv/bin/activate | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | *while in the directory you created your <code>.venv</code> in*. | ||
=== Installing Starter Project === | === Installing Starter Project === |
Revision as of 03:19, 20 May 2024
This is User:Ben's guide to getting started with machine learning.
Contents
Dependencies
Here's some useful dependencies that I use:
- uv
- This is similar to Pip but written in Rust and is way faster
- It has nice management of virtual environments
- Can use Conda instead but it is much slower
- Github Copilot
- mlfab
- This is a Python package I made to help make it easy to quickly try out machine learning ideas in PyTorch
- Coding tools
uv
To get started with uv
, pick a directory you want your virtual environment to live in. ($HOME
is not recommended.) Once you have cd
ed there, run
uv venv
To activate your virtual environment, run
source .venv/bin/activate
- while in the directory you created your
.venv
in*.
Installing Starter Project
- Go to this project and install it
Opening the project in VSCode
- Create a VSCode config file that looks something like this:
{
"folders": [
{
"name": "Getting Started",
"path": "/home/ubuntu/Github/getting_started"
},
{
"name": "Workspaces",
"path": "/home/ubuntu/.code-workspaces"
}
],
"settings": {
"cmake.configureSettings": {
"CMAKE_CUDA_COMPILER": "/usr/bin/nvcc",
"CMAKE_PREFIX_PATH": [
"/home/ubuntu/.virtualenvs/getting-started/lib/python3.11/site-packages/torch/share/cmake"
],
"PYTHON_EXECUTABLE": "/home/ubuntu/.virtualenvs/getting-started/bin/python",
"TORCH_CUDA_ARCH_LIST": "'8.0'"
},
"python.defaultInterpreterPath": "/home/ubuntu/.virtualenvs/getting-started/bin/python",
"ruff.path": [
"/home/ubuntu/.virtualenvs/getting-started/bin/ruff"
]
}
}
- Install the VSCode SSH extension
- SSH into the cluster (see K-Scale Cluster for instructions)
- Open the workspace that you created in VSCode
Useful Brain Dump Stuff
- Use
breakpoint()
to debug code - Check out the mlfab examples directory for some ideas
- It is a good idea to try to write the full training loop yourself to figure out what's going on
- Run
nvidia-smi
to see the GPUs and their statuses/any active processes