Difference between revisions of "Getting Started with Machine Learning"

From Humanoid Robots Wiki
Jump to: navigation, search
m (fix bold)
(Explain uv venv --python 3.11 flag and why it is important/why one might want to use it.)
 
(One intermediate revision by the same user not shown)
Line 18: Line 18:
  
 
==== uv ====
 
==== uv ====
 +
 +
To install <code>uv</code> on the K-Scale clusters, run
 +
 +
<syntaxhighlight lang="bash">
 +
curl -LsSf https://astral.sh/uv/install.sh | sh
 +
</syntaxhighlight>
  
 
To get started with <code>uv</code>, pick a directory you want your virtual environment to live in. (<code>$HOME</code> is not recommended.) Once you have <code>cd</code>ed there, run
 
To get started with <code>uv</code>, pick a directory you want your virtual environment to live in. (<code>$HOME</code> is not recommended.) Once you have <code>cd</code>ed there, run
Line 24: Line 30:
 
uv venv
 
uv venv
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
'''If you are on the clusters''', you instead may want to run
 +
 +
<syntaxhighlight lang="bash">
 +
uv venv --python 3.11
 +
</syntaxhighlight>
 +
 +
to ensure that the virtual environment uses Python 3.11. This is because by default, uv uses the system's version of Python (whatever the result of <code>which python</code> yields), and the clusters are running Python 3.10.12. (Python 3.11 will be useful because various projects, including the starter project, will require Python 3.11.)
  
 
To activate your virtual environment, run
 
To activate your virtual environment, run
  
 
<syntaxhighlight lang="bash">
 
<syntaxhighlight lang="bash">
source .venv/bin/activate  
+
source .venv/bin/activate
 
</syntaxhighlight>
 
</syntaxhighlight>
  

Latest revision as of 03:33, 20 May 2024

This is User:Ben's guide to getting started with machine learning.

Dependencies[edit]

Here's some useful dependencies that I use:

  • uv
    • This is similar to Pip but written in Rust and is way faster
    • It has nice management of virtual environments
    • Can use Conda instead but it is much slower
  • Github Copilot
  • mlfab
    • This is a Python package I made to help make it easy to quickly try out machine learning ideas in PyTorch
  • Coding tools
    • mypy static analysis
    • black code formatter
    • ruff alternative to flake8

uv[edit]

To install uv on the K-Scale clusters, run

curl -LsSf https://astral.sh/uv/install.sh | sh

To get started with uv, pick a directory you want your virtual environment to live in. ($HOME is not recommended.) Once you have cded there, run

uv venv

If you are on the clusters, you instead may want to run

uv venv --python 3.11

to ensure that the virtual environment uses Python 3.11. This is because by default, uv uses the system's version of Python (whatever the result of which python yields), and the clusters are running Python 3.10.12. (Python 3.11 will be useful because various projects, including the starter project, will require Python 3.11.)

To activate your virtual environment, run

source .venv/bin/activate

while in the directory you created your .venv in.

Installing Starter Project[edit]

Opening the project in VSCode[edit]

  • Create a VSCode config file that looks something like this:
{
  "folders": [
    {
      "name": "Getting Started",
      "path": "/home/ubuntu/Github/getting_started"
    },
    {
      "name": "Workspaces",
      "path": "/home/ubuntu/.code-workspaces"
    }
  ],
  "settings": {
    "cmake.configureSettings": {
      "CMAKE_CUDA_COMPILER": "/usr/bin/nvcc",
      "CMAKE_PREFIX_PATH": [
        "/home/ubuntu/.virtualenvs/getting-started/lib/python3.11/site-packages/torch/share/cmake"
      ],
      "PYTHON_EXECUTABLE": "/home/ubuntu/.virtualenvs/getting-started/bin/python",
      "TORCH_CUDA_ARCH_LIST": "'8.0'"
    },
    "python.defaultInterpreterPath": "/home/ubuntu/.virtualenvs/getting-started/bin/python",
    "ruff.path": [
      "/home/ubuntu/.virtualenvs/getting-started/bin/ruff"
    ]
  }
}

Useful Brain Dump Stuff[edit]

  • Use breakpoint() to debug code
  • Check out the mlfab examples directory for some ideas
  • It is a good idea to try to write the full training loop yourself to figure out what's going on
  • Run nvidia-smi to see the GPUs and their statuses/any active processes