467
edits
Changes
no edit summary
== Prismatic REPL Script Guide == Here are some suggestions to run the generate.py REPL Script from the repo if (you would like to get started with OpenVLAcan find this in the '''scripts''' folder).
== Prerequisites ==
Make sure the images have an end effector in them.
[[File:Coke can2.png|400px|Can pickup task]]
== Starting REPL mode ==
Then, run generate.py. The script starts by initializing the generation playground with the Prismatic model prism-dinosiglip+7b.
The model prism-dinosiglip+7b is downloaded from the Hugging Face Hub.
The model configuration is found and then the model is loaded with the following components:
Vision Backbone: dinosiglip-vit-so-384px
Language Model (LLM) Backbone: llama2-7b-pure (this is also where the hf token comes into play)
Architecture Specifier: no-align+fused-gelu-mlp
Checkpoint Path: The model checkpoint is loaded from a specific path in the cache.
You should see this in your terminal:
[[File:Openvla1.png|800px|prismatic models]]
''After loading the model, the script enters a REPL mode, allowing the user to interact with the model. The REPL mode provides a default generation setup and waits for user inputs.''
Basically, the generate.py script runs a REPL that allows users to interactively test generating outputs from the Prismatic model prism-dinosiglip+7b. Upon running the script, users can enter commands in the REPL prompt: