OK I got it to work.
The problem is that it wasn’t clear in the exercise’s description that I had to run the notebook inside the folder from the repo already cloned in the previous step “Generating a Synthetic Dataset Using Replicator”.
The confusion happened when I got to “Fine-Tuning and Validating an AI Perception Model > Lecture: Training a Model With Synthetic Data” and run into this instruction:
“Optional: Training Your Own Model
For those interested in training their own model, follow these steps using the Synthetic Data Generation Training Workflow:
-
Clone the GitHub project and navigate to the local_train.ipynb notebook.
-
Set up the TAO Toolkit via a Docker container.
-
Download a pre-trained object detection model.
-
Convert your dataset into TFRecords (a format optimized for faster data iteration).
-
Specify training parameters such as batch size and learning rate.
-
Train the model using TAO Toolkit.
-
Evaluate its performance on test data.
-
Visualize results to assess how well the model detects objects.”
I know it’s optional but I want to train my model anyways.
So I cloned the Github repo from step 1, above, and downloaded to a different folder from the repo used to generate synthetic data from the previous section. This repo does NOT contain “/workspace/tao-experiments/palletjack_sdg/palletjack_data/distractors_warehouse/Camera/rgb” etc. So when ran local_train.ipynb from this repo it gave me that error.
What the exercise should have told me to do is: “go to the folder where you saved the repo from the previous step (where you generate synthetic data) and run local_train.ipynb from there”. This repo DOES contain “/workspace/tao-experiments/palletjack_sdg/palletjack_data/distractors_warehouse/Camera/rgb” and the other needed folders because they were generated during the synthetic data generation step.
Also, we should replace “os.environ[“LOCAL_PROJECT_DIR”] = "<LOCAL_PATH_OF_CLONED_REPO>” with the path for the project ran in the synthetic generation step, not the repo linked from the list above.
Here are the steps that I used to make it work:
0 - Make sure you have completed all the previous steps to generate synthetic data:
. Clone repo: GitHub - NVIDIA-AI-IOT/synthetic_data_generation_training_workflow: Workflow for generating synthetic data and training CV models.
. Configure generate_data.sh
. Run generate_data.sh
1- Open Ubuntu CLI (Linux environment if running on windows)
2- create a separate Conda environment that users python version 3.10: "conda create -n tao-py310 python=3.10
If this environment has already been previously created, skip this step.
3- Activate python 3.10 conda env: “conda activate tao-py310”
4- Connect to Nvidia’s docker container:
. Open docker desktop application and click on the play button on the container
. in the ubuntu cli run “docker login nvcr.io”
. login to the Nvidia container (if already logged user and password were already saved, if not get user (API key) from https://org.ngc.nvidia.com/setup/api-keys and rotate password to get a new password)
. run “docker ps -a” to check active containers.
5- navigate, in Ubuntu CLI, to folder where the GitHub project for synthetic data generation was cloned:
. from project Course | NVIDIA “Generating a Synthetic Dataset Using Replicator > Activity: Understanding Basics of the SDG Script”
6- Open the notebook in this folder from Ubuntu CLI: “jupyter notebook local_train.ipynb --allow-root”
. Copy the URL provided in my web browser, click on the notebook to open
7- inside the notebook, replace "# os.environ[“LOCAL_PROJECT_DIR”] = “<LOCAL_PATH_OF_CLONED_REPO>” by the actual path where I saved my cloned project used during the synthetic data generation step.
8- run all cells in the notebook
In my opinion, the exercise should include the steps I describe above to make it easier to follow and clearer.
In brief, the description and steps in the exercise should be clearer to guide the user through all necessary steps and make it explicit that we need to refer to the repo from the previous step, not the one from the list.