Skip to main content

Others

Can I recover data from a released instance?

No, once an instance is released, the data cannot be recovered.

What should I do if the server CPU is fully utilized?

First, check which processes/applications are consuming the CPU.

Why can't I open JupyterLab?

Images imported through the image import feature do not install JupyterLab by default. If you need JupyterLab, please install and configure it yourself.

In official images, jupyterlab is installed in the base virtual environment.
If you modified the base environment (e.g., changed the python version, installed other packages causing conflicts with jupyterlab, uninstalled jupyter, or made misoperations), the installed jupyter may be damaged, resulting in failure to access JupyterLab. Below are troubleshooting and solutions:

# 1. Check whether the Python version in the base environment matches the Python version of the image selected when creating the instance.
(base) root@492132307857413:/# python -V
Python 3.10.10

# 2. Use `pip list | grep jupyter` to check if the currently installed jupyter is missing any packages, then compare with below
(base) root@492132307857413:/# pip list | grep jupyter
jupyter_client 8.6.0
jupyter_core 5.5.0
jupyter-events 0.9.0
jupyter-lsp 2.2.0
jupyter_server 2.11.1
jupyter_server_terminals 0.4.4
jupyterlab 4.0.8
jupyterlab-language-pack-zh-CN 4.0.post3
jupyterlab-pygments 0.2.2
jupyterlab_server 2.25.0

# 3. If a package is missing, install it with pip install <package_name>. For example, if jupyter_core is missing, install it as follows
(base) root@492132307857413:/# pip install jupyter_core

# 4. After installation, restart jupyterlab using the following command
(base) root@492132307857413:/# supervisord ctl restart jupyterlab

# 5. Check the running status of jupyterlab. If the status is Running, it’s normal, and you can access it via the console
(base) root@492132307857413:/# supervisord ctl status jupyterlab
jupyterlab Running pid 40, uptime 0:15:43

# If it shows another status, please submit a ticket for technical support

Why can't I use the GPU?

When running deep learning training and noticing that the GPU is not being used, try the following troubleshooting steps:

1. Ensure GPU information can be viewed with nvidia-smi

nvidia-smi

2. Check that the framework used by your code (TensorFlow, PyTorch, etc.) is properly installed with GPU support in the instance environment

TensorFlow framework check
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
PyTorch framework check
import torch
print(torch.cuda.is_available())

3. Check if the installed CUDA version is compatible with your deep learning framework version

Official images include frameworks, CUDA, and Python versions that are officially supported.
If you installed other versions of frameworks in the official image, please verify compatibility between the installed framework and CUDA version according to the framework’s official documentation.

Check CUDA version
nvcc -V

4. Explicitly specify GPU device in training code

TensorFlow framework
with tf.device('/GPU:0'):
model.fit(...)
PyTorch framework
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# Ensure data is also moved to GPU
inputs, labels = data[0].to(device), data[1].to(device)

5. Set environment variables For some frameworks, you may need to set environment variables to indicate GPU usage. For example, with CUDA:

export CUDA_VISIBLE_DEVICES=0

Why can't I connect to SSH or JupyterLab after restoring an instance from an image?

If SSH or JupyterLab cannot connect after restoring from an image, try restarting the instance first. If it still fails, please submit a ticket for technical support.

If the restored image was a custom imported image, note that JupyterLab is not installed by default, but SSH should work normally. If SSH also fails, please submit a ticket for technical support.

HuggingFace Cache

By default, HuggingFace caches models in /root/.cache. You can configure the cache to save to the data disk as follows:

# Execute in terminal:
export HF_HOME=/gz-data/hf-cache/

# Or in Python code:
import os
os.environ['HF_HOME'] = '/gz-data/hf-cache/'