Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
388 views
in Technique[技术] by (71.8m points)

python - How to install python3.8 and tensorflow-gpu==2.4 on google colab

Here are the steps I follow and I fail to get tensorflow to work .................................................................................................................................................

Install python3.8

!sudo add-apt-repository ppa:deadsnakes/ppa
!sudo apt-get update
!sudo apt-get install python3.8
!sudo apt install python3.8-distutils
!sudo apt install python3.8-venv python3.8-dev
!sudo apt-get install python3-pip
!python3.8 -m pip install -U pip setuptools

Install tensorflow

!python3.8 -m pip install tensorflow tensorflow-gpu
!python3.8 -c 'import tensorflow'

OUT:

2021-01-05 03:06:33.030994: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2021-01-05 03:06:33.031039: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

Install cuda 11 using tensorflow docs

!sudo apt-get update -y
!sudo apt-get install -y libcupti-dev
!export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64
!sudo apt-get install nvidia-kernel-common-455 libnvidia-gl-455 libnvidia-fbc1-455
 libnvidia-ifr1-455 libnvidia-cfg1-455 xserver-xorg-video-nvidia-455
  libnvidia-encode-455 nvidia-utils-455 nvidia-dkms-455 libnvidia-decode-455
   nvidia-compute-utils-455 nvidia-kernel-source-455 libnvidia-extra-455
    libnvidia-compute-455 nvidia-driver-455
# Add NVIDIA package repositories
!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
!sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
!sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
!sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
!sudo apt-get update

!wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb

!sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
!sudo apt-get update

# Install NVIDIA driver
!sudo apt-get install --no-install-recommends nvidia-driver-455
# Reboot. Check that GPUs are visible using the command: nvidia-smi

!wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libnvinfer7_7.1.3-1+cuda11.0_amd64.deb
!sudo apt install ./libnvinfer7_7.1.3-1+cuda11.0_amd64.deb
!sudo apt-get update

# Install development and runtime libraries (~4GB)
!sudo apt-get install --no-install-recommends 
    cuda-11-0 
    libcudnn8=8.0.4.30-1+cuda11.0  
    libcudnn8-dev=8.0.4.30-1+cuda11.0


# Install TensorRT. Requires that libcudnn8 is installed above.
!sudo apt-get install -y --no-install-recommends libnvinfer7=7.1.3-1+cuda11.0 
    libnvinfer-dev=7.1.3-1+cuda11.0 
    libnvinfer-plugin7=7.1.3-1+cuda11.0

I then get this error when I run a script dqn.py that uses tensorflow using !python3.8 dqn.py

or/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-01-05 20:58:19.951459: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-01-05 20:58:19.952528: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-01-05 20:58:19.986076: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-05 20:58:19.986638: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.73GiB deviceMemoryBandwidth: 298.08GiB/s
2021-01-05 20:58:19.986691: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-01-05 20:58:20.006886: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-01-05 20:58:20.006998: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-01-05 20:58:20.014581: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-01-05 20:58:20.019778: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-01-05 20:58:20.031744: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-01-05 20:58:20.035045: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-01-05 20:58:20.038185: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-01-05 20:58:20.038305: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-05 20:58:20.038885: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-05 20:58:20.039409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-01-05 20:58:20.039914: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-01-05 20:58:20.040041: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-05 20:58:20.040578: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.73GiB deviceMemoryBandwidth: 298.08GiB/s
2021-01-05 20:58:20.040616: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-01-05 20:58:20.040653: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-01-05 20:58:20.040677: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-01-05 20:58:20.040698: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-01-05 20:58:20.040720: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-01-05 20:58:20.040740: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-01-05 20:58:20.040759: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-01-05 20:58:20.040779: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-01-05 20:58:20.040840: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-05 20:58:20.041448: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-05 20:58:20.041944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-01-05 20:58:20.041994: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
  File "dqn.py", line 337, in <module>
    mod1 = dqn_conv(en.observation_space.shape, en.action_space.n)
  File "/content/models.py", line 19, in dqn_conv
    x = Conv2D(32, 8, 4, activation='relu')(x0)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 951, in __call__
    return self._functional_construction_call(inputs, args, kwargs,
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1090, in _functional_construction_call
    outputs = self._keras_tensor_symbolic_call(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 822, in _keras_tensor_symbolic_call
    return self._infer_output_signature(inputs, args, kwargs, input_masks)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 862, in _infer_output_signature
    self._maybe_build(inputs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 2710, in _maybe_build
    self.build(input_shapes)  # pylint:disable=not-callable
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/layers/convolutional.py", line 198, in build
    self.kernel = self.add_weight(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 623, in add_weight
    variable = self._add_variable_with_custom_getter(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/tracking/base.py", line 805, in _add_variable_with_custom_getter
    new_variable = getter(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 130, in make_variable
    return tf_variables.VariableV1(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/variables.py", line 260, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/variables.py", line 206, in _variable_v1_call
    return previous_getter(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/variables.py", line 199, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/variable_scope.py", line 2604, in default_variable_creator
    return resource_variable_ops.ResourceVariable(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/variables.py", line 264, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 1574, in __init__
    self._init_from_args(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 1712, in _init_from_args
    initial_value = initial_value()
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/initializers/initializers_v2.py", line 409, in __call__
    return super(VarianceScaling, self).__call__(

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...