TensorFlow + Kerasの環境構築メモ
はじめに
2週間ほど前に機械学習やkaggleの勉強用に新しいPCを買いました。
このPCにTensorFlowとKerasをインストールした際の作業メモを残します。
インストール環境
インストールした環境は以下のとおりです。
pythonはすでにAnacondaでインストールしてあります。
色々言われるAnacondaですが、使いたいパッケージが揃っていて、やはり便利。
項目 | 製品名またはバージョン |
---|---|
PC | GALLERIA ZZ |
OS | Ubuntu18.04 Desktop |
GPU | NVIDIA GeForce GTX1080Ti |
Python | 3.6.5 (Anaconda 5.2) |
今回参考にした情報
今回は以下の2つを参考にインストールしました。
https://qiita.com/gott/items/28524953547d13894e08
https://medium.com/@taylordenouden/installing-tensorflow-gpu-on-ubuntu-18-04-89a142325138
インストール手順
CUDA-9.0をインストール
下のURLからCUDA Toolkit 9.0をダウンロードしました。
https://developer.nvidia.com/cuda-90-download-archive
今回は"Select Target Platform"を以下のとおりに選択します。
項目 | 選択した値 |
---|---|
Operating System | Linux |
Architecture | x86_64 |
Distribution | Ubuntu |
Version | 17.04 |
Installer Type | runfile(local) |
ダウンロードしたら下記のコマンドを実行してインストールを開始します。
$ cd ~/Downloads # 今回のダウンロード先 $ sudo chmod +x cuda_9.0.176_384.81_linux.run $ ./cuda_9.0.176_384.81_linux.run --override
長い免責事項の後に出てくる選択肢には、
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?
のみ
no
を選択し、他はすべてyes
と答えます。
実際の内容は以下のとおりです。
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ----------------- Do you accept the previously read EULA? accept/decline/quit: accept You are attempting to install on an unsupported configuration. Do you wish to continue? (y)es/(n)o [ default is no ]: yes Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81? (y)es/(n)o/(q)uit: no Install the CUDA 9.0 Toolkit? (y)es/(n)o/(q)uit: yes Enter Toolkit Location [ default is /usr/local/cuda-9.0 ]: /usr/local/cuda-9.0 is not writable. Do you wish to run the installation with 'sudo'? (y)es/(n)o: yes Please enter your password: Do you want to install a symbolic link at /usr/local/cuda? (y)es/(n)o/(q)uit: yes Install the CUDA 9.0 Samples? (y)es/(n)o/(q)uit: yes Enter CUDA Samples Location [ default is /home/katsuya ]: Installing the CUDA Toolkit in /usr/local/cuda-9.0 ...
CUDNN 7.1をインストールする
Download cuDNN v7.1.4 (May 16, 2018), for CUDA 9.0
の
cuDNN v7.1.4 Library for Linux
をダウンロードしました。
以下のコマンドを実行してライブラリを適切なディレクトリに配置します。
$ tar -zxvf cudnn-9.0-linux-x64-v7.1.tgz $ sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda-9.0/lib64/ $ sudo cp cuda/include/cudnn.h /usr/local/cuda-9.0/include/ $ sudo chmod a+r /usr/local/cuda-9.0/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
NVIDIA CUDA Profile Tools Interfaceをインストールする
これはメモリロードの状況やボトルネックの同定などの機能を提供するパッケージです。
下記のとおり、aptで簡単にインストールできます。
$ sudo apt install libcupti-dev
環境変数を設定する
.bashrcの末尾に次の2行を追加します。
export PATH="/usr/local/cuda-9.0/bin:$PATH" export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH"
完了したら次のコマンドを実行して、環境変数を有効化します。
$ source ~/.bashrc
TensorFlowとKerasをインストールする
やっとTensorFlowとKerasをインストールできる環境が整いました。
あとは
$ pip install tensorflow-gpu $ pip install keras
を実行すればインストール作業は終了です。
テストコードを実行してみる
実際に動作するか試してみます。
テストコードは
https://github.com/keras-team/keras/blob/master/examples/cifar10_cnn.py
をそのまま拝借してkeras_test_code.py
を作成します。
'''Train a simple deep CNN on the CIFAR10 small images dataset. It gets to 75% validation accuracy in 25 epochs, and 79% after 50 epochs. (it's still underfitting at that point, though). ''' from __future__ import print_function import keras from keras.datasets import cifar10 from keras.preprocessing.image import ImageDataGenerator from keras.models import Sequential from keras.layers import Dense, Dropout, Activation, Flatten from keras.layers import Conv2D, MaxPooling2D import os batch_size = 32 num_classes = 10 epochs = 100 data_augmentation = True num_predictions = 20 save_dir = os.path.join(os.getcwd(), 'saved_models') model_name = 'keras_cifar10_trained_model.h5' # The data, split between train and test sets: (x_train, y_train), (x_test, y_test) = cifar10.load_data() print('x_train shape:', x_train.shape) print(x_train.shape[0], 'train samples') print(x_test.shape[0], 'test samples') # Convert class vectors to binary class matrices. y_train = keras.utils.to_categorical(y_train, num_classes) y_test = keras.utils.to_categorical(y_test, num_classes) model = Sequential() model.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:])) model.add(Activation('relu')) model.add(Conv2D(32, (3, 3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Conv2D(64, (3, 3), padding='same')) model.add(Activation('relu')) model.add(Conv2D(64, (3, 3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(512)) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(num_classes)) model.add(Activation('softmax')) # initiate RMSprop optimizer opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6) # Let's train the model using RMSprop model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy']) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 if not data_augmentation: print('Not using data augmentation.') model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test, y_test), shuffle=True) else: print('Using real-time data augmentation.') # This will do preprocessing and realtime data augmentation: datagen = ImageDataGenerator( featurewise_center=False, # set input mean to 0 over the dataset samplewise_center=False, # set each sample mean to 0 featurewise_std_normalization=False, # divide inputs by std of the dataset samplewise_std_normalization=False, # divide each input by its std zca_whitening=False, # apply ZCA whitening zca_epsilon=1e-06, # epsilon for ZCA whitening rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180) # randomly shift images horizontally (fraction of total width) width_shift_range=0.1, # randomly shift images vertically (fraction of total height) height_shift_range=0.1, shear_range=0., # set range for random shear zoom_range=0., # set range for random zoom channel_shift_range=0., # set range for random channel shifts # set mode for filling points outside the input boundaries fill_mode='nearest', cval=0., # value used for fill_mode = "constant" horizontal_flip=True, # randomly flip images vertical_flip=False, # randomly flip images # set rescaling factor (applied before any other transformation) rescale=None, # set function that will be applied on each input preprocessing_function=None, # image data format, either "channels_first" or "channels_last" data_format=None, # fraction of images reserved for validation (strictly between 0 and 1) validation_split=0.0) # Compute quantities required for feature-wise normalization # (std, mean, and principal components if ZCA whitening is applied). datagen.fit(x_train) # Fit the model on the batches generated by datagen.flow(). model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size), epochs=epochs, validation_data=(x_test, y_test), workers=4) # Save model and weights if not os.path.isdir(save_dir): os.makedirs(save_dir) model_path = os.path.join(save_dir, model_name) model.save(model_path) print('Saved trained model at %s ' % model_path) # Score trained model. scores = model.evaluate(x_test, y_test, verbose=1) print('Test loss:', scores[0]) print('Test accuracy:', scores[1])
実行結果はご覧のとおりです。
katsuya@katsuya:~$ python ./keras_test_code.py /home/katsuya/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters Using TensorFlow backend. Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz 170500096/170498071 [==============================] - 203s 1us/step x_train shape: (50000, 32, 32, 3) 50000 train samples 10000 test samples Using real-time data augmentation. Epoch 1/100 2018-09-05 01:20:30.859870: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2018-09-05 01:20:31.135949: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2018-09-05 01:20:31.137171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582 pciBusID: 0000:01:00.0 totalMemory: 10.91GiB freeMemory: 10.42GiB 2018-09-05 01:20:31.137233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0 2018-09-05 01:20:36.353767: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-09-05 01:20:36.353854: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0 2018-09-05 01:20:36.353880: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N 2018-09-05 01:20:36.363145: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10075 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1) 1563/1563 [==============================] - 18s 11ms/step - loss: 1.8860 - acc: 0.3071 - val_loss: 1.5808 - val_acc: 0.4329 Epoch 2/100 1563/1563 [==============================] - 10s 6ms/step - loss: 1.5933 - acc: 0.4186 - val_loss: 1.4743 - val_acc: 0.4629
正常に動いていることが確認できました。