Build the perfect TensorFlow input pipeline w/ tf.data API.
Why? Because your data do not fit in memory, you need pre-processing and you want to decouple loading and pre-processing of data from the distribution of parallel computation, training your model.
A single host performance guide to achieve maximum performance between your CPU/GPU system for training your NN model.
ETL is normally performed on your CPU, while training of your model is done on your GPU. Optimal parallelization for multiple core CPUs and asynchronous cycle management. Included: tf.data.AUTOTUNE.
Link to main literature:
https://www.tensorflow.org/guide/data
https://www.tensorflow.org/guide/data...
https://github.com/tensorflow/docs/bl...
https://colab.research.google.com/git...
#tf.data
#TensorFlowAPI
#InputPipeline
00:00 TF2 Input Pipeline
00:50 ETL process
01:45 Multiple CPU cores
02:32 TF2 code for tf.data
06:00 COLAB Jupyter NB tf.data API
11:05 Summary