I get this question a lot in my deep learning courses: how do I save a neural network after I’ve trained it?
This is a real-world problem.
In most of my courses, we are focused on the “hard part” – how to actually train the model, both mathematically and then translating that math into code. Thinking about saving and loading files seems silly in comparison to multivariate calculus and numerical optimization techniques.
Today, we are going to discuss saving (and loading) a trained neural network.
In TensorFlow specifically, this is non-trivial.
In fact, it’s hard to even turn your model into a class, because variables in TensorFlow only have values inside sessions.
Once the session is over, the variables are lost.
For us, this seemed ok, because we would train the variables, show that the cost decreased, and end things there.
The key component for saving and loading variables (as in tf.Variable, not variables in general), is the tf.train.Saver object.
self.saver = tf.train.Saver()
Once we have this, there are 2 functions we are interested in:
saver.save(session, filename) saver.restore(session, filename)
As you can see, they both depend on the same 2 parameters, and they both require you to be inside a session.
That means, these functions will only be called within blocks that look like:
with tf.Session() as session: ... do stuff ...
The filename can be whatever you wish to name it.
In pseudocode, our train function will look something like:
with tf.Session() as session: run gradient descent saver.save(session, outputfile)
Notice that at this point we have never looked at the trained model parameters. If we wanted to, we could have extracted them inside the train session, with something like:
W_value = session.eval(W)
And then later in the predict function, we could start a new session and feed in this trained W, but that is not the TensorFlow way (plus it would be really ugly).
Instead, our predict function will look like this:
with tf.Session() as session: saver.restore(session, outputfile) prediction = session.run(predict_op, ...)
And TensorFlow will automatically know to load the files from your output file.
Technically, this is all you need to know to create a class-based neural network that defines the fit(X, Y) and predict(X) functions.
What about saving the actual model (object instance) to a file, and then reloading it at a later time?
We again require the Saver object but it’s a little more difficult, in particular because Saver actually has quite a lot of functionality and we need to pass in some params to ensure we are saving and then reloading the correct variables.
I don’t want to give too much away here because I like my students to actually practice coding and problem solving.
If you get stuck, the final code is available at:
It shows you how to save and load a Logistic Regression model on the MNIST data (one weight and one bias), and it will be added later to my Theano and TensorFlow basics course.
We use Logistic Regression so that you may see the techniques on a simple model without getting bogged down by the complexity of a neural network.
Here are some steps you may want to take to tackle this problem step-by-step:
- Try extending the file tensorflow2.py (folder: ann_class2) to make a LogisticRegression class that implements the fit(X, Y) and predict(X) functions and makes use of tf.train.Saver.
- Once you’ve got that working, add the functions save(filename), and load(filename). Note that the load function should be a static or class method because calling it implies you do not yet have a LogisticRegression object (load is what creates one).
- One weight and one bias is easy. Extend it to a neural network with an arbitrary number of layers and hidden units.