A Tool Developer’s Guide to TensorFlow Model Files

Most users shouldn’t need to care about the internal details of how TensorFlow stores data on disk, but you might if you’re a tool developer. For example, you may want to analyze models, or convert back and forth between TensorFlow and other formats. This guide tries to explain some of the details of how you can work with the main files that hold model data, to make it easier to develop those kind of tools.

Protocol Buffers

All of TensorFlow’s file formats are based on Protocol Buffers, so to start it’s worth getting familiar with how they work. The summary is that you define data structures in text files, and the protobuf tools generate classes in C, Python, and other languages that can load, save, and access the data in a friendly way. We often refer to Protocol Buffers as protobufs, and I’ll use that convention in this guide.

GraphDef

The foundation of computation in TensorFlow is the Graph object. This holds a network of nodes, each representing one operation, connected to each other as inputs and outputs. After you’ve created a Graph object, you can save it out by calling as_graph_def(), which returns a GraphDef object.

The GraphDef class is an object created by the ProtoBuf library from the definition in tensorflow/core/framework/graph.proto. The protobuf tools parse this text file, and generate the code to load, store, and manipulate graph definitions. If you see a standalone TensorFlow file representing a model, it’s likely to contain a serialized version of one of these GraphDef objects saved out by the protobuf code.

This generated code is used to save and load the GraphDef files from disk. The code that actually loads the model looks like this:

graph_def = graph_pb2.GraphDef()

This line creates an empty GraphDef object, the class that’s been created from the textual definition in graph.proto. This is the object we’re going to populate with the data from our file.

with open(FLAGS.graph, "rb") as f:

Here we get a file handle for the path we’ve passed in to the script

  if FLAGS.input_binary:
    graph_def.ParseFromString(f.read())
  else:
    text_format.Merge(f.read(), graph_def)

Text or Binary?

There are actually two different formats that a ProtoBuf can be saved in. TextFormat is a human-readable form, which makes it nice for debugging and editing, but can get large when there’s numerical data like weights stored in it. You can see a small example of that in graph_run_run2.pbtxt.

Binary format files are a lot smaller than their text equivalents, even though they’re not as readable for us. In this script, we ask the user to supply a flag indicating whether the input file is binary or text, so we know the right function to call. You can find an example of a large binary file inside the inception_v3 archive, as inception_v3_2016_08_28_frozen.pb.

The API itself can be a bit confusing - the binary call is actually ParseFromString(), whereas you use a utility function from the text_format module to load textual files.

Nodes

Once you’ve loaded a file into the graph_def variable, you can now access the data inside it. For most practical purposes, the important section is the list of nodes stored in the node member. Here’s the code that loops through those:

for node in graph_def.node

Each node is a NodeDef object, defined in tensorflow/core/framework/node_def.proto. These are the fundamental building blocks of TensorFlow graphs, with each one defining a single operation along with its input connections. Here are the members of a NodeDef, and what they mean.

name

Every node should have a unique identifier that’s not used by any other nodes in the graph. If you don’t specify one as you’re building a graph using the Python API, one reflecting the name of operation, such as “MatMul”, concatenated with a monotonically increasing number, such as “5”, will be picked for you. The name is used when defining the connections between nodes, and when setting inputs and outputs for the whole graph when it’s run.

op

This defines what operation to run, for example "Add", "MatMul", or "Conv2D". When a graph is run, this op name is looked up in a registry to find an implementation. The registry is populated by calls to the REGISTER_OP() macro, like those in tensorflow/core/ops/nn_ops.cc.

input

A list of strings, each one of which is the name of another node, optionally followed by a colon and an output port number. For example, a node with two inputs might have a list like ["some_node_name", "another_node_name"], which is equivalent to ["some_node_name:0", "another_node_name:0"], and defines the node’s first input as the first output from the node with the name "some_node_name", and a second input from the first output of "another_node_name"

device

In most cases you can ignore this, since it defines where to run a node in a distributed environment, or when you want to force the operation onto CPU or GPU.

attr

This is a key/value store holding all the attributes of a node. These are the permanent properties of nodes, things that don’t change at runtime such as the size of filters for convolutions, or the values of constant ops. Because there can be so many different types of attribute values, from strings, to ints, to arrays of tensor values, there’s a separate protobuf file defining the data structure that holds them, in tensorflow/core/framework/attr_value.proto.

Each attribute has a unique name string, and the expected attributes are listed when the operation is defined. If an attribute isn’t present in a node, but it has a default listed in the operation definition, that default is used when the graph is created.

You can access all of these members by calling node.name, node.op, etc. in Python. The list of nodes stored in the GraphDef is a full definition of the model architecture.

Freezing

One confusing part about this is that the weights usually aren’t stored inside the file format during training. Instead, they’re held in separate checkpoint files, and there are Variable ops in the graph that load the latest values when they’re initialized. It’s often not very convenient to have separate files when you’re deploying to production, so there’s the freeze_graph.py script that takes a graph definition and a set of checkpoints and freezes them together into a single file.

What this does is load the GraphDef, pull in the values for all the variables from the latest checkpoint file, and then replace each Variable op with a Const that has the numerical data for the weights stored in its attributes It then strips away all the extraneous nodes that aren’t used for forward inference, and saves out the resulting GraphDef into an output file.

Weight Formats

If you’re dealing with TensorFlow models that represent neural networks, one of the most common problems is extracting and interpreting the weight values. A common way to store them, for example in graphs created by the freeze_graph script, is as Const ops containing the weights as Tensors. These are defined in tensorflow/core/framework/tensor.proto, and contain information about the size and type of the data, as well as the values themselves. In Python, you get a TensorProto object from a NodeDef representing a Const op by calling something like some_node_def.attr['value'].tensor.

This will give you an object representing the weights data. The data itself will be stored in one of the lists with the suffix _val as indicated by the type of the object, for example float_val for 32-bit float data types.

The ordering of convolution weight values is often tricky to deal with when converting between different frameworks. In TensorFlow, the filter weights for the Conv2D operation are stored on the second input, and are expected to be in the order [filter_height, filter_width, input_depth, output_depth], where filter_count increasing by one means moving to an adjacent value in memory.

Hopefully this rundown gives you a better idea of what’s going on inside TensorFlow model files, and will help you if you ever need to manipulate them.