The Boost C++ Libraries


Chapter 11: Serialization


Table of Contents

This book is licensed under a Creative Commons License.

A new edition of this book is available! It has been published as a print book and can be bought from Barnes and Noble, Amazon and other bookstores. The new edition is up-to-date and based on the Boost C++ Libraries 1.47.0 (released in July 2011). Several chapters have been updated (for example to Boost.Spirit 2.x, Boost.Signals 2 and Boost.Filesystem 3) and many new libraries are covered (for example Boost.CircularBuffer, Boost.Intrusive and Boost.MultiArray). For more information please see the publisher's website XML Press.


11.1 General

The Boost C++ Library Serialization allows to convert objects in a C++ application into a sequence of bytes that can be saved and loaded again at a later time to restore the objects. Different data formats, including XML, are available that specify the rules after which the sequence of bytes is generated. All of the formats supported by Boost.Serialization are proprietary in some respect. For example, the XML format cannot be used to exchange data with other applications that were not developed in C++ using Boost.Serialization. All data stored in the XML format is geared towards restoring the same C++ objects previously saved. The solely advantage of the XML format is the better understanding of serialized C++ objects which is helpful during e.g. debugging.


11.2 Archive

The main concept of Boost.Serialization is the archive. An archive is a sequence of bytes representing serialized C++ objects. Objects can be added to an archive to serialize them or be loaded from one, respectively. In order to restore the same C++ objects previously saved, the same data types are presumed.

The following is a simple example.

#include <boost/archive/text_oarchive.hpp> 
#include <iostream> 

int main() 
{ 
  boost::archive::text_oarchive oa(std::cout); 
  int i = 1; 
  oa << i; 
} 

Boost.Serialization provides multiple archive classes such as boost::archive::text_oarchive defined in boost/archive/text_oarchive.hpp. boost::archive::text_oarchive allows the serialization of objects as a text stream. The above application writes 22 serialization::archive 5 1 to the standard output stream.

As can be seen, the object oa of type boost::archive::text_oarchive can be used just like a stream to serialize a variable via <<. Nonetheless, archives should not be considered as regular streams storing arbitrary data. In order to restore the data at a later point, it is necessary to use the same data types in the same order they were previously saved. The following example serializes and restores the variable of type int.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <iostream> 
#include <fstream> 

void save() 
{ 
  std::ofstream file("archiv.txt"); 
  boost::archive::text_oarchive oa(file); 
  int i = 1; 
  oa << i; 
} 

void load() 
{ 
  std::ifstream file("archiv.txt"); 
  boost::archive::text_iarchive ia(file); 
  int i = 0; 
  ia >> i; 
  std::cout << i << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

While boost::archive::text_oarchive is used to serialize data as a text stream, boost::archive::text_iarchive is used to restore data from such a text stream. In order to use the class, the header file boost/archive/text_iarchive.hpp must be included.

Constructors of archives expect an input or output stream as the argument. The stream is used to serialize or to restore data, respectively. While above application accesses a file, other streams such as a stringstream could be used alternatively.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <iostream> 
#include <sstream> 

std::stringstream ss; 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  int i = 1; 
  oa << i; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  int i = 0; 
  ia >> i; 
  std::cout << i << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

The application also writes 1 to the standard output stream. However, opposed to the previous one, data is serialized using a stringstream instead.

So far, primitive data types have been serialized. The following example shows how to serialize objects of user defined data types.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <iostream> 
#include <sstream> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age) 
    : age_(age) 
  { 
  } 

  int age() const 
  { 
    return age_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & age_; 
  } 

  int age_; 
}; 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  person p(31); 
  oa << p; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  person p; 
  ia >> p; 
  std::cout << p.age() << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

In order to serialize objects of user defined data types, a method named serialize() must be defined which is called if the object is serialized to or restored from a byte stream. Since serialize() is used for both serializing and restoring, Boost.Serialization offers the operator & in addition to the << and >> ones. If used, there is no longer the need to distinguish between serializing and restoring within the serialize() method.

serialize() is automatically called any time an object is serialized or restored. It should never be called explicitly and thus should be declared as private. In this case, the class boost::serialization::access must be declared as friend which allows Boost.Serialization to access the method.

There may be situations that do not allow to modify an existing class in order to add the serialize() method. This is for example true for classes from the C++ standard or any other library.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <iostream> 
#include <sstream> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age) 
    : age_(age) 
  { 
  } 

  int age() const 
  { 
    return age_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  friend void serialize(Archive &ar, person &p, const unsigned int version); 

  int age_; 
}; 

template <typename Archive> 
void serialize(Archive &ar, person &p, const unsigned int version) 
{ 
  ar & p.age_; 
} 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  person p(31); 
  oa << p; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  person p; 
  ia >> p; 
  std::cout << p.age() << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

In order to serialize data types that cannot be modified, the free-standing function serialize() can be defined as shown in the above example. The function expects a reference to an object of the corresponding data type as its second argument.

If the data type to be serialized contains private properties which cannot be accessed via public methods, things get more complicated. In this case, the data type may need to be modified. The serialize() function in the above application would not be able to access the age_ property without the friend declaration.

Fortunately, Boost.Serialization provides corresponding serialize() functions for many classes of the C++ standard. To serialize objects based on C++ standard classes, additional header files need to be included.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <boost/serialization/string.hpp> 
#include <iostream> 
#include <sstream> 
#include <string> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age, const std::string &name) 
    : age_(age), name_(name) 
  { 
  } 

  int age() const 
  { 
    return age_; 
  } 

  std::string name() const 
  { 
    return name_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  friend void serialize(Archive &ar, person &p, const unsigned int version); 

  int age_; 
  std::string name_; 
}; 

template <typename Archive> 
void serialize(Archive &ar, person &p, const unsigned int version) 
{ 
  ar & p.age_; 
  ar & p.name_; 
} 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  person p(31, "Boris"); 
  oa << p; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  person p; 
  ia >> p; 
  std::cout << p.age() << std::endl; 
  std::cout << p.name() << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

The example extends the person class by a name of type std::string. In order to serialize this property, the header file boost/serialization/string.hpp must be included which offers the appropriate free-standing serialize() function.

As mentioned before, Boost.Serialization defines serialize() functions for many classes of the C++ standard. These are defined in header files that carry the same name as the corresponding header files from the C++ standard. In order to serialize objects of type std::string, the header file boost/serialization/string.hpp must be included. For serializing an object of type std::vector, the header file boost/serialization/vector.hpp must be used instead. It is therefore fairly obvious which header file must be included in any given scenario.

One argument of serialize(), which has been ignored so far, is version. This argument is relevant if archives should be forward compatible to support future versions of the given application. The next example considers archives of the person class as being forward compatible. Since the original version of person did not contain any name, the new version of person still should be able to manage old archives created without the name.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <boost/serialization/string.hpp> 
#include <iostream> 
#include <sstream> 
#include <string> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age, const std::string &name) 
    : age_(age), name_(name) 
  { 
  } 

  int age() const 
  { 
    return age_; 
  } 

  std::string name() const 
  { 
    return name_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  friend void serialize(Archive &ar, person &p, const unsigned int version); 

  int age_; 
  std::string name_; 
}; 

template <typename Archive> 
void serialize(Archive &ar, person &p, const unsigned int version) 
{ 
  ar & p.age_; 
  if (version > 0) 
    ar & p.name_; 
} 

BOOST_CLASS_VERSION(person, 1) 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  person p(31, "Boris"); 
  oa << p; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  person p; 
  ia >> p; 
  std::cout << p.age() << std::endl; 
  std::cout << p.name() << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

The macro BOOST_CLASS_VERSION is used to specify a version number to a class. The version number for the person class in the above application is set to 1. If BOOST_CLASS_VERSION is not used, the version number is 0 by default.

The version number is stored within the archive and thus is part of it. While the version number specified for a particular class via the BOOST_CLASS_VERSION macro is used during serialization, the version argument of serialize() is set to the value stored in the archive when the object is restored. If the new version of person accesses an archive containing an object serialized with the old version, the name_ property would not be restored since the old version did not contain such a property. Forward compatible archives are therefore supported by Boost.Serialization via this mechanism.


11.3 Pointers and references

Boost.Serialization can also serialize pointers and references. Since a pointer stores the address of an object, serializing the address certainly does not make much sense. While serializing pointers and references, the object referenced is automatically serialized instead.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <iostream> 
#include <sstream> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age) 
    : age_(age) 
  { 
  } 

  int age() const 
  { 
    return age_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & age_; 
  } 

  int age_; 
}; 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  person *p = new person(31); 
  oa << p; 
  std::cout << std::hex << p << std::endl; 
  delete p; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  person *p; 
  ia >> p; 
  std::cout << std::hex << p << std::endl; 
  std::cout << p->age() << std::endl; 
  delete p; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

The above application creates a new object of type person using new and assigns it to the pointer p. The pointer - not *p - is then serialized. Boost.Serialization automatically serializes the object itself as referenced by p and not the address of the object.

If the archive is restored, p does not necessarily refer to the same address. A new object is created and its address is assigned to p instead. Boost.Serialization only guarantees that the object is identical to the one serialized, not that its address is the same.

Since modern C++ uses smart pointers in connection with dynamically allocated memory, Boost.Serialization provides support accordingly.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <boost/serialization/scoped_ptr.hpp> 
#include <boost/scoped_ptr.hpp> 
#include <iostream> 
#include <sstream> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age) 
    : age_(age) 
  { 
  } 

  int age() const 
  { 
    return age_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & age_; 
  } 

  int age_; 
}; 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  boost::scoped_ptr<person> p(new person(31)); 
  oa << p; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  boost::scoped_ptr<person> p; 
  ia >> p; 
  std::cout << p->age() << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

The example uses the smart pointer boost::scoped_ptr to manage a dynamically allocated object of type person. In order to serialize such a pointer, the header file boost/serialization/scoped_ptr.hpp must be included.

In case a smart pointer of type boost::shared_ptr should be serialized, the header file boost/serialization/shared_ptr.hpp must be used instead.

The following application now uses a reference in place of a pointer.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <iostream> 
#include <sstream> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age) 
    : age_(age) 
  { 
  } 

  int age() const 
  { 
    return age_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & age_; 
  } 

  int age_; 
}; 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  person p(31); 
  person &pp = p; 
  oa << pp; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  person p; 
  person &pp = p; 
  ia >> pp; 
  std::cout << pp.age() << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

As shown, Boost.Serialization can also serialize references without any issue. Just like with pointers, the referenced object is serialized automatically.


11.4 Serialization of class hierarchy objects

In order to serialize objects based on class hierarchies, the child classes must access the boost::serialization::base_object() function inside the serialize() method. This function guarantees that inherited properties of base classes are correctly serialized as well. The following example shows a class named developer which is derived from person.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <boost/serialization/string.hpp> 
#include <iostream> 
#include <sstream> 
#include <string> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age) 
    : age_(age) 
  { 
  } 

  int age() const 
  { 
    return age_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & age_; 
  } 

  int age_; 
}; 

class developer 
  : public person 
{ 
public: 
  developer() 
  { 
  } 

  developer(int age, const std::string &language) 
    : person(age), language_(language) 
  { 
  } 

  std::string language() const 
  { 
    return language_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & boost::serialization::base_object<person>(*this); 
    ar & language_; 
  } 

  std::string language_; 
}; 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  developer d(31, "C++"); 
  oa << d; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  developer d; 
  ia >> d; 
  std::cout << d.age() << std::endl; 
  std::cout << d.language() << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

Both person and developer contain a private serialize() method allowing objects based on either class to be serialized. Since developer is derived from person, the serialize() method must ensure that properties inherited from person are also serialized.

Inherited properties are serialized by accessing the base class inside the serialize() method of the child class using boost::serialization::base_object(). It is mandatory to use this function over e.g. static_cast since only boost::serialization::base_object() ensures correct serialization.

Addresses of objects dynamically created can be assigned to pointers of the corresponding base class type. The following example shows that Boost.Serialization still serializes them correctly.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <boost/serialization/string.hpp> 
#include <boost/serialization/export.hpp> 
#include <iostream> 
#include <sstream> 
#include <string> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age) 
    : age_(age) 
  { 
  } 

  virtual int age() const 
  { 
    return age_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & age_; 
  } 

  int age_; 
}; 

class developer 
  : public person 
{ 
public: 
  developer() 
  { 
  } 

  developer(int age, const std::string &language) 
    : person(age), language_(language) 
  { 
  } 

  std::string language() const 
  { 
    return language_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & boost::serialization::base_object<person>(*this); 
    ar & language_; 
  } 

  std::string language_; 
}; 

BOOST_CLASS_EXPORT(developer) 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  person *p = new developer(31, "C++"); 
  oa << p; 
  delete p; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  person *p; 
  ia >> p; 
  std::cout << p->age() << std::endl; 
  delete p; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

The application creates an object of type developer inside the save() function and assigns it to a pointer of type person* which in turn is serialized via << accordingly.

As mentioned in the previous section, the referenced object is automatically serialized. In order to have Boost.Serialization recognize that an object of type developer must be serialized even though the pointer is of type person*, the class developer needs to be declared accordingly. This is done via the BOOST_CLASS_EXPORT macro defined in boost/serialization/export.hpp. Since the type developer does not appear in the pointer definition, Boost.Serialization would not be able to serialize an object of type developer correctly without the macro.

The macro BOOST_CLASS_EXPORT must be used if objects of child classes should be serialized via a pointer to their corresponding base class.

Because of the static registration, one disadvantage of BOOST_CLASS_EXPORT is the registration of classes that may not be serialized at all. Boost.Serialization offers a solution for exactly this scenario though.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <boost/serialization/string.hpp> 
#include <boost/serialization/export.hpp> 
#include <iostream> 
#include <sstream> 
#include <string> 

std::stringstream ss; 

class person 
{ 
public: 
  person() 
  { 
  } 

  person(int age) 
    : age_(age) 
  { 
  } 

  virtual int age() const 
  { 
    return age_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & age_; 
  } 

  int age_; 
}; 

class developer 
  : public person 
{ 
public: 
  developer() 
  { 
  } 

  developer(int age, const std::string &language) 
    : person(age), language_(language) 
  { 
  } 

  std::string language() const 
  { 
    return language_; 
  } 

private: 
  friend class boost::serialization::access; 

  template <typename Archive> 
  void serialize(Archive &ar, const unsigned int version) 
  { 
    ar & boost::serialization::base_object<person>(*this); 
    ar & language_; 
  } 

  std::string language_; 
}; 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  oa.register_type<developer>(); 
  person *p = new developer(31, "C++"); 
  oa << p; 
  delete p; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  ia.register_type<developer>(); 
  person *p; 
  ia >> p; 
  std::cout << p->age() << std::endl; 
  delete p; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

Instead of using the BOOST_CLASS_EXPORT macro, the above application calls the template method register_type() instead. The type to be registered is passed as the template argument accordingly.

Please note that register_type() must be called both in save() and load().

The advantage of register_type() is that only classes used for serialization must be registered. While developing e.g. a library, one does not know which classes a developer may use for serialization later on. While the macro BOOST_CLASS_EXPORT certainly makes it easy, it may register types that are not going to be serialized.


11.5 Wrapper functions for optimization

After understanding how to serialize objects, this section introduces wrapper functions to optimize the serialization process. By using these functions, objects are kind of marked to allow Boost.Serialization to apply certain optimization techniques.

The following example uses Boost.Serialization without any wrapper function.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <boost/array.hpp> 
#include <iostream> 
#include <sstream> 

std::stringstream ss; 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  boost::array<int, 3> a = { 0, 1, 2 }; 
  oa << a; 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  boost::array<int, 3> a; 
  ia >> a; 
  std::cout << a[0] << ", " << a[1] << ", " << a[2] << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

The above application creates and writes the text stream 22 serialization::archive 5 0 0 3 0 1 2 to the standard output stream. Using the wrapper function boost::serialization::make_array(), the output can be shortened to 22 serialization::archive 5 0 1 2.

#include <boost/archive/text_oarchive.hpp> 
#include <boost/archive/text_iarchive.hpp> 
#include <boost/serialization/array.hpp> 
#include <boost/array.hpp> 
#include <iostream> 
#include <sstream> 

std::stringstream ss; 

void save() 
{ 
  boost::archive::text_oarchive oa(ss); 
  boost::array<int, 3> a = { 0, 1, 2 }; 
  oa << boost::serialization::make_array(a.data(), a.size()); 
} 

void load() 
{ 
  boost::archive::text_iarchive ia(ss); 
  boost::array<int, 3> a; 
  ia >> boost::serialization::make_array(a.data(), a.size()); 
  std::cout << a[0] << ", " << a[1] << ", " << a[2] << std::endl; 
} 

int main() 
{ 
  save(); 
  load(); 
} 

boost::serialization::make_array() expects the address and the length of an array. Since the length is hard-coded, it does not need to be serialized as part of the object of type boost::array. The function can be used whenever classes such as boost::array or std::vector contain an array that can be directly serialized. Additional properties that normally would be serialized are not serialized.

Another wrapper function provided by Boost.Serialization is boost::serialization::make_binary_object(). Similar to boost::serialization::make_array(), it expects an address and a length. boost::serialization::make_binary_object() is used solely for binary data without any underlying structure while boost::serialization::make_array() is used for arrays.


11.6 Exercises

You can buy solutions to all exercises in this book as a ZIP file.

  1. Develop an application able to serialize and restore arbitrary number of records consisting of the name, the department and a unique identification number of employees in a file. Records should be displayed on the screen after restoring them. Use sample records to test the application.

  2. Extend the application by storing the birth date for each employee. The application should still be able to restore records serialized with the older version created in the previous exercise.