Protocol Buffers works seamlessly with Python. It's commonly used in data science, backend services, and microservices where efficient serialization matters. Python's simple syntax makes protobuf easy to learn and use.
This guide shows you everything you need to get started with Protocol Buffers in Python, from installation to writing your first working examples. If you're comparing serialization formats, check out our Protobuf vs JSON comparison.
What You'll Need
- •Python: Version 3.7 or higher (download from python.org)
- •pip: Python package manager (usually comes with Python)
- •Protocol Buffer Compiler: We'll install this next
Step 1: Install Protocol Buffers
Install the Python protobuf library:
pip install protobuf
Next, install the Protocol Buffer compiler. On Windows:
# Download from https://github.com/protocolbuffers/protobuf/releases # Extract and add to PATH, or use chocolatey: choco install protoc
On macOS:
brew install protobuf
On Linux:
sudo apt install protobuf-compiler
Verify installation:
protoc --version # Should show: libprotoc 3.x.x or higher
Step 2: Set Up Your Project
Create a simple project structure:
protobuf-python-example/ ├── protos/ │ └── person.proto ├── generated/ │ └── (generated files go here) └── main.py
You can create this manually or with these commands:
mkdir protobuf-python-example cd protobuf-python-example mkdir protos generated touch main.py
Step 3: Create Your .proto File
Create protos/person.proto. Learn more about proto3 syntax in the official language guide:
syntax = "proto3";
package tutorial;
message Person {
  string name = 1;
  int32 id = 2;
  string email = 3;
  repeated string phone_numbers = 4;
  
  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }
  
  message PhoneNumber {
    string number = 1;
    PhoneType type = 2;
  }
  
  repeated PhoneNumber phones = 5;
  
  // Optional fields
  string address = 6;
  bool is_active = 7;
}
message AddressBook {
  repeated Person people = 1;
}Key points:
- repeated- Creates a Python list
- message- Becomes a Python class
- Numbers (1, 2, 3) are field tags, not values
Step 4: Compile the .proto File
Run the protobuf compiler to generate Python code:
protoc -I=protos --python_out=generated protos/person.proto
This creates generated/person_pb2.py. This file contains all the classes you need.
Command breakdown:
- -I=protos- Input directory
- --python_out=generated- Output directory
- protos/person.proto- Source file
Step 5: Use Protocol Buffers in Python
Create main.py:
import sys
sys.path.append('generated')
import person_pb2
def create_person():
    """Create a Person message"""
    person = person_pb2.Person()
    person.name = "Telecom Subscriber"
    person.id = 917123456789
    person.email = "[email protected]"
    
    # Add simple phone numbers
    person.phone_numbers.append("+91-9876543210")
    person.phone_numbers.append("+91-9876543211")
    
    # Add structured phone numbers
    phone = person.phones.add()
    phone.number = "+91-9876543212"
    phone.type = person_pb2.Person.MOBILE
    
    # Optional fields
    person.address = "Cell Tower Sector A, Base Station 001"
    person.is_active = True
    
    return person
def serialize_example():
    """Serialize to binary"""
    person = create_person()
    
    print("Created Person:")
    print(f"Name: {person.name}")
    print(f"ID: {person.id}")
    print(f"Email: {person.email}")
    print(f"Phone numbers: {list(person.phone_numbers)}")
    
    # Serialize to bytes
    binary_data = person.SerializeToString()
    print(f"\nSerialized to {len(binary_data)} bytes")
    
    return binary_data
def deserialize_example(binary_data):
    """Deserialize from binary"""
    person = person_pb2.Person()
    person.ParseFromString(binary_data)
    
    print("\nDeserialized Person:")
    print(f"Name: {person.name}")
    print(f"ID: {person.id}")
    print(f"Email: {person.email}")
    print(f"Active: {person.is_active}")
    
    return person
def file_example():
    """Save to file and read back"""
    person = create_person()
    
    # Write to file
    with open("person.bin", "wb") as f:
        f.write(person.SerializeToString())
    print("\nSaved to person.bin")
    
    # Read from file
    person_from_file = person_pb2.Person()
    with open("person.bin", "rb") as f:
        person_from_file.ParseFromString(f.read())
    print(f"Read from file: {person_from_file.name}")
def address_book_example():
    """Work with multiple people"""
    address_book = person_pb2.AddressBook()
    
    # Add first person
    person1 = address_book.people.add()
    person1.name = "Telecom Subscriber"
    person1.id = 917123456789
    person1.email = "[email protected]"
    
    # Add second person
    person2 = address_book.people.add()
    person2.name = "Network Admin"
    person2.id = 919876543210
    person2.email = "[email protected]"
    
    print(f"\nAddress book has {len(address_book.people)} people")
    
    # Iterate through people
    for person in address_book.people:
        print(f"- {person.name} (ID: {person.id})")
if __name__ == "__main__":
    # Run examples
    binary_data = serialize_example()
    deserialize_example(binary_data)
    file_example()
    address_book_example()Step 6: Run Your Application
Run the script:
python main.py
Expected output:
Created Person: Name: Telecom Subscriber ID: 917123456789 Email: [email protected] Phone numbers: ['+91-9876543210', '+91-9876543211'] Serialized to 102 bytes Deserialized Person: Name: Telecom Subscriber ID: 917123456789 Email: [email protected] Active: True Saved to person.bin Read from file: Telecom Subscriber Address book has 2 people - Telecom Subscriber (ID: 917123456789) - Network Admin (ID: 919876543210)
Common Operations in Python
Check if Field is Set
if person.HasField('email'):
    print(f"Email: {person.email}")Clear a Field
person.ClearField('email')
# Or clear entire message
person.Clear()Copy a Message
person2 = person_pb2.Person() person2.CopyFrom(person1)
Merge Messages
# Merge person2 into person1 person1.MergeFrom(person2)
Convert to JSON
from google.protobuf.json_format import MessageToJson json_string = MessageToJson(person) print(json_string)
Parse from JSON
from google.protobuf.json_format import Parse
json_str = '{"name": "Mobile User", "id": 919123456789}'
person = Parse(json_str, person_pb2.Person())Print Debug String
# Great for debugging print(person) # Or more readable format: from google.protobuf import text_format print(text_format.MessageToString(person))
Working with Repeated Fields
Repeated fields work like Python lists:
# Add items one by one
person.phone_numbers.append("555-1111")
person.phone_numbers.append("555-2222")
# Extend with multiple items
numbers = ["555-3333", "555-4444"]
person.phone_numbers.extend(numbers)
# Get length
count = len(person.phone_numbers)
# Access by index
first_number = person.phone_numbers[0]
# Iterate
for number in person.phone_numbers:
    print(number)
# Clear all items
del person.phone_numbers[:]
# For nested messages, use add()
phone = person.phones.add()
phone.number = "555-7777"
phone.type = person_pb2.Person.HOMEBest Practices for Python
Always Check Field Presence
Use HasField() before accessing optional fields to avoid getting default values.
Use Binary Mode for Files
Always open files in binary mode ('wb' or'rb') when working with protobuf data.
Handle ParseError
Wrap ParseFromString() in try-except to handle corrupted data gracefully.
Don't Edit Generated Files
Never modify *_pb2.py files. They're auto-generated and will be overwritten.
Using Virtual Environments
It's good practice to use a virtual environment:
# Create virtual environment python -m venv venv # Activate (Windows) venv\Scripts\activate # Activate (macOS/Linux) source venv/bin/activate # Install protobuf pip install protobuf # Create requirements.txt pip freeze > requirements.txt # Later, install from requirements pip install -r requirements.txt
Using Protobuf with Web Frameworks
Here's a simple Flask example:
from flask import Flask, request, Response
import person_pb2
app = Flask(__name__)
@app.route('/api/person', methods=['POST'])
def create_person():
    # Parse binary protobuf from request
    person = person_pb2.Person()
    person.ParseFromString(request.data)
    
    # Process the person...
    print(f"Received: {person.name}")
    
    # Return binary protobuf
    return Response(
        person.SerializeToString(),
        mimetype='application/x-protobuf'
    )
@app.route('/api/person/json', methods=['POST'])
def create_person_json():
    # Accept JSON, return protobuf
    from google.protobuf.json_format import Parse
    
    person = Parse(request.data, person_pb2.Person())
    return Response(
        person.SerializeToString(),
        mimetype='application/x-protobuf'
    )Common Issues
Issue: Cannot import person_pb2
Solution: Make sure you've compiled the .proto file and the generated folder is in your Python path. Use sys.path.append('generated') at the top of your script.
Issue: protoc command not found
Solution: Install the Protocol Buffer compiler using your system's package manager, or download from the official GitHub releases page.
Issue: TypeError with repeated fields
Solution: For nested messages, use add() instead of append(). Example: person.phones.add()
Issue: ParseError when reading files
Solution: Ensure you're opening files in binary mode ('rb'). Also check that the .proto schema matches the serialized data.
Related Tools
Additional Resources
Official Documentation & References
- Official Python Protobuf Tutorial - Google's official Python guide
- Python Protobuf on GitHub - Source code and examples
- Protobuf on PyPI - Python package repository
Conclusion
Protocol Buffers fits naturally into Python's ecosystem. The generated code is clean and the API is straightforward. Python's dynamic nature makes working with protobuf messages intuitive and easy.
Start with simple examples like we covered here, then expand to more complex use cases. Whether you're building microservices, data pipelines, or API clients, protobuf provides efficient serialization that scales well.