Avro Schema Examples - Ready-to-Use Templates

Practical Avro schemas you can copy and customize for your projects

January 202611 min read

Introduction

Setting up Kafka topics or building Hadoop pipelines? You need good schema examples to get started fast.

Here are battle-tested schemas from real production systems. Copy them straight into your Schema Registry or tweak them for your needs.

New to Avro Schemas?

Read the schema basics guide first if you're not familiar with types and structures.

Example 1: User Profile

Classic user profile with required and optional fields. Works for user services, auth systems, CRM apps.

{
  "type": "record",
  "name": "UserProfile",
  "namespace": "com.example.users",
  "doc": "User profile information",
  "fields": [
    {
      "name": "userId",
      "type": "long",
      "doc": "Unique user identifier"
    },
    {
      "name": "username",
      "type": "string",
      "doc": "Login username"
    },
    {
      "name": "email",
      "type": "string"
    },
    {
      "name": "firstName",
      "type": "string"
    },
    {
      "name": "lastName",
      "type": "string"
    },
    {
      "name": "phoneNumber",
      "type": ["null", "string"],
      "default": null,
      "doc": "Optional phone"
    },
    {
      "name": "dateOfBirth",
      "type": ["null", {"type": "int", "logicalType": "date"}],
      "default": null
    },
    {
      "name": "createdAt",
      "type": {"type": "long", "logicalType": "timestamp-millis"}
    },
    {
      "name": "isActive",
      "type": "boolean",
      "default": true
    }
  ]
}

Why This Works

Required fields (userId, email) keep data clean. Optional fields (phoneNumber) use unions with defaults so you can add them later without breaking stuff. The timestamp-millis type keeps dates consistent everywhere.

Example 2: E-Commerce Order

Complete order schema with line items, addresses, payment info. Standard pattern for e-commerce and order management.

{
  "type": "record",
  "name": "Order",
  "namespace": "com.example.orders",
  "fields": [
    {
      "name": "orderId",
      "type": "string",
      "doc": "UUID format"
    },
    {
      "name": "customerId",
      "type": "long"
    },
    {
      "name": "orderDate",
      "type": {"type": "long", "logicalType": "timestamp-millis"}
    },
    {
      "name": "status",
      "type": {
        "type": "enum",
        "name": "OrderStatus",
        "symbols": [
          "PENDING",
          "CONFIRMED",
          "PROCESSING",
          "SHIPPED",
          "DELIVERED",
          "CANCELLED",
          "REFUNDED"
        ]
      }
    },
    {
      "name": "items",
      "type": {
        "type": "array",
        "items": {
          "type": "record",
          "name": "OrderItem",
          "fields": [
            {"name": "productId", "type": "string"},
            {"name": "productName", "type": "string"},
            {"name": "quantity", "type": "int"},
            {"name": "unitPrice", "type": "double"},
            {"name": "discount", "type": ["null", "double"], "default": null}
          ]
        }
      }
    },
    {
      "name": "shippingAddress",
      "type": {
        "type": "record",
        "name": "Address",
        "fields": [
          {"name": "street", "type": "string"},
          {"name": "city", "type": "string"},
          {"name": "state", "type": "string"},
          {"name": "zipCode", "type": "string"},
          {"name": "country", "type": "string"}
        ]
      }
    },
    {
      "name": "subtotal",
      "type": "double"
    },
    {
      "name": "tax",
      "type": "double"
    },
    {
      "name": "shippingCost",
      "type": "double"
    },
    {
      "name": "total",
      "type": "double"
    }
  ]
}

In Production

You'll see this pattern in most online stores. Orders need items (obviously), items need prices and quantities. The status enum keeps things organized and prevents typos. Pretty much every Kafka-based shop uses something like this.

Example 3: Kafka Event Message

Generic event schema for event-driven architectures. Includes metadata, payload, tracking info.

{
  "type": "record",
  "name": "KafkaEvent",
  "namespace": "com.example.events",
  "doc": "Generic event message",
  "fields": [
    {
      "name": "eventId",
      "type": "string",
      "doc": "UUID"
    },
    {
      "name": "eventType",
      "type": "string",
      "doc": "e.g., USER_CREATED, ORDER_PLACED"
    },
    {
      "name": "eventVersion",
      "type": "string",
      "default": "1.0"
    },
    {
      "name": "timestamp",
      "type": {"type": "long", "logicalType": "timestamp-millis"}
    },
    {
      "name": "source",
      "type": "string",
      "doc": "Service that generated this"
    },
    {
      "name": "userId",
      "type": ["null", "string"],
      "default": null
    },
    {
      "name": "payload",
      "type": "string",
      "doc": "JSON-encoded event data"
    },
    {
      "name": "metadata",
      "type": {
        "type": "map",
        "values": "string"
      },
      "default": {}
    },
    {
      "name": "correlationId",
      "type": ["null", "string"],
      "default": null,
      "doc": "For tracing related events"
    }
  ]
}

Event Pattern

This gives you a consistent envelope for all events. The flexible payload (stored as JSON string) combined with structured metadata makes it work for different event types. The correlationId lets you trace events across microservices.

Example 4: IoT Sensor Data

Time-series sensor data for IoT apps, industrial monitoring, smart home systems. Optimized for high-volume streaming.

{
  "type": "record",
  "name": "SensorReading",
  "namespace": "com.example.iot",
  "fields": [
    {
      "name": "sensorId",
      "type": "string"
    },
    {
      "name": "timestamp",
      "type": {"type": "long", "logicalType": "timestamp-millis"}
    },
    {
      "name": "sensorType",
      "type": {
        "type": "enum",
        "name": "SensorType",
        "symbols": [
          "TEMPERATURE",
          "HUMIDITY",
          "PRESSURE",
          "MOTION",
          "LIGHT",
          "SOUND"
        ]
      }
    },
    {
      "name": "value",
      "type": "double"
    },
    {
      "name": "unit",
      "type": "string",
      "doc": "celsius, percentage, pascals, etc."
    },
    {
      "name": "location",
      "type": ["null", {
        "type": "record",
        "name": "GeoLocation",
        "fields": [
          {"name": "latitude", "type": "double"},
          {"name": "longitude", "type": "double"},
          {"name": "altitude", "type": ["null", "double"], "default": null}
        ]
      }],
      "default": null
    },
    {
      "name": "batteryLevel",
      "type": ["null", "int"],
      "default": null,
      "doc": "0-100 percentage"
    },
    {
      "name": "quality",
      "type": {
        "type": "enum",
        "name": "DataQuality",
        "symbols": ["GOOD", "QUESTIONABLE", "BAD"],
        "default": "GOOD"
      }
    }
  ]
}

High-Volume Design

This handles millions of readings per day. Fixed fields like sensorId and timestamp keep messages compact. Optional location and battery work for both stationary and mobile sensors without wasting space.

Example 5: Application Logs

Log schema for centralized logging, observability platforms, security monitoring. Works with log aggregation tools.

{
  "type": "record",
  "name": "LogEntry",
  "namespace": "com.example.logging",
  "fields": [
    {
      "name": "timestamp",
      "type": {"type": "long", "logicalType": "timestamp-micros"}
    },
    {
      "name": "level",
      "type": {
        "type": "enum",
        "name": "LogLevel",
        "symbols": ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "FATAL"]
      }
    },
    {
      "name": "service",
      "type": "string"
    },
    {
      "name": "hostname",
      "type": "string"
    },
    {
      "name": "message",
      "type": "string"
    },
    {
      "name": "logger",
      "type": "string"
    },
    {
      "name": "thread",
      "type": ["null", "string"],
      "default": null
    },
    {
      "name": "exception",
      "type": ["null", {
        "type": "record",
        "name": "ExceptionInfo",
        "fields": [
          {"name": "type", "type": "string"},
          {"name": "message", "type": "string"},
          {"name": "stackTrace", "type": "string"}
        ]
      }],
      "default": null
    },
    {
      "name": "tags",
      "type": {
        "type": "map",
        "values": "string"
      },
      "default": {}
    },
    {
      "name": "traceId",
      "type": ["null", "string"],
      "default": null,
      "doc": "Distributed tracing ID"
    },
    {
      "name": "spanId",
      "type": ["null", "string"],
      "default": null
    }
  ]
}

Production Logging

Supports modern observability. LogLevel enum keeps severity consistent. Optional exception record captures error details without bloating normal logs. TraceId and spanId work with OpenTelemetry for distributed tracing.

Example 6: Financial Transaction

Banking or payment transaction with precise decimal handling. Critical for financial systems where accuracy matters.

{
  "type": "record",
  "name": "Transaction",
  "namespace": "com.example.finance",
  "fields": [
    {
      "name": "transactionId",
      "type": "string"
    },
    {
      "name": "timestamp",
      "type": {"type": "long", "logicalType": "timestamp-millis"}
    },
    {
      "name": "type",
      "type": {
        "type": "enum",
        "name": "TransactionType",
        "symbols": [
          "DEPOSIT",
          "WITHDRAWAL",
          "TRANSFER",
          "PAYMENT",
          "REFUND",
          "FEE"
        ]
      }
    },
    {
      "name": "amount",
      "type": {
        "type": "bytes",
        "logicalType": "decimal",
        "precision": 19,
        "scale": 4
      },
      "doc": "Precise to 4 decimal places"
    },
    {
      "name": "currency",
      "type": "string",
      "doc": "ISO 4217 code (USD, EUR, GBP)"
    },
    {
      "name": "fromAccount",
      "type": "string"
    },
    {
      "name": "toAccount",
      "type": ["null", "string"],
      "default": null
    },
    {
      "name": "status",
      "type": {
        "type": "enum",
        "name": "TransactionStatus",
        "symbols": ["PENDING", "COMPLETED", "FAILED", "REVERSED"]
      }
    },
    {
      "name": "description",
      "type": ["null", "string"],
      "default": null
    },
    {
      "name": "metadata",
      "type": {
        "type": "map",
        "values": "string"
      },
      "default": {}
    }
  ]
}

Critical: Use Decimal for Money

Never use float or double for money! The decimal logical type ensures precise arithmetic without floating-point errors. With precision=19 and scale=4, you can handle amounts up to 999,999,999,999,999.9999 accurately.

Example 7: Product Catalog

Product catalog with categories, variants, inventory. For e-commerce and inventory management.

{
  "type": "record",
  "name": "Product",
  "namespace": "com.example.catalog",
  "fields": [
    {
      "name": "productId",
      "type": "string"
    },
    {
      "name": "sku",
      "type": "string"
    },
    {
      "name": "name",
      "type": "string"
    },
    {
      "name": "description",
      "type": "string"
    },
    {
      "name": "category",
      "type": {
        "type": "array",
        "items": "string"
      },
      "doc": "Hierarchy like ['Electronics', 'Computers', 'Laptops']"
    },
    {
      "name": "brand",
      "type": "string"
    },
    {
      "name": "price",
      "type": "double"
    },
    {
      "name": "currency",
      "type": "string",
      "default": "USD"
    },
    {
      "name": "variants",
      "type": {
        "type": "array",
        "items": {
          "type": "record",
          "name": "ProductVariant",
          "fields": [
            {"name": "variantId", "type": "string"},
            {"name": "size", "type": ["null", "string"], "default": null},
            {"name": "color", "type": ["null", "string"], "default": null},
            {"name": "sku", "type": "string"},
            {"name": "price", "type": "double"},
            {"name": "stockQuantity", "type": "int"}
          ]
        }
      },
      "default": []
    },
    {
      "name": "images",
      "type": {
        "type": "array",
        "items": "string"
      },
      "default": [],
      "doc": "Image URLs"
    },
    {
      "name": "specifications",
      "type": {
        "type": "map",
        "values": "string"
      },
      "default": {}
    },
    {
      "name": "isActive",
      "type": "boolean",
      "default": true
    },
    {
      "name": "createdAt",
      "type": {"type": "long", "logicalType": "timestamp-millis"}
    },
    {
      "name": "updatedAt",
      "type": {"type": "long", "logicalType": "timestamp-millis"}
    }
  ]
}

Handling Variations

This handles products with multiple variants (shirts in different sizes/colors, etc.). Variants array keeps related products together while tracking separate inventory. Category array supports multi-level classification.

Schema Design Tips

Things I've learned from these examples:

Always Include:

  • Namespace (avoids conflicts in Schema Registry)
  • Doc strings (helps future you and teammates)
  • Timestamps (always useful for debugging)
  • Defaults for optional fields
  • Enums for fixed value sets

Avoid:

  • Float/double for money (use decimal)
  • Required fields without defaults
  • Generic names like "data" or "info"
  • Too much nesting (3 levels max)
  • Changing field order (breaks encoding)

Test These Schemas

Copy any schema above and test it:

Official Documentation

Official Resources

Related Guides