Introduction
Setting up Kafka topics or building Hadoop pipelines? You need good schema examples to get started fast.
Here are battle-tested schemas from real production systems. Copy them straight into your Schema Registry or tweak them for your needs.
New to Avro Schemas?
Read the schema basics guide first if you're not familiar with types and structures.
Example 1: User Profile
Classic user profile with required and optional fields. Works for user services, auth systems, CRM apps.
{
"type": "record",
"name": "UserProfile",
"namespace": "com.example.users",
"doc": "User profile information",
"fields": [
{
"name": "userId",
"type": "long",
"doc": "Unique user identifier"
},
{
"name": "username",
"type": "string",
"doc": "Login username"
},
{
"name": "email",
"type": "string"
},
{
"name": "firstName",
"type": "string"
},
{
"name": "lastName",
"type": "string"
},
{
"name": "phoneNumber",
"type": ["null", "string"],
"default": null,
"doc": "Optional phone"
},
{
"name": "dateOfBirth",
"type": ["null", {"type": "int", "logicalType": "date"}],
"default": null
},
{
"name": "createdAt",
"type": {"type": "long", "logicalType": "timestamp-millis"}
},
{
"name": "isActive",
"type": "boolean",
"default": true
}
]
}Why This Works
Required fields (userId, email) keep data clean. Optional fields (phoneNumber) use unions with defaults so you can add them later without breaking stuff. The timestamp-millis type keeps dates consistent everywhere.
Example 2: E-Commerce Order
Complete order schema with line items, addresses, payment info. Standard pattern for e-commerce and order management.
{
"type": "record",
"name": "Order",
"namespace": "com.example.orders",
"fields": [
{
"name": "orderId",
"type": "string",
"doc": "UUID format"
},
{
"name": "customerId",
"type": "long"
},
{
"name": "orderDate",
"type": {"type": "long", "logicalType": "timestamp-millis"}
},
{
"name": "status",
"type": {
"type": "enum",
"name": "OrderStatus",
"symbols": [
"PENDING",
"CONFIRMED",
"PROCESSING",
"SHIPPED",
"DELIVERED",
"CANCELLED",
"REFUNDED"
]
}
},
{
"name": "items",
"type": {
"type": "array",
"items": {
"type": "record",
"name": "OrderItem",
"fields": [
{"name": "productId", "type": "string"},
{"name": "productName", "type": "string"},
{"name": "quantity", "type": "int"},
{"name": "unitPrice", "type": "double"},
{"name": "discount", "type": ["null", "double"], "default": null}
]
}
}
},
{
"name": "shippingAddress",
"type": {
"type": "record",
"name": "Address",
"fields": [
{"name": "street", "type": "string"},
{"name": "city", "type": "string"},
{"name": "state", "type": "string"},
{"name": "zipCode", "type": "string"},
{"name": "country", "type": "string"}
]
}
},
{
"name": "subtotal",
"type": "double"
},
{
"name": "tax",
"type": "double"
},
{
"name": "shippingCost",
"type": "double"
},
{
"name": "total",
"type": "double"
}
]
}In Production
You'll see this pattern in most online stores. Orders need items (obviously), items need prices and quantities. The status enum keeps things organized and prevents typos. Pretty much every Kafka-based shop uses something like this.
Example 3: Kafka Event Message
Generic event schema for event-driven architectures. Includes metadata, payload, tracking info.
{
"type": "record",
"name": "KafkaEvent",
"namespace": "com.example.events",
"doc": "Generic event message",
"fields": [
{
"name": "eventId",
"type": "string",
"doc": "UUID"
},
{
"name": "eventType",
"type": "string",
"doc": "e.g., USER_CREATED, ORDER_PLACED"
},
{
"name": "eventVersion",
"type": "string",
"default": "1.0"
},
{
"name": "timestamp",
"type": {"type": "long", "logicalType": "timestamp-millis"}
},
{
"name": "source",
"type": "string",
"doc": "Service that generated this"
},
{
"name": "userId",
"type": ["null", "string"],
"default": null
},
{
"name": "payload",
"type": "string",
"doc": "JSON-encoded event data"
},
{
"name": "metadata",
"type": {
"type": "map",
"values": "string"
},
"default": {}
},
{
"name": "correlationId",
"type": ["null", "string"],
"default": null,
"doc": "For tracing related events"
}
]
}Event Pattern
This gives you a consistent envelope for all events. The flexible payload (stored as JSON string) combined with structured metadata makes it work for different event types. The correlationId lets you trace events across microservices.
Example 4: IoT Sensor Data
Time-series sensor data for IoT apps, industrial monitoring, smart home systems. Optimized for high-volume streaming.
{
"type": "record",
"name": "SensorReading",
"namespace": "com.example.iot",
"fields": [
{
"name": "sensorId",
"type": "string"
},
{
"name": "timestamp",
"type": {"type": "long", "logicalType": "timestamp-millis"}
},
{
"name": "sensorType",
"type": {
"type": "enum",
"name": "SensorType",
"symbols": [
"TEMPERATURE",
"HUMIDITY",
"PRESSURE",
"MOTION",
"LIGHT",
"SOUND"
]
}
},
{
"name": "value",
"type": "double"
},
{
"name": "unit",
"type": "string",
"doc": "celsius, percentage, pascals, etc."
},
{
"name": "location",
"type": ["null", {
"type": "record",
"name": "GeoLocation",
"fields": [
{"name": "latitude", "type": "double"},
{"name": "longitude", "type": "double"},
{"name": "altitude", "type": ["null", "double"], "default": null}
]
}],
"default": null
},
{
"name": "batteryLevel",
"type": ["null", "int"],
"default": null,
"doc": "0-100 percentage"
},
{
"name": "quality",
"type": {
"type": "enum",
"name": "DataQuality",
"symbols": ["GOOD", "QUESTIONABLE", "BAD"],
"default": "GOOD"
}
}
]
}High-Volume Design
This handles millions of readings per day. Fixed fields like sensorId and timestamp keep messages compact. Optional location and battery work for both stationary and mobile sensors without wasting space.
Example 5: Application Logs
Log schema for centralized logging, observability platforms, security monitoring. Works with log aggregation tools.
{
"type": "record",
"name": "LogEntry",
"namespace": "com.example.logging",
"fields": [
{
"name": "timestamp",
"type": {"type": "long", "logicalType": "timestamp-micros"}
},
{
"name": "level",
"type": {
"type": "enum",
"name": "LogLevel",
"symbols": ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "FATAL"]
}
},
{
"name": "service",
"type": "string"
},
{
"name": "hostname",
"type": "string"
},
{
"name": "message",
"type": "string"
},
{
"name": "logger",
"type": "string"
},
{
"name": "thread",
"type": ["null", "string"],
"default": null
},
{
"name": "exception",
"type": ["null", {
"type": "record",
"name": "ExceptionInfo",
"fields": [
{"name": "type", "type": "string"},
{"name": "message", "type": "string"},
{"name": "stackTrace", "type": "string"}
]
}],
"default": null
},
{
"name": "tags",
"type": {
"type": "map",
"values": "string"
},
"default": {}
},
{
"name": "traceId",
"type": ["null", "string"],
"default": null,
"doc": "Distributed tracing ID"
},
{
"name": "spanId",
"type": ["null", "string"],
"default": null
}
]
}Production Logging
Supports modern observability. LogLevel enum keeps severity consistent. Optional exception record captures error details without bloating normal logs. TraceId and spanId work with OpenTelemetry for distributed tracing.
Example 6: Financial Transaction
Banking or payment transaction with precise decimal handling. Critical for financial systems where accuracy matters.
{
"type": "record",
"name": "Transaction",
"namespace": "com.example.finance",
"fields": [
{
"name": "transactionId",
"type": "string"
},
{
"name": "timestamp",
"type": {"type": "long", "logicalType": "timestamp-millis"}
},
{
"name": "type",
"type": {
"type": "enum",
"name": "TransactionType",
"symbols": [
"DEPOSIT",
"WITHDRAWAL",
"TRANSFER",
"PAYMENT",
"REFUND",
"FEE"
]
}
},
{
"name": "amount",
"type": {
"type": "bytes",
"logicalType": "decimal",
"precision": 19,
"scale": 4
},
"doc": "Precise to 4 decimal places"
},
{
"name": "currency",
"type": "string",
"doc": "ISO 4217 code (USD, EUR, GBP)"
},
{
"name": "fromAccount",
"type": "string"
},
{
"name": "toAccount",
"type": ["null", "string"],
"default": null
},
{
"name": "status",
"type": {
"type": "enum",
"name": "TransactionStatus",
"symbols": ["PENDING", "COMPLETED", "FAILED", "REVERSED"]
}
},
{
"name": "description",
"type": ["null", "string"],
"default": null
},
{
"name": "metadata",
"type": {
"type": "map",
"values": "string"
},
"default": {}
}
]
}Critical: Use Decimal for Money
Never use float or double for money! The decimal logical type ensures precise arithmetic without floating-point errors. With precision=19 and scale=4, you can handle amounts up to 999,999,999,999,999.9999 accurately.
Example 7: Product Catalog
Product catalog with categories, variants, inventory. For e-commerce and inventory management.
{
"type": "record",
"name": "Product",
"namespace": "com.example.catalog",
"fields": [
{
"name": "productId",
"type": "string"
},
{
"name": "sku",
"type": "string"
},
{
"name": "name",
"type": "string"
},
{
"name": "description",
"type": "string"
},
{
"name": "category",
"type": {
"type": "array",
"items": "string"
},
"doc": "Hierarchy like ['Electronics', 'Computers', 'Laptops']"
},
{
"name": "brand",
"type": "string"
},
{
"name": "price",
"type": "double"
},
{
"name": "currency",
"type": "string",
"default": "USD"
},
{
"name": "variants",
"type": {
"type": "array",
"items": {
"type": "record",
"name": "ProductVariant",
"fields": [
{"name": "variantId", "type": "string"},
{"name": "size", "type": ["null", "string"], "default": null},
{"name": "color", "type": ["null", "string"], "default": null},
{"name": "sku", "type": "string"},
{"name": "price", "type": "double"},
{"name": "stockQuantity", "type": "int"}
]
}
},
"default": []
},
{
"name": "images",
"type": {
"type": "array",
"items": "string"
},
"default": [],
"doc": "Image URLs"
},
{
"name": "specifications",
"type": {
"type": "map",
"values": "string"
},
"default": {}
},
{
"name": "isActive",
"type": "boolean",
"default": true
},
{
"name": "createdAt",
"type": {"type": "long", "logicalType": "timestamp-millis"}
},
{
"name": "updatedAt",
"type": {"type": "long", "logicalType": "timestamp-millis"}
}
]
}Handling Variations
This handles products with multiple variants (shirts in different sizes/colors, etc.). Variants array keeps related products together while tracking separate inventory. Category array supports multi-level classification.
Schema Design Tips
Things I've learned from these examples:
Always Include:
- •Namespace (avoids conflicts in Schema Registry)
- •Doc strings (helps future you and teammates)
- •Timestamps (always useful for debugging)
- •Defaults for optional fields
- •Enums for fixed value sets
Avoid:
- •Float/double for money (use decimal)
- •Required fields without defaults
- •Generic names like "data" or "info"
- •Too much nesting (3 levels max)
- •Changing field order (breaks encoding)
Test These Schemas
Copy any schema above and test it:
Official Documentation
Official Resources
- Avro Specification
Full schema reference
- Official Examples
More from Apache
- Schema Registry
Managing schemas
Related Guides
- What is Apache Avro?
Avro introduction
- Schema Guide
All types explained
- Format Examples
Schema with data