fastavro.read

class reader(fo, reader_schema=None, return_record_name=False)

Iterator over records in an avro file.

Parameters
  • fo (file-like) – Input stream

  • reader_schema (dict, optional) – Reader schema

  • return_record_name (bool, optional) – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself

Example:

from fastavro import reader
with open('some-file.avro', 'rb') as fo:
    avro_reader = reader(fo)
    for record in avro_reader:
        process_record(record)

The fo argument is a file-like object so another common example usage would use an io.BytesIO object like so:

from io import BytesIO
from fastavro import writer, reader

fo = BytesIO()
writer(fo, schema, records)
fo.seek(0)
for record in reader(fo):
    process_record(record)
metadata

Key-value pairs in the header metadata

codec

The codec used when writing

writer_schema

The schema used when writing

reader_schema

The schema used when reading (if provided)

class block_reader(fo, reader_schema=None, return_record_name=False)

Iterator over Block in an avro file.

Parameters
  • fo (file-like) – Input stream

  • reader_schema (dict, optional) – Reader schema

  • return_record_name (bool, optional) – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself

Example:

from fastavro import block_reader
with open('some-file.avro', 'rb') as fo:
    avro_reader = block_reader(fo)
    for block in avro_reader:
        process_block(block)
metadata

Key-value pairs in the header metadata

codec

The codec used when writing

writer_schema

The schema used when writing

reader_schema

The schema used when reading (if provided)

class Block(bytes_, num_records, codec, reader_schema, writer_schema, offset, size, return_record_name=False)

An avro block. Will yield records when iterated over

num_records

Number of records in the block

writer_schema

The schema used when writing

reader_schema

The schema used when reading (if provided)

offset

Offset of the block from the begining of the avro file

size

Size of the block in bytes

schemaless_reader(fo, writer_schema, reader_schema=None, return_record_name=False)

Reads a single record writen using the schemaless_writer()

Parameters
  • fo (file-like) – Input stream

  • writer_schema (dict) – Schema used when calling schemaless_writer

  • reader_schema (dict, optional) – If the schema has changed since being written then the new schema can be given to allow for schema migration

  • return_record_name (bool, optional) – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself

Example:

parsed_schema = fastavro.parse_schema(schema)
with open('file.avro', 'rb') as fp:
    record = fastavro.schemaless_reader(fp, parsed_schema)

Note: The schemaless_reader can only read a single record.

is_avro(path_or_buffer)

Return True if path (or buffer) points to an Avro file.

Parameters

path_or_buffer (path to file or file-like object) – Path to file