fastavro.read¶
-
class
reader
(fo, reader_schema=None, return_record_name=False)¶ Iterator over records in an avro file.
- Parameters
fo (file-like) – Input stream
reader_schema (dict, optional) – Reader schema
return_record_name (bool, optional) – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
Example:
from fastavro import reader with open('some-file.avro', 'rb') as fo: avro_reader = reader(fo) for record in avro_reader: process_record(record)
The fo argument is a file-like object so another common example usage would use an io.BytesIO object like so:
from io import BytesIO from fastavro import writer, reader fo = BytesIO() writer(fo, schema, records) fo.seek(0) for record in reader(fo): process_record(record)
-
metadata
¶ Key-value pairs in the header metadata
-
codec
¶ The codec used when writing
-
writer_schema
¶ The schema used when writing
-
reader_schema
¶ The schema used when reading (if provided)
-
class
block_reader
(fo, reader_schema=None, return_record_name=False)¶ Iterator over
Block
in an avro file.- Parameters
fo (file-like) – Input stream
reader_schema (dict, optional) – Reader schema
return_record_name (bool, optional) – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
Example:
from fastavro import block_reader with open('some-file.avro', 'rb') as fo: avro_reader = block_reader(fo) for block in avro_reader: process_block(block)
-
metadata
¶ Key-value pairs in the header metadata
-
codec
¶ The codec used when writing
-
writer_schema
¶ The schema used when writing
-
reader_schema
¶ The schema used when reading (if provided)
-
class
Block
(bytes_, num_records, codec, reader_schema, writer_schema, offset, size, return_record_name=False)¶ An avro block. Will yield records when iterated over
-
num_records
¶ Number of records in the block
-
writer_schema
¶ The schema used when writing
-
reader_schema
¶ The schema used when reading (if provided)
-
offset
¶ Offset of the block from the begining of the avro file
-
size
¶ Size of the block in bytes
-
-
schemaless_reader
(fo, writer_schema, reader_schema=None, return_record_name=False)¶ Reads a single record writen using the
schemaless_writer()
- Parameters
fo (file-like) – Input stream
writer_schema (dict) – Schema used when calling schemaless_writer
reader_schema (dict, optional) – If the schema has changed since being written then the new schema can be given to allow for schema migration
return_record_name (bool, optional) – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
Example:
parsed_schema = fastavro.parse_schema(schema) with open('file.avro', 'rb') as fp: record = fastavro.schemaless_reader(fp, parsed_schema)
Note: The
schemaless_reader
can only read a single record.
-
is_avro
(path_or_buffer)¶ Return True if path (or buffer) points to an Avro file.
- Parameters
path_or_buffer (path to file or file-like object) – Path to file