Skip to content

Commit d989336

Browse files
authored
Add notes/gotchas on schema fingerprints (#112)
1 parent 906d178 commit d989336

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

README.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -571,4 +571,17 @@ The first that matches, is the one that will be used.
571571
In order to initialize the single connectors, a configuration will be created merging the specific part
572572
(i.e. hbase/mongo/confluent) with the outer layer: in case of duplicated entries the more specific one will be used.
573573

574-
Registration of the schema, will work with the connector set as registrar.
574+
Registration of the schema, will work with the connector set as registrar.
575+
576+
### Notes on Avro Single-Object Encoding Specification
577+
578+
#### Canonical form ignores default fields
579+
580+
As specified in the background section, Darwin leverages the Avro Single-Object Encoding specification to allow the schema fingerprint to be stored along the avro data.
581+
In order to create the fingerprint, the schema is converted into its [parsing-canonical form](https://avro.apache.org/docs/1.8.1/spec.html#Transforming+into+Parsing+Canonical+Form), which strips away all the fields that are not needed for reading/writing, such as doc, alias, comment. However, it will also remove the default field, allowing two schemas that are semantically different due to a default field to have the identical fingerprint.
582+
The default field is important for compatibility; it is useful to know this tip for debugging purposes in case of a broken compatibility on a subject.
583+
584+
Sources:
585+
586+
- [AVRO-2002](https://issues.apache.org/jira/browse/AVRO-2002)
587+
- [Adding or removing a default does not register a new schema version](https://github.com/salsify/avro-schema-registry/issues/38)

0 commit comments

Comments
 (0)