Skip to content

Commit

Permalink
[SchemaRegistry] prepare for release (#21040)
Browse files Browse the repository at this point in the history
fixes: #20970
  • Loading branch information
swathipil committed Oct 4, 2021
1 parent aeacef5 commit 139aae2
Show file tree
Hide file tree
Showing 4 changed files with 53 additions and 55 deletions.
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Release History

## 1.0.0b3 (Unreleased)
## 1.0.0b3 (2021-10-05)

### Features Added

- `auto_register_schemas` keyword argument has been added to `AvroSerializer`, which will allow for automatically registering schemas passed in to the `serialize`.
- `auto_register_schemas` keyword argument has been added to `AvroSerializer`, which will allow for automatically registering schemas passed in to the `serialize`, when set to `True`, otherwise `False` by default.
- `value` parameter in `serialize` on `AvroSerializer` takes type `Mapping` rather than `Dict`.

### Breaking Changes
Expand Down
46 changes: 23 additions & 23 deletions sdk/schemaregistry/azure-schemaregistry-avroserializer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,25 +27,25 @@ To use this package, you must have:
* Python 2.7, 3.6 or later - [Install Python][python]

### Authenticate the client
Interaction with Schema Registry Avro Serializer starts with an instance of SchemaRegistryAvroSerializer class. You need the endpoint, AAD credential and schema group name to instantiate the client object.
Interaction with Schema Registry Avro Serializer starts with an instance of AvroSerializer class. You need the endpoint, AAD credential and schema group name to instantiate the client object.

**Create client using the azure-identity library:**

```python
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.schemaregistry.serializer.avroserializer import AvroSerializer
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
endpoint = '<< ENDPOINT OF THE SCHEMA REGISTRY >>'
schema_group = '<< GROUP NAME OF THE SCHEMA >>'
schema_registry_client = SchemaRegistryClient(endpoint, credential)
serializer = SchemaRegistryAvroSerializer(schema_registry_client, schema_group)
serializer = AvroSerializer(schema_registry_client, schema_group)
```

## Key concepts

### SchemaRegistryAvroSerializer
### AvroSerializer

Provides API to serialize to and deserialize from Avro Binary Encoding plus a
header with schema ID. Uses [SchemaRegistryClient][schemaregistry_client] to get schema IDs from schema content or vice versa.
Expand Down Expand Up @@ -85,21 +85,21 @@ The following sections provide several code snippets covering some of the most c

### Serialization

Use `SchemaRegistryAvroSerializer.serialize` method to serialize dict data with the given avro schema.
Use `AvroSerializer.serialize` method to serialize dict data with the given avro schema.
The method would automatically register the schema to the Schema Registry Service and keep the schema cached for future serialization usage.

```python
import os
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.schemaregistry.serializer.avroserializer import AvroSerializer
from azure.identity import DefaultAzureCredential

token_credential = DefaultAzureCredential()
endpoint = os.environ['SCHEMA_REGISTRY_ENDPOINT']
schema_group = "<your-group-name>"
group_name = "<your-group-name>"

schema_registry_client = SchemaRegistryClient(endpoint, token_credential)
serializer = SchemaRegistryAvroSerializer(schema_registry_client, schema_group)
serializer = AvroSerializer(client=schema_registry_client, group_name=group_name)

schema_string = """
{"namespace": "example.avro",
Expand All @@ -114,26 +114,26 @@ schema_string = """

with serializer:
dict_data = {"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
encoded_bytes = serializer.serialize(dict_data, schema_string)
encoded_bytes = serializer.serialize(dict_data, schema=schema_string)
```

### Deserialization

Use `SchemaRegistryAvroSerializer.deserialize` method to deserialize raw bytes into dict data.
Use `AvroSerializer.deserialize` method to deserialize raw bytes into dict data.
The method would automatically retrieve the schema from the Schema Registry Service and keep the schema cached for future deserialization usage.

```python
import os
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.schemaregistry.serializer.avroserializer import AvroSerializer
from azure.identity import DefaultAzureCredential

token_credential = DefaultAzureCredential()
endpoint = os.environ['SCHEMA_REGISTRY_ENDPOINT']
schema_group = "<your-group-name>"
group_name = "<your-group-name>"

schema_registry_client = SchemaRegistryClient(endpoint, token_credential)
serializer = SchemaRegistryAvroSerializer(schema_registry_client, schema_group)
serializer = AvroSerializer(client=schema_registry_client, group_name=group_name)

with serializer:
encoded_bytes = b'<data_encoded_by_azure_schema_registry_avro_serializer>'
Expand All @@ -148,12 +148,12 @@ Integration with [Event Hubs][eventhubs_repo] to send serialized avro dict data
import os
from azure.eventhub import EventHubProducerClient, EventData
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.schemaregistry.serializer.avroserializer import AvroSerializer
from azure.identity import DefaultAzureCredential

token_credential = DefaultAzureCredential()
endpoint = os.environ['SCHEMA_REGISTRY_ENDPOINT']
schema_group = "<your-group-name>"
group_name = "<your-group-name>"
eventhub_connection_str = os.environ['EVENT_HUB_CONN_STR']
eventhub_name = os.environ['EVENT_HUB_NAME']

Expand All @@ -169,7 +169,7 @@ schema_string = """
}"""

schema_registry_client = SchemaRegistryClient(endpoint, token_credential)
avro_serializer = SchemaRegistryAvroSerializer(schema_registry_client, schema_group)
avro_serializer = AvroSerializer(client=schema_registry_client, group_name=group_name)

eventhub_producer = EventHubProducerClient.from_connection_string(
conn_str=eventhub_connection_str,
Expand All @@ -179,7 +179,7 @@ eventhub_producer = EventHubProducerClient.from_connection_string(
with eventhub_producer, avro_serializer:
event_data_batch = eventhub_producer.create_batch()
dict_data = {"name": "Bob", "favorite_number": 7, "favorite_color": "red"}
payload_bytes = avro_serializer.serialize(data=dict_data, schema=schema_string)
payload_bytes = avro_serializer.serialize(dict_data, schema=schema_string)
event_data_batch.add(EventData(body=payload_bytes))
eventhub_producer.send_batch(event_data_batch)
```
Expand All @@ -192,17 +192,17 @@ Integration with [Event Hubs][eventhubs_repo] to receive `EventData` and deseria
import os
from azure.eventhub import EventHubConsumerClient
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.schemaregistry.serializer.avroserializer import AvroSerializer
from azure.identity import DefaultAzureCredential

token_credential = DefaultAzureCredential()
endpoint = os.environ['SCHEMA_REGISTRY_ENDPOINT']
schema_group = "<your-group-name>"
group_name = "<your-group-name>"
eventhub_connection_str = os.environ['EVENT_HUB_CONN_STR']
eventhub_name = os.environ['EVENT_HUB_NAME']

schema_registry_client = SchemaRegistryClient(endpoint, token_credential)
avro_serializer = SchemaRegistryAvroSerializer(schema_registry_client, schema_group)
avro_serializer = AvroSerializer(client=schema_registry_client, group_name=group_name)

eventhub_consumer = EventHubConsumerClient.from_connection_string(
conn_str=eventhub_connection_str,
Expand Down Expand Up @@ -236,7 +236,7 @@ headers, can be enabled on a client with the `logging_enable` argument:
import sys
import logging
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.serializer.avroserializer import SchemaRegistryAvroSerializer
from azure.schemaregistry.serializer.avroserializer import AvroSerializer
from azure.identity import DefaultAzureCredential

# Create a logger for the SDK
Expand All @@ -250,13 +250,13 @@ logger.addHandler(handler)
credential = DefaultAzureCredential()
schema_registry_client = SchemaRegistryClient("<your-end-point>", credential)
# This client will log detailed information about its HTTP sessions, at DEBUG level
serializer = SchemaRegistryAvroSerializer(schema_registry_client, "<your-group-name>", logging_enable=True)
serializer = AvroSerializer(client=schema_registry_client, group_name="<your-group-name>", logging_enable=True)
```

Similarly, `logging_enable` can enable detailed logging for a single operation,
even when it isn't enabled for the client:
```py
serializer.serialie(dict_data, schema_content, logging_enable=True)
serializer.serialize(dict_data, schema=schema_definition, logging_enable=True)
```

## Next steps
Expand Down
2 changes: 1 addition & 1 deletion sdk/schemaregistry/azure-schemaregistry/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Release History

## 1.0.0b3 (Unreleased)
## 1.0.0b3 (2021-10-05)

### Breaking Changes

Expand Down
56 changes: 27 additions & 29 deletions sdk/schemaregistry/azure-schemaregistry/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ To use this package, you must have:
* Python 2.7, 3.6 or later - [Install Python][python]

### Authenticate the client
Interaction with Schema Registry starts with an instance of SchemaRegistryClient class. You need the endpoint and AAD credential to instantiate the client object.
Interaction with Schema Registry starts with an instance of SchemaRegistryClient class. You need the fully qualified namespace and AAD credential to instantiate the client object.

**Create client using the azure-identity library:**

Expand All @@ -36,8 +36,9 @@ from azure.schemaregistry import SchemaRegistryClient
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
endpoint = '<< ENDPOINT OF THE SCHEMA REGISTRY >>'
schema_registry_client = SchemaRegistryClient(endpoint, credential)
# Namespace should be similar to: '<your-eventhub-namespace>.servicebus.windows.net/'
fully_qualified_namespace = '<< FULLY QUALIFIED NAMESPACE OF THE SCHEMA REGISTRY >>'
schema_registry_client = SchemaRegistryClient(fully_qualified_namespace, credential)
```

## Key concepts
Expand All @@ -57,7 +58,6 @@ The following sections provide several code snippets covering some of the most c
### Register a schema

Use `SchemaRegistryClient.register_schema` method to register a schema.
When registering a schema, the `Schema` and `SchemaProperties` will be cached in the `SchemaRegistryClient` instance, so that any subsequent calls to `get_schema_id` and `get_schema` corresponding to the same schema can use the cached value rather than going to the service.

```python
import os
Expand All @@ -66,11 +66,11 @@ from azure.identity import DefaultAzureCredential
from azure.schemaregistry import SchemaRegistryClient

token_credential = DefaultAzureCredential()
endpoint = os.environ['SCHEMA_REGISTRY_ENDPOINT']
schema_group = "<your-group-name>"
schema_name = "<your-schema-name>"
serialization_type = "Avro"
schema_content = """
fully_qualified_namespace = os.environ['SCHEMA_REGISTRY_FULLY_QUALIFIED_NAMESPACE']
group_name = "<your-group-name>"
name = "<your-schema-name>"
format = "Avro"
schema_definition = """
{"namespace": "example.avro",
"type": "record",
"name": "User",
Expand All @@ -82,16 +82,15 @@ schema_content = """
}
"""

schema_registry_client = SchemaRegistryClient(endpoint=endpoint, credential=token_credential)
schema_registry_client = SchemaRegistryClient(fully_qualified_namespace=fully_qualified_namespace, credential=token_credential)
with schema_registry_client:
schema_properties = schema_registry_client.register_schema(schema_group, schema_name, serialization_type, schema_content)
schema_id = schema_properties.schema_id
schema_properties = schema_registry_client.register_schema(group_name, name, schema_definition, format)
id = schema_properties.id
```

### Get the schema by id

Get the schema content and its properties by schema id.
When looking up the schema content by schema id, the `Schema` will be cached in the `SchemaRegistryClient` instance so that subsequent requests for this schema id do not need to go the service.

```python
import os
Expand All @@ -100,19 +99,18 @@ from azure.identity import DefaultAzureCredential
from azure.schemaregistry import SchemaRegistryClient

token_credential = DefaultAzureCredential()
endpoint = os.environ['SCHEMA_REGISTRY_ENDPOINT']
schema_id = '<your-schema-id>'
fully_qualified_namespace = os.environ['SCHEMA_REGISTRY_FULLY_QUALIFIED_NAMESPACE']
id = '<your-schema-id>'

schema_registry_client = SchemaRegistryClient(endpoint=endpoint, credential=token_credential)
schema_registry_client = SchemaRegistryClient(fully_qualified_namespace=fully_qualified_namespace, credential=token_credential)
with schema_registry_client:
schema = schema_registry_client.get_schema(schema_id)
schema_content = schema.schema_content
schema = schema_registry_client.get_schema(id)
schema_definition = schema.schema_definition
```

### Get the id of a schema

Get the schema id of a schema by schema content and its properties.
When looking up the schema id, the `Schema` and `SchemaProperties` will be cached in the `SchemaRegistryClient` instance, so that subsequent requests for this schema do not need to go the service.

```python
import os
Expand All @@ -121,11 +119,11 @@ from azure.identity import DefaultAzureCredential
from azure.schemaregistry import SchemaRegistryClient

token_credential = DefaultAzureCredential()
endpoint = os.environ['SCHEMA_REGISTRY_ENDPOINT']
schema_group = "<your-group-name>"
schema_name = "<your-schema-name>"
serialization_type = "Avro"
schema_content = """
fully_qualified_namespace = os.environ['SCHEMA_REGISTRY_FULLY_QUALIFIED_NAMESPACE']
group_name = "<your-group-name>"
name = "<your-schema-name>"
format = "Avro"
schema_definition = """
{"namespace": "example.avro",
"type": "record",
"name": "User",
Expand All @@ -137,10 +135,10 @@ schema_content = """
}
"""

schema_registry_client = SchemaRegistryClient(endpoint=endpoint, credential=token_credential)
schema_registry_client = SchemaRegistryClient(fully_qualified_namespace=fully_qualified_namespace, credential=token_credential)
with schema_registry_client:
schema_properties = schema_registry_client.get_schema_id(schema_group, schema_name, serialization_type, schema_content)
schema_id = schema_properties.schema_id
schema_properties = schema_registry_client.register_schema(group_name, name, schema_definition, format)
id = schema_properties.id
```

## Troubleshooting
Expand Down Expand Up @@ -173,13 +171,13 @@ logger.addHandler(handler)

credential = DefaultAzureCredential()
# This client will log detailed information about its HTTP sessions, at DEBUG level
schema_registry_client = SchemaRegistryClient("you_end_point", credential, logging_enable=True)
schema_registry_client = SchemaRegistryClient("your_fully_qualified_namespace", credential, logging_enable=True)
```

Similarly, `logging_enable` can enable detailed logging for a single operation,
even when it isn't enabled for the client:
```py
schema_registry_client.get_schema(schema_id, logging_enable=True)
schema_registry_client.get_schema(id, logging_enable=True)
```

## Next steps
Expand Down

0 comments on commit 139aae2

Please sign in to comment.