The protoc, better known as the protobuf compiler, does not generate code for your favorite programming language by default,
but it provides the ability to do so via plugins. Although a plugin for PHP already exists, it generates rather outdated PHP code,
using setters, getters, inheritance, and other obsolete practices that are no longer considered acceptable for DTOs (which is exactly what protobuf messages are).
For this reason, we have written this plugin, which — in addition to addressing the issues mentioned above — is also necessary for generating
gRPC code based on non-blocking PHP (currently using amphp ecosystem).
To use the plugin, you need to install the compiled phar archive of the required version from the release page using curl.
For example, you can do this when building a Docker image.
- First of all, you must install the required version of protoc itself:
ARG PROTOC_VERSION=32.1
ARG PROTOC_PLUGIN_VERSION=0.1.4
RUN apk add --no-cache curl unzip && \
curl -LO "https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/protoc-${PROTOC_VERSION}-linux-x86_64.zip" && \
unzip "protoc-${PROTOC_VERSION}-linux-x86_64.zip" -d /usr/local && \
rm "protoc-${PROTOC_VERSION}-linux-x86_64.zip"- After that, install the plugin:
RUN curl -L https://github.com/thesis-php/protoc-plugin/releases/download/${PROTOC_PLUGIN_VERSION}/protoc-gen-php \
-o /usr/local/bin/protoc-gen-php \
&& chmod +x /usr/local/bin/protoc-gen-phpNow you can use the plugin just as you did before. For example, to generate code for all files from the protos/ folder and place it in the genproto/ folder, you can use the following command:
protoc \
--plugin=protoc-gen-php-plugin=/usr/local/bin/protoc-gen-php \
protos/*.proto \
--php-plugin_out=genprotoLet's assume that the protos/ folder contains a request.proto file with the following schema:
syntax = "proto3";
package thesis.api;
message Request {
string id = 1;
}After running the generation command, a Thesis/Api folder containing the Request.php class will appear in the genproto/ directory:
<?php
/**
* Code generated by thesis/protoc-plugin. DO NOT EDIT.
* Versions:
* thesis/protoc-plugin — v0.1.2
* protoc — v6.32.1
* Source: protos/request.proto
*/
declare(strict_types=1);
namespace Thesis\Api;
use Thesis\Protobuf\Reflection;
/**
* @api
*/
final readonly class Request
{
public function __construct(
#[Reflection\Field(1, Reflection\StringT::T)]
public string $id = '',
) {}
}After generating the code, you must install the thesis/protobuf library, if you haven't done so already, which is responsible for serializing and deserializing protobuf messages.
To extend the plugin's behaviour, cli options are used.
If you want to change the namespace under which the code will be generated, use the php_namespace option.
protoc \
--plugin=protoc-gen-php-plugin=/usr/local/bin/protoc-gen-php \
protos/*.proto \
--php-plugin_out=php_namespace="Thesis\\Api\\V1":srcIf you are generating code within an application, you most likely want to place it in the genproto/ or generated/ folder
at the same level as your application code (which is typically stored in your src/ or app/ folder).
In this case, you would want the code to be placed in a folder that reflects the full namespace.
For example, if the proto schema specifies the option option php_namespace = "Thesis\\Api" or the package name package thesis.api,
all messages within this schema will be placed in the genproto/Thesis/Api/* folder.
However, if you want to generate the code and later vendor it as a package, you can place all the code in the src/ folder
and configure autoloading via composer without creating unnecessary nesting. To achieve this, you can use the src_path option with a dot (.) as its value:
protoc \
--plugin=protoc-gen-php-plugin=/usr/local/bin/protoc-gen-php \
protos/*.proto \
--php-plugin_out=src_path=.:srcOur plugin can generate not only protobuf messages, but also the client and server for gRPC services.
If you do not want to generate gRPC code, you can disable this behavior using the grpc=none option:
protoc \
--plugin=protoc-gen-php-plugin=/usr/local/bin/protoc-gen-php \
protos/*.proto \
--php-plugin_out=grpc=none:genprotoIf you want to generate only the client code, use grpc=client:
protoc \
--plugin=protoc-gen-php-plugin=/usr/local/bin/protoc-gen-php \
protos/*.proto \
--php-plugin_out=grpc=client:genprotoTo generate only the server code, use grpc=server. By default, and when passing grpc=client,server, both the client and server will be generated.
You can pass multiple options separated by commas. For example, if you want to specify a different namespace, place all the code in the root directory, and generate only the gRPC client, you can write the following:
protoc \
--plugin=protoc-gen-php-plugin=/usr/local/bin/protoc-gen-php \
protos/*.proto \
--php-plugin_out=php_namespace="Thesis\\Api\\V1",src_path=.,grpc=client:genprotoAs mentioned above, the plugin generates simple DTOs without setters, getters, or inheritance. All metadata for protobuf serialization is stored in attribute Thesis\Protobuf\Reflection\*, and the generated DTOs only have a constructor with promoted properties.
To avoid issues with integer overflow when using int64/uint64 types, \BcMath\Number will be used.
For other numeric scalars, the int and float types will be used respectively.
When using proto2, it will be explicitly specified whether lists are packed. This is necessary because in proto2 only lists with the corresponding option explicitly set could be packed.
Meanwhile, in proto3, this rule is applied implicitly, but only for types for which it was also possible in proto2.
In other words, for a list of int32 in proto2 code will be generated with an attribute like this:
final readonly class Request
{
/**
* @param list<int> $ids
*/
public function __construct(
#[Reflection\Field(1, new Reflection\ListT(Reflection\Int32T::T, true))]
public array $ids = [],
) {}
}Note the second argument of the ListT attribute: it will be set according to the [packed = bool] option.
For proto3 this argument will be omitted. Also note that lists will always have [] as their default value, because in protobuf all values are optional.
Since maps can have int64/uint64 types as keys, for which we use \BcMath\Number, we cannot use the native array type, as its keys cannot be objects.
A typical solution to this problem is a list of pairs, where the key is the result of applying a hash function, which ensures fast lookup in such a map similar to a regular array.
Therefore, for all fields of type map<K, V>, regardless of the key type, the \Thesis\Protobuf\Map<K, V> type will be used for consistency.
It implements \ArrayAccess, \Countable, and \IteratorAggregate to smooth over the inconvenience of not being able to use a regular array.
use Thesis\Protobuf;
use Thesis\Protobuf\Reflection;
final readonly class Request
{
/**
* @param Protobuf\Map<string, string> $options
*/
public function __construct(
#[Reflection\Field(1, new Reflection\MapT(Reflection\StringT::T, Reflection\StringT::T))]
public Protobuf\Map $options = new Protobuf\Map(),
) {}
}Such fields will never be nullable (especially since maps in protobuf cannot be required or optional) and will have an empty Map object as its default value.
This will simplify interaction with this type.
Since oneof in protobuf can contain variants with the same data types, we cannot use a native union.
For this reason, an object is created for each variant, each of which implements a sealed interface (enabled through static analysis).
Consider the following protobuf schema:
syntax = "proto3";
package thesis.api;
message Request {
oneof contact {
string phone = 1;
string email = 2;
int64 chat_id = 3;
}
}First of all, an Request object will be generated:
namespace Thesis\Api\Request;
final readonly class Request
{
public function __construct(
#[Reflection\OneOf([
\Thesis\Api\Request\ContactPhone::class,
\Thesis\Api\Request\ContactEmail::class,
\Thesis\Api\Request\ContactChatId::class,
])]
public ?\Thesis\Api\Request\Contact $contact = null,
) {}
}Then, an interface Contact will be generated:
namespace Thesis\Api\Request;
/**
* @api
* @phpstan-sealed (
* ContactPhone |
* ContactEmail |
* ContactChatId
* )
*/
interface Contact {}Note the namespace: this interface and all its implementations will be generated in a namespace nested relative to the object, just like all nested types of this message.
And for each variant, the following objects will be generated:
namespace Thesis\Api\Request;
/**
* @api
*/
final readonly class ContactChatId implements \Thesis\Api\Request\Contact
{
public function __construct(
#[Reflection\Field(3, Reflection\Int64T::T)]
public \BcMath\Number $chatId = new \BcMath\Number(0),
) {}
}
/**
* @api
*/
final readonly class ContactEmail implements \Thesis\Api\Request\Contact
{
public function __construct(
#[Reflection\Field(2, Reflection\StringT::T)]
public string $email = '',
) {}
}
/**
* @api
*/
final readonly class ContactPhone implements \Thesis\Api\Request\Contact
{
public function __construct(
#[Reflection\Field(1, Reflection\StringT::T)]
public string $phone = '',
) {}
}By default, all fields with scalar data types will have corresponding default values (0 for numbers, false for booleans, and so on).
If proto2 is used and the field is marked as optional, scalar types will become nullable, with null as the default value.
The same applies to optional in proto3. Lists and maps, however, will always be non-nullable (especially since they cannot be required or optional) but will have empty default values.
Meanwhile, all objects will always be nullable, regardless of required/optional labels, which allows the serializer to quickly skip such fields and avoid writing unnecessary data.
Our plugin supports generating all types of communication between client and server, including client-side, server-side, and bidirectional streaming.
Consider the following service:
syntax = "proto3";
package thesis.api.v1;
import "google/protobuf/empty.proto";
message Message {}
message Heartbeat {}
message Queue {}
service QueueService {
rpc State(google.protobuf.Empty) returns (Queue);
rpc Push(stream Message) returns (google.protobuf.Empty);
rpc Pull(google.protobuf.Empty) returns (stream Message);
rpc Heartbeats(stream Heartbeat) returns (stream Heartbeat);
}A client of the following code will be generated (method bodies are intentionally omitted for simplicity):
namespace Thesis\Api\V1;
use Amp\Cancellation;
use Amp\NullCancellation;
use Thesis\Grpc\Client;
use Thesis\Grpc\Metadata;
/**
* @api
*/
final readonly class QueueServiceClient
{
public function __construct(
private Client $client,
) {}
public function state(
\Google\Protobuf\Empty_ $request,
Metadata $md = new Metadata(),
Cancellation $cancellation = new NullCancellation(),
): \Thesis\Api\V1\Queue {}
/**
* @return Client\ClientStreamChannel<\Thesis\Api\V1\Message, \Google\Protobuf\Empty_>
*/
public function push(
Metadata $md = new Metadata(),
Cancellation $cancellation = new NullCancellation(),
): Client\ClientStreamChannel {}
/**
* @return Client\ServerStreamChannel<\Google\Protobuf\Empty_, \Thesis\Api\V1\Message>
*/
public function pull(
\Google\Protobuf\Empty_ $request,
Metadata $md = new Metadata(),
Cancellation $cancellation = new NullCancellation(),
): Client\ServerStreamChannel {}
/**
* @return Client\BidirectionalStreamChannel<\Thesis\Api\V1\Heartbeat, \Thesis\Api\V1\Heartbeat>
*/
public function heartbeats(
Metadata $md = new Metadata(),
Cancellation $cancellation = new NullCancellation(),
): Client\BidirectionalStreamChannel {}
}See the thesis/grpc for details of how to use streams and Metadata.
To use gRPC after generation you should install thesis/grpc and amphp/amp packages, if you haven't done so already.
When generating the server, two types will be generated: first, the server interface itself, which you need to implement,
and second, a registrar class that registers your implementation with the gRPC server, adapting the interface methods through intermediate handlers.
Let's take a look at the server interface:
namespace Thesis\Api\V1;
use Amp\Cancellation;
use Thesis\Grpc\Metadata;
use Thesis\Grpc\Server;
/**
* @api
*/
interface QueueServiceServer
{
public function state(
\Google\Protobuf\Empty_ $request,
Metadata $md,
Cancellation $cancellation,
): \Thesis\Api\V1\Queue;
/**
* @param Server\ClientStreamChannel<\Thesis\Api\V1\Message, \Google\Protobuf\Empty_> $stream
*/
public function push(Server\ClientStreamChannel $stream, Metadata $md, Cancellation $cancellation): void;
/**
* @param Server\ServerStreamChannel<\Google\Protobuf\Empty_, \Thesis\Api\V1\Message> $stream
*/
public function pull(
\Google\Protobuf\Empty_ $request,
Server\ServerStreamChannel $stream,
Metadata $md,
Cancellation $cancellation,
): void;
/**
* @param Server\BidirectionalStreamChannel<\Thesis\Api\V1\Heartbeat, \Thesis\Api\V1\Heartbeat> $stream
*/
public function heartbeats(
Server\BidirectionalStreamChannel $stream,
Metadata $md,
Cancellation $cancellation,
): void;
}And at the registrar class:
namespace Thesis\Api\V1;
use Override;
use Thesis\Grpc\Server;
/**
* @api
*/
final readonly class QueueServiceServerRegistry implements Server\ServiceRegistry
{
public function __construct(
private \Thesis\Api\V1\QueueServiceServer $server,
) {}
#[Override]
public function services(): iterable
{
yield new Server\Service('thesis.api.v1.QueueService', [
new Server\Rpc(
new Server\Handle('State', \Google\Protobuf\Empty_::class),
new Server\UnaryHandler($this->server->state(...)),
),
new Server\Rpc(
new Server\Handle('Push', \Thesis\Api\V1\Message::class),
new Server\ClientStreamHandler($this->server->push(...)),
),
new Server\Rpc(
new Server\Handle('Pull', \Google\Protobuf\Empty_::class),
new Server\ServerStreamHandler($this->server->pull(...)),
),
new Server\Rpc(
new Server\Handle('Heartbeats', \Thesis\Api\V1\Heartbeat::class),
new Server\BidirectionalStreamHandler($this->server->heartbeats(...)),
),
]);
}
}After generating the code, you may notice a strange file called autoload.metadata.php and classes named *DescriptorRegistry.php, which contain the original proto files stored as base64-encoded protobuf messages that the plugin used to generate your code.
Do not scare. These files are necessary for implementing server-side reflection and (de) serialization of the google.protobuf.Any type,
which contains the full message path within the schema. This approach is also used in other ecosystems.
To avoid manually registering *DescriptorRegistry.php classes in the descriptor pool, it is recommended to add the path to autoload.metadata.php in your composer.json file.
In long-running applications for which the thesis project is designed, such a file will be loaded by Composer only once:
"autoload": {
"psr-4": {...},
"files": [
"src/autoload.metadata.php"
]
}After registering this file in autoload.files, all types from your package will appear in \Thesis\Protobuf\Pool\Registry.
Please note that application developers should not use it directly. This storage is needed for implementing various libraries, such as the server-side reflection mentioned above.
The protobuf ecosystem has many well-defined types for various tasks, including the so-called well-known types, which include Timestamp, Duration, Empty, and many others.
To avoid having to generate these types each time, it is recommended to use libraries where this code has already been generated.
thesis/protobuf-known-types — generated well-known types.
thesis/google-types — generated types from the google/type package, which include, for example, Money, Color, PhoneNumber, and many others.
thesis/googleapis-rpc-types — it is primarily used by gRPC itself, as it contains types for status and errors.
thesis/protobuf-descriptor-types — types used by the plugin itself and for implementing so-called server reflection.
thesis/protobuf-compiler-types — types used for communication between protoc and its plugins.
-
proto2 -
proto3 -
proto2 scalar defaults -
proto2 enum defaults -
proto3 optional -
extensions -
groups -
nested types -
packed repeated -
oneof -
grpc -
client streaming -
server streaming -
bidirectional streaming -
option php_namespace -
option php_class_prefix -
option php_metadata_namespace -
comments