Run Vision Transformer in PHP with phpy: A Complete Step‑by‑Step Guide
This article explains how to implement and run a Vision Transformer (ViT) model in PHP using the phpy extension, covering ViT fundamentals, installation of Python dependencies, full PHP and Python code examples, and practical application scenarios for PHP developers.
Background
Vision Transformer (ViT) has become popular in deep learning for its strong performance on image classification tasks. While most implementations are written in Python with PyTorch, PHP developers can also run ViT by leveraging the phpy extension, which enables PHP to call Python modules directly.
ViT Model Characteristics
Input images are split into patches, each patch is embedded into a 1‑D vector via Patch Embedding.
The core of the model is a Transformer Encoder block with Multi‑head Attention; the normalization layer position is adjusted.
After stacking several encoder blocks, a fully‑connected head produces class predictions. The encoder part is referred to as the backbone.
What is phpy?
phpyis a PHP extension that allows PHP code to import and use Python modules. By using PyCore::import(), PHP can access libraries such as torch and torch.nn, making it possible to run complex deep‑learning models without leaving the PHP environment.
Installation
First install the required Python packages, for example: pip install torch A typical installation log (truncated) looks like:
Collecting torch
Downloading torch-2.4.0‑cp39‑cp39‑manylinux1_x86_64.whl (797.2 MB)
Collecting nvidia‑cufft‑cu12==11.0.2.54
Downloading nvidia_cufft_cu12‑11.0.2.54‑py3‑none‑manylinux1_x86_64.whl (121.6 MB)
... (additional dependencies) ...
Successfully installed torch‑2.4.0 filelock‑3.15.4 fsspec‑2024.6.1 ...PHP Implementation
<?php
declare(strict_types=1);
/**
* ViT class defines the Vision Transformer structure.
*/
class Vit {
private mixed $emb_size;
private int $patch_size;
private int $patch_count;
private $conv;
private $patch_emb;
private $cls_token;
private $pos_emb;
private $tranformer_enc;
private $cls_linear;
private $torch; // imported torch module
private $nn; // imported torch.nn module
/**
* Constructor initializes model parameters and layers.
* @param int $emb_size Embedding size, default 16.
*/
public function __construct($emb_size = 16) {
$this->torch = PyCore::import('torch');
$this->nn = PyCore::import('torch.nn');
$this->emb_size = $emb_size;
$this->patch_size = 4;
$this->patch_count = intdiv(28, $this->patch_size);
$this->conv = $this->nn->Conv2d(
in_channels: 1,
out_channels: pow($this->patch_size, 2),
kernel_size: $this->patch_size,
padding: 0,
stride: $this->patch_size,
);
$this->patch_emb = $this->nn->Linear(pow($this->patch_size, 2), $this->emb_size);
$this->cls_token = $this->torch->randn([1, 1, $this->emb_size]);
$this->pos_emb = $this->torch->randn([1, pow($this->patch_count, 2) + 1, $this->emb_size]);
$encoder_layer = $this->nn->TransformerEncoderLayer(
$this->emb_size, 2,
dim_feedforward: 2 * $this->emb_size,
dropout: 0.1,
activation: 'relu',
layer_norm_eps: 1e-5,
batch_first: true
);
$this->tranformer_enc = $this->nn->TransformerEncoder($encoder_layer, 3);
$this->cls_linear = $this->nn->Linear($this->emb_size, 10);
}
/**
* Forward pass of the model.
* @param mixed $x Input tensor.
* @return mixed Model output.
*/
public function forward($x) {
$operator = \PyCore::import('operator');
$x = $this->conv->forward($x);
$batch_size = $x->size(0);
$out_channels = $x->size(1);
$height = $x->size(2);
$width = $x->size(3);
$x = $x->view($batch_size, $out_channels, $height * $width);
$x = $x->permute([0, 2, 1]);
$x = $this->patch_emb->forward($x);
$cls_token = $this->cls_token->expand([$x->size(0), 1, $x->size(2)]);
$x = $this->torch->cat([$cls_token, $x], 1);
$x = $operator->__add__($x, $this->pos_emb);
$x = $this->tranformer_enc->forward($x);
return $this->cls_linear->forward($x->select(1, 0));
}
}
// Import torch library
$torch = PyCore::import('torch');
// Initialize ViT model
$vit = new Vit();
// Create a random input tensor (5, 1, 28, 28)
$x = $torch->rand(5, 1, 28, 28);
// Forward pass
$y = $vit->forward($x);
// Print result
PyCore::print($y);Running the PHP Code
# php ViT.php
tensor([[ 1.4124e-01, -2.2445e-01, -4.8343e-02, 1.0453e+00, 2.6407e-01,
-1.0721e+00, -4.5355e-01, 9.3695e-01, 2.0814e-01, -6.9242e-01],
[ 1.3197e-01, -1.7860e-01, -3.5619e-02, 1.0052e+00, 3.5701e-01,
-1.0619e+00, -5.5952e-01, 8.9957e-01, 2.2079e-01, -7.3373e-01],
...])Python Reference Implementation
from torch import nn
import torch
class ViT(nn.Module):
def __init__(self, emb_size=16):
super().__init__()
self.patch_size = 4
self.patch_count = 28 // self.patch_size
self.conv = nn.Conv2d(in_channels=1, out_channels=self.patch_size**2,
kernel_size=self.patch_size, padding=0, stride=self.patch_size)
self.patch_emb = nn.Linear(in_features=self.patch_size**2, out_features=emb_size)
self.cls_token = nn.Parameter(torch.rand(1, 1, emb_size))
self.pos_emb = nn.Parameter(torch.rand(1, self.patch_count**2 + 1, emb_size))
self.tranformer_enc = nn.TransformerEncoder(
nn.TransformerEncoderLayer(d_model=emb_size, nhead=2, batch_first=True),
num_layers=3)
self.cls_linear = nn.Linear(in_features=emb_size, out_features=10)
def forward(self, x):
x = self.conv(x)
x = x.view(x.size(0), x.size(1), self.patch_count**2)
x = x.permute(0, 2, 1)
x = self.patch_emb(x)
cls_token = self.cls_token.expand(x.size(0), 1, x.size(2))
x = torch.cat((cls_token, x), dim=1)
x = self.pos_emb + x
y = self.tranformer_enc(x)
return self.cls_linear(y[:, 0, :])
if __name__ == '__main__':
vit = ViT()
x = torch.rand(5, 1, 28, 28)
y = vit(x)
print(y.shape)Application Scenarios and Significance
PHP is widely used for web development but lacks native deep‑learning support. By using phpy, developers can directly call Python frameworks such as PyTorch or TensorFlow, integrating sophisticated AI algorithms into PHP applications—for example, real‑time prediction services or complex data‑processing pipelines.
Conclusion
With the phpy extension, PHP developers can effortlessly run Python‑based deep‑learning models like Vision Transformer, expanding PHP’s capabilities and opening new possibilities for AI‑enhanced web applications. As AI continues to evolve, the synergy between PHP and Python is expected to foster further innovation.
Source: https://segmentfault.com/a/1190000045240156
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Tech Hub
Sharing cutting-edge internet technologies and practical AI resources.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
