Intel launched LLM Scaler version 1.0 and, according to the company itself, it was developed with the ease of use and industry standards in mind. Optimization should Offer up to 80% performance increase with multi-GPU scaling and PCIE P2P data transfers.
The Battlematrix project was revealed by Intel along with its GPUS ARC Pro during Computex 2025. It was designed as a Complete solution for inference work stations that perform multiple GPUS ARC PRO.
The company promised in its script to offer the first container deployment. In addition, the promise was Staging VLLM and basic telemetry support in the third quarter as a “optimized inference” container.

This promise, now fulfilled with the LLM Scler V1.0, can be downloaded from the Intel page in Github. It is a gain for the company that is in need of victories.
The software was developed in container brings Reliability and business level management resourcessuch as ECC, Sriov, Telemetry and Remote Firmware Updates.
Containers are software packages that contain all the elements needed to be executed in any environment. Thus, they allow virtualize the operating system and can be executed anywherefrom a private date to the public cloud or even in the personal laptop of a developer.
Related News:
Full List of Resources and Optimizations

The package brings Benchmark Oneccl Benchmark tool qualification and the XPU Managerwhich allows to monitor and manage: GPU power; GPU firmware update; GPU diagnosis; and memory of the bandwidth GPU.
In terms of VLLM properly, the additions are as follows:
- TPOP Performance Optimization for long length inputs (> 4K): Performance of up to 1.8x for 40k sequence lengths in the 32B KPI model and 4.2x performance for 40k 40k sequence lengths in the 70B KPI model;
- Performance optimizations with 10% improvement in the output transfer rate for 8B to 32b KPI models compared to the latest version;
- Online layer quantization to reduce the necessary GPU memory;
- PP (pipeline parallel) support in VLLM (experimental);
- torch.compile (experimental);
- Speculative (experimental) decoding;
- Incorporation support and a reranking model;
- Improved support to multimodal models;
- Automatic detection of maximum length;
- Support data parallelism.

Following the previously released script, the update will be followed by a more robust container release in the same quarter. The update should offer improved performance and VLLM service.
Finally, in the fourth quarter, Intel will launch a complete set of resources.
Year -old

Despite the release, it has not been an easy year for Intel. The company’s shares have reached the lowest value in the last 16 years. In addition, the company exchanged CEO in March, which gave some breath to its value, but, still last week, Trump asked the resignation of Lip-BO Tan.
The two talked and got it right, but it does not change that the conflicting situation undermines the company’s performance and the perception of its value in the market.
Later this week, Craig Barret, former CEA of Intel, presented a bold plan for the company to recover. Part of his plan is to take advantage of Trump’s interest in strengthening the national industry and ensuring national supply to companies such as Apple, Google and Nvidia.
Source: Intel Page at Github.
Join the Adrenaline offers group
Check out the main offers of hardware, components and other electronics we find over the internet. Video card, motherboard, RAM and everything you need to set up your PC. By participating in our group, you receive daily promotions and have early access to discount coupons.
Enter the group and enjoy the promotions
Source: https://www.adrenaline.com.br/intel/atualizacao-para-arc-pro-gpus-traz-ganho-de-80-e-novos-recursos-no-project-battlematrix/
