Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.basic.visual.misc > #3316

Videocard Intel 945.14.324 Xp Exe

Newsgroups comp.lang.basic.visual.misc
Date 2023-12-26 14:17 -0800
Message-ID <7c286d24-d9cd-439b-a13d-76f983ee1bfen@googlegroups.com> (permalink)
Subject Videocard Intel 945.14.324 Xp Exe
From Janette Leupold <leupoldjanette@gmail.com>

Show all headers | View raw


We'll start with the mobile benchmarks, since Intel used its two high-end models for these. Based on the numbers, Intel suggests its A770M can outperform the RTX 3060 mobile, and the A730M can outperform the RTX 3050 Ti mobile. The overall scores put the A770M 12% ahead of the RTX 3060, and the A730M was 13% ahead of the RTX 3050 Ti. However, looking at the individual game results, the A770M was anywhere from 15% slower to 30% faster, and the A730M was 21% slower to 48% faster.



That's a big spread in performance, and tweaks to some settings could have a significant impact on the fps results. Still, overall the list of games and settings used here looks pretty decent. However, Intel used laptops equipped with the older Core i7-11800H CPU on the Nvidia cards, and then used the latest and greatest Core i9-12900HK for the A770M and the Core i7-12700H for the A730M. There's no question that the Alder Lake CPUs are faster than the previous generation Tiger Lake variants, though without doing our own testing we can't say for certain how much CPU bottlenecks come into play.



There's also the question of how much power the various chips used, as the Nvidia GPUs have a wide power range. The RTX 3050 Ti can ran at anywhere from 35W to 80W (Intel used a 60W model), and the RTX 3060 mobile has a range from 60W to 115W (Intel used an 85W model). Intel's Arc GPUs also have a power range, from 80W to 120W on the A730M and from 120W to 150W on the A770M. While Intel didn't specifically state the power level of its GPUs, it would have to be higher in both cases.



Videocard Intel 945.14.324 Xp Exe

Download File https://t.co/GK6rKfwiZN 






Switching over to the desktop side of things, Intel provided the above A380 benchmarks. Note that this time the target is much lower, with the GTX 1650 and RX 6400 budget GPUs going up against the A380. Intel still has higher-end cards coming, but here's how it looks in the budget desktop market.



Even with the usual caveats about manufacturer provided benchmarks, things aren't looking too good for the A380. The Radeon RX 6400 delivered 9% better performance than the Arc A380, with a range of -9% to +31%. The GTX 1650 did even better, with a 19% overall margin of victory and a range of just -3% up to +37%.



And look at the list of games: Age of Empires 4, Apex Legends, DOTA 2, GTAV, Naraka Bladepoint, NiZhan, PUBG, Warframe, The Witcher 3, and Wolfenstein Youngblood? Some of those are more than five years old, several are known to be pretty light in terms of requirements, and in general that's not a list of demanding titles. We get the idea of going after esports competitors, sort of, but wouldn't a serious esports gamer already have something more potent than a GTX 1650?



Keep in mind that Intel potentially has a part that will have four times as much raw compute, which we expect to see in an Arc A770 with a fully enabled ACM-G10 chip. If drivers and performance don't hold it back, such a card could still theoretically match the RTX 3070 and RX 6700 XT, but drivers are very much a concern right now.



On that note, our own Arc A380 review has a slightly different result. We tested eight standard games at 1080p medium, 1080p ultra, and 1440p ultra. Here's what our testing looks like, which came a month or two after Intel's initial tests and used newer drivers.


Xe-core represents just one of the building blocks used for Intel's Arc GPUs. Like previous designs, the next level up from the Xe-core is called a render slice (analogous to an Nvidia GPC, sort of) that contains four Xe-core blocks. In total, a render slice contains 64 Vector and Matrix Engines, plus additional hardware. That additional hardware includes four ray tracing units (one per Xe-core), geometry and rasterization pipelines, samplers (TMUs, aka Texture Mapping Units), and the pixel backend (ROPs).



The above block diagrams may or may not be fully accurate down to the individual block level. For example, looking at the diagrams, it would appear each render slice contains 32 TMUs and 16 ROPs. That would make sense, but Intel has not yet confirmed those numbers (even though that's what we used in the above specs table).



The ray tracing units (RTUs) are another interesting item. Intel detailed their capabilities and says each RTU can do up to 12 ray/box BVH intersections per cycle, along with a single ray/triangle intersection. There's dedicated BVH hardware as well (unlike on AMD's RDNA 2 GPUs), so a single Intel RTU should pack substantially more ray tracing power than a single RDNA 2 ray accelerator or maybe even an Nvidia RT core. Except, the maximum number of RTUs is only 32, where AMD has up to 80 ray accelerators and Nvidia has 84 RT cores. But Intel isn't really looking to compete with the top cards this round.



In our testing of the Arc A380, we found ray tracing performance was relatively weak, which is understandable considering its eight RTUs. However, thanks to the architecture and likely the 6GB of VRAM, ray tracing performance did tend to match or even exceed AMD's RX 6500 XT. Again, the high-end cards could end up being quite decent, and Intel claims the A750 can more than match the RTX 3060 in DXR performance while the A770 should be closer to the RTX 3060 Ti or even RTX 3070.


More important than how it works will be how many game developers choose to use XeSS. They already have access to both DLSS and AMD FSR, which target the same problem of boosting performance and image quality. Adding a third option, from the newcomer to the dedicated GPU market no less, seems like a stretch for developers. However, Intel does offer a potential advantage over DLSS.



XeSS is designed to work in two modes. The highest performance mode utilizes the XMX hardware to do the upscaling and enhancement, but of course, that would only work on Intel's Arc GPUs. That's the same problem as DLSS, except with zero existing installation base, which would be a showstopper in terms of developer support. But Intel has a solution: XeSS will also work, in a lower performance mode, using DP4a instructions (four INT8 instructions packed into a single 32-bit register).



DP4a is widely supported by other GPUs, including Intel's previous generation Xe LP and multiple generations of AMD and Nvidia GPUs (Nvidia Pascal and later, or AMD Vega 20 and later), which means XeSS in DP4a mode will run on virtually any modern GPU. Support might not be as universal as AMD's FSR, which runs in shaders and basically works on any DirectX 11 or later capable GPU as far as we're aware, but quality should be better than FSR 1.0 and might even take on FSR 2.0 as well. It would also be very interesting if Intel supported Nvidia's Tensor cores, through DirectML or a similar library, but that wasn't discussed.



The big question will still be developer uptake. We'd love to see similar quality to DLSS 2.x, with support covering a broad range of graphics cards from all competitors. That's definitely something Nvidia is still missing with DLSS, as it requires an RTX card. But RTX cards already make up a huge chunk of the high-end gaming PC market, probably around 90% or more (depending on how you quantify high-end). So Intel basically has to start from scratch with XeSS, and that makes for a long uphill climb.






Besides the wafer shot, Intel also provided these two die shots for Xe HPG. The larger die has eight clusters in the center area that would correlate to the eight render slices. The memory interfaces are along the bottom edge and the bottom half of the left and right edges, and there are four 64-bit interfaces, for 256-bit total. Then there's a bunch of other stuff that's a bit more nebulous, for video encoding and decoding, display outputs, etc.



The smaller die has two render slices, giving it just 128 Vector Engines. It also only has a 96-bit memory interface (the blocks in the lower-right edges of the chip), which could put it at a disadvantage relative to other cards. Then there's the other 'miscellaneous' bits and pieces, for things like the QuickSync Video Engine. Obviously, performance will be substantially lower than the bigger chip.



While the smaller chip appears to be slower than all the current RTX 30-series GPUs, it does put Intel in an interesting position. The A380 checks in at a theoretical 4.1 TFLOPS, which means it ought to be able to compete with a GTX 1650 Super, with additional features like AV1 encoding/decoding support that no other GPU currently has. 6GB of VRAM also gives Intel a potential advantage, and on paper the A380 ought to land closer to the RX 6500 XT than the RX 6400.



That's not currently the case, according to Intel's own benchmarks as well as our own testing (see above), but perhaps further tuning of the drivers could give a solid boost to performance. We certainly hope so, but let's not count those chickens before they hatch.


What do you mean? Your Apple laptop does not have an Intel GPU. (Edit: unless I misunderstood, and you have an old pre-M1 MacBook, but then forget about running any kind of GPU computing on there). There is no way to add a graphics card internally or externally to any recent Apple computer. You are stuck with the integrated Apple Silicon graphics system.


Integrated graphics (including intels...) offer the same types of performance benefits that one would find in discrete graphics. They also have some advantages because they are on the same die. Recently integrated gpus have become capable of doing zero copy operations, which means you aren't limited by bandwidth in the same way you are with dedicated GPUs. Modern integrated GPUs can access the same address space as your CPU, in OpenCL you release and acquire this memory in a similar way to doing DMA streaming from host to devices in CUDA/OpenCL. In this way, you can often get more of a speed up with certain types of batch SIMD processing than you could with a discrete card.


I suspect that Blender may never add support for Intel OpenCL, because of the effort and lack of power of their integrated GPUs compared to discrete cards. It wouldn't help the majority of their users, and certainly not their core users.

 0aad45d008


Back to comp.lang.basic.visual.misc | Previous | Next | Find similar | Unroll thread


Thread

Videocard Intel 945.14.324 Xp Exe Janette Leupold <leupoldjanette@gmail.com> - 2023-12-26 14:17 -0800

csiph-web