• Worked on the firmware side a bit. Now the microcontroller reads the VSA-100 I2C temperature sensor, internal temperature sensor, and monitors the PGOOD signals of the PSUs.
    If the VSA or MCU temperature are greater than a certain threshold, or if one of the power rails goes down (eg. due to overtemperature) the MCU immediately turns off the power rails and sends an emergency shutdown signal to the laptop. This is tested and works fine.
    Fan is now controlled depending on the VSA-100 temperature. For now, it's just a simple LUT, I'll implement some hysteresis.
    Backlight control was switched back to the MCU, and the scaler backlight control is fed into the MCU. The scaler can override the MCU backlight control only when it wants to turn it off.

    Now for the more interesting part. The PCIe to PCI bridge has 5 GPIOs. Two of them are used for the I2C EEPROM that stores the bridge configuration. 3 GPIOs are left, and I used them to implement a bidirectional communication bus between the bridge and the MCU. These pins are set as RX, TX and CLOCK. On the MCU and host side there are two state machines implemented that are basically a shift register, and package/parse the data.
    On the host side, by reading/writing the PCI configuration registers of the PCIe-PCI bridge, the GPIOs are configured as input/output and their values can be set or read. The MCU also stores configuration persistently in the FRAM.

    Now, this is the first time I write a GUI application, (or a Windows application for that matter ), so the design maybe isn't the prettiest thing, but it works.
    I wrote this in Visual Studio Express C# 2008, which, unexpectedly , was a joy to use.

    It also runs in the system tray when minimized:

    Now backlight control from the OS works, the VSA-100 frequency can be changed, VSA-100 core voltage can be changed from 2.5V to 3.1V, and Framebuffer Size can be changed between 32MB and 64MB without tearing down the laptop and flipping a switch.

    It can also read back the data that's persistently stored on the card, card model and revision, temperature sensors.

  • Amazing!!! Since 20y people here use VSA100 and now you just program a tool that can read the internal sensor? Afaik that has not been done before neither...

  • Thanks for the kind words!


    Tobi sorry to disapoint, I'm not reading the internal VSA temperature. There's an external temperature sensor, right underneath the VSA, in the middle of all the thermal VIAs.



    Edited once, last by sdz (June 21, 2024 at 6:30 AM).

  • Today I worked a bit on this, and I got 1920x1080 working perfectly (without breaking other resolutions)

    This will need further testing on multiple boards (there's only 1 at the moment), as the FPGA is running a bit out of spec at this resolution.

    I looked closer at the display and the image is still shifted to the right (not as much as before) and I started to investigate. It's either the VSA, my FPGA code, or the scaler.

    -tested the scaler, by using the devboard I previously made, connected to a PC and to the laptop display. It looked fine.

    -generated a test image from the FPGA (on the MXM card), in the laptop, it looked fine.

    -plugged a V5 DVI in a PC, connected to the scaler devboard and to the laptop display, offset

    -V5 DVI in a PC, connected to a regular HDMI monitor, and the offset is there.

  • Another small update, I made a board with VSA-100 REV 320, and 200MHz rated RAM (sourcing 200MHz BGA SDRAM is next to impossible...).


    WIth the current cooling, I was able to OC to 208MHz at 3.1V VCORE.


    I suspect 210MHz may be possible, with an improved cooling system.

    It also does 200MHz at 2.7V VCORE, which is really nice.

  • Very nice Result and OC! Im running Avenger with 225MHz and get 310X Points in 3Dmark 01.

    It would be very interessting to see a 16Bit Result with your VSA 100 to compare.

    Good Work my friend! :thumbup:

  • Thanks! That's a nice OC on your Avenger!

    Here are the results:


    1024x768, 16bpp, 166MHz, 2725

    1024x768, 16bpp, 200MHz, 3194

    1024x768, 16bpp,208MHz, 3296


    I'm pretty sure that is the VSA had only 16MB of RAM, it would score below your V3.

  • Hm this is realy impressive! Thank you!!!

    Maybe its not the memory that VSA would be faster as Avenger clock for clock.

    If i remember right Avenger should have one Pipeline with two TMU and VSA has two Pipelines with only one TMU for each Pipe.

    There should be some Advantages maybe?

  • Rein aus Füllratensicht: Mit zwei Pixelpipelines hat der VSA-100 einfach die doppelte theoretische Pixelfüllrate pro Taktzyklus. Ein Avenger kann ein Pixel pro Hertz rendern, also sind das 166 MPixel bei 166 MHz. Mit den zwei Texelpipelines ist die Texturfüllrate naturgemäß das doppelte davon - 333 MTexel.

    Der VSA-100 hat die selbe Füllrate für Pixel wie Texel. Also zwei pro Hertz. Damit sind das 333 MPixel und MTexel pro Chip bei 166 MHz.

    Man erinnere sich an den 3dfx Slogan aus den Zeiten vor programmierbaren GPUs; "Fillrate is King". ;)

    Allerdings: Und hier geht meine Spekuliererei los - bedeuten zwei Pipelines daß man das Land der Parallelität auf Instruktionsebene betritt und damit unter dem Gesetz des abnehmenden Ertrags zu leiden beginnen wird. Das könnte den VSA-100 zurückhalten. Ich weiß aber nicht wie sich der Wert des abnehmenden Ertrags bei Pixel- und Texelpipelines unterscheidet.

    Es könnte natürlich auch sein, daß die Pipelines des Avengers kürzer sind als die vom VSA-100. Aber da weiß ich einfach zu wenig um was sinnvolles zu sagen.

    --

    Purely from the perspective of fillrate: With two pixel pipelines, VSA-100 simply has double the theoretical pixel fillrate per clock cycle. An Avenger can push one pixel per Hertz, so for 166 MHz it's 166 MPixels. Given the two texel pipelines, texture fillrate is naturally double that - 333 MTexels.

    The VSA-100 has the same fillrate for both pixels and texels. That's two per Hertz. So 333 MPixels / MTexels per chip for 166 MHz.

    Let's remember 3dfx's slogan from times before programmable GPUs; "Fillrate is king". ;)

    However; Here's where my speculation starts - having multiple pipelines means you enter the land of instruction-level parallelism and will start to suffer from the law of diminishing yields. This might hold VSA-100 back. Not sure how much of a difference there is with that yield between pixel and texel pipelines.

    There is also the possibility, that the pipelines of Avenger are shorter than the VSA-100's. But I don't know enough to say anything about that.

    1-6000-banner-88x31-jpg

    Stolzer Besitzer eines 3dfx Voodoo5 6000 AGP Prototypen:

    • 3dfx Voodoo5 6000 AGP HiNT Rev.A-3700

    [//wp.xin.at] - No RISC, no fun!

    QotY: Girls Love, BEST Love; 2018 - Lo and behold, for it is the third Coming; The third great Year of Yuri, citric as it may be! Edit: 2019 wasn't too bad either... Edit: 2020... holy crap, we're on a roll here~♡! Edit: 2024, finally last year's dry spell is over!

    Quote Bier.jpg@IRC 2020: "Je schlimmer der Fetisch, desto besser!"

  • I noticed that sometimes 3dmark scored 50-80 points lower. This was due to the CPU clock varying quite a lot. When fully loading one core, that core's clock would randomly jump between 800MHz, 2.2GHZ, 3.5GHz. Not a temperature or TDP issue. Did a BIOS update, which didn't do much, after which I managed to "fix" the issue using ThrottleStop.

    After that I upgraded the CPU from a 4710MQ to a 4810MQ, and installed Amigamerlin 3.1 R1 instead of SFFT 1.9.



    194 extra 3D Marks.


    I will repeat the benchmark for 16bpp soon.

  • Today I had some free time and decided to finally address the image shifted to the right by 4 pixels bug.

    First thought was that there is something wrong with the horizontal timings coming out of the VSA-100. The following is from the V2 documentation, but it also applies here (hsync polarity is inverted compared to the VSA-100, but that doesn't matter):

    Adjusting the horizontal back porch (hBackPorch in the above diagram) was a good place to start. I modified the FPGA code to make it smaller by increasing (doubling) the horizontal sync pulse length (hSyncOn) in the above diagram.

    FBI_HSYNC_IBUF is the horizontal sync signal coming from the VSA-100, hsync_out is the increased horizontal sync pulse that is fed to the TMDS encoder block, FBI_VIDEO_D0_IBUF is one of the 12 bits of RGB data sent by the VSA.
    This did absolutely nothing.

    Next I took a closer look at the blanking signal coming from the VSA:

    FBI_BLANK_IBUF is the blanking signal generated by the VSA, FBI_VIDEO_D0_IBUF is one of the 12 bits of RGB data sent by the VSA.
    The video data is sent by the VSA exactly 4 pixel clocks after the blanking signal is deasserted. The delay is always 4 pixels clocks, regardless of the resolution. This seems wrong, and 4 pixel clocks correspond to the 4 pixels shifted to the right bug, so I delayed the rising edge of the blanking signals by exactly 4 pixels clocks from the FPGA, now it looks like this:

    blank_out is the delayed signal, that is then fed to the TMDS encoder block at the same time as the RGB data.


    This is how it looked before:

    And this is how it looks now:

    The sticker is there as a reference.


    This fixed the issue for all resolutions. Obviously, this fix applies only for the V4 M4800 and not regular DVI V4 and V5 cards. In the future I'll try to find an universal fix.

    Edited once, last by sdz (October 4, 2024 at 1:12 AM).

  • sdz did you see Anthonys V5 6k Replica Projekt? Maybe you could work with him to upgrade it with DVI. And maybe you could dig into the AAlchemy cards. Maybe we could get custom 8 or 16 VSA 100 cards... Or better maybe better drivers ;)

  • Bier.jpg

    Yes, I have seen his 6000 cards.

    It's possible that such an upgarde will exist at some point.

    Making 8 or 16 VSA cards, while possible, will require a ton of work on drivers. It's not something that I'm willing to dedicate so much time on.

    Next I'd like to make a 6000 with native HDMI out, and some other goodies, but I need an original card for reverse engineering some bits, and no one is willing to sell.