Commit Graph

2145 Commits

Author SHA1 Message Date
Daniil Tatianin e5f2e4c696 pciinit: don't misalign large BARs
Previously we would unconditionally lower the alignment for large BARs
in case their alignment was greater than "pci_mem64_top >> 11", this
would make it impossible to use these devices by the kernel:
    [   13.821108] pci 0000:9c:00.0: can't claim BAR 1 [mem 0x66000000000-0x67fffffffff 64bit pref]: no compatible bridge window
    [   13.823492] pci 0000:9d:00.0: can't claim BAR 1 [mem 0x64000000000-0x65fffffffff 64bit pref]: no compatible bridge window
    [   13.824218] pci 0000:9e:00.0: can't claim BAR 1 [mem 0x62000000000-0x63fffffffff 64bit pref]: no compatible bridge window
    [   13.828322] pci 0000:8a:00.0: can't claim BAR 1 [mem 0x6e000000000-0x6ffffffffff 64bit pref]: no compatible bridge window
    [   13.830691] pci 0000:8b:00.0: can't claim BAR 1 [mem 0x6c000000000-0x6dfffffffff 64bit pref]: no compatible bridge window
    [   13.832218] pci 0000:8c:00.0: can't claim BAR 1 [mem 0x6a000000000-0x6bfffffffff 64bit pref]: no compatible bridge window

Fix it by only overwriting the alignment in case it's actually greater
than the desired by the BAR window.

Fixes: 96a8d130a8 ("be less conservative with the 64bit pci io window")
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-04-15 09:21:37 -04:00
Kevin O'Connor 731c88d503 stdvgaio: Only read/write one color palette entry at a time
Introduce stdvga_dac_read_many() and stdvga_dac_write_many() for
writing multiple dac palette entries.  Convert the stdvga_dac_read()
and stdvga_dac_write() low-level IO access functions in stdvgaio.c to
access just one color palette entry.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
2024-04-13 13:19:56 -04:00
Daniel Verkamp 5d87ff2542 vbe: Add VBE 2.0+ OemData field to struct vbe_info
Per the VBE 2.0 specification, the VBE controller information is 512
bytes long when the "VBE2" signature is provided, instead of the
original 256 bytes.

src/bootsplash.c uses the original pre-VBE-2.0 256-byte structure while
also filling in the "VBE2" signature, so a video BIOS that makes use of
the VBE2 OemData area could write past the end of the allocated region.

The original bootsplash code did not have this bug; it was introduced
when the bootsplash VBE structures were merged with the VGA ROM struct
definitions.

Fixes: 69e941c159 ("Merge bootsplash and VGA ROM vbe structure definitions")
Signed-off-by: Daniel Verkamp <daniel@drv.nu>
2024-03-10 13:00:27 -04:00
Igor Mammedov 163fd9f087 fix smbios blob length overflow
When tables are more than 64K, size of copied tables will be
truncated due to cast from u32 to u16, and as result only
a small portion of the tables will be copied in the end.
That leads to corrupted tables (a part from QEMU and
remainder is whatever was in memory block allocated for
the tables).

Fix it by making qtables_len 32bit int.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
2024-03-03 12:40:12 -05:00
Max Tottenham 82faf1d5c8 Add LBA 64bit support for reads beyond 2TB.
When booting from a >2TB drive/filesystem, it's possible what the
kernel/bootloader may be updated and written out at an LBA address
beyond what is normally accessible by the READ(10) SCSI commands.
If this happens to the kernel grub will fail to boot the kernel
as it will call into the BIOS with an LBA address >2TB, and the
BIOS will return an error. Per the SCSI spec, >2TB drives should
return 0XFFFFFFFF, and a READ CAPACITY(16) command should be issued
to determine the full size of the drive, READ(16) commands can then
be used in order to read data at LBA addresses beyond 2TB (64 bit
LBA addresses)

Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Message-ID: <20240125150050.3775834-2-mtottenh@akamai.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2024-01-26 15:59:34 +01:00
Andrej Kruták 3f082f38bf Add AHCI Power ON + ICC_ACTIVE into port setup code
Windows appears to put the AHCI port into 'Partial power management state'
during reboot, the command puts it back into 'active state'.

AHCI/1: link down 0x00000231 (SCR STAT register)
 ->
AHCI/1: link up 0x00000133

Signed-off-by: Andrej Krutak andrej.krutak@sysgo.com
Message-ID: <1531455205.6484.1704814463638@ox.sysgo.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2024-01-25 10:12:22 +01:00
Mark Cave-Ayland 3ae8888611 esp-scsi: terminate DMA transfer when ESP data transfer completes
When the ESP data transfer completes indicated by the STAT_TC flag being set,
terminate the DMA transfer by issuing a DMA IDLE command. Otherwise in the case
where the guest sends a reset followed by an ESP command, the DMA signal remains
enabled and so the next SeaBIOS DMA transfer begins immediately when the next
ESP command is received rather than waiting until the data is ready and the DMA
command is issued.

With this fix it is possible to boot a Windows XP ISO to the installer and
complete a full installation within QEMU directly using SeaBIOS.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20240101121942.383191-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2024-01-25 10:08:37 +01:00
Gerd Hoffmann a6ed6b701f limit address space used for pci devices.
For better compatibility with old linux kernels,
see source code comment.

Also rename some variables to make the code more
readable, following suggestions by Kevin.

Related (same problem in ovmf):
https://github.com/tianocore/edk2/commit/c1e853769046

Cc: Kevin O'Connor <kevin@koconnor.net>
Reported-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-11-13 13:49:59 +01:00
Gerd Hoffmann 1e1da7a963 check for e820 conflict
Add support to check for overlaps with e820 entries.
In case the 64bit pci io window has conflicts move it down.

The only known case where this happens is AMD processors
with 1TB address space which has some space just below
1TB reserved for HT.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:56:21 +02:00
Gerd Hoffmann ecc51f211f qemu: log reservations in fw_cfg e820 table
With loglevel 1 (same we use for RAM entries),
so it is included in the firmware log by default.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:56:21 +02:00
Gerd Hoffmann 96a8d130a8 be less conservative with the 64bit pci io window
Current seabios code will only enable and use the 64bit pci io window in
case it runs out of space in the 32bit pci mmio window below 4G.

This patch will also enable the 64bit pci io window when
  (a) RAM above 4G is present, and
  (b) the physical address space size is known, and
  (c) seabios is running on a 64bit capable processor.

This operates with the assumption that guests which are ok with memory
above 4G most likely can handle mmio above 4G too.

In case the 64bit pci io window is enabled also assign more memory to
prefetchable pci bridge windows and the complete 64bit pci io window.

The total mmio window size is 1/8 of the physical address space.
Minimum bridge windows size is 1/256 of the total mmio window size.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:56:21 +02:00
Gerd Hoffmann bcfed7e270 move 64bit pci window to end of address space
When the size of the physical address space is known (PhysBits is not
zero) move the 64bit pci io window to the end of the address space.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:56:21 +02:00
Gerd Hoffmann 90eeb0c855 detect physical address space size
Check for pae and long mode using cpuid.  If present also read the
physical address bits.  Apply some qemu sanity checks (see below).
Record results in PhysBits and LongMode variables.  In case we are not
sure what the address space size is leave the PhysBits variable unset.

On qemu we have the problem that for historical reasons x86_64
processors advertise 40 physical address space bits by default, even in
case the host supports less than that so actually using the whole
address space will not work.

Because of that the code applies some extra sanity checks in case we
find 40 (or less) physical address space bits advertised.  Only
known-good values (which is 40 for amd processors and 36+39 for intel
processors) will be accepted as valid.

Recommendation is to use 'qemu -cpu ${name},host-phys-bits=on' to
advertise valid physical address space bits to the guest.  Some distro
builds enable this by default, and most likely the qemu default will
change in near future too.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:56:21 +02:00
Gerd Hoffmann be84867613 better kvm detection
In case kvm emulates features of another hypervisor (for example hyperv)
two VMM CPUID blocks will be present, one for the emulated hypervisor
and one for kvm itself.

This patch makes seabios loop over the VMM CPUID blocks to make sure it
will properly detect kvm when multiple blocks are present.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:56:21 +02:00
Mark Cave-Ayland 7a4003be25 esp-scsi: handle non-DMA SCSI commands with no data phase
The existing esp-scsi state machine checks for the STAT_TC bit to exit state 1
but in the case where there is no data phase, a non-DMA command is executed
which doesn't set STAT_TC. This only works because QEMU currently always sets
STAT_TC just after issuing every SCSI command.

Update the esp-scsi state machine so that in the case where there is no data
phase, we immediately execute CMD_ICCS instead of waiting for STAT_TC to be
set which will never happen with a non-DMA CMD_SELATN command.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-ID: <20230807065300.366070-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:40:09 +02:00
Mark Cave-Ayland cf4b829f0c esp-scsi: check for INTR_BS/INTR_FC instead of STAT_TC for command completion
The ESP SELATN command used to send SCSI commands from the ESP to the SCSI bus
is not a DMA command and therefore does not affect the STAT_TC bit. The only
reason this works at all is due to a bug in QEMU which (currently) always
updates the STAT_TC bit in ESP_RSTAT regardless of the state of the ESP_CMD_DMA
bit.

According to the NCR datasheet [1] the INTR_BS/INTR_FC bits are set when the
SELATN command has completed, so update the existing logic to check for these
bits in ESP_RINTR instead. Note that the read of ESP_RINTR needs to be
restricted to state == 0 as reading ESP_RINTR resets the ESP_RSTAT register
which breaks the STAT_TC check when state == 1.

This commit also includes an extra read of ESP_INTR to clear all the interrupt
bits before submitting the SELATN command to ensure that we don't accidentally
immediately progress to the data phase handling logic where ESP_RINTR bits have
already been set by a previous ESP command.

[1] "NCR 53C94, 53C95, 53C96 Advanced SCSI Controller"
    NCR_53C94_53C95_53C96_Data_Sheet_Feb90.pdf

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230807065300.366070-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:40:09 +02:00
Mark Cave-Ayland db50227d4e esp-scsi: flush FIFO before sending SCSI command
The ESP FIFO is used as a buffer for DMA requests and so isn't guaranteed to
be empty in the case of SCSI errors or a mixed DMA/non-DMA request. Flush the
FIFO before sending a SCSI command to guarantee that it is correctly
positioned at the start of the FIFO.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230807065300.366070-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:40:09 +02:00
Tony Titus via SeaBIOS 774a823a96 Increase BUILD_MAX_E820 to 128
For platforms with high number of numa nodes, 32 e820 entries are not
enough. Linux kernel sets the maximum e820 entries to a base value of
128. Setting BUILD_MAX_E820 to 128 to be in sync with this base value.

Signed-off-by: Tony Titus <tonydt@amazon.com>
Message-ID: <20230728044148.58041-1-tonydt@amazon.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2023-08-24 10:32:06 +02:00
Niklas Cassel via SeaBIOS 1281e340ad ahci: handle TFES irq correctly
According to AHCI 1.3.1, 5.3.8.1 RegFIS:Entry, if ERR_STAT is set in the
received FIS, the HBA shall jump to state ERR:FatalTaskfile, which will
raise a TFES IRQ.

This means that if ERR_STAT is set in the recevied FIS, PxIS.TFES will
be set, without either PxIS.DHRS or PxIS.PSS being set.

SeaBIOS function ahci_port_setup() will try to identify an AHCI device
by sending an ATAPI identify device command. However, such a command
will be aborted with ERR_STAT set for a regular (non-ATAPI) device.

ahci_command() already performs the correct error recovery steps when
status is correctly set, so simply modify ahci_command() to read the
correct status when PxIS.TFES is set.

It is safe to read PxTFD when PxIS.TFES is set, even for systems with a
port multiplier, see AHCI 1.3.1, 9.3.7 PxTFD Register Information:
"When a taskfile error occurs (PxIS.TFES is set to '1'), the host may
refer to the values in PxTFD. The values in PxTFD at this time are
guaranteed to correspond to the device that reported the taskfile error
condition."

Without this, each boot will be delayed by 32 seconds, waiting for the
AHCI command to timeout.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2023-06-21 22:02:32 -04:00
Lukas Stockner via SeaBIOS cd933454b5 virtio-blk: Fix integer overflow for large max IO sizes
When the maximum IO size supported by the virtio-blk backend is large
enough (>= 32MiB for 512B sectors), the computed blk_num_max will
overflow. In particular, if it's a multiple of 32MiB, blk_num_max
will end up as zero, causing IO requests to fail.

This is triggered by e.g. the SPDK virtio-blk vhost-user backend.

To fix it, just limit blk_num_max to 65535 before converting to u16.

Signed-off-by: Lukas Stockner <lstockner@genesiscloud.com>
2023-06-13 11:11:25 -04:00
José Martínez 4db444b9a7 Fix high memory zone initialization in CSM mode
malloc_high() cannot allocate any memory in CSM mode due to an empty
ZoneHigh. SeaBIOS cannot find any disk to boot from because device
initialization fails.

The bug was introduced in 1.16.1 (commit dc88f9b) when the meaning of
BUILD_MAX_HIGHTABLE changed but CSM code was not updated. This patch
reverts to the previous behavior by using BUILD_MIN_HIGHTABLE in CSM
methods.

Signed-off-by: José Martínez <xose@google.com>
2023-06-13 11:01:34 -04:00
David Woodhouse ea1b7a0733 xen: require Xen info structure at 0x1000 to detect Xen
When running under Xen, hvmloader places a table at 0x1000 with the e820
information and BIOS tables. If this isn't present, SeaBIOS will
currently panic.

We now have support for running Xen guests natively in QEMU/KVM, which
boots SeaBIOS directly instead of via hvmloader, and does not provide
the same structure.

As it happens, this doesn't matter on first boot. because although we
set PlatformRunningOn to PF_QEMU|PF_XEN, reading it back again still
gives zero. Presumably because in true Xen, this is all already RAM. But
in QEMU with a faithfully-emulated PAM config in the host bridge, it's
still in ROM mode at this point so we don't see what we've just written.

On reboot, however, the region *is* set to RAM mode and we do see the
updated value of PlatformRunningOn, do manage to remember that we've
detected Xen in CPUID, and hit the panic.

It's not trivial to detect QEMU vs. real Xen at the time xen_preinit()
runs, because it's so early. We can't even make a XENVER_extraversion
hypercall to look for hints, because we haven't set up the hypercall
page (and don't have an allocator to give us a page in which to do so).

So just make Xen detection contingent on the info structure being
present. If it wasn't, we were going to panic anyway. That leaves us
taking the standard QEMU init path for Xen guests in native QEMU,
which is just fine.

Untested on actual Xen but ObviouslyCorrect™.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2023-02-01 20:14:39 -05:00
Qi Zhou 645a64b491 usb: fix wrong init of keyboard/mouse's if first interface is not boot protocol
There is always some endpoint descriptors after each interface descriptor, We
should only decrement num_iface if interface type is USB_DT_INTERFACE, see
https://www.beyondlogic.org/usbnutshell/usb5.shtml#ConfigurationDescriptors

Signed-off-by: Qi Zhou <atmgnd@outlook.com>
2022-11-23 11:31:15 -05:00
Xuan Zhuo 3208b098f5 virtio: finalize features before using device
Under the standard of Virtio 1.0, the initialization process of the
device must first write sub-features back to device before
using device, such as finding vqs.

There are four places using vp_find_vq().

1. virtio-blk.pci:    put the code of finalizing features in front of using device
2. virtio-blk.mmio:   put the code of finalizing features in front of using device
3. virtio-scsi.pci:   is ok
4. virtio-scsi.mmio:  add the code of finalizing features before vp_find_vq()

Link: https://www.mail-archive.com/qemu-devel@nongnu.org/msg920776.html
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221114035818.109511-3-xuanzhuo@linux.alibaba.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-11-23 09:28:54 +01:00
Xuan Zhuo 5ea5c64c20 virtio-mmio: read/write the hi 32 features for mmio
Under mmio, when we read the feature from the device, we should read the
high 32-bit part. Similarly, when writing the feature back, we should
also write back the high 32-bit feature.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221114035818.109511-2-xuanzhuo@linux.alibaba.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-11-23 09:28:54 +01:00
Igor Mammedov 61e901bbaa acpi: parse Alias object
Since QEMU commit
  47a373faa6 (acpi: pc/q35: drop ad-hoc PCI-ISA bridge AML routines and let bus ennumeration generate AML)
SeaBIOS fails to parse ISA bridge AML with:

   parse_termlist: parse error, skip from 92/517
   ...
   ACPI: no PS/2 keyboard present

due to Alias term in DSDT which isn't handled by SeaBIOS properly.
Add dumb Alias parsing which just skips over term,
so the rest of AML could be parsed successfully.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20221118142755.3879231-1-imammedo@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-11-23 09:23:30 +01:00
Xiaofei Lee 85d56f812f virtio-blk: Fix incorrect type conversion in virtio_blk_op()
When using spdk aio bdev driver, the qemu command line like this:

qemu-system-x86_64 \
    -chardev socket,id=char0,path=/tmp/vhost.0 \
    -device vhost-user-blk-pci,id=blk0,chardev=char0 \
    ...

Boot failure message as below:

e820 map has 7 items:
  0: 0000000000000000 - 000000000009fc00 = 1 RAM
  1: 000000000009fc00 - 00000000000a0000 = 2 RESERVED
  2: 00000000000f0000 - 0000000000100000 = 2 RESERVED
  3: 0000000000100000 - 000000007ffdd000 = 1 RAM
  4: 000000007ffdd000 - 0000000080000000 = 2 RESERVED
  5: 00000000feffc000 - 00000000ff000000 = 2 RESERVED
  6: 00000000fffc0000 - 0000000100000000 = 2 RESERVED
enter handle_19:
  NULL
Booting from Hard Disk...
Boot failed: could not read the boot disk

Fixes: a05af290ba ("virtio-blk: split large IO according to size_max")

Acked-by: Andy Pei <andy.pei@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Xiaofei Lee <hbuxiaofei@gmail.com>
2022-11-22 13:11:49 -05:00
Gerd Hoffmann 46de2eec93 virtio-blk: use larger default request size
Bump default from 8 to 64 blocks.  Using 8 by default leads
to requests being splitted on qemu, which slows down boot.

Some (temporary) debug logging added showed that almost all
requests on a standard fedora install are less than 64 blocks,
so that should bring us back to 1.15 performance levels.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-07-07 12:05:19 +02:00
Gerd Hoffmann dc88f9b72d malloc: use large ZoneHigh when there is enough memory
In case there is enough memory installed use a large ZoneHigh.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-04-27 09:16:05 +02:00
Gerd Hoffmann 3b91e8e9fe malloc: use variable for ZoneHigh size
Use the variable highram_size instead of the BUILD_MAX_HIGHTABLE #define
for the ZoneHigh size. Initialize the new variable with the old #define,
so behavior does not change.

This allows to easily adjust the ZoneHigh size at runtime in a followup
patch.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-04-27 09:15:25 +02:00
Volker Rümelin 01774004c7 reset: force standard PCI configuration access
After a reset of a QEMU -machine q35 guest, the PCI Express
Enhanced Configuration Mechanism is disabled and the variable
mmconfig no longer matches the configuration register PCIEXBAR
of the Q35 chipset. Until the variable mmconfig is reset to 0,
all pci_config_*() functions no longer work.

The variable mmconfig is located in one of the read-only C-F
segments. To reset it the pci_config_*() functions are needed,
but they do not work.

Replace all pci_config_*() calls with Standard PCI Configuration
Mechanism pci_ioconfig_*() calls until mmconfig is overwritten
with 0 by a fresh copy of the BIOS.

This fixes

In resume (status=0)
In 32bit resume
Attempting a hard reboot
Unable to unlock ram - bridge not found

and a reset loop with QEMU -accel tcg.

Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
2022-04-04 17:13:00 -04:00
Volker Rümelin d24f42b0d8 pci: refactor the pci_config_*() functions
Split out the Standard PCI Configuration Access Mechanism
pci_ioconfig_*() functions from the pci_config_*() functions.
The standard PCI CAM functions will be used in the next patch.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
2022-04-04 17:13:00 -04:00
Florian Larysch 829b0f1a7c nvme: fix LBA format data structure
The LBA Format Data structure is dword-sized, but struct nvme_lba_format
erroneously contains an additional member, misaligning all LBAF
descriptors after the first and causing them to be misinterpreted.
Remove it.

Signed-off-by: Florian Larysch <fl@n621.de>
Reviewed-by: Alexander Graf <graf@amazon.com>
2022-02-03 17:50:00 -05:00
Jan Beulich via SeaBIOS dc776a2d9c nvme: avoid use-after-free in nvme_controller_enable()
Commit b68f313c91 ("nvme: Record maximum allowed request size")
introduced a use of "identify" past it being passed to free(). Latch the
value of interest into a local variable.

Reported-by: Coverity (ID 1497613)
Signed-off-by: Jan Beulich <jbeulich@suse.com>
2022-01-27 11:32:47 -05:00
Kevin O'Connor 15a102e062 sercon: Fix missing GET_LOW() to access rx_bytes
The variable rx_bytes is marked VARLOW, but there was a missing
GET_LOW() to access rx_bytes.  Fix by copying rx_bytes to a local
variable and avoid the repetitive segment memory accesses.

Reported-by: Gabe Black <gabe.black@gmail.com>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
2022-01-27 11:28:41 -05:00
Kevin O'Connor 6d462830e7 nvme: Only allocate one dma bounce buffer for all nvme drives
There is no need to create multiple dma bounce buffers as the BIOS
disk code isn't reentrant capable.

Also, verify that the allocation succeeds.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Alexander Graf <graf@amazon.com>
2022-01-27 11:26:11 -05:00
Kevin O'Connor f13b650015 nvme: Build the page list in the existing dma buffer
Commit 01f2736cc9 ("nvme: Pass large I/O requests as PRP lists")
introduced multi-page requests using the NVMe PRP mechanism. To store the
list and "first page to write to" hints, it added fields to the NVMe
namespace struct.

Unfortunately, that struct resides in fseg which is read-only at runtime.
While KVM ignores the read-only part and allows writes, real hardware and
TCG adhere to the semantics and ignore writes to the fseg region. The net
effect of that is that reads and writes were always happening on address 0,
unless they went through the bounce buffer logic.

This patch builds the PRP maintenance data in the existing "dma bounce
buffer" and only builds it when needed.

Fixes: 01f2736cc9 ("nvme: Pass large I/O requests as PRP lists")
Reported-by: Matt DeVillier <matt.devillier@gmail.com>
Signed-off-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Alexander Graf <graf@amazon.com>
2022-01-27 11:26:01 -05:00
Kevin O'Connor 0a40653f30 nvme: Pass prp1 and prp2 directly to nvme_io_xfer()
When using a prp2 parameter, build it in nvme_prpl_xfer() and pass it
directly to nvme_io_xfer().

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Alexander Graf <graf@amazon.com>
2022-01-27 11:25:36 -05:00
Kevin O'Connor 9404f597b2 nvme: Convert nvme_build_prpl() to nvme_prpl_xfer()
Rename nvme_build_prpl() to nvme_prpl_xfer() and directly invoke
nvme_io_xfer() or nvme_bounce_xfer() from that function.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Alexander Graf <graf@amazon.com>
2022-01-27 11:25:26 -05:00
Kevin O'Connor 4eff93e7b0 nvme: Add nvme_bounce_xfer() helper function
Move bounce buffer processing to a new helper function.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Alexander Graf <graf@amazon.com>
2022-01-21 11:23:31 -05:00
Kevin O'Connor da18ec909a nvme: Rework nvme_io_readwrite() to return -1 on error
Rename nvme_io_readwrite() to nvme_io_xfer() and change it so it
implements the debugging dprintf() and it returns -1 on an error.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Alexander Graf <graf@amazon.com>
2022-01-21 11:23:26 -05:00
Kevin O'Connor e4f02c1251 smm: Suppress gcc array-bounds warnings
Add a hack to suppress spurious gcc array-bounds warning (on at least
gcc v11).

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
2022-01-21 11:08:33 -05:00
Kevin O'Connor 98dd53b994 memmap: Fix gcc out-of-bounds warning
Use a different definition for the linker script symbol to avoid a gcc
warning.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
2021-12-18 12:08:53 -05:00
Andy Pei a05af290ba virtio-blk: split large IO according to size_max
if driver reads data larger than VIRTIO_BLK_F_SIZE_MAX,
it will cause some issue to the DMA engine.

So when upper software wants to read data larger than
VIRTIO_BLK_F_SIZE_MAX, virtio-blk driver split one large
request into multiple smaller ones.

Signed-off-by: Andy Pei <andy.pei@intel.com>
Signed-off-by: Ding Limin <dinglimin@cmss.chinamobile.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2021-12-18 11:52:43 -05:00
Andy Pei 815d749865 virtio-blk: abstract a function named virtio_blk_op_one_segment to handle r/w request
abstract virtio-blk queue operation to form a function named virtio_blk_op_one_segment

Signed-off-by: Andy Pei <andy.pei@intel.com>
Signed-off-by: Ding Limin <dinglimin@cmss.chinamobile.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2021-12-18 11:52:36 -05:00
Andy Pei 27b573d4f5 virtio-blk: add feature VIRTIO_BLK_F_SIZE_MAX and VIRTIO_BLK_F_SEG_MAX
according to virtio spec, add feature VIRTIO_BLK_F_SIZE_MAX
and VIRTIO_BLK_F_SEG_MAX parse to virtio blk driver.

Signed-off-by: Andy Pei <andy.pei@intel.com>
Signed-off-by: Ding Limin <dinglimin@cmss.chinamobile.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2021-12-18 11:52:22 -05:00
Igor Mammedov b06d956d75 pci: let firmware reserve IO for pcie-pci-bridge
With [1] patch hotplug of rtl8139 succeeds, with caveat that it
fails to initialize IO bar, which is caused by [2] that makes
firmware skip IO reservation for any PCIe device, which isn't
correct in case of pcie-pci-bridge.
Fix it by exposing hotplug type and making IO resource optional
only if PCIe hotplug is in use.

[1]
 "pci: reserve resources for pcie-pci-bridge to fix regressed hotplug on q35"
[2]
Fixes: 76327b9f32 ("fw/pci: do not automatically allocate IO region for PCIe bridges")
Signed-off-by: Igor Mammedov imammedo@redhat.com
Tested-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
CC: mapfelba@redhat.com
CC: kraxel@redhat.com
CC: mst@redhat.com
CC: lvivier@redhat.com
CC: jusual@redhat.com
2021-12-18 11:49:22 -05:00
Igor Mammedov bba24ef84b pci: reserve resources for pcie-pci-bridge to fix regressed hotplug on q35
If QEMU is started with unpopulated pcie-pci-bridge with ACPI PCI
hotplug enabled (default since QEMU-6.1), hotplugging a PCI device
into one of the bridge slots fails due to lack of resources.

once linux guest is booted (test used Fedora 34), hotplug NIC from
QEMU monitor:
  (qemu) device_add rtl8139,bus=pcie-pci-bridge-0,addr=0x2

guest fails hotplug with:
  pci 0000:01:02.0: [10ec:8139] type 00 class 0x020000
  pci 0000:01:02.0: reg 0x10: [io  0x0000-0x00ff]
  pci 0000:01:02.0: reg 0x14: [mem 0x00000000-0x000000ff]
  pci 0000:01:02.0: reg 0x30: [mem 0x00000000-0x0003ffff pref]
  pci 0000:01:02.0: BAR 6: no space for [mem size 0x00040000 pref]
  pci 0000:01:02.0: BAR 6: failed to assign [mem size 0x00040000 pref]
  pci 0000:01:02.0: BAR 0: no space for [io  size 0x0100]
  pci 0000:01:02.0: BAR 0: failed to assign [io  size 0x0100]
  pci 0000:01:02.0: BAR 1: no space for [mem size 0x00000100]
  pci 0000:01:02.0: BAR 1: failed to assign [mem size 0x00000100]
  8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
  PCI Interrupt Link [GSIG] enabled at IRQ 22
  8139cp 0000:01:02.0: no MMIO resource
  8139cp: probe of 0000:01:02.0 failed with error -5

Reason for this is that commit [1] didn't take into account
pcie-pci-bridge, marking bridge as non hotpluggable instead of
handling it as possibly SHPC capable bridge.
Fix issue by checking if pcie-pci-bridge is SHPC capable and
if it is mark it as hotpluggable.

Fixes regression in QEMU-6.1 and later, since it was switched
to ACPI based PCI hotplug on Q35 by default at that time.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2001732
[1]
Fixes: 3aa31d7d63 ("hw/pci: reserve IO and mem for pci express downstream ports with no devices attached")
Signed-off-by: Igor Mammedov imammedo@redhat.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
CC: mapfelba@redhat.com
CC: kraxel@redhat.com
CC: mst@redhat.com
CC: lvivier@redhat.com
CC: jusual@redhat.com
2021-12-18 11:48:35 -05:00
Eduardo Habkost fa69276802 smbios: Support SMBIOS 3.0 entry point at smbios_romfile_setup()
Support SMBIOS 3.0 entry points if exposed by QEMU in fw_cfg.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-12-18 11:39:13 -05:00
Eduardo Habkost 401d3132fd smbios: Support SMBIOS 3.0 entry point at copy_table()
This will make coreboot code (scan_tables()) and xen code
(xen_biostable_setup()) copy SMBIOS 3.0 entry points if
found.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-12-18 11:39:13 -05:00