PCIe to PCIe communication on custom board

Hi, we are trying to do PCIe to PCIe communication on custom board, but sometimes link happened and most of the time link not happened. Actually we are trying to send video from endpoint to rootport.

*** Please note that this reply is generated by LLM automatically ***

Based on the provided context, I’ll try to provide a comprehensive answer to your question.

It appears that you are trying to establish a PCIe to PCIe communication on a custom board using the Jetson AGX Orin module. You are experiencing issues with the link not being established most of the time, and you are trying to send video from the endpoint to the root port.

Firstly, let’s review the PCIe controller features on the Jetson AGX Orin module. According to the provided context, the Jetson AGX Orin module has six PCIe controllers with the following specifications:

  • Speed: All controllers support up to Gen4 speed.
  • Lane width:
    • C0, C5: up to x8
    • C4: up to x4
    • C1, C2, C3: x1
  • Controllers: Controllers C0, C4, and C5 support dual mode, which means they can be configured as endpoints.

To troubleshoot the issue, let’s go through some possible causes:

  1. Hardware Requirements: Ensure that your custom board meets the hardware requirements for PCIe communication. This includes proper lane configuration, signal integrity, and power management.
  2. Lane Configuration: Verify that the lane configuration on your custom board matches the specifications of the Jetson AGX Orin module. You can refer to the “UPHY Lane Configuration” section in the provided context for more information.
  3. Signal Integrity: Signal integrity is crucial for establishing a reliable PCIe link. Ensure that the signal quality is acceptable by checking for excessive over/undershoot and glitches on signal edges. You can use tools like oscilloscopes or signal analyzers to verify signal integrity.
  4. Power Management: Ensure that the power management on your custom board is properly configured. This includes ensuring that the PCIe controllers are properly powered and that the power rails are stable.
  5. Software Configuration: Ensure that the software configuration on your Jetson AGX Orin module is correct. This includes configuring the PCIe controllers, setting up the endpoint and root port, and enabling the necessary drivers.

To further troubleshoot the issue, you can try the following:

  1. Verify PCIe Link Status: Use tools like lspci or pcie_link_status to verify the PCIe link status on your Jetson AGX Orin module.
  2. Check PCIe Controller Configuration: Use tools like pcie_ctrl_config to verify the PCIe controller configuration on your Jetson AGX Orin module.
  3. Enable PCIe Debugging: Enable PCIe debugging on your Jetson AGX Orin module to get more detailed information about the PCIe link establishment process.
  4. Test with a Different Endpoint: Try testing the PCIe communication with a different endpoint device to rule out any issues with the current endpoint device.

In comparison to other Jetson modules, the Jetson AGX Xavier series has a similar PCIe controller configuration, but with some differences in lane width and speed. The Jetson AGX Xavier series has six PCIe controllers with the following specifications:

  • Speed: All controllers support up to Gen4 speed.
  • Lane width:
    • C0, C5: up to x8
    • C4: up to x4
    • C1, C2, C3: x1
  • Controllers: Controllers C0, C4, and C5 support dual mode, which means they can be configured as endpoints.

However, the Jetson AGX Xavier series has a different default PCIe configuration, with C5 configured as x8, C0 configured as x4, and C1 and C3 configured as x1.

In summary, to establish a reliable PCIe to PCIe communication on your custom board using the Jetson AGX Orin module, ensure that your custom board meets the hardware requirements, verify the lane configuration and signal integrity, and configure the software correctly. Additionally, try troubleshooting the issue by verifying the PCIe link status, checking the PCIe controller configuration, enabling PCIe debugging, and testing with a different endpoint device.

*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***

One info i want from your side, like we are doing with PCIe endpoint mode : PCIe Endpoint Mode — NVIDIA Jetson Linux Developer Guide

So, like that can I tried with rootport mode?

Not sure what is the question here. The default mode is already in rootport mode.

Actually I am trying, the RP to maintain link-training.

The default software without enabling a EP mode will make the port under root port.

One thing, our custom hardware designed like both modules are booted at the same time. But according to the process, the endpoint should be booted and put in endpoint mode and after that rootport should be boot and check lspci, so, there is a difference. So, is there any suggestions on this?

You could try to bind and unbind or reload the pcie driver on the RP side again. This will re-enumerate/ do the link training again.

Sure I will do and let you know.

One more thing, I got this error in minicom from rootport….

Ubuntu 22.04.5 LTS astrarp-desktop ttyTCU0

astrarp-desktop login: [ 31.532557] audit: type=1326 audit(1755680741.808:2): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=kernel pid=2258 comm=“cupsd” exe=“/snap/cups/1102/sbin/cupsd” sig=0 arch=c00
[ 32.014834] audit: type=1326 audit(1755680742.288:3): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=kernel pid=2258 comm=“cupsd” exe=“/snap/cups/1102/sbin/cupsd” sig=0 arch=c00000b7 syscall=55 compa0
[ 32.037795] ttyAMA ttyAMA0: 1 input overrun(s)
[ 32.040580] audit: type=1326 audit(1755680742.316:4): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=kernel pid=2271 comm=“cups-browsed” exe=“/snap/cups/1102/sbin/cups-browsed” sig=0 arch=c00000b7 sy0
[ 59.178525] CPU:0, Error: cbb-fabric@0x13a00000, irq=182
[ 59.178535] **************************************
[ 59.178537] CPU:0, Error:cbb-fabric, Errmon:2
[ 59.178542] Error Code : TIMEOUT_ERR
[ 59.178544] Overflow : Multiple TIMEOUT_ERR
[ 59.178551]
[ 59.178552] Error Code : TIMEOUT_ERR
[ 59.178553] MASTER_ID : CCPLEX
[ 59.178554] Address : 0x3a080000
[ 59.178555] Cache : 0x0 – Device Non-Bufferable
[ 59.178556] Protection : 0x2 – Unprivileged, Non-Secure, Data Access
[ 59.178558] Access_Type : Read
[ 59.178558] Access_ID : 0x10
[ 59.178559] Fabric : cbb-fabric
[ 59.178560] Slave_Id : 0x16
[ 59.178561] Burst_length : 0x0
[ 59.178562] Burst_type : 0x1
[ 59.178562] Beat_size : 0x2
[ 59.178563] VQC : 0x0
[ 59.178564] GRPSEC : 0x7e
[ 59.178564] FALCONSEC : 0x0
[ 59.178567] PCIE_C5_SLV_TIMEOUT_STATUS : 0x1
[ 59.178569] **************************************
[ 59.178585] WARNING: CPU: 0 PID: 131 at drivers/soc/tegra/cbb/tegra234-cbb.c:608 tegra234_cbb_isr+0x13c/0x170
[ 59.178771] —[ end trace 0000000000000008 ]—
[ 60.982943] CPU:0, Error: cbb-fabric@0x13a00000, irq=182
[ 60.982952] **************************************
[ 60.982953] CPU:0, Error:cbb-fabric, Errmon:2
[ 60.982960] Error Code : TIMEOUT_ERR
[ 60.982962] Overflow : Multiple TIMEOUT_ERR
[ 60.982970]
[ 60.982971] Error Code : TIMEOUT_ERR
[ 60.982972] MASTER_ID : CCPLEX
[ 60.982974] Address : 0x3a04000c
[ 60.982976] Cache : 0x1 – Bufferable
[ 60.982979] Protection : 0x2 – Unprivileged, Non-Secure, Data Access
[ 60.982981] Access_Type : Write
[ 60.982983] Access_ID : 0x2
[ 60.982984] Fabric : cbb-fabric
[ 60.982986] Slave_Id : 0x16
[ 60.982987] Burst_length : 0x0
[ 60.982989] Burst_type : 0x1
[ 60.982990] Beat_size : 0x2
[ 60.982991] VQC : 0x0
[ 60.982991] GRPSEC : 0x7e
[ 60.982992] FALCONSEC : 0x0
[ 60.982995] PCIE_C5_SLV_TIMEOUT_STATUS : 0x1
[ 60.982997] **************************************
[ 60.983014] WARNING: CPU: 0 PID: 131 at drivers/soc/tegra/cbb/tegra234-cbb.c:608 tegra234_cbb_isr+0x13c/0x170
[ 60.983182] —[ end trace 0000000000000009 ]—

dmesg_pcie_20th_august.txt (3.9 KB)

What is the pcie device in use on this RP?

NVIDIA jetson AGX ORIN module

Which Jetpack version is in use and how to reproduce this issue?

If your detection is intermittent working/not working, then it could be hardware issue.

Jetpack version is 6.0 and whenever we are trying to do PCIe link, sometimes link happened with downgraded speed but that is not stable, sometimes link happened immediately and most of the times i need try more than 2, 3 times.

please move to jp6.2.

ok I will try and let you know.

Thanks

Sorry, one update here, please once look into this.

root@astrarp-desktop:/home/astra-rp# sudo apt-cache show nvidia-jetpack
Package: nvidia-jetpack
Source: nvidia-jetpack (6.2.1)
Version: 6.2.1+b38
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-jetpack-runtime (= 6.2.1+b38), nvidia-jetpack-dev (= 6.2.1+b38)
Homepage: Jetson - Embedded AI Computing Platform | NVIDIA Developer
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_6.2.1+b38_arm64.deb
Size: 29300
SHA256: dd9cb893fbe7f80d2c2348b268f17c8140b18b9dbb674fa8d79facfaa2050c53
SHA1: dc630f213f9afcb6f67c65234df7ad5c019edb9c
MD5sum: 9c8dc61bdab2b816dcc7cd253bcf6482
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

Package: nvidia-jetpack
Source: nvidia-jetpack (6.2)
Version: 6.2+b77
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-jetpack-runtime (= 6.2+b77), nvidia-jetpack-dev (= 6.2+b77)
Homepage: Jetson - Embedded AI Computing Platform | NVIDIA Developer
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_6.2+b77_arm64.deb
Size: 29298
SHA256: 70553d4b5a802057f9436677ef8ce255db386fd3b5d24ff2c0a8ec0e485c59cd
SHA1: 9deab64d12eef0e788471e05856c84bf2a0cf6e6
MD5sum: 4db65dc36434fe1f84176843384aee23
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

Package: nvidia-jetpack
Source: nvidia-jetpack (6.1)
Version: 6.1+b123
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-jetpack-runtime (= 6.1+b123), nvidia-jetpack-dev (= 6.1+b123)
Homepage: Jetson - Embedded AI Computing Platform | NVIDIA Developer
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_6.1+b123_arm64.deb
Size: 29312
SHA256: b6475a6108aeabc5b16af7c102162b7c46c36361239fef6293535d05ee2c2929
SHA1: f0984a6272c8f3a70ae14cb2ca6716b8c1a09543
MD5sum: a167745e1d88a8d7597454c8003fa9a4
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

root@astrarp-desktop:/home/astra-rp#

Hi ??

@WayneWWW and @carolyuu there are multiple jetpacks are shown, so, is it correct?

Hi, is there any suggestion?