cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Issue on AMD

jubinp
Participant

Hello,

Can anyone have an idea on below error while restarting rtmhs service.

service rtmhs restart
Redirecting to /bin/systemctl restart rtmhs.service
Job for rtmhs.service failed because the control process exited with error code. See "systemctl status rtmhs.service" and "journalctl -xe" for details.

Also while doing interface configuration need to assign port in capture mode which is receiving mirrored traffic and getting below details:

Stopping service amdstatsmerger...

Stopping service cba...

Stopping service nfc...

Stopping service page2trans...

Stopping service rtmgate...

Stopping service rtmhs...

Applying changes...
No such error ?!

Plsz suggest.

35 REPLIES 35

Babar_Qayyum
Leader

Hello Mr. J.

The data memory limit is the maximum memory size that the AMD process is allowed to use (in megabytes); it is pre-configured to an optimum value at AMD software installation time but can be changed later if you add memory to the system.

RTMHS - The core AMD monitoring process. Use the recommended value, which is 70% of the system RAM size.

Reveiw the below link for more insight.

https://www.dynatrace.com/support/doc/dcrum/installation/install-amd/after-you-install-the-amd-software/rtm-configuration-tool-rtminst/

Also check the /usr/adlex/log/rtm.log and /var/log/adlex/rtm.log for some more insight to this situation.

Regards,

Babar

Hello Babar,

Checking in logs getting below error. Any suggestion

=================================== RTMHS start ======================================= === I 2018-04-01 04:06:01.033 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers: 98 <LDDRV> I 2018-04-01 04:06:01.042 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers:611 <LDDRV> Hyperthreading already disabled C 2018-04-01 04:06:01.072 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers:893 <LDDRV> none or more than one driver runs sniffing devices! configuration verification failed! I 2018-04-01 04:06:11.283 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers: 96 <LDDRV> I 2018-04-01 04:06:11.283 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers: 97 <LDDRV>

While running amd setup, not showing below options for rtmhs

Options:
1 - RTMHS data memory limit
and only showing below  
Options:
1 - page2trans memory limit
X - Exit

Hello Mr. J.

First Press 1 - Data memory limit
then
Press 1 - RTM data memory limit
to configure the value (go with the recommended value)

Did you configure the sniffing interface?

none or more than one driver runs sniffing devices! configuration verification failed! 

Which AMD version do you have?

Regards,

Babar

Hello Babar,

AMD version : 17.0.3.112 (High Speed).

After entering 1 it is showing only and not RTM data memory limit

1 - page2trans memory limit

X - Exit

While configuring sniffing interface in capture mode not able complete the configuration and giving below details:

Stopping service amdstatsmerger...

Stopping service cba...

Stopping service nfc...

Stopping service page2trans...

Stopping service rtmgate...

Stopping service rtmhs...

Applying changes... No such error ?!

and then it comes out of the setting screen.

Hello Mr. J.

This is strange.

Is the cable connected to the 10G/sniffing card?

Did you configure the RTMHS data memory limit?

Regards,

Babar

Is the cable connected to the 10G/sniffing card?

>>> details of port where traffic is mirrored

ethtool en**

Settings for en**:

Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full

100baseT/Half 100baseT/Full

1000baseT/Full

Supported pause frame use: Symmetric

Supports auto-negotiation: Yes

Advertised link modes: 10baseT/Half 10baseT/Full

100baseT/Half 100baseT/Full

1000baseT/Full

Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes


Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: off (auto)
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes

Did you configure the RTMHS data memory limit?

>>> No.

Hello Mr. J.

Select full duplex speed or auto-negotiation. Enter [P]# where # is the number of the entry in the table and then select the appropriate option.

  • When an AMD with 1 GB copper wire network cards is connected to a 100 Mbps tap or a switch mirror port and auto-negotiation fails, you can force the traffic sniffing devices to full duplex at 100 Mbps to avoid performance degradation.

Sniffing interfaces

  • Do NOT configure network for sniffing interfaces. Do NOT use DHCP nor assign a static IP address to them. Sniffing interfaces are administered by the monitoring software. Make sure no network configuration is set up for these interfaces.

Did you enable automatic AMD setup?

These options will not ba available in the rtminst menu while the automatic AMD setup is enabled because setting will automatically adjusts.


  • Data memory limit
  • Driver parameters set
  • RTM HS driver parameters

Regards,

Babar

Hello Babar,

Enable automatic setup is displayed.

---------------------------------------------------------------------------------------------

Automatic AMD setup: disabled

Options:
1 - Interface identification and network setup

2 - AMD setup

X - Exit
Select an option and press 'Enter': 1
Checking for missing drivers... done


Bringing all interfaces up... done

Validating the link speed for et***... speed 10000Mbps seems to work
Validating the link speed for en****... speed 10000Mbps seems to work
Network setup:

Default gateway: 10.*.*.*

Hostname: *************


AMD type: No probe type enabled!!!


Driver type: native
| | |Aval. |Link | | |Link |IP/
#|Name |HW type |driver |param. |Mode |Traffic|det. |Label
--|----------|-----------|-----------|---------|-------|-------|------|--------------
1|en** |igb |Native | |Unused |In |Y |
2|en** |igb |Native | |Unused |In |Y |
3|en**** |enic |Native | |Unused |In |Y |
4|et** |enic |Native |10000M-FD|Comm. |In |Y |10.*.*.*


(*) - changed since last refresh
port type colors: capture, communication, IP assigned, IP DHCP, other
!! -

Problems:
Error: Capture port not defined, use M command
In commands below substitute # with an interface number from table.

Type: 'M #' to setup port mode
'P #' to setup link parameters
'I #' to show details about interface and blink if possible
'T' to change driver mode
'A' to set all connected unused ports to capture mode
'S' to save configuration and apply changes
'X' to exit
press ENTER to refresh
Select an option and press 'Enter': x

----------------------------------------------------------------------------------------------

What needs to check here.. Plz suggest.

Hello Mr. J.

Just a wild guess that HS AMD 17.0 only
supports RHEL 7.3 & 7.4.

Which OS do you have?

https://answers.dynatrace.com/spaces/160/open-q-a_2/questions/196271/how-to-change-network-setup-of-an-existing-amd-hs.html

Regards,

Babar

Hello Babar,

It's already on 7.4.

=== OS type

Red Hat Enterprise Linux Server release 7.4 (Maipo)

But what could be reason behind it as rtmhs service not getting restarting and getting below error:

service rtmhs restart Redirecting to /bin/systemctl restart rtmhs.service Job for rtmhs.service failed because the control process exited with error code. See "systemctl status rtmhs.service" and "journalctl -xe" for details.

Plsz suggest.

Hello Mr. J.

I would suggest you to open the support ticket while we are troubleshooting.

First try to configure the automatic AMD setup.

Secondly try to use the kernel version:

3.10.0-514.26.1.el7.x86_64

Regards,

Babar

Hello Babar,

While running command getting below description, Any suggestion on this?

[root@*********** network-scripts]# systemctl status rtmhs.service
● rtmhs.service - RTMHS service
Loaded: loaded (/usr/lib/systemd/system/rtmhs.service; disabled; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Wed 2018-04-04 15:35:14 IST; 7s ago
Process: 24517 ExecStartPre=/bin/bash -c /usr/bin/perl /usr/adlex/rtmhs/bin/load.drivers >> /var/log/adlex/rtmhs.log 2>&1 (code=exited, status=110)
Process: 24511 ExecStartPre=/bin/bash -c /usr/bin/perl -e 'use hssetup; $0='autoconfig'; rtmhs::autoconfig();' >> /var/log/adlex/rtmhs.log 2>&1 (code=exited, status=0/SUCCESS)
Process: 24508 ExecStartPre=/usr/bin/chown adlex:adlex /var/run/adlex (code=exited, status=0/SUCCESS)
Process: 24505 ExecStartPre=/bin/bash -c cd /var/run 2>/dev/null && /usr/bin/mkdir -p adlex (code=exited, status=0/SUCCESS)
Apr 04 15:35:14 dcctrlmappprr2 systemd[1]: Failed to start RTMHS service.
Apr 04 15:35:14 dcctrlmappprr2 systemd[1]: Unit rtmhs.service entered failed state.
Apr 04 15:35:14 dcctrlmappprr2 systemd[1]: rtmhs.service failed.
[root@dcctrlmappprr2 network-scripts]#

Hello Mr. J.

I am suspecting something wrong with the driver compilation.

Check the /usr/adlex/log/rtm.log and /var/log/adlex/rtm.log for some more insight to this situation.

Did you open the support case?

Did you enable the automatic AMD setup?

Regards,

Babar

Hello Babar,

Yes we have enabled automatic AMD setup but then disabled as all services were not starting up.

Checked in logs getting this:

=================================== RTMHS start ======================================= === I 2018-04-01 04:06:01.033 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers: 98 <LDDRV> I 2018-04-01 04:06:01.042 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers:611 <LDDRV> Hyperthreading already disabled C 2018-04-01 04:06:01.072 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers:893 <LDDRV> none or more than one driver runs sniffing devices! configuration verification failed! I 2018-04-01 04:06:11.283 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers: 96 <LDDRV> I 2018-04-01 04:06:11.283 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers: 97 <LDDRV

Hello Mr. J.

The message is telling that something is wrong with the driver.

Would you mind to open the support case while we are trying to figured it out.

<LDDRV> none or more than one driver runs sniffing devices! configuration verification failed! I 2018-04-01 04:06:11.283 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers: 96 <LDDRV> I 2018-04-01 04:06:11.283 [load.driver] @/usr/adlex/rtmhs/bin/load.drivers: 97 <LDDRV

Do you have ER SPAN?

Review the below link for the ER SPAN.

https://answers.dynatrace.com/spaces/160/open-q-a_2/questions/190412/can-we-add-more-than-one-sniffing-link-to-a-single.html

Regards,

Babar

Hello Babar,

Could you let me know what is ER Span. Then what needs to be checked to figure out drivers?

Currently we have 4 nics and have configured 4th nic in communication port but while configuring in capture mode ie for 1st nic giving error as mentioned in above chain.

Thanks for understanding.

Hello Mr. J.

Just have a look on the below configuration.

1 # interface is configured for the iLO, therefore, we can see Traffic (In) and Link det. (Yes).

5 # interface is configured for the communication where we can see that the traffic is (In+Out) and Link det. (Yes).

11 # interface is our 10G card where we are receiving the SPAN traffic, therefore, we can see Traffic (In) and Link det. (Yes).

But in your case I can see that all the interfaces showing link detected. Why?

Regards,

Babar

Hello Babar,

Thanks for info shared. Now what needs to be checked here is it from network or from linux point of view? Any suggestion.

Hello Mr. J.

My question was that why you have all the detected links?

Regards,

Babar

Hello Babar,

So only 2 nics needs to be in Y and others in N mode??

Hello Mr. J.

This indicates whether you have a SPAN/communication traffic on all the ports, therefore, I would recommend you to review the below link for better understanding and if you need more then this then you can contact with the Dynatrace support to obtain their expert services to properly configure your environment.

https://www.dynatrace.com/support/doc/dcrum/installation/install-amd/after-you-install-the-amd-software/rtm-configuration-tool-rtminst/

Regards,

Babar

Hello Babar,

Is there any other way where it can be checked for drivers errors which is coming in log files.?

Any suggestions.

Hello Mr. J.

How many Ethernet Cables are connected to the AMD?

How many 10gb Cable are connected the AMD?

Which 10gb card model do you have?

Did you review the below link for the NICs that are compatible with custom driver for use with AMD?

https://www.dynatrace.com/support/doc/dcrum/installation/hardware-recommendations/nics-tested-by-dynatrace/

Regards,

Babar

Hello Babar,

Attaching NIC details of AMD probe.

nic-details.txt

Hello Mr. J.

For sniffing with custom drivers, AMD uses 10 Gbps interfaces (dual port for sizes M and L) based on Intel 82599 chipset.

The below 10 Gbps card is from the Cisco Systems Inc.

*-network
description: Ethernet interface
product: VIC Ethernet NIC
vendor: Cisco Systems Inc
physical id: 0
bus info: pci@0000:09:00.0
logical name: enp9s0
version: a2
serial: 78:ba:f9:cb:54:a3
size: 10Gbit/s
capacity: 10Gbit/s
width: 64 bits
clock: 33MHz
capabilities: msi msix pm pciexpress bus_master cap_list ethernet physical fibre 10000bt-fd
configuration: autonegotiation=off broadcast=yes driver=enic driverversion=2.3.0.31 duplex=full firmware=4.0(1d) latency=0 link=yes multicast=yes port=fibre speed=10Gbit/s
resources: irq:37 memory:c6c00000-c6c07fff memory:c6c08000-c6c09fff<br>

See the below manufacturer documentation for supported SFP and SFP+ module options and restrictions.

I guess you will have to talk to the Dynatrace Support in this situaion.

Regards,

Babar

Hello Babar,

Thanks for info shared. So in this case mirrored traffic we are receiving is on eno1 which is of size 1Gbit/s and for snipping do we have to use 10Gbit/s which is here is of Cisco.

So is there is any possibility with current envi can we do modification in order to startup AMD services.

Hello Mr. J.

Why you have mirrored traffic on the 1Gbit/s instead of 10Gbit/s?

How you are receiving the span traffic?

From the Core Switches > Aggregators > AMD

or

From the Core Switches > AMD

Regards,

Babar

Hello Babar,

Spanning has been done from Core Switches > AMD.

Here there 4 nics connected to probe in which 2 10G one of them is management IP and other is free. & 2 1G one of them is receiving traffic and other is free. Network team will be doing mirroring on remaining nic for receiving application level traffic as till now they have mirrored third party traffic which is receiving at probe level.

Hello Mr. J.

Do you mean that sniffing traffic will be only on the 1G cards?

10G card will be used for the management.

Have a look on the below link for the coexistence of the 1GB and 10GB cards for the sniffing traffic.

https://answers.dynatrace.com/spaces/160/open-q-a_2/questions/120572/spf-card-coexist-with-1gb-port.html

Regards,

Babar

Hello Babar,

That not the case. They will be performing mirroring too on 10gbps as there was freeze end.

Can you let us know is it because of port mirroring done on 1gbps will not able to perform rtminst and configure nic in capture mode.

Hello Mr. J.

I read that 1gb and 10gb can't coexist for the SPAN traffic, therefore, you can try the configuration whether with the 1gb or 10gb only to exhaust this option.

Regards,

Babar

Hello Babar,

Know it is silly question, you can try the configuration whether with the 1gb or 10gb only to exhaust this option. But how??

so we can't move ahead with the drivers currently there on AMD probe?

Any suggestion much appreciated.

Hello Mr. J.

I would suggest you to unplug all the cables from the probe server and only keep 1gb for the communication and 10gb for the span traffic (if traffic is available on this).

If you don't have span traffic on the 10gb then also remove that cable and keep another 1gb for the span traffic.

Regards,

Babar

Hello Babar,

Thanks for quick response.

Means is it really due to this nature rtmhs service is not getting started. Is it dependent on spanning /communication nic which is to be used here, then will take it ahead with linux / network team for changing it. So here will require only 2 nic, one for span and other for communication rest all remove it.

Just for ref attaching logs while running command

rtmhs.txt

Hello Mr. J.

You can have more than one sniffing connection but the concern is coexistence of 1gb and 10gb for the spanning at a time.

So I would suggest you to first check with the 1gb communication and 1gb spanning. Later on you can span other VLAN data and add one more 1gb card.

If you want to use 10gb card then remove the span cable of 1gb from the AMD server.

Regards,

Babar