on 24 Nov 2023 08:00 AM
Some messages you get during Network monitoring might require actions from your side while some are just informational and do not need any reaction.
During OneAgent installation, the host might experience network communication disruption on all input and output ports. The disruption typically lasts until the end of OneAgent installation and is associated with the Npcap
installation.
For OneAgent version 1.255+, Npcap is installed by default and may cause a network disruption on:
For OneAgent version 1.253 and earlier, you need to uninstall any existing WinPcap
driver to allow Npcap
installation. Do this on all Windows versions, except for Windows Server 2019 build 1809 without hotfix KB4571748
. For more information, see Uninstall WinPcap driver to allow Npcap installation.
OneAgents operating on Linux might generate a page allocation failure message in the logs during network communication monitoring. The message format is:
Apr 29 08:07:54 (host name) kernel: oneagentnetwork: page allocation failure. order:4, mode:0xd0
This message is purely informational and does not indicate any problems with Dynatrace components, the network that you monitor, or any data loss.
In some situations, Network Agent stops working and fails to restart on Windows. This may be because the pcap
driver (either Npcap
or WinPcap
) has been removed or has malfunctioned. In such cases, the Network Agent logs include the IstallInfo: appName: serviceName:no-pcap
message:
2022-05-18 20:11:07.744 UTC [00003e90] info [native] NetworkAgent::init:318: Running with IstallInfo: appName: serviceName:no-pcap dllPath:C:\Windows\system32 isInstalled:0 isRunning:0 isError:1
2022-05-18 20:11:07.744 UTC [00003e90] info [native] NetworkAgent::init:320: IstallInfo: appName: serviceName:no-pcap dllPath:C:\Windows\system32 isInstalled:0 isRunning:0 isError:1
2022-05-18 20:11:07.744 UTC [00003e90] info [native] NetworkAgent::init:321: Pcap not operational, Network Agent will shutdown gracefully
2022-05-18 20:11:07.744 UTC [00003e90] info [native] NetworkAgent::~NetworkAgent:136: pktReader.finalize()
2022-05-18 20:11:07.744 UTC [00003e90] info [native] NetworkAgent::~NetworkAgent:139: delete pcapIfc
2022-05-18 20:11:07.744 UTC [00003e90] info [native] DataReporter::~DataReporter:104: DataReporter::~DataReporter
2022-05-18 20:11:07.744 UTC [00003e90] info [native] pkt_analysis::SessAnalyzer::~SessAnalyzer:210: Session insert counter=0, session remove counter=0
2022-05-18 20:11:07.744 UTC [00003e90] info [native] ... last message repeated 2 times ...
2022-05-18 20:11:07.744 UTC [00003e90] info [native] Agent::~Agent:950: delete mTimeProvider
2022-05-18 20:11:07.744 UTC [00003e90] info [native] Agent::~Agent:952: delete mconfig
2022-05-18 20:11:07.744 UTC [00003e90] info [native] Agent::~Agent:955: delete mlog
In such cases, Network Agent is repeatedly restarted by OneAgent Watchdog, but exits early without providing network metrics. In OneAgent versions earlier than 1.241, it also resulted in multiple Network Agent segmentation faults due to flawed error handling logic. Since OneAgent version 1.241, however, Network Agent initialization failure no longer results in a segmentation fault.
To verify that at least one pcap
driver is installed, execute the driverquery
command. If the driver is not installed, it will be installed automatically during the next upgrade.
Hi @piotr_szwarc, I got a problem of Host or monitoring unavailable in a linux environment, It was in fact the monitoring, the Dynatrace OneAgent. I was searching oneagent logs around the time, and just 20 minutes before, I got these lines...
2024-03-19 15:02:21.177 UTC [0000050c] info [native] Stats,type=SessAnlzr,sess_ins=301,sess_rm=309,sess_cnt=4
2024-03-19 15:02:21.177 UTC [0000050c] info [native] Stats,type=DataReporter,mq_bytes_send=130018,mq_msg_send=30,mq_bytes_fail=0,mq_msg_fail=0,mq_msg_skip=0
2024-03-19 15:02:21.177 UTC [0000050c] info [native] Stats,type=CPU,system=5985200000,proces=2500000,utilization=0.000417697,os_util=0.01
2024-03-19 15:02:21.177 UTC [0000050c] info [native] Memory statistic DataReporter::mqTransport 131,994 B LocalEndpointsImpl IPv4addrCnt=2 IPv6addrCnt=0 endpointsCnt=4,717 total=245,380B TCPAnalyzer singleKey=72 singleStats=1,136 boundSinglePerc=464 maxStatsCnt=121 maxBucketCnt=6,151 maxRxDLlen=227 total=210,444B PktReader::NetIfc ifcCnt=2 addrCnt=2 total=1,360B
2024-03-19 15:04:12.736 UTC [0000061d] info [native] NetworkAgent::runPktAnlzr:710: done
2024-03-19 15:04:13.035 UTC [0000061c] info [native] NetworkAgent::runPktRcver:615: done
2024-03-19 15:04:13.150 UTC [0000061e] info [native] NetworkAgent::runDataReporter:725: done
2024-03-19 15:04:13.200 UTC [0000050c] info [native] NetworkAgent::main_loop:882: done
2024-03-19 15:04:13.200 UTC [0000050c] info [native] NetworkAgent::~NetworkAgent:130: pktReader.finalize()
2024-03-19 15:04:13.200 UTC [0000050c] info [native] PktReader::finalize:454: Network Agent statistics for Ifc: ens192:
2024-03-19 15:04:13.200 UTC [0000050c] info [native] PktReader::finalize:469: pcap_pkts_processed=61084,pcap_pkts_dropped=0,rdr_pkts_processed=61083,rdr_pkts_dropped=1,cont_puts=61083,cont_drops=0
2024-03-19 15:04:13.200 UTC [0000050c] info [native] PktReader::finalize:454: Network Agent statistics for Ifc: ens224:
2024-03-19 15:04:13.200 UTC [0000050c] info [native] PktReader::finalize:469: pcap_pkts_processed=52860593,pcap_pkts_dropped=0,rdr_pkts_processed=52860484,rdr_pkts_dropped=109,cont_puts=52860484,cont_drops=0
2024-03-19 15:04:13.225 UTC [0000050c] info [native] DataReporter::~DataReporter:104: DataReporter::~DataReporter
2024-03-19 15:04:13.225 UTC [0000050c] info [native] DataReporter::~DataReporter:107: DataReporter::~DataReporter queue remove
2024-03-19 15:04:13.225 UTC [0000050c] info [native] DataReporter::~DataReporter:110: DataReporter::~DataReporter delete transport
2024-03-19 15:04:13.233 UTC [0000050c] info [native] pkt_analysis::SessAnalyzer::~SessAnalyzer:210: Session insert counter=0, session remove counter=0
2024-03-19 15:04:13.233 UTC [0000050c] info [native] ... last message repeated 1 time
2024-03-19 15:04:13.233 UTC [0000050c] info [native] pkt_analysis::SessAnalyzer::~SessAnalyzer:210: Session insert counter=459149, session remove counter=459042
2024-03-19 15:04:13.247 UTC [0000050c] info [native] Agent::~Agent:920: delete mTimeProvider
2024-03-19 15:04:13.249 UTC [0000050c] info [native] Agent::~Agent:923: delete mconfig
2024-03-19 15:04:13.266 UTC [0000050c] info [native] Agent::~Agent:926: delete mlog
And the log then stops. The agent was still reporting data for 20 minutes, then It stopped, I don't know if It could really be related, but this is the closest thing to the problem I was able to find in the logs, and It could make sense. This apparently has happened before, and the solution was still to restart the oneagent service. Can you give me some clue? I am lost. Why is this happening? How can I avoid this particular oneagent failing?
Regards.